How to Pull MLS Listings into MongoDB on a Schedule with the HTTP Connector

Build a Spojit workflow that polls your MLS or RESO Web API on a cron schedule, flattens the listing feed, skips records that have not changed, and upserts each listing into a local MongoDB warehouse for downstream search and reporting.

What This Integration Does

Real estate teams usually have listing data trapped behind an MLS or RESO Web API that no portal, CRM, or reporting tool can read directly. This workflow gives you a clean, queryable copy of that feed. A Schedule trigger calls the listings endpoint at a fixed interval, a Transform node flattens the nested feed into flat rows, a Condition node throws away records whose last-modified timestamp has not advanced, and the mongodb connector upserts the survivors into a single collection. The result is a warehouse you can search, slice into board reports, or feed into other Spojit workflows.

The run model is a scheduled poll. On every tick the workflow fetches the current page of listings, compares each listing's last-modified value against the copy already stored, and writes only the new or changed ones. Because each write is an upsert keyed on the listing's unique id, re-runs are idempotent: running the same poll twice does not create duplicates, it just re-writes the same row. State lives entirely in your MongoDB collection, so the workflow holds no memory between runs and you can safely pause, resume, or replay it.

Prerequisites

  • An MLS or RESO Web API endpoint that returns listings as JSON, plus an API key or bearer token you can pass in an Authorization header. Spojit reaches this system through the built-in http connector, so no vendor-specific connection is required.
  • A mongodb connection configured under Connections, pointing at the database that will hold your warehouse. See Adding a New Connection if you have not set one up yet.
  • The name of the field your MLS uses as a stable unique key for each listing (commonly ListingKey or ListingId) and the field that carries the last-modified timestamp (commonly ModificationTimestamp).
  • A target collection name in MongoDB, for example listings. It does not need to exist in advance; the first upsert creates it.

Step 1: Add a Schedule trigger

Start a new workflow and set the Trigger node type to Schedule. The schedule uses a 5-field Unix cron expression plus an IANA timezone. To poll every 30 minutes during business hours on weekdays, use:

*/30 8-18 * * 1-5

Set the timezone to your local zone, for example America/New_York or Australia/Sydney. The trigger output is {{ scheduledAt }}, the timestamp of the tick, which is handy when you later want to log when a poll ran. A single Schedule trigger can hold several schedules if you want different cadences for peak and off-peak hours.

Step 2: Fetch the listings with the http connector

Add a Connector node in Direct mode on the http connector and choose the http-get tool. Point the url at your MLS or RESO listings endpoint and pass your credential in the Authorization header. A typical RESO Web API request looks like this:

url: https://api.yourmls.com/reso/odata/Property?$top=200&$orderby=ModificationTimestamp desc
headers:
  Authorization: Bearer YOUR_API_TOKEN
  Accept: application/json

Use Direct mode here because the call is deterministic and predictable, which keeps it fast and free of AI cost. Name the node result so later steps can reference it, for example mls. The parsed response is then available as {{ mls.data }}. If your feed paginates, keep $top modest (for example 200) and rely on the schedule to walk forward over time, or read How to Connect to Any REST API Using HTTP Requests for paging patterns.

Step 3: Flatten the feed with a Transform node

RESO and most MLS feeds return listings wrapped in an envelope (for example a top-level value array) with nested address and media objects. Add a Transform node to reshape each listing into a flat row that maps cleanly onto a MongoDB document. Pull the array out of the envelope and select just the fields you want to warehouse, projecting nested values up to the top level:

{{ mls.data.value }} mapped to:
{
  listingKey:    item.ListingKey,
  status:        item.StandardStatus,
  listPrice:     item.ListPrice,
  bedrooms:      item.BedroomsTotal,
  city:          item.City,
  postalCode:    item.PostalCode,
  modifiedAt:    item.ModificationTimestamp
}

If you prefer to do this with utility tools rather than a single Transform, the json connector's flatten tool collapses nested objects into dotted keys, and the array connector's pluck and filter tools extract and trim collections. Name this node result rows so the next steps can iterate over {{ rows }}. Keeping a flat shape now makes both the change check and the upsert filter trivial.

Step 4: Loop over listings and read the stored copy

Add a Loop node in ForEach mode over {{ rows }} so each listing is processed individually. Inside the loop, add a Connector node in Direct mode on the mongodb connector and choose the find-documents tool to read the existing copy of this listing. Set the collection to listings and a filter that matches on the unique key:

collection: listings
filter: { "listingKey": "{{ item.listingKey }}" }
limit: 1

Name the result existing. When the listing is brand new, the find returns an empty list, which the next step treats as "changed" so the record is written. This read is what lets you skip unchanged listings instead of blindly rewriting the entire feed on every poll.

Step 5: Skip unchanged records with a Condition node

Add a Condition node that compares the incoming last-modified timestamp against the stored one. Configure the true branch to fire only when the listing is new or has been updated since the last poll:

{{ existing.documents.length }} is 0
  OR {{ item.modifiedAt }} is after {{ existing.documents.0.modifiedAt }}

If your timestamps are strings that do not compare cleanly, normalize them first with the date connector's parse or unix tool so you are comparing numeric values. Route the true branch into the upsert step in Step 6; leave the false branch empty so unchanged listings simply fall through and the loop moves on. This is the gate that keeps each run cheap: only genuinely changed listings reach the write.

Step 6: Upsert each listing into MongoDB

On the true branch of the Condition, add a Connector node in Direct mode on the mongodb connector and choose the update-documents tool. Match on the unique key, set the full row with the $set operator, and turn on upsert so a missing listing is inserted and an existing one is replaced in place:

collection: listings
filter: { "listingKey": "{{ item.listingKey }}" }
update: { "$set": {
  "listingKey":  "{{ item.listingKey }}",
  "status":      "{{ item.status }}",
  "listPrice":   {{ item.listPrice }},
  "bedrooms":    {{ item.bedrooms }},
  "city":        "{{ item.city }}",
  "postalCode":  "{{ item.postalCode }}",
  "modifiedAt":  "{{ item.modifiedAt }}"
} }
upsert: true

The tool returns matchedCount, modifiedCount, upsertedCount, and upsertedId, which you can sum across the loop to report how many listings were inserted versus updated on each poll. Because the filter is keyed on listingKey and upsert is on, re-running the workflow never creates duplicates. Create a unique index on listingKey in your database once, up front, so the warehouse stays clean even under concurrent polls.

Tips

  • Ask Miraxa, the intelligent layer across your automation, to scaffold the skeleton for you: try "Build a workflow with a Schedule trigger that calls http-get, loops over the results, and upserts each into the mongodb listings collection." Then fine-tune each node in the properties panel.
  • Most MLS feeds support server-side sorting and filtering. Adding $orderby=ModificationTimestamp desc and a $filter on a recent timestamp shrinks each poll dramatically and reduces the work the Condition node has to do.
  • For very large feeds, run the upsert step in batches rather than one document at a time by collecting changed rows first and using the mongodb insert-documents tool for net-new records, reserving update-documents for changes.
  • Store the highest modifiedAt you have seen in a small control document in the same collection so the next poll can request only listings changed after it, turning a full scan into an incremental sync.

Common Pitfalls

  • Schedule cron runs in the timezone you pick, not the server's. A real estate office that expects 9am polls in America/Chicago will see them an hour off if the timezone field is left on a default. Always set it explicitly.
  • Timestamp comparison fails silently when one side is a string and the other is a date. Normalize both with the date connector before the Condition node, or every listing will look unchanged (or every listing will look changed).
  • Forgetting upsert: true means new listings are matched against nothing and never written. Confirm the flag is on, and confirm your filter uses the same key field you index on.
  • MLS APIs rate-limit aggressively. Keep $top reasonable, avoid polling more often than the feed updates, and respect any Retry-After header the API returns rather than hammering the endpoint on a tight cron.

Testing

Before turning the schedule loose, narrow the scope. Temporarily set the http-get url to request a single known listing (for example by adding a $filter on one ListingKey), then use the Run button to fire the workflow manually and watch the execution. Confirm the Transform produced a clean flat row, the Condition correctly reported the listing as new on the first run, and the update-documents result shows upsertedCount of 1. Run it a second time with no change to the source and verify the Condition skips the record and nothing is written. Once both paths behave, widen $top, remove the test filter, and enable the schedule.

Learn More

Did this answer your question? Thanks for the feedback There was a problem submitting your feedback. Please try again later.