How to Validate and Clean a Partner EDI-Style CSV Before Loading Orders into NetSuite

Build a scheduled Spojit workflow that pulls a partner's EDI-style order CSV from an FTP server, parses and validates required columns and SKU formats, removes duplicate lines, quarantines bad rows through a reusable Subworkflow, and creates NetSuite sales orders only from clean rows.

What This Integration Does

Wholesale and manufacturing partners rarely send you clean EDI. Instead of a strict 850 document you get a flat CSV dropped onto an FTP folder overnight: missing customer references, blank quantities, malformed SKUs, and the same line repeated twice because their export ran again. Loading that straight into your ERP creates broken sales orders, wrong shipments, and hours of manual cleanup. This workflow puts a validation gate in front of NetSuite so only rows that pass your rules ever become orders, and everything that fails is set aside in a quarantine file your team can review.

The run model is fully scheduled and unattended. A Schedule trigger fires each morning, a Connector node on the ftp connector downloads the latest partner file, and the csv connector parses it into rows. Each row is checked against required-column and SKU-format rules, exact duplicate lines are stripped, and a reusable Subworkflow writes the rejects to a quarantine CSV. The clean rows are looped into create-record calls on the netsuite connector. Re-runs are safe by design: the workflow reads whatever file is currently on FTP, and because validation and dedupe run before any order is created, a duplicate or partially-bad file produces no duplicate or broken sales orders.

Prerequisites

  • An ftp connection in Spojit with read access to the folder where your partner drops order files, and write access to a quarantine folder (see the FTP/SFTP connector article).
  • A netsuite connection authorized to create sales-order records (see the NetSuite connector article).
  • The exact partner file layout: the column headers you expect, which columns are required, your customer/entity reference, and your SKU (item) format so you can write a regex for it.
  • A mapping from the partner's customer reference and SKU values to the matching NetSuite entity and item internal IDs.
  • The csv and validation utility connectors are built in and need no setup.

Step 1: Schedule the pull with a Schedule trigger

Add a Trigger node and set its type to Schedule. Use a 5-field Unix cron expression plus an IANA timezone so the workflow runs after the partner's nightly drop has landed. For a 6 AM weekday pull in Sydney, set the cron to 0 6 * * 1-5 and the timezone to Australia/Sydney. A single Schedule trigger can hold more than one schedule if different partners deliver at different times. The trigger output is simply { scheduledAt }, which you can use later to name the quarantine file.

Step 2: Download the partner file from FTP

Add a Connector node on the ftp connector in Direct mode and select the download-file tool. Set the remote path to the partner's file, for example /inbound/partner-acme/orders-latest.csv. If the partner names files by date, build the path with the date connector first (a format call on {{ trigger.scheduledAt }}) and reference that value in the path. The download output exposes the file body as {{ ftp_download.data.content }}. If you want the workflow to skip cleanly when no file is present, add a get-file-info call ahead of the download and gate it with a Condition node.

Step 3: Parse the CSV into rows

Add a Connector node on the csv connector in Direct mode and select the parse tool. Feed the downloaded file body into the input and confirm the header row is treated as column names so each parsed row becomes a keyed object. Name the output variable parsed. From here you can reference column values per row as, for example, {{ row.customer_ref }}, {{ row.sku }}, and {{ row.quantity }}. If the partner sends tab-separated EDI exports, the same parse tool handles them: set the delimiter to a tab.

Step 4: Validate required columns and SKU format, then split good from bad

You want two checks: required fields are present, and each SKU matches your item format. Add a Connector node on the csv connector with the filter tool to keep rows where required columns are non-empty, for example a filter that requires customer_ref, sku, and quantity to be present. For the SKU shape, add a Connector node on the validation connector: use the numeric check for purely numeric item codes, or run a regex connector test against a pattern such as ^[A-Z]{3}-\d{4,6}$ for prefixed SKUs. Combine these into a clean set and a reject set. A reliable pattern is to run two filter passes that mirror each other so every parsed row lands in exactly one of cleanRows or badRows:

cleanRows  = rows where customer_ref, sku, quantity are all present
             AND sku matches ^[A-Z]{3}-\d{4,6}$
             AND quantity is numeric and > 0

badRows    = every other parsed row

For each rejected row, attach a short reason (missing field, bad SKU, non-numeric quantity) using a Transform node so reviewers can act on the quarantine file without guessing.

Step 5: Remove duplicate order lines

Partner exports frequently repeat lines when their job runs twice. Add a Connector node on the csv connector with the dedupe tool and run it over cleanRows. Choose the key columns that define a unique order line: for EDI-style files that is usually the combination of customer_ref, the partner's purchase-order number, sku, and quantity. Dedupe keeps the first occurrence and drops exact repeats, so a doubled file produces a single sales order. Name the output dedupedRows. Run dedupe after validation so malformed rows never participate in the uniqueness key.

Step 6: Quarantine the bad rows with a reusable Subworkflow

Build the quarantine step once and reuse it across every partner feed. Create a separate workflow named something like quarantine-rows with a Manual trigger that expects an input of { rows, source, scheduledAt }. Inside it, convert the incoming rejects back to CSV with the csv connector from-json tool, then add a Connector node on the ftp connector with upload-file to write the quarantine file to a path such as /quarantine/partner-acme-{{ input.scheduledAt }}.csv. Optionally end it with a Send Email node to alert your operations inbox with the reject count.

Back in the main workflow, add a Subworkflow node, set Workflow to quarantine-rows, and pass Input as:

{
  "rows": {{ badRows }},
  "source": "partner-acme",
  "scheduledAt": "{{ trigger.scheduledAt }}"
}

The parent pauses while the child runs and resumes once the quarantine file is written. Because the logic lives in one child workflow, changing the quarantine path or alert format updates every partner feed that calls it at once.

Step 7: Create NetSuite sales orders from clean rows

Add a Loop node in ForEach mode over {{ dedupedRows }}. Inside the loop, add a Connector node on the netsuite connector in Direct mode with the create-record tool. Set the record type to a sales order and map the partner row into NetSuite fields, translating the partner's customer_ref to the NetSuite entity ID and the partner SKU to the item internal ID using your mapping table. A minimal create payload looks like:

{
  "recordType": "salesOrder",
  "values": {
    "entity": "{{ row.netsuiteEntityId }}",
    "otherRefNum": "{{ row.po_number }}"
  }
}

For multi-line orders, create the order header first, then add lines with the netsuite add-sublist-item tool against the item sublist, passing the mapped item ID and quantity for each line. If you would rather group all lines for one partner PO into a single sales order, group dedupedRows by PO number with the array connector group-by tool before the loop so each iteration builds one complete order.

Tips

  • Keep the SKU and required-column rules in the filter and validation nodes, not in the NetSuite step. Validating before create-record means a bad file costs zero ERP writes.
  • Always dedupe after validation, never before. Running dedupe first can collapse rows whose only difference is an invalid field and hide the problem.
  • Name the quarantine file with {{ trigger.scheduledAt }} so each day's rejects are a separate file and nothing is overwritten.
  • Ask Miraxa, the intelligent layer across your automation, to scaffold the skeleton with a prompt like "Add a Schedule trigger, an ftp download-file node, a csv parse node, and a Loop over the rows into a netsuite create-record node," then fine-tune each node in the properties panel.

Common Pitfalls

  • Header drift. Partners rename or reorder columns without warning. Reference columns by name from the parsed rows rather than by position, and let the required-column filter reject the whole run loudly if an expected header is missing.
  • Quantity as text. CSV values arrive as strings, so a non-numeric or zero quantity can slip past a presence-only check. Add an explicit validation numeric test and a greater-than-zero rule before creating any order.
  • Unmapped SKUs. A SKU can be correctly formatted yet have no NetSuite item mapping. Treat a missing mapping as a quarantine reason in Step 4 rather than letting create-record fail mid-loop.
  • Timezone mismatch. If the partner drops files in their local night, a UTC cron can run before the file lands. Set the Schedule trigger timezone to match the partner's delivery window, not your own.

Testing

Before scheduling, point the ftp download-file path at a small sample file that deliberately includes one missing-field row, one bad-SKU row, and one duplicated line. Run the workflow with the Run button and open the execution log to confirm the duplicate was dropped by dedupe, both bad rows landed in the quarantine file via the Subworkflow, and only the clean rows reached create-record. Verify in NetSuite that the expected sales orders exist with the right entity, items, and quantities and that no duplicates were created. Once a sample run is clean end to end, enable the Schedule trigger for the live partner folder.

Learn More

Did this answer your question? Thanks for the feedback There was a problem submitting your feedback. Please try again later.