How to Process Emailed Purchase Order PDFs into NetSuite Sales Orders with Transient Knowledge
Turn a PDF purchase order that a buyer emails to a Spojit mailhook address into a NetSuite sales order, using a transient Knowledge collection to read the attachment and a Human approval step to gate large orders.
What This Integration Does
Wholesale and manufacturing teams often receive purchase orders as PDF attachments rather than structured data: a buyer attaches a formatted PO to an email and sends it to your orders inbox. Someone then re-keys the buyer, the line items, the quantities, and the prices into NetSuite by hand. This workflow removes that step. A buyer emails the PDF to a dedicated Spojit mailhook address, Spojit fetches the attachment bytes, embeds the PDF into a one-off Knowledge collection, asks the intelligent layer to extract the order into a clean JSON shape, pauses for approval when the order is large, and then creates the sales order in NetSuite via the netsuite connector.
The workflow runs on a Mailhook trigger, so it fires within seconds of an email arriving at the generated address (no mailbox or OAuth needed). Each run handles a single inbound email and leaves one NetSuite sales order behind (plus a notification email back to the buyer). The transient Knowledge collection is created per run and cleaned up automatically when the run completes, so nothing is stored long term. Because the trigger deduplicates per message and NetSuite create-record is a single insert, re-sending the same PO without changing it produces a fresh run; use the buyer's own PO number for idempotency checks if you want to avoid duplicates. This tutorial centers on PDF-attachment extraction via transient Knowledge plus large-order approval; if your buyers send PO details as plain email text instead, see the related plaintext tutorial linked at the end.
Prerequisites
- A NetSuite connection added in Spojit under Connections, with permission to create
salesOrderrecords and look up customers. - The NetSuite internal IDs (or external IDs) for the customers and items you expect on incoming POs, or a reliable way to match them by name. A small mapping of buyer email or company name to NetSuite customer ID is recommended.
- The orders inbox or vendor address that will forward POs, so you can point it at the mailhook address you generate in Step 1.
- An approval threshold agreed with finance (for example, orders over 10,000 in your currency require sign-off) and at least one approver user, role, or team to assign.
- If you want to notify the buyer, your org's email recipient allowlist configured under Settings → General → Email recipients.
Step 1: Receive the PO email on a Mailhook trigger
Create a new workflow and add a Trigger node. Set Trigger Type to Mailhook. Enter an Address prefix such as po, then choose Generate email address to produce a unique address like po-1a2b3c4d5e6f7g8h@mailhook.spojit.com. Copy it and have buyers (or your forwarding rule) send POs there. Optionally set a From allowlist of known buyer domains and a Subject regex such as (?i)purchase\s*order|^PO[-\s] so only real POs start a run.
The trigger fires per inbound email and exposes the message as {{ input }}, including {{ input.from }}, {{ input.subject }}, {{ input.replyTo }}, and an {{ input.attachments }} list where each entry is a reference of { id, filename, contentType }. The bytes themselves are fetched in the next step.
Step 2: Fetch the PDF bytes with an Attachment node
Add an Attachment node directly after the trigger. Set Mode to Single so you get the first matching attachment as one object. Set the Content type filter to application/pdf and the Filename pattern to *.pdf so only the PO PDF is selected. Turn on Fail if no attachment matches so an email with no PDF stops cleanly rather than producing an empty order.
In Single mode the node outputs { filename, contentType, size, content }, where content is the base64-encoded PDF. Name the node so you can reference it as {{ attachment }}. Keep in mind the default limits of 10 MB per attachment and 25 MB per run; a single PO PDF is comfortably within these.
Step 3: Embed the PDF into a transient Knowledge collection
Add a Knowledge node in Embed mode. In the Collection dropdown choose Transient so a one-off collection is created just for this run and discarded when the run finishes (no file name or persistent storage needed). Set Document Type to PDF and set Document Input to the attachment bytes:
Document Input: {{ attachment.content }}
Give the node an Output Variable (for example embedResult) so you can confirm the chunk count. Transient collections are ideal for this "embed then query then discard" pattern on a single document. Remember to use the same embedding model for the query step that you use here; with a transient collection the default model is applied automatically across both nodes in the same run.
Step 4: Extract the order with a Knowledge query and Response Schema
Add a second Knowledge node in Query mode. Set Collection to Transient so it reads the document you just embedded in this same run. Write a Prompt that describes exactly what to pull out, and attach a Response Schema so the intelligent layer returns clean structured JSON instead of prose. A schema like this forces a predictable shape:
{
"type": "object",
"properties": {
"buyerName": { "type": "string" },
"buyerEmail": { "type": "string" },
"poNumber": { "type": "string" },
"currency": { "type": "string" },
"lineItems": {
"type": "array",
"items": {
"type": "object",
"properties": {
"sku": { "type": "string" },
"description": { "type": "string" },
"quantity": { "type": "number" },
"unitPrice": { "type": "number" }
},
"required": ["sku", "quantity", "unitPrice"]
}
},
"orderTotal": { "type": "number" }
},
"required": ["buyerName", "poNumber", "lineItems", "orderTotal"]
}
For the Prompt, use something concrete such as: Extract the buyer name and email, the PO number, the currency, every line item with its SKU, description, quantity and unit price, and the order total from this purchase order. Set Result Count high enough to cover a multi-page PO (the default of 5 chunks is usually fine for a one-page order; raise it for longer documents) and give the node an Output Variable such as po. Downstream you can reference {{ po.poNumber }}, {{ po.orderTotal }}, and the array {{ po.lineItems }}.
Step 5: Require approval for large orders
Add a Condition node that checks whether the order exceeds your threshold, for example whether {{ po.orderTotal }} is greater than 10000. Route the true branch into a Human node. Configure it with a clear Label (such as Approve large PO) and a Message that uses variables so approvers see the context:
PO {{ po.poNumber }} from {{ po.buyerName }} for {{ po.currency }} {{ po.orderTotal }} needs sign-off before it is created in NetSuite.
Set a Timeout (minutes) if you want stale requests to lapse, choose an Urgency, and fill the required Approval slots with the user, role, or team that signs off on large orders. Approval completes only when every slot is satisfied. Approvers respond in the Approvals inbox at /approvals; an approval continues the run, while a rejection or timeout halts it (there is no "on reject" branch). Connect both the false branch of the Condition and the Approved output of the Human node forward to the next step, so small orders skip approval and large approved orders proceed.
Step 6: Create the sales order in NetSuite
Add a Connector node on the netsuite connector in Direct mode. If you need to resolve the buyer to a NetSuite customer first, add a preceding NetSuite node using list-customers or get-customer to match on {{ po.buyerEmail }} or {{ po.buyerName }} and capture the internal ID. Then call create-record with recordType set to salesOrder and a body that maps the extracted fields and line items:
{
"recordType": "salesOrder",
"body": {
"entity": { "id": "{{ customer.id }}" },
"otherRefNum": "{{ po.poNumber }}",
"item": {
"items": [
{
"item": { "id": "{{ lineItem.netsuiteItemId }}" },
"quantity": "{{ lineItem.quantity }}",
"rate": "{{ lineItem.unitPrice }}"
}
]
}
}
}
To build the item.items array from every PO line, wrap a Transform node before this step to reshape {{ po.lineItems }} into the NetSuite line shape (mapping each SKU to its NetSuite item ID), or use a Loop over {{ po.lineItems }} with add-sublist-item against the created order. Use get-record-metadata with recordType of salesOrder once during setup to confirm the exact field and sublist names your NetSuite account expects.
Step 7: Confirm back to the buyer
Add a Send Email node so the buyer gets an acknowledgement. Set Recipients to {{ input.replyTo }} (the address that sent the PO), a templated Subject such as PO {{ po.poNumber }} received, and a Body that confirms the order was created in NetSuite and lists the line count. Set If sending fails to Continue anyway so a delivery hiccup does not roll back a successfully created sales order. Remember the recipient must be on your org's email allowlist, or send from your own domain using the resend or smtp connector instead.
Tips
- Use Miraxa, the intelligent layer across your automation, to scaffold the skeleton: try a prompt like "Build a workflow on a Mailhook trigger that fetches a PDF attachment, embeds it into a transient Knowledge collection, queries it with a Response Schema, and creates a NetSuite sales order." Then fine-tune each node in the properties panel.
- Keep the Response Schema strict (mark
quantityandunitPriceas required) so half-read POs surface as missing fields rather than silently creating a wrong order. - If buyers sometimes send two PDFs (a PO plus terms), tighten the Filename pattern on the Attachment node to match only your PO naming convention, or switch to
Multiplemode and Loop to pick the right file. - Store the buyer-to-NetSuite-customer mapping where you can update it easily; a Transform node or a small lookup keeps SKU-to-item-ID matching out of the prompt.
Common Pitfalls
- Querying the wrong collection: both Knowledge nodes must use Transient so the query reads the document embedded earlier in the same run. A persistent collection here would query the wrong data.
- Rejection halts the run: the Human node has no reject branch, so a rejected large PO stops before NetSuite is touched. Tell approvers that rejecting means the order is not created.
- Scanned image POs: if a PO is a flat scan with no text layer, plain PDF embedding may extract little. Set Document Type to
Images via OCRfor image-only PDFs, or pre-process accordingly. - NetSuite field drift: sublist and field names (such as
itemvs a custom sublist) vary by account configuration. Confirm withget-record-metadatabefore going live, or create-record will fail validation. - Duplicate POs: the mailhook deduplicates identical messages, but a buyer re-sending an edited PDF is a new run. Check the buyer's
{{ po.poNumber }}against existing NetSuite orders if duplicates would cause problems.
Testing
Before pointing real buyers at the address, email a sample PO PDF from your own account to the generated mailhook address and watch the run in execution history. Confirm the Attachment node returned content, the embed step reported a sensible chunk count, and the query output populated {{ po.poNumber }}, {{ po.lineItems }}, and {{ po.orderTotal }} correctly. Send one small PO (under the threshold) to verify it skips approval and creates the order, then one large PO to verify it pauses in the Approvals inbox and only creates the NetSuite sales order after you approve. Verify the created salesOrder in NetSuite matches the PDF line for line before enabling the From allowlist and handing the address to buyers.
Learn More
- How to Create NetSuite Sales Orders from Emailed PO PDFs for the related plaintext and PDF email-trigger approach.
- Setting Up a Mailhook Trigger for full mailhook address and filter options.
- Using Knowledge Nodes for Embed and Query mode details including transient collections.
- Using Human Approval Nodes for approval slots and outcomes.
- Attachment node reference on the Spojit developer docs.
- NetSuite connector reference for create-record and related tools.
- Knowledge platform overview covering collections and embedding models.