How to Archive Workflow Data to a Database
Store workflow execution results in a database for long-term tracking and analysis.
What This Integration Does
Spojit keeps execution history in its own store, but most teams eventually want their own copy: in a warehouse for BI, in a customer-facing audit log, or simply on hand for compliance reviews. This pattern adds an archive step at the tail of any workflow so each run lands as a row (or document) in your database with all the context you'll later need.
The archive step runs at the end of the host workflow, gathers the result, the input, the trigger metadata, and a status flag, then writes a single record. Because it's a final-stage step, it doesn't slow down the business logic and any failure to archive doesn't roll back the work that already happened.
Prerequisites
- A MySQL or MongoDB connection with write access to the archive table or collection.
- A target schema: at minimum
workflow_id,execution_id,started_at,finished_at,status,input(JSON),result(JSON). - An indexed column on
workflow_idandstarted_atso later queries stay fast.
Step 1: Identify the Host Workflow
Open the workflow whose results you want archived. The archive step is added as the final node, so locate the last business-logic node - the one that produces the result you'd want to look up later.
Step 2: Transform - Build the Archive Record
Add a Transform node after the final business-logic step. Build a single object with the fields you want archived:
{
"workflow_id": "{{ workflow.id }}",
"execution_id": "{{ workflow.executionId }}",
"started_at": "{{ workflow.startedAt }}",
"finished_at": "{{ now }}",
"status": "success",
"input": {{ trigger.payload }},
"result": {{ previousStep.output }}
}
If the result is large, use the json connector with pick to keep only the fields you actually need - archives bloat fast.
Step 3: Insert into the Archive Store
For SQL, add a Connector node for the mysql connector and pick the insert-rows tool. Point it at your archive table (e.g. workflow_archive) and pass the object from Step 2 as the row. For document storage, use the mongodb connector with insert-documents instead and write to a collection like workflow_archive.
Step 4: Capture Failures Too
A successful-only archive is a half-archive. Add a Condition branch earlier in the workflow (on the catch path of any node that can fail) that routes to a second archive step with status: "failed" and error populated from the caught exception. Both branches end at the same archive table so a single query gives you the full picture.
Step 5: Add Retention Logic
Build a second, scheduled workflow that prunes the archive. Use a Trigger set to Schedule (weekly is plenty), then a Connector node calling the mysql execute-query tool with something like:
DELETE FROM workflow_archive
WHERE finished_at < NOW() - INTERVAL 90 DAY
For MongoDB, use the delete-documents tool with a filter on finished_at. Adjust the window to match your compliance policy.
Step 6: Surface the Archive
If end users or support need access, expose the archive via a small read-only workflow: a Trigger of type Webhook takes a workflow_id, a Connector node queries MySQL or MongoDB, and a Response node returns the rows. That gives you an in-product audit log without giving anyone direct database access.
Tips
- Index
execution_id- it's the natural join key when someone reports "execution X did something weird" and you want to pull the matching archive row instantly. - Use JSON columns - both MySQL (8.0+) and MongoDB store JSON natively. Querying nested fields stays cheap and you don't have to predict every shape upfront.
- Archive size grows linearly - if your workflow runs every minute, you'll have ~525k rows a year. Partition the table by month or use TTL indexes in MongoDB.
Common Pitfalls
- Archiving secrets - if
trigger.payloadcontains API keys or PII, the archive will too. Usejsonomitto strip sensitive fields before insert. - Schema drift - if the host workflow's output shape changes, the archive will silently capture different fields each version. Either pin the shape in the Transform step or use a schemaless collection.
- Archive failures break the run - by default, a failed archive insert marks the whole workflow as failed. Turn on the "continue on error" flag for the archive node so business logic isn't held hostage by your audit trail.
Testing
Trigger the host workflow manually once. Query the archive table for that execution_id and confirm every field is populated and JSON-parseable. Then force a failure (bad input on purpose) and confirm a row with status: "failed" lands as well. Only then turn the workflow back on for real traffic.