Getting Started with AI-Powered Workflows

Learn how to leverage AI capabilities across your Spojit workflows.

What This Integration Does

Spojit treats AI as a first-class workflow building block. Instead of bolting something onto an automation, you use a Connector node in Agent mode on the same canvas where your triggers, connectors, and transforms live. In Agent mode, an AI agent reads your prompt and the upstream variables, decides which connector tools to call and in what order, and hands its output to downstream steps. The result is workflows that can reason about messy data, classify intent, extract structured fields from documents, and decide which path to take at runtime.

This guide is a tour rather than a single workflow. Each section below names a Spojit node or feature, when to reach for it, and which built-in pieces compose with it. Agent-mode runs cost AI credits, so the closing sections show how to keep deterministic work in Direct mode or a utility connector. Use the cross-links throughout to jump to a concrete end-to-end tutorial once you know which pattern fits your use case.

Prerequisites

A Spojit workspace with at least one workflow created (even an empty one is fine for clicking through nodes).
At least one connector configured. An Agent-mode node needs tools to call, so pick a connector that matches a real system you have access to (Shopify, Monday, Slack, MongoDB, and so on).
For Knowledge flows: documents you can upload (PDF, Word, Excel, CSV, Email, or images via OCR).
Familiarity with the canvas. If this is your first workflow, read the Trigger and Connector node docs linked at the bottom first.

Step 1: Agent Mode - Let the AI Pick Tools

Add a Connector node to the canvas and set its mode to Agent. Give it a goal in natural language and choose the connector and the tools it is allowed to call (for example monday with create-item and list-boards, or slack with send-message). The agent reads the prompt, inspects the allowed tools, and decides which to call and in what order. Use Agent mode when the exact steps depend on the input, for example: "Triage this support email. If it is billing, create a Monday item on the Finance board, otherwise reply on Slack." For deterministic single-tool calls (create an order, fetch a record), use Direct mode instead so there is no AI cost.

Step 2: Response Schema - Constrain the JSON Shape

In an Agent-mode Connector node, fill in the Response Schema field with a JSON schema. The model is then forced to return JSON that matches the schema, which means the next step can reference the node's output variable (for example {{ triage.category }}) without defensive parsing. A typical schema for order triage:

{
  "type": "object",
  "properties": {
    "category": { "type": "string", "enum": ["billing", "shipping", "product", "other"] },
    "urgency": { "type": "string", "enum": ["low", "medium", "high"] },
    "summary": { "type": "string" }
  },
  "required": ["category", "urgency", "summary"]
}

Downstream Condition nodes can branch on {{ triage.category }} with zero ambiguity.

Step 3: Knowledge - Ground the Answer in Your Docs

Add a Knowledge node in Query mode. Point its Collection at a persistent collection you have already embedded into (policy docs, product manuals, runbooks), write a natural-language Prompt, set Result Count (default 5), and pick a Model for synthesis. The node retrieves the most relevant passages and returns a grounded answer in its Output Variable. This is retrieval-augmented generation in one node. Use it whenever an answer needs to be grounded in something other than the model's training data, for example replying to a customer with the exact wording from your shipping policy. You can also add an optional Response Schema to force structured output. To build the collection first, add a Knowledge node in Embed mode (or use the Knowledge section of the sidebar to upload and embed documents).

Step 4: Document Processing - Pull Text Out of Files

Use the pdf connector's extract-text tool to pull text from a PDF, or the csv connector's parse tool to turn a spreadsheet into structured rows. For scanned images, embed the file with a Knowledge node using the Images via OCR document type. Feed the extracted text into an Agent-mode Connector node with a Response Schema. This pattern handles invoices, contracts, expense reports, and scanned forms: the agent does the field extraction, and your downstream nodes do the routing and posting.

Step 5: Pick the Right Model

Each Agent-mode Connector node and each Knowledge Query node exposes a Model field. The trade-off is cost, latency, and reasoning ability:

Faster, lower-cost models - good for classification, simple extraction, and high-volume routing.
Balanced models - handle most multi-step reasoning and tool use well.
Most capable models - reach for these when the task involves nuanced reasoning, code, or long synthesis, where a wrong answer is expensive.
Large-context models - useful when you need to process a long document in a single pass.

If you are unsure which to choose, ask Miraxa, the intelligent layer across your automation, right from the page you are working on. Miraxa knows the workflow you are editing and can recommend a model or explain the difference between Agent mode and Direct mode in context.

Step 6: Code Runner - Drop Down When AI Is Overkill

For deterministic logic (formatting, regex, math, custom data shaping), use a Connector node in Direct mode pointing at the code connector with execute-javascript or execute-python. It is cheaper and more predictable than Agent mode for things that don't need reasoning, and it costs no AI credits. A good rule: if you can describe the logic in three lines of pseudocode, use the code connector instead of an agent.

Tips

Match the model to the task. Do not pay top-tier model prices for what a faster, cheaper model can classify in a fraction of a second.
Prefer a Response Schema over free-form text any time the next node has to read a specific field.
Keep allowed tool lists small. Giving an Agent-mode node 30 tools confuses the model and balloons token usage. Five focused tools per agent is a sweet spot.
Reuse a persistent collection for high-volume flows so you embed your reference documents once and query them from many workflows, rather than re-embedding on every run.

Common Pitfalls

Skipping schema validation. Even with a Response Schema, a model can produce empty strings or missing optional fields. Add a Condition node to short-circuit on bad input.
Hidden costs. A Loop that calls an Agent-mode node per item can add up fast. Use a single agent call with batched input where possible.
Stale knowledge. If your source documents change, your collection needs re-embedding. Schedule a refresh workflow that re-embeds the updated files rather than embedding once and forgetting.
Prompt injection. User-supplied text can override your prompt. Treat external content as data, not instructions, and use a Response Schema to constrain what the model can return.

Testing

Build each Agent-mode Connector node in isolation first. Pin a small sample input as a variable, run the node alone, and inspect its output variable. Once the agent behaves on five hand-picked inputs, wire it into the surrounding flow and run end-to-end on a tiny scope (a single record, a single Slack channel) before turning the schedule on. You can also ask Miraxa "Why did my last run fail?" to investigate a failed execution in context.

Where to Start

These end-to-end tutorials put the pieces above together:

Agent mode vs Direct mode explains when to let the AI pick tools and when to call one tool deterministically.
Extracting structured output with a Response Schema walks through forcing clean JSON from a model.
Extracting invoice data with PDF tools and AI combines document processing with field extraction.
Querying your knowledge base covers grounding answers in your own documents.
Using the code connector to extend AI shows when to drop down to deterministic logic.