FTP and Knowledge: Nightly Document Index Template

A Spojit template that lists a vendor FTP directory every night, downloads each new document, and embeds it into a persistent Knowledge collection so your workspace can query the archive the next morning.

What It Builds

A Schedule trigger fires on a nightly cron. A FTP Connector node lists the target directory and downloads each file that has not been seen before. A Loop node walks the new documents, and a Knowledge node in Embed mode adds each one (OCR handles scans and PDFs) to a persistent collection. By morning, any workflow in the workspace can query that collection for answers grounded in the latest documents.

The Prompt

Paste this into Miraxa and it builds the workflow, connecting the tools for you:

Build a workflow that runs every night at 2am, lists the files in our vendor FTP directory /incoming/docs, downloads each new document that has not been indexed yet, and embeds each one into a persistent Knowledge collection called "vendor-docs" so other workflows can query it the next morning.

Connectors Used

  • Schedule trigger - fires nightly on a cron with your chosen IANA timezone.
  • FTP - lists the directory and downloads each new document.
  • Knowledge - embeds documents into a persistent, reusable collection.

Customize It

Change the cron time, the FTP directory path, and the collection name directly in the prompt. You can also narrow which files are picked up by naming a pattern (for example only .pdf or files containing "invoice"), or point at a different vendor folder for a second nightly run.

Tips

  • Give your FTP connection read access to the full directory so the nightly list does not miss files.
  • Use a stable, descriptive collection name so other workflows can reuse it in Query mode.
  • Knowledge Embed runs OCR, so scanned PDFs and images index cleanly alongside text documents.

Related

Did this answer your question? Thanks for the feedback There was a problem submitting your feedback. Please try again later.