Tutorials

How to Build an AI Assistant in Retool with Workflows and Agents

OTC Team·April 18, 2026·5 min read

If you've ever wondered how to build an AI assistant in Retool using Workflows and Agents, Retool's own team just showed you how — by open-sourcing the architecture behind RetoolGPT, their embedded AI assistant for docs and support. This post breaks down everything covered in the Day 3 session of AI Build Week, including the design decisions, vector search strategy, prompt chaining approach, and the hard-won lessons from shipping AI into a production support flow.

What Is RetoolGPT and What Powers It?

RetoolGPT is Retool's internal AI assistant, embedded directly into their documentation and support experience. Under the hood, it runs on four core Retool primitives:

Retool Workflows — orchestrates the logic and data flow
Retool Agents — handles multi-step reasoning and tool use
Retool Vectors — stores and retrieves semantically relevant document chunks
Prompt chaining — structures how context is passed between LLM calls

The team also uses a modular component pattern (MCP) to keep the system maintainable as it grows, and they've implemented failover logic designed to "fail loud" — meaning errors surface visibly so the team can catch and fix degraded responses quickly.

How the Workflow and Agent Architecture Works

The core of RetoolGPT is a Workflow that routes user queries through a series of steps: retrieving relevant context from vector stores, assembling a prompt, calling an LLM via a Retool Agent, and returning a response. The exported Workflow JSON is available in the official resource folder for you to import into your own Retool instance and inspect directly.

One of the most discussed design decisions in the session was how to structure vector retrieval — specifically, whether to use a single large vector store with filters or multiple dedicated vector stores per source.

One Vector Store vs. Multiple Vector Stores: Which Should You Use?

This is the question that sparked the most technical discussion during the session, and it's worth unpacking carefully. The RetoolGPT workflow uses separate vector stores per source (e.g., one for Confluence docs, one for support articles), with branching logic in the Workflow to determine which stores to query based on the incoming request — for example, checking if searchConfluence == true before triggering that branch.

A valid alternative is to store everything in one vector store and use metadata filters to narrow results at query time. Here's how the tradeoffs break down:

Single vector store with filters: Simpler to set up initially. The Select Vectors field in Retool supports a JavaScript expression (click the fx toggle) so you can dynamically select stores, but filter management lives on a separate screen from your Workflow logic, making it easy to forget to update when you add new sources.
Multiple vector stores (one per source): Slightly more setup upfront, but each store is scoped to a single, coherent body of content. You're less likely to accidentally exclude a relevant document because you forgot to update a filter. Adding a new source means adding a new store — a visible, deliberate action that's harder to miss.

The community consensus: default to one vector store per source. As your content library grows, a single monolithic store with filters becomes increasingly difficult to debug and maintain. Forgetting to include a filter for a new content type is a silent failure — forgetting to wire in a whole new vector store is much harder to overlook.

Step-by-Step: Building Your Own Internal AI Copilot in Retool

Step 1 — Set up your vector stores: Create one Retool Vector store per content source (e.g., Confluence, Notion, internal wikis). Ingest documents and let Retool handle chunking internally, or pre-process with a tool like RecursiveCharacterTextSplitter for more control over chunk size and overlap.
Step 2 — Build the routing Workflow: Use a Workflow to evaluate the incoming query and determine which vector stores to search. Branch logic or a JS expression on the Select Vectors field can handle this dynamically.
Step 3 — Run vector search and merge context: Query the relevant stores and merge the top results into a single context block to pass to your LLM prompt.
Step 4 — Chain your prompts: Structure your prompt carefully — include retrieved context, conversation history, and a clear system instruction. Use a Retool Agent to handle multi-step reasoning if your use case requires it.
Step 5 — Add failover and scheduling logic: Implement "fail loud" error handling so degraded responses are visible. Schedule vector ingestion with Workflows and add concurrency guards to skip a run if a prior ingestion is still active.
Step 6 — Connect your LLM credentials: Retool provides managed credentials for OpenAI and Anthropic. For other models like Deepseek or Gemini-specific features (e.g., grounding), you'll need to bring your own API key or build a custom REST API resource.

Key Limits and Platform Notes to Know Before You Build

Vector storage: Retool Vectors share the same 5GB limit as Retool DB. Most teams don't hit this, but plan ahead if you're ingesting large document sets.
External vector DBs: Native Pinecone support doesn't exist yet — use a REST API resource to integrate it manually.
Confluence ingestion: Fully supported. Extract your content, convert it to an ingestible format, and pipe it into your vector store via a Workflow.
Multi-key API management: Assigning different API keys to different apps isn't supported today, but it's on the roadmap. Submit a feature request if this is a blocker for you.
Token cost tradeoff: More vector context = more tokens per request = higher LLM cost. Tune your chunk size and top-K retrieval count to balance relevance against cost.

Why Build This Instead of Using Guru or NotionAI?

Off-the-shelf tools like Guru or NotionAI are solid for general knowledge management, but they give you limited control over model behavior, retrieval logic, and integration with your internal systems. Building with Retool means your AI assistant can call internal Workflows, read from your own databases, respect your access control model, and be embedded directly inside the tools your team already uses — not bolted on externally. If your internal data is the moat, you probably want to own the AI layer too.

The full Workflow JSON from the RetoolGPT session is available in the AI Build Week resource folder. Import it, poke around, and use it as the foundation for your own internal copilot. Drop questions in the Retool Community thread if you get stuck — the team is actively monitoring it.

Ready to build?

We scope, design, and ship your Retool app — fast.