Make Waves '26 tickets are live. Join us in Prague, Oct 19–20, for two days of AI, automation, and what's next. Save with early-bird pricing!

May 27, 2026 | 10 minutes

How to build an LLM integration: 2026 guide

A scenario-level walkthrough of connecting an LLM to your business systems with routing, validation, and operational guardrails.

LLM Integration

A working prompt in a chat window is not LLM integration. Integration is the model reading from one system, deciding, and writing to another, on a schedule, without a human in the loop. 

This article shows engineering, ops, and RevOps leads how to build a production-grade LLM integration in a Make scenario: classify, validate, route. 

The walkthrough uses the Make AI Toolkit (built-in provider, no API key, every plan) so you can ship the pattern today, then swap in native OpenAI, Anthropic Claude, or Azure OpenAI modules when you have your own keys.

Gartner forecasts that 40% of enterprise applications will embed task-specific AI agents by end of 2026, up from under 5% today.

What do you need before integrating an LLM?

Before opening the Scenario Builder, three things determine whether your LLM integration ships or stalls in pilot: the right provider path, the right access, and a clear data shape.

Which LLM provider and model fits the job?

The fastest path is Make AI Toolkit, which runs on Make's built-in AI provider. No API key, no billing relationship with a model vendor, available on every plan. The walkthrough in Section IV uses this path.

Reach for a native provider module when you need something the built-in provider doesn't cover:

  • OpenAI — widest range of modalities (text, image, audio, embeddings) and strict JSON-mode outputs.

  • Anthropic Claude — long-context reasoning and tool use for agent-style workflows.

  • Azure OpenAI — same OpenAI models, but with the data-residency and compliance posture some teams require.

Three criteria decide which path fits a given scenario:

Criterion

What it means

What to pick

Latency tolerance

How fast the response must come back

Smaller, faster model for live chat; slower, smarter model for batch

Output structure

Whether downstream modules need strict JSON

JSON for classification and extraction; free-form for drafting

Monthly token volume

Inputs plus outputs across every run (a token is ~¾ of a word)

Premium model on low volume; cheaper model on high volume

What access do you need in place?

You also need three things in place before you build:

  • An API key from your chosen provider

  • A Make account (free works for this build; Core plan or above for production volume)

  • Write-permission on the destination system (CRM, Slack, database)

Finally, define your data shape up front.

This article covers three input types: email body, webhook JSON, and Google Sheets row. An LLM that returns prose where you expected JSON breaks every module after it.

What data shape are you working with?

Define it up front. This article covers three input types: email body, webhook JSON, and Google Sheets row. 

An LLM that returns prose where you expected JSON breaks every module after it.

How does an LLM integration work in Make?

Make turns the LLM into one module in a sequence, surrounded by retrieval, validation, and routing logic you can audit on every run.

What is a scenario doing when it calls an LLM?

A trigger module fires a bundle: a new email, a webhook payload, a new sheet row. 

A retrieval module enriches that bundle with context from your systems before anything reaches the model. 

The Anthropic Claude > Create a Prompt or OpenAI > Create a Chat Completion module then passes the enriched bundle as structured input, and downstream modules act on the parsed response. 

That sequence is the difference between asking a chatbot a question and operationalizing a model inside a business process. 

The four roles in order are:

  • TriggerRetrieveReasonAct

How is this different from a direct API call?

Writing direct API calls to OpenAI or Anthropic Claude gives full control, but every retry, schema check, secret rotation, and routing branch becomes code your team maintains. 

For teams whose needs stop at a single linear API call, a script is a solid baseline. 

For teams whose requirements have outgrown it, Make handles those layers as native modules.

Dimension

Direct API call

Make scenario

Retry on rate limit

Custom code

Built-in exponential backoff

Schema validation

Custom code

Filter + Text Parser modules

Multi-system routing

Custom code

Router with fallback route

Audit of each run

Custom logging stack

Per-bundle execution log

How do you build an LLM integration scenario in Make?

This example build classifies inbound support emails using the Make AI Toolkit, validates the output, and routes by category in five simple steps.

Step 1: How do you connect your LLM provider to Make?

Open Make and add the Make AI Toolkit > Categorize Text module to a blank scenario. Click Add a connection and select

No external API key required, and it works on all plans. 

Make also supports native modules for Anthropic Claude, OpenAI, Azure OpenAI, and other providers if you have your own API keys, but Make's built-in provider is the fastest path to a working build.

Three connection settings that matter:

  • Connection name: label by environment, not by app

  • AI provider: Make's built-in provider unless you have a custom API key

  • Model: inherited from Make's provider, no manual selection needed

💡 PRO TIP: Name connections make-ai-prod and make-ai-staging. When you clone a scenario for testing, swap connections instead of editing settings to prevent staging runs consuming production credits.

LLM-Make-AI-Categorize-Step-1

Step 2: How do you structure the trigger and retrieval modules?

Your trigger choice sets the credit ceiling for the scenario. Gmail > Watch Emails polls on a schedule, consuming credits whether or not new emails arrived. 

Webhooks > Custom Webhook fires only when data arrives, making it the lower-cost default. Three trigger patterns this scenario supports:

  • Gmail > Watch Emails for teams routing support through Gmail

  • Webhooks > Custom Webhook for instant triggers from a help desk

  • Google Sheets > Watch Rows for batch processing from a spreadsheet

Add a HubSpot CRM > Get a Contact module between the trigger and the AI module to attach prior customer data to the input. 

How you structure agent workflow memory at this stage determines the quality of every downstream decision. 

Map retrieved fields into the AI module separately, not as one concatenated blob.

LLM-Gmail-→-HubSpot-→-Make-AI-Make-Step 2

Step 3: How do you configure the LLM module and prompt?

Use Make AI Toolkit > Categorize Text for classification tasks like routing tickets by topic. 

Use Make AI Toolkit > Summarize Text when downstream modules need a condensed version of long input. Both use Make's built-in provider, so there is no third-party token billing to manage. 

This module sits at the heart of what AI automation is, turning unstructured input into structured output. 

Five fields that matter:

  • Input Text: mapped from the retrieved bundle in Step 2

  • Categories: the exact list the model must choose from

  • Language: auto-detect or fixed, depending on input variability

  • Confidence threshold: the minimum score required for a valid result

  • Output format: structured response the next module can parse

The category list is the integration's contract with the model. Define it precisely upfront. 

Vague categories like "other" produce routing failures downstream. 

For a support email classifier, define three to five exclusive categories. Add urgency as a second classification pass if you need it.

Categories: billing, technical, general

Confidence threshold: 0.7

Language: auto-detect

When to use each toolkit module:

  • Categorize Text for routing decisions

  • Summarize Text for condensing long input

  • Extract Information for pulling structured fields from prose

  • Analyze Sentiment for tone-based routing

💡 PRO TIP: Run an Analyze Sentiment module in parallel with Categorize Text and use sentiment score as a second routing dimension alongside category. Angry billing tickets route differently from neutral billing tickets.

LLM-Gmail-→-HubSpot-→-Make- Step - 3

Step 4: How do you validate the LLM response?

The Make AI Toolkit returns structured output directly, which reduces but does not eliminate validation. 

A filter still needs to confirm that the returned category matches your defined list and that the confidence score clears the threshold you set in Step 3. 

Building these is the layer that separates a demo from a production scenario, and where most teams cut corners and pay for it later. 

Three validation steps to add after the AI module:

  • Filter checks the category field is not empty and matches the defined list of values

  • Tools > Set Variable normalises casing and trims whitespace before downstream mapping

  • Filter checks the confidence score is above the 0.7 threshold you defined

Validation modules consume credits, but a miscategorised bundle routed to the wrong system costs more in cleanup than the credits spent checking it first. 

Treat this step as insurance against bad model output, not overhead on a working scenario.

Validation module

What it checks

On failure

Filter (category)

Category matches the defined list

Route to manual review queue

Tools > Set Variable

Casing, whitespace consistency

Normalises silently

Filter (confidence)

Score is ≥ 0.7 threshold

Route to human review path

LLM-Gmail-→-HubSpot-→-Make- Step 4

Step 5: How do you route and act on the result?

The Router turns the AI module's structured output into business outcomes. 

Each route is a destination system, not a code branch. Configure routes in priority order, with a fallback at the end for any bundle that matches no filter condition. 

Without a fallback, bundles that fall outside your defined paths disappear silently with no log trail. 

The pattern from this build extends naturally into autonomous AI agents on the Make canvas when you outgrow rules-based routing.

Three routes the demo scenario produces:

  • Gmail > Create a Draft email for high-confidence auto-replies that the support team approves before sending

  • Slack > Send a Message to the support channel for mid-confidence bundles needing human review

  • HubSpot CRM > Create a Ticket for low-confidence or sensitive bundles requiring full escalation

Configure the fallback route to log unmatched bundles to a Google Sheet or Airtable. 

This gives you a backlog to review weekly and tune your filter conditions against, rather than discovering missed cases through customer complaints.

Category + confidence

Route

Action module

Any category, ≥ 0.85

Auto-action

Gmail > Create a Draft

Any category, 0.7–0.85

Human review

Slack > Create a Message

Any category, < 0.7

Escalate

HubSpot CRM > Create a Ticket

No match

Fallback

Log to Google Sheets

LLM-Gmail-→-HubSpot-→-Make- Step 5

How do you test and troubleshoot the integration?

Three failure modes account for almost every broken LLM integration. Test for each before turning scheduling on.

How do you run a single-bundle test?

Use the Run once button with a real sample bundle and inspect the output of every module via the bundle inspector. Three things to check on the first run:

  • AI module returned a structured category, not free-form prose

  • Filter passed the bundle through to the router

  • Router selected the route matching the category and confidence

What are the three most common LLM integration failures?

  • Malformed output: prompt drift returns prose instead of structured fields. Fix with a stricter category list and a Text Parser > Match Pattern fallback.

  • Rate limit (429): provider throttles under load. Fix with Make's built-in exponential backoff and the Break error handler.

  • Stale context: retrieval module read outdated CRM state. Fix by moving retrieval after deduplication or adding a timestamp filter.

How do you handle errors in production?

Attach an error handler route to the Make AI Toolkit module. Use the Resume directive for transient failures and Break for malformed responses needing manual review.

  • Log to a Google Sheet

  • Post to a #scenario-errors Slack channel

  • Write a row to a dead-letter Airtable

What variations and next steps can you build from here?

The classify-and-route scenario is the entry pattern. 

Once it runs reliably, three extensions deliver the next round of value without rebuilding the foundation you have already shipped. 

Each one solves a different production problem, so the right pick depends on what broke first during testing. 

Knowiing when to use AI agents vs automation is the call you make before reaching for the long-running agent extension below.

Extension

What it adds

Make module or feature

Multi-model routing

Cheap model for classification, premium model only for drafting

Two OpenAI > Create a Chat Completion modules behind a Router

Long-running agent

Multi-step reasoning with tool calls

Make AI Agents for autonomous decisions

RAG-style retrieval

Ground responses in your own knowledge base

OpenAI > Add Files to Vector Store plus retrieval before the prompt

Pick the extension matching the failure mode your first scenario surfaced, not the one that sounds most ambitious.

So what should you ship first?

You can now ship a Make scenario that takes structured or unstructured input, calls an LLM, validates the response, and routes work to the right system. 

That capability is the foundation every more ambitious agent build extends from.

Pick the highest-volume unstructured input your team handles this week, apply the classify-and-route pattern from Section IV, and ship it.

Sign up for Make free on the free tier before scaling to Core. For ready-made starting points, browse the Anthropic Claude Integration pages.

Frequently asked questions

Q1: What is LLM integration?

LLM integration is the process of connecting a large language model to your business systems so it reads inputs from one app, applies reasoning, and writes outputs to another, on a schedule, without a human in the loop. The integration layer handles retrieval, validation, routing, and error recovery around the LLM call.

Q2: How do you integrate an LLM with an API?

Three paths: write direct API calls to a provider like OpenAI or Anthropic Claude, use a visual platform like Make where API connections are pre-built modules, or build with a framework like LangChain. Make is the fastest path because retries, validation, and routing are handled natively.

Q3: Do I need an API key to integrate an LLM with Make?

Not always. Make's built-in AI provider is available on every plan with no external API key required and powers the Make AI Toolkit and Make AI Agents apps. Native modules for OpenAI, Anthropic Claude, and others require your own provider API key.

Q4: How do I stop the LLM from returning unstructured prose?

Define the exact output schema in the system prompt and add a validation module after the LLM call. The Make AI Toolkit returns structured output by default. For free-text models, follow up with JSON > Parse JSON and a Filter checking required fields exist.

Q5: What does an LLM integration cost to run?

Two layers: Make charges credits per module operation, with AI-heavy modules consuming more. The provider charges per token. Most production classify-and-route scenarios run on the Core plan plus $20–80 in monthly model spend, depending on volume. See for current rates.

Raife Dowley

Raife Dowley

Raife is a Content Specialist with a background in marketing and campaign management. Transitioning from hands-on platform work to content, he developed a talent for translating technical concepts into clear, engaging narratives that actually resonate with readers.

Like this use case? Spread the word.

Get monthly automation inspiration

Join 350,000+ users to get the freshest content delivered straight to your inbox