Software

2 minute read

Build a Simple RAG App with Telnyx AI Inference

Inbar Yagur

June 26, 2026

RAG is one of those patterns that sounds more complicated than it has to be.

At its core, retrieval-augmented generation is just:

Store some documents
Embed the user’s question
Find the most relevant docs
Send those docs to the model as context
Return an answer with sources

I built a small Python example that shows that flow end to end with Telnyx AI Inference.

Repo: https://github.com/team-telnyx/telnyx-code-examples/tree/main/build-rag-with-telnyx-inference-python

What it does

The app exposes a Flask API for asking questions against a tiny in-memory knowledge base.

You send a question like:

{ "question": "Production signup broke after rotating an API key. Logs show 401 errors. What should we check first?" }

The app

creates an embedding for the question
compares it against embeddings for the sample documents
retrieves the most relevant sources
sends those sources to a chat model
returns a grounded answer plus source titles

Why this pattern is useful

A normal LLM call only knows what is in the prompt and the model’s training data. RAG lets your app answer with your own docs, policies, product information, support notes, or internal knowledge base. That makes it useful for things like:

support assistants
internal docs search
onboarding copilots
product Q&A
troubleshooting workflows
agent tools that need source-grounded answers

How the example works

The example keeps the moving parts intentionally small.
There is an in-memory DOCUMENTS list. On the first request, the app creates embeddings for those documents and caches them. When a user asks a question, the app embeds the question, compares it to the document embeddings, and sends the best matches to the model.
The answer response includes source titles, so you can see what context the app used instead of treating the model like a black box.

Try it

Clone the repo:
git clone https://github.com/team-telnyx/telnyx-code-examples.git cd telnyx-code-examples/build-rag-with-telnyx-inference-python

Install dependencies and run the app:
pip install -r requirements.txt cp .env.example .env python app.py

Ask a question:
curl -X POST http://localhost:5000/rag/ask -H "Content-Type: application/json" -d '{ "question": "Production signup broke after rotating an API key. Logs show 401 errors. What should we check first?" }'

Why I like this example

It is deliberately small, but it gives you the core pieces of a real RAG workflow:

embeddings
retrieval
source grounding
chat completion
a clean API surface

From there, you could swap the in-memory docs for a vector database, pull content from product docs, or turn it into a support assistant.
The Telnyx code examples repo is also structured to be agent-readable, so coding agents can inspect these examples and help you extend them into fuller applications.

Resources

Code example
Telnyx AI repo with skills/toolkits
Telnyx AI Inference docs

Not Such a Cruel Summer: An Economic Outlook

June 26, 2026

Quality Assurance

What ISO 9001:2015 Really Means to an Organization

June 26, 2026

M	T	W	T	F	S	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Hand-Picked Top-Read Stories

What ISO 9001:2015 Really Means to an Organization

Build a Simple RAG App with Telnyx AI Inference

Not Such a Cruel Summer: An Economic Outlook

Trending Tags

Build a Simple RAG App with Telnyx AI Inference

What it does

The app

Why this pattern is useful

How the example works

Try it

Why I like this example

Resources

Leave a Reply Cancel reply

Previous Post

Not Such a Cruel Summer: An Economic Outlook

Next Post

What ISO 9001:2015 Really Means to an Organization

Build a Simple RAG App with Telnyx AI Inference

What it does

The app

Why this pattern is useful

How the example works

Try it

Why I like this example

Resources

Leave a Reply Cancel reply

Previous Post

Next Post

Related Posts