Securing a RAG Application built with OpenAI and FGA in Python
Published on February 21, 2025Overview
In this lab, you'll learn how to build a secure Retrieval-Augmented Generation (RAG) application using OpenAI SDK and Auth0 Fine-Grained Authorization (FGA) in Python. You'll see how to prevent sensitive information disclosure by implementing proper access controls in your RAG system.
You'll start by setting up a basic RAG pipeline with OpenAI and a FAISS vector store, then add FGA to secure access to sensitive documents. Along the way, you'll learn about:
- Building AI application with OpenAI
- Implementing RAG for improved LLM accuracy
- Securing document access with Fine-Grained Authorization
Connect with the Author: Jessica Temporal
Jess is a Senior Developer Advocate at Auth0 and co-founder of the first Brazilian data science podcast called Pizza de Dados. She's the author of The Big Git Microbook, creator of GitFichas.com, and a part of LinkedIn Learning instructors team. Jessica helped develop the AI that identifies possible unlawful expenses from Brazilian politicians. She is part of the Pyladies initiative that works to improve diversity and inclusion in technology. In her free time, she likes to knit, play video games, use her 3D printer, and is learning to create 3D models.
Generative Artificial Intelligence (GenAI) is a subset of Artificial Intelligence (AI) focused on generating data, be it text, image, or other data formats. Meanwhile, Large Language Models are part of “AI” much like GenAI. A Large Language Model (LLM) is a machine learning model that is trained on gigantic amounts of data and is capable of understanding and responding with natural language.
Many GenAI applications today implement Retrieval-Augmented Generation (RAG). It is an architecture that improves the accuracy of LLMs by helping avoid hallucinations, which can be seen in LLMs. It can also provide more information for the LLM to formulate an answer by giving it some extra context that is either more updated than the training dataset or domain-specific.
If you would like to learn the basics of using FGA for RAG, check out this blog post on RAG and Access Control: Where Do You Start?.
Prerequisites
To follow this tutorial you'll need:
- Python 3.10 or later.
- Auth0 FGA account: it is free and you can sign up here.
- An OpenAI account and API key: read here on how to get yours.
Setup an OpenAI RAG Application
To get started, you are going to clone a starter application from our AI samples repository like so:
git clone -b openai-fga-python-starter https://github.com/oktadev/auth0-ai-samples.git
git switch main
Then navigate into the folder you just downloaded with the clone.
cd auth0-ai-samples/authorization-for-rag/openai-fga-python
Now create a python environment and install the dependencies like so:
python -m venv .env && source .env/bin/activate && pip install -r requirements.txt
This will install
- The OpenAI SDK will give us access to the
gpt-4o-mini
model so we can use it for the query. - The OpenFGA SDK will enforce the authorization model to avoid leaking sensitive data.
faiss-cpu
will be the local vector store that we will use to index the embeddings generated from the documents.
The application will be structured as follows:
main.py
: The main entry point of the application that defines the RAG pipeline.docs/*.md
: Sample markdown documents that will be used as context for the LLM. We have public and private documents for demonstration purposes.
Start by creating the RAG pipeline.
RAG Pipeline
The RAG pipeline will be defined in main.py
. It uses OpenAI GPT-4o mini model as the underlying LLM and retrieves data from the database - the vector store.
Here is a visual representation of the RAG pipeline:
Create a new file called main.py
and add the following code:
import asynciofrom helpers.documents import read_documentsfrom helpers.vector_store import LocalVectorStorefrom helpers.retriever import FGARetrieverfrom helpers.documents import DocumentWithScore, generateasync def main(user: str = "notadmin", query: str = "Show me the forecast for ZEKO?"):# 1. RAG pipelinedocuments = read_documents()# `LocalVectorStore` is a helper class that creates a FAISS index# and uses OpenAI embeddings API to encode the documents.vector_store = await LocalVectorStore.from_documents(documents)# Perform a querysearch_results = await vector_store['search'](query, k=2)# Convert search results to DocumentWithScoredocuments_with_scores = [DocumentWithScore(document=result['document'], score=result['score'])for result in search_results]# 2. Create an instance of the FGARetrieverretriever = FGARetriever.create({"documents": documents_with_scores,"build_query": lambda doc: {"user": f"user:{user}","object": f"doc:{doc['id']}","relation": "viewer",}})# 3. Filter documents based on user permissionscontext = await retriever.retrieve()# 4. Generate a response based on the context# `generate` is a helper function that takes a query and a context and# returns a response using OpenAI chat completion API.answer = await generate(query, context)# 5. Print the answerprint(f"Response to {user}:\n\n{answer}\n\n")if __name__ == "__main__":# Jess only has access to public docsasyncio.run(main("jess"))
Retrieving Documents
- The
read_documents()
reads the private and public documents fromdocs/
folder. - Then a
vector_store
is created using the FAISS library, the classLocalVectorStore
implements both the creating of embeddings of the documents for storing and searching for appropriate docs based on the user's query. - The
search_results
are used to create the documents list (documents_with_scores
) that will be filtered down by the users permissions.
# 1. RAG pipelinedocuments = read_documents()# `LocalVectorStore` is a helper class that creates a FAISS index# and uses OpenAI embeddings API to encode the documents.vector_store = await LocalVectorStore.from_documents(documents)# Perform a querysearch_results = await vector_store['search'](query, k=2)# Convert search results to DocumentWithScoredocuments_with_scores = [DocumentWithScore(document=result['document'], score=result['score'])for result in search_results]
FGA Retriever
The FGARetriever
class filters documents based on the authorization model defined in Auth0 FGA and will be available as part of the Auth for GenAI SDK (to be released). This retriever is a post search filter ideal for scenarios where you already have documents in a vector store and want to filter the vector store results based on the user's permissions. Assuming the vector store already narrows down the documents to a few, the FGA retriever will further narrow down the documents to only the ones that the user has access to.
retriever = FGARetriever.create({"documents": documents_with_scores,"build_query": ...})
The build_query
receives a lambda function and it is used to construct the query to the FGA store. The query is constructed using the user
, object
, and relation
. The user is the user ID, the object is the document ID or the document name, and the relation is the permission that the user must have on the document.
lambda doc: {"user": f"user:{user}","object": f"doc:{doc['id']}","relation": "viewer",}
For example, the dictionary below can be used to represent the relation where "jess is a viewer of public-doc".
{"user": "user:jess","relation": "viewer","object": "docs:public-doc"}
Set Up an FGA Store
In the Auth0 FGA dashboard, navigate to Settings, and in the Authorized Clients section, click + Create Client. Give your client a name, mark all three client permissions, and then click Create.
Once your client is created, you'll see a modal containing Store ID, Client ID, and Client Secret like so:
Add .config
file with the following content to the root of the project and add the corresponding values in the FGA_STORE_ID
, FGA_CLIENT_ID
, and FGA_CLIENT_SECRET
from the modal above.
# OpenAIOPENAI_API_KEY=<your-openai-api-key># Auth0 FGAFGA_STORE_ID=<your-fga-store-id>FGA_CLIENT_ID=<your-fga-store-client-id>FGA_CLIENT_SECRET=<your-fga-store-client-secret># Required only for non-US regionsFGA_API_URL=https://api.xxx.fga.devFGA_API_AUDIENCE=https://api.xxx.fga.dev/FGA_API_TOKEN_ISSUER=auth.fga.dev
Click Continue in the dashboard modal to see the FGA_API_URL
and FGA_API_AUDIENCE
and finish filling the rest of the variables.
Check the instructions here to find your OpenAI API key.
Next, in the FGA Dashboard navigate to Model Explorer. You'll need to update the model information with this:
modelschema 1.1type usertype docrelationsdefine owner: [user]define viewer: [user, user:*]
Remember to click Save. Your model preview should update and look like this:
Check out this documentation to learn more about creating an authorization model in FGA.
Now, to have access to the public information, you'll need to add a tuple on FGA. Navigate to the Tuple Management section and click + Add Tuple, fill in the following information:
- User:
user:*
- Object: select doc and add
public-doc
in the ID field - Relation:
viewer
A tuple signifies a user's relation to a given object. For example, the above tuple implies that all users can view the public-doc
object.
Test the Application
Now that you have set up the application and the FGA store, you can run the application using the following command:
python main.py
The application will start with the query, Show me forecast for ZEKO?
Since this information is in a private document, and we haven't defined a tuple with access to this document, the application will not be able to retrieve it. The FGA retriever will filter out the private document from the vector store results and, hence, print a similar output.
Response to jess:The provided context does not include specific forecastsor projections for Zeko Advanced Systems Inc. ...
If you change the query to something that is available in the public document, the application will be able to retrieve the information.
Now, to have access to the private information, you'll need to update your tuple list. Go back to the Auth0 FGA dashboard in the Tuple Management section and click + Add Tuple, fill in the following information:
- User:
user:user1
- Object: select
doc
and addprivate-doc
in the ID field - Relation:
viewer
Now click Add Tuple and adjust the main.py
:
# ... pre-existing codeif __name__ == "__main__":# Jess only has access to public docsasyncio.run(main("jess"))# User1 is part of the financial team and has access to financial reportsasyncio.run(main("user1"))
And run the app again:
python main.py
This time, you should see both the response to jess
who has access to public information while and also the response containing the forecast information since we added a tuple that defines the viewer
relation for user1
to the private-doc
object, like so:
Response to user1:The forecast for Zeko Advanced Systems Inc. (ZEKO)for fiscal year 2025 indicates a bearish outlook.Here are the key projections:- **Revenue Growth:** Estimated increase of 2-3%, ...
Congratulations! You have run a simple RAG application using OpenAI and secured it using Auth0 FGA.
Recap
In this tutorial, you learned how to build a secure RAG application using OpenAI and Auth0 FGA in Python. You saw how to implement RAG for improved LLM accuracy and secure document access with Fine-Grained Authorization.
You can find the complete code for this tutorial on this GitHub repository.
Extra Resources
If you want to continue your studies in GenAI and auth, check out these other blog posts:
- RAG and Access Control: Where Do You Start?
- Tool Calling in AI Agents: Empowering Intelligent Automation Securely
- Building a Secure RAG with Python, LangChain, and OpenFGA
- Build a Secure RAG Agent Using LlamaIndex and Okta FGA on Node.js
Legal Disclosure
This document and any recommendations within are not legal, privacy, security, compliance, or business advice. This document is intended for general informational purposes only and may not reflect the most current security, privacy, and legal developments or all relevant issues. You are responsible for obtaining legal, security, privacy, compliance, or business advice from your lawyer or other professional advisor and should not rely on the recommendations herein. Okta is not liable to you for any loss or damages that may result from your implementation of any recommendations in this document. Okta makes no representations, warranties, or other assurances regarding the content of this document. Information regarding Okta's contractual assurances to its customers can be found at okta.com/agreements.