Securing a RAG Application built with OpenAI and FGA in Python

Published on February 21, 2025

Overview

In this lab, you'll learn how to build a secure Retrieval-Augmented Generation (RAG) application using OpenAI SDK and Auth0 Fine-Grained Authorization (FGA) in Python. You'll see how to prevent sensitive information disclosure by implementing proper access controls in your RAG system.

You'll start by setting up a basic RAG pipeline with OpenAI and a FAISS vector store, then add FGA to secure access to sensitive documents. Along the way, you'll learn about:

Building AI application with OpenAI
Implementing RAG for improved LLM accuracy
Securing document access with Fine-Grained Authorization

Connect with the Author: Jessica Temporal

Jess is a Senior Developer Advocate at Auth0 and co-founder of the first Brazilian data science podcast called Pizza de Dados. She's the author of The Big Git Microbook, creator of GitFichas.com, and a part of LinkedIn Learning instructors team. Jessica helped develop the AI that identifies possible unlawful expenses from Brazilian politicians. She is part of the Pyladies initiative that works to improve diversity and inclusion in technology. In her free time, she likes to knit, play video games, use her 3D printer, and is learning to create 3D models.

Connect with Jess on LinkedIn

Connect with Jess on Twitter

Generative Artificial Intelligence (GenAI) is a subset of Artificial Intelligence (AI) focused on generating data, be it text, image, or other data formats. Meanwhile, Large Language Models are part of “AI” much like GenAI. A Large Language Model (LLM) is a machine learning model that is trained on gigantic amounts of data and is capable of understanding and responding with natural language.

Many GenAI applications today implement Retrieval-Augmented Generation (RAG). It is an architecture that improves the accuracy of LLMs by helping avoid hallucinations, which can be seen in LLMs. It can also provide more information for the LLM to formulate an answer by giving it some extra context that is either more updated than the training dataset or domain-specific.

If you would like to learn the basics of using FGA for RAG, check out this blog post on RAG and Access Control: Where Do You Start?.

Prerequisites

To follow this tutorial you'll need:

Python 3.10 or later.
Auth0 FGA account: it is free and you can sign up here.
An OpenAI account and API key: read here on how to get yours.

Setup an OpenAI RAG Application

To get started, you are going to clone a starter application from our AI samples repository like so:

COMMAND

git clone -b openai-fga-python-starter https://github.com/auth0-samples/auth0-ai-samples.git

Note that this branch only contains the starter sample. If you want to checkout the other AI samples or the final code for this app, you have to switch to the main branch with git switch main

Then navigate into the folder you just downloaded with the clone.

COMMAND

cd auth0-ai-samples/authorization-for-rag/openai-fga-python

Now create a python environment and install the dependencies like so:

COMMAND

python -m venv .env && source .env/bin/activate && pip install -r requirements.txt

This will install

The OpenAI SDK will give us access to the gpt-4o-mini model so we can use it for the query.
The OpenFGA SDK will enforce the authorization model to avoid leaking sensitive data.
faiss-cpu will be the local vector store that we will use to index the embeddings generated from the documents.

You can use any other dependencies and project management like uv or poetry if you prefer.

The application will be structured as follows:

main.py: The main entry point of the application that defines the RAG pipeline.
docs/*.md: Sample markdown documents that will be used as context for the LLM. We have public and private documents for demonstration purposes.

Start by creating the RAG pipeline.

RAG Pipeline

The RAG pipeline will be defined in main.py. It uses OpenAI GPT-4o mini model as the underlying LLM and retrieves data from the database - the vector store.

Here is a visual representation of the RAG pipeline:

https://images.ctfassets.net/23aumh6u8s0i/76DegvQtjEx5jNDcqvy1VD/462977639c07dd1d92e82783d66aac7e/rag-with-fga-flow.png

Create a new file called main.py and add the following code:

main.py

import asyncio
from helpers.documents import read_documents
from helpers.vector_store import LocalVectorStore
from helpers.retriever import FGARetriever
from helpers.documents import DocumentWithScore, generate

async def main(user: str = "notadmin", query: str = "Show me the forecast for ZEKO?"):
    # 1. RAG pipeline
    documents = read_documents()

    # `LocalVectorStore` is a helper class that creates a FAISS index
    # and uses OpenAI embeddings API to encode the documents.
    vector_store = await LocalVectorStore.from_documents(documents)
    
    # Perform a query
    search_results = await vector_store['search'](query, k=2)

    # Convert search results to DocumentWithScore
    documents_with_scores = [
        DocumentWithScore(document=result['document'], score=result['score'])
        for result in search_results
    ]

    # 2. Create an instance of the FGARetriever
    retriever = FGARetriever.create({
        "documents": documents_with_scores,
        "build_query": lambda doc: {
            "user": f"user:{user}",
            "object": f"doc:{doc['id']}",
            "relation": "viewer",
        }
    })

    # 3. Filter documents based on user permissions
    context = await retriever.retrieve()

    # 4. Generate a response based on the context
    # `generate` is a helper function that takes a query and a context and
    # returns a response using OpenAI chat completion API.
    answer = await generate(query, context)

    # 5. Print the answer
    print(f"Response to {user}:\n\n{answer}\n\n")
    

if __name__ == "__main__":
    # Jess only has access to public docs
    asyncio.run(main("jess"))

Retrieving Documents

The read_documents() reads the private and public documents from docs/ folder.
Then a vector_store is created using the FAISS library, the class LocalVectorStore implements both the creating of embeddings of the documents for storing and searching for appropriate docs based on the user's query.
The search_results are used to create the documents list (documents_with_scores) that will be filtered down by the users permissions.

main.py

# 1. RAG pipeline
documents = read_documents()

# `LocalVectorStore` is a helper class that creates a FAISS index
# and uses OpenAI embeddings API to encode the documents.
vector_store = await LocalVectorStore.from_documents(documents)
    
# Perform a query
search_results = await vector_store['search'](query, k=2)

# Convert search results to DocumentWithScore
documents_with_scores = [
    DocumentWithScore(document=result['document'], score=result['score'])
    for result in search_results
]

FGA Retriever

The FGARetriever class filters documents based on the authorization model defined in Auth0 FGA and will be available as part of the Auth for GenAI SDK (to be released). This retriever is a post search filter ideal for scenarios where you already have documents in a vector store and want to filter the vector store results based on the user's permissions. Assuming the vector store already narrows down the documents to a few, the FGA retriever will further narrow down the documents to only the ones that the user has access to.

main.py

retriever = FGARetriever.create({
        "documents": documents_with_scores,
        "build_query": ...
    })

The build_query receives a lambda function and it is used to construct the query to the FGA store. The query is constructed using the user, object, and relation. The user is the user ID, the object is the document ID or the document name, and the relation is the permission that the user must have on the document.

main.py

lambda doc: {
    "user": f"user:{user}",
    "object": f"doc:{doc['id']}",
    "relation": "viewer",
}

For example, the dictionary below can be used to represent the relation where "jess is a viewer of public-doc".

{
    "user": "user:jess",
    "relation": "viewer", 
    "object": "docs:public-doc"
}

Set Up an FGA Store

In the Auth0 FGA dashboard, navigate to Settings, and in the Authorized Clients section, click + Create Client. Give your client a name, mark all three client permissions, and then click Create.

Once your client is created, you'll see a modal containing Store ID, Client ID, and Client Secret like so:

Add .config file with the following content to the root of the project and add the corresponding values in the FGA_STORE_ID, FGA_CLIENT_ID, and FGA_CLIENT_SECRET from the modal above.

.config

# OpenAI
OPENAI_API_KEY=<your-openai-api-key>

# Auth0 FGA
FGA_STORE_ID=<your-fga-store-id>
FGA_CLIENT_ID=<your-fga-store-client-id>
FGA_CLIENT_SECRET=<your-fga-store-client-secret>

# Required only for non-US regions
FGA_API_URL=https://api.xxx.fga.dev
FGA_API_AUDIENCE=https://api.xxx.fga.dev/
FGA_API_TOKEN_ISSUER=auth.fga.dev

Click Continue in the dashboard modal to see the FGA_API_URL and FGA_API_AUDIENCE and finish filling the rest of the variables.

Check the instructions here to find your OpenAI API key.

Next, in the FGA Dashboard navigate to Model Explorer. You'll need to update the model information with this:

model
  schema 1.1

type user

type doc
  relations
    define owner: [user]
    define viewer: [user, user:*]

Remember to click Save. Your model preview should update and look like this:

model with updated preview in the dashboard

Check out this documentation to learn more about creating an authorization model in FGA.

Now, to have access to the public information, you'll need to add a tuple on FGA. Navigate to the Tuple Management section and click + Add Tuple, fill in the following information:

User: user:*
Object: select doc and add public-doc in the ID field
Relation: viewer

tuple in FGA dashboard that allow all users to view public doc

A tuple signifies a user's relation to a given object. For example, the above tuple implies that all users can view the public-doc object.

Test the Application

Now that you have set up the application and the FGA store, you can run the application using the following command:

COMMAND

python main.py

The application will start with the query, Show me forecast for ZEKO? Since this information is in a private document, and we haven't defined a tuple with access to this document, the application will not be able to retrieve it. The FGA retriever will filter out the private document from the vector store results and, hence, print a similar output.

Response to jess:

The provided context does not include specific forecasts
 or projections for Zeko Advanced Systems Inc. ...

If you change the query to something that is available in the public document, the application will be able to retrieve the information.

Now, to have access to the private information, you'll need to update your tuple list. Go back to the Auth0 FGA dashboard in the Tuple Management section and click + Add Tuple, fill in the following information:

User: user:user1
Object: select doc and add private-doc in the ID field
Relation: viewer

Now click Add Tuple and adjust the main.py:

main.py

# ... pre-existing code

if __name__ == "__main__":
    # Jess only has access to public docs
    asyncio.run(main("jess"))
    
    # User1 is part of the financial team and has access to financial reports
    asyncio.run(main("user1"))

And run the app again:

COMMAND

python main.py

This time, you should see both the response to jess who has access to public information while and also the response containing the forecast information since we added a tuple that defines the viewer relation for user1 to the private-doc object, like so:

Response to user1:

The forecast for Zeko Advanced Systems Inc. (ZEKO) 
for fiscal year 2025 indicates a bearish outlook. 
Here are the key projections:

- **Revenue Growth:** Estimated increase of 2-3%, ...

Congratulations! You have run a simple RAG application using OpenAI and secured it using Auth0 FGA.

Recap

In this tutorial, you learned how to build a secure RAG application using OpenAI and Auth0 FGA in Python. You saw how to implement RAG for improved LLM accuracy and secure document access with Fine-Grained Authorization.

You can find the complete code for this tutorial on this GitHub repository.

Extra Resources

If you want to continue your studies in GenAI and auth, check out these other blog posts:

Legal Disclosure

This document and any recommendations within are not legal, privacy, security, compliance, or business advice. This document is intended for general informational purposes only and may not reflect the most current security, privacy, and legal developments or all relevant issues. You are responsible for obtaining legal, security, privacy, compliance, or business advice from your lawyer or other professional advisor and should not rely on the recommendations herein. Okta is not liable to you for any loss or damages that may result from your implementation of any recommendations in this document. Okta makes no representations, warranties, or other assurances regarding the content of this document. Information regarding Okta's contractual assurances to its customers can be found at okta.com/agreements.