Build a Secure RAG Agent Using LlamaIndex and Auth0 FGA on Node.js

Published on February 7, 2025

Overview

In this lab, you'll learn how to build a secure Retrieval-Augmented Generation (RAG) agent using LlamaIndex and Auth0 Fine-Grained Authorization (FGA) on Node.js. You'll see how to prevent sensitive information disclosure by implementing proper access controls in your RAG system.

You'll start by setting up a basic RAG pipeline with LlamaIndex, then add FGA to secure access to sensitive documents. Along the way, you'll learn about:

Building AI agents with LlamaIndex
Implementing RAG for improved LLM accuracy
Securing document access with Fine-Grained Authorization

Connect with the Author Deepu Sasidharan

Deepu Sasidharan is a Software Engineer by passion and profession. He is a Java Champion working as a Staff Developer Advocate at Auth0 by Okta. He is the co-chair of JHipster and the creator of KDash, JWT-UI, and JDL Studio. He is a polyglot programmer working with Java, Rust, JavaScript, Go, etc. He is also a cloud technology advocate and an open-source software aficionado. He has authored books on Full-stack development and frequently writes about Java, Rust, JavaScript, Go, DevOps, Kubernetes, Linux, and so on, on his blog.

Connect with Deepu on LinkedIn

Connect with Deepu on Twitter

Generative Artificial Intelligence (GenAI) has massively changed the software landscape and AI agents are all the rage now. AI agents can also be used for complex Retrieval-Augmented Generation systems where additional context can be provided to the Large Language Model (LLM) by retrieving updated or domain-specific data from a database or a search engine. This technique can reduce hallucinations and improve accuracy of the LLM. AI agents can also use RAG as a tool to perform complex workflows.

Sensitive Information Disclosure is a common issue that plagues RAG-based systems. We don't want the LLM to accidentally access or expose sensitive data from a database. Traditional Role-Based Access Control (RBAC) systems are not enough to secure RAG applications and agents. This is where Fine-Grained Authorization (FGA) shines as a solution.

If you would like to learn the basics of using FGA for RAG, check out this blog post on RAG and Access Control: Where Do You Start?.

What Does LlamaIndex Do?

LlamaIndex is a flexible framework for building AI agents and RAG applications. It provides Python and JavaScript SDKs to interact with a variety of LLMs and databases. LlamaIndex can be used to build AI agents and workflows that can interact with the user, retrieve data from a database, and generate responses using a LLM. It also provides tools like LlamaParse to transform unstructured data into LLM optimized formats and LlamaCloud to store and retrieve LLM-ready data from the cloud.

In this tutorial, we will build a simple RAG agent using the LlamaIndex.TS, which is the JavaScript version of the popular LlamaIndex framework, and secure it using Auth0 FGA.

Prerequisites

You need to set up the following tools and services:

NodeJS v20

VSCode

An Auth0 FGA account. Create one.

An OpenAI account and API key. Create one.

Set up a LlamaIndex RAG Application

To get started, create a new directory for the project and navigate into it:

COMMAND

mkdir gen-ai-llamaindex-rag-lab
cd gen-ai-llamaindex-rag-lab

Initialize a new Node.js project with the following command:

COMMAND

npm init -y

We will be using TypeScript for this tutorial. Install the following dependencies:

COMMAND

npm install --save-dev tsx @types/node

To use Auth0 FGA, you need to install the OpenFGA SDK and the auth0-ai-js SDK.

COMMAND

npm install dotenv @openfga/sdk "https://github.com/auth0-lab/auth0-ai-js.git#alpha-2" --save

Finally, install the LlamaIndex.TS SDK:

COMMAND

npm install llamaindex

The application will be structured as follows:

index.ts : The main entry point of the application that defines the RAG pipeline.
assets/docs/*.md : Sample markdown documents that will be used as context for the LLM. We have public and private documents for demonstration purposes. Download these files from GitHub.

Let us start by creating the RAG pipeline.

RAG Pipeline

The RAG pipeline will be defined in index.ts. It uses LlamaIndex Agents to interact with the underlying LLM and retrieve data from the database.

Here is a visual representation of the RAG pipeline:

Create a new file called index.ts and add the following code:

index.ts

import "dotenv/config";
import {
  OpenAIAgent,
  QueryEngineTool,
  VectorStoreIndex,
  SimpleDirectoryReader,
} from "llamaindex";
// Once published to NPM, this will become `import { FGARetriever } from "@auth0/ai-llamaindex";`
import { FGARetriever } from "auth0-ai-js/packages/ai-llamaindex/src";

/**
 * Demonstrates the usage of the Auth0 FGA (Fine-Grained Authorization)
 * with a vector store index to query documents with permission checks.
 *
 * The FGARetriever checks if the user has the "viewer" relation to the document
 * based on predefined tuples in Auth0 FGA.
 *
 * Example:
 * - A tuple {user: "user:*", relation: "viewer", object: "doc:public-doc"} allows all users to view "public-doc".
 * - A tuple {user: "user:user1", relation: "viewer", object: "doc:private-doc"} allows "user1" to view "private-doc".
 *
 * The output of the query depends on the user's permissions to view the documents.
 */
async function main() {
  console.log(
    "\n..:: LlamaIndex Example: Retrievers with Auth0 FGA (Fine-Grained Authorization)\n\n"
  );

  // UserID
  const user = "user1";
  // 1. Read and load documents from the assets folder
  const documents = await new SimpleDirectoryReader().loadData("./assets/docs");
  // 2. Create an in-memory vector store from the documents using the default OpenAI embeddings
  const vectorStoreIndex = await VectorStoreIndex.fromDocuments(documents);
  // 3. Create a retriever that uses FGA to gate fetching documents on permissions.
  const retriever = FGARetriever.create({
    // Set the similarityTopK to retrieve more documents as SimpleDirectoryReader creates chunks
    retriever: vectorStoreIndex.asRetriever({ similarityTopK: 30 }),
    // FGA tuple to query for the user's permissions
    buildQuery: (document) => ({
      user: `user:${user}`,
      object: `doc:${document.metadata.file_name.split(".")[0]}`,
      relation: "viewer",
    }),
  });
  // 4. Create a query engine and convert it into a tool
  const queryEngine = vectorStoreIndex.asQueryEngine({ retriever });
  const tools = [
    new QueryEngineTool({
      queryEngine,
      metadata: {
        name: "zeko-internal-tool",
        description: `This tool can answer detailed questions about ZEKO.`,
      },
    }),
  ];

  // 5. Create an agent using the tools array and OpenAI GPT-4 LLM
  const agent = new OpenAIAgent({ tools });

  // 6. Query the agent
  let response = await agent.chat({ message: "Show me forecast for ZEKO?" });

  /**
   * Output: `The provided document does not contain any specific forecast information...`
   */
  console.log(response.message.content);

  /**
   * If we add the following tuple to the Auth0 FGA:
   *
   *    { user: "user:user1", relation: "viewer", object: "doc:private-doc" }
   *
   * Then, the output will be: `The forecast for Zeko Advanced Systems Inc. (ZEKO) for fiscal year 2025...`
   */
}

main().catch(console.error);

FGA Retriever

The FGARetriever class filters documents based on the authorization model defined in Auth0 FGA and will be available as part of the auth0-ai-js SDK. This retriever is a post search filter ideal for scenarios where you already have documents in a vector store and want to filter the vector store results based on the user's permissions. Assuming the vector store already narrows down the documents to a few, the FGA retriever will further narrow down the documents to only the ones that the user has access to.

The build query function is used to construct the query to the FGA store. The query is constructed using the user, object, and relation. The user is the user ID, the object is the document ID or the document name, and the relation is the permission that the user must have on the document.

buildQuery: (document) => ({
  user: `user:${user}`,
  object: `doc:${document.metadata.file_name.split(".")[0]}`,
  relation: "viewer",
}),

Retrieval Agent

The queryEngine is created from the vector store index and configured to use our custom FGA retriever. The query engine handles searching through documents and retrieving relevant information based on user queries.
The tools array contains a QueryEngineTool that wraps our query engine. The tool provides a structured interface for the agent to access the query engine's capabilities.
The agent is created using OpenAI's GPT-4 model and the tools array. The agent acts as an intelligent interface between the user and the tools - it understands natural language queries, determines when to use the query engine tool, and formulates responses based on the retrieved information.

/** index.ts **/
const queryEngine = vectorStoreIndex.asQueryEngine({ retriever });
const tools = [
  new QueryEngineTool({
    queryEngine,
    metadata: {
      name: "zeko-internal-tool",
      description: `This tool can answer detailed questions about ZEKO.`,
    },
  }),
];

// 5. Create an agent using the tools array and OpenAI GPT-4 LLM
const agent = new OpenAIAgent({ tools });

Set up an FGA Store

In the Auth0 FGA dashboard, navigate to Settings, and in the Authorized Clients section, click + Create Client. Give your client a name, mark all three client permissions, and then click Create.

Once your client is created, you’ll see a modal containing Store ID, Client ID, and Client Secret.

Add .env file with the following content to the root of the project. Click Continue to see the FGA_API_URL and FGA_API_AUDIENCE.

.env

# OpenAI
OPENAI_API_KEY=<your-openai-api-key>

# Auth0 FGA
FGA_STORE_ID=<your-fga-store-id>
FGA_CLIENT_ID=<your-fga-store-client-id>
FGA_CLIENT_SECRET=<your-fga-store-client-secret>
# Required only for non-US regions
FGA_API_URL=https://api.xxx.fga.dev
FGA_API_AUDIENCE=https://api.xxx.fga.dev/

Check the instructions here to find your OpenAI API key.

Next, navigate to Model Explorer. You’ll need to update the model information with this:

schema.fga

model
  schema 1.1

type user

type doc
  relations
    define owner: [user]
    define viewer: [user, user:*]

Remember to click Save.

Check out this documentation to learn more about creating an authorization model in FGA.

Now, to have access to the public information, you’ll need to add a tuple on FGA. Navigate to the Tuple Management section and click + Add Tuple, fill in the following information:

User : user:*
Object : select doc and add public-doc in the ID field
Relation : viewer

A tuple signifies a user’s relation to a given object. For example, the above tuple implies that all users can view the public-doc object.

Test the Application

Now that you have set up the application and the FGA store, you can run the application using the following command:

COMMAND

npm start

The application will start with the query, Show me forecast for ZEKO? Since this information is in a private document, and we haven't defined a tuple with access to this document, the application will not be able to retrieve it. The FGA retriever will filter out the private document from the vector store results and, hence, print a similar output.

The provided context does not include specific forecasts or projections for Zeko Advanced Systems Inc. ...

If you change the query to something that is available in the public document, the application will be able to retrieve the information.

Now, to have access to the private information, you’ll need to update your tuple list. Go back to the Auth0 FGA dashboard in the Tuple Management section and click + Add Tuple, fill in the following information:

User : user:user1
Object : select doc and add private-doc in the ID field
Relation : viewer

Now click Add Tuple and then run the script again:

COMMAND

npm start

This time, you should see a response containing the forecast information since we added a tuple that defines the viewer relation for user1 to the private-doc object.

Congratulations! You have run a simple RAG application using LlamaIndex and secured it using Auth0 FGA.

Conclusion

In this tutorial, you learned how to build a secure RAG application using LlamaIndex and Auth0 FGA on Node.js. You saw how to implement RAG for improved LLM accuracy and secure document access with Fine-Grained Authorization.

You can find the complete code for this tutorial on GitHub.

Legal Disclosure

This document and any recommendations within are not legal, privacy, security, compliance, or business advice. This document is intended for general informational purposes only and may not reflect the most current security, privacy, and legal developments or all relevant issues. You are responsible for obtaining legal, security, privacy, compliance, or business advice from your lawyer or other professional advisor and should not rely on the recommendations herein. Okta is not liable to you for any loss or damages that may result from your implementation of any recommendations in this document. Okta makes no representations, warranties, or other assurances regarding the content of this document. Information regarding Okta's contractual assurances to its customers can be found at okta.com/agreements.