AskSmart: A RAG-Powered Intelligent Query System

AskSmart is a powerful document retrieval system that allows you to upload and process various formats such as PDF, DOCX, JSON, and TXT. Our advanced AI technology retrieves relevant information and generates context-aware responses to your queries.

Supported Formats: PDF, DOCX, JSON, TXT

Local Environment Setup

Clone the Repository

git clone https://github.com/Rishi-Jain2602/AskSmart.git

Create Virtual Environment

cd backend
virtualenv venv
venv\Scripts\activate  # On Windows
source venv/bin/activate  # On macOS/Linux

Install the Project dependencies

3.1 Navigate to the Backend Directory and install Python dependencies:

cd backend
pip install -r requirements.txt

3.2 Navigate to the Frontend Directory and install Node.js dependencies:

cd frontend
npm install

Run the React App

Start the React app with the following command:

cd frontend
npm start

Run the Backend (FastAPI App)

Open a new terminal and run the backend:

cd backend
uvicorn main:app --host 0.0.0.0 --port 8000 --reload

The server will be running at http://127.0.0.1:8000.

Ingestion Pipeline

This code is responsible for processing and storing documents that are uploaded by users, preparing them for retrieval and generation of context-based responses. Here's how the pipeline works:

File Upload and Processing Based on Format:: When a user uploads a document (PDF, DOCX, JSON, or TXT), the system processes it using appropriate loaders based on the file type. Here’s the logic for handling different formats:

if file_path.endswith(".pdf"):
     loader = PyPDFLoader(file)
 elif file_path.endswith(".docx"):
     loader = Docx2txtLoader(file)
 elif file_path.endswith(".json"):
     loader = JSONLoader(file, jq_schema=".", text_content=False, json_lines=False)
 elif file_path.endswith(".txt"):
     loader = TextLoader(file)
 else:
     raise ValueError("Unsupported file format. Please upload a PDF, DOCX, JSON, or TXT file.")
 
 # Load the document content
 documents = loader.load()

Document Processing: After uploading, the file is processed using specific loaders based on its file type (e.g., PyPDFLoader for PDFs, Docx2txtLoader for DOCX). The document content is split into smaller text chunks to enable more efficient querying later.
```
text_splitter = CharacterTextSplitter(chunk_size=2000, chunk_overlap=200)
docs = text_splitter.split_documents(documents)
```
Handling Duplicate File Names : The system automatically checks if a collection with the same name exists. If it does, the old data is deleted, and the new data is updated. This prevents conflicts when uploading a new file with the same name.
```
 if client.collections.exists(valid_name):
     client.collections.delete(valid_name)
```

Data Storage: These chunks are then added to a collection in Weaviate, with relevant metadata such as the document title, chunk index, and document ID.

data_properties = {
    "Document_title": str(doct_name),
    "chunk": str(chunk),
    "chunk_index": i,
    "DoctID": str(docuemtID)
}
data_object = wvc.data.DataObject(properties=data_properties)
chunks_list.append(data_object)

Chunk Insertion: The chunks are inserted into Weaviate in batches to optimize performance.

for i in range(0, len(chunks_list), batch_size):
    batch = chunks_list[i:i + batch_size]
    chunks.data.insert_many(batch)

API Endpoint Implementation

Document Upload API:
- Endpoint: POST https://asksmart.onrender.com/Rag/upload
- Description: This API allows you to upload documents in formats like PDF, DOCX, JSON, and TXT. The document is processed and stored in a Weaviate collection.
Curl Command:
```
curl -X POST "https://asksmart.onrender.com/Rag/upload" -F "file=@/path/to/your/file.pdf"
```
- Replace /path/to/your/file.pdf with the actual path of the document.
- Response:
```
{
  "doct_id": "Generated_collection_name",
  "filename": "yourfile.pdf",
  "message": "Upload and processing complete"
}
```

Chatting API:

Endpoint: POST https://asksmart.onrender.com/Rag/chat
Description: Query the uploaded document to get context-based responses.

Curl Command:

curl -X POST "https://asksmart.onrender.com/Rag/chat" \
  -H "Content-Type: application/json" \
  -d '{ 
        "user_id": "SESSION_ID_VALUE", 
        "query": "CURRENT_INPUT_VALUE", 
        "doctID": "STORED_DOC_ID_VALUE" 
      }'

Replace SESSION_ID_VALUE, CURRENT_INPUT_VALUE, and STORED_DOC_ID_VALUE with actual values.
Response:

{
  "response": "Generated response based on document"
}

Backend Deployment on Render:

The backend of this project is deployed on Render.
You can automate deployment with a render.yaml file to define the environment settings for Render or follow these manual steps:
- Create a New Web Service:
  - Log into Render and create a new web service for the backend.
  - Connect it to your GitHub repository.
- Build Command:
  - In the service settings, set the build co 7262 mmand to install dependencies and run the FastAPI server:
```
cd backend
pip install -r requirements.txt
uvicorn main:app --host 0.0.0.0 --port 8000
```
- Environment Variables:
  - Add the necessary environment variables for Weaviate, OpenAI, and other integrations under the "Environment" tab.
- Deploy:
  - Once configured, deploy the service and obtain the live URL for your backend (e.g., https://asksmart.onrender.com).

Frontend Deployment on Vercel:

The frontend of this project is deployed on Vercel.
You can automate deployment via Vercel or follow these manual steps:
- Create a New Project:
  - Log into Vercel and create a new project.
  - Select your GitHub repository and choose the frontend folder to deploy as a separate Vercel project.
- Build Command:
  - In the project settings, set the build command to install dependencies and build the React app:
```
cd frontend
npm install
npm run build
```
- Output Directory:
  - Ensure that the output directory is set to build (as Vercel automatically looks for this folder in React projects).
- Environment Variables:
  - Set any necessary environment variables, such as API URLs pointing to the backend (e.g., the Render URL).
- Deploy:
  - Deploy the frontend, and Vercel will provide a live URL (e.g., https://asksmart-frontend.vercel.app).

Note

Make sure you have Python 3.x and npm 10.x installed
It is recommended to use a virtual environment for backend to avoid conflict with other projects.
If you encounter any issue during installation or usage please contact rishijainai262003@gmail.com or rj1016743@gmail.com

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
backend		backend
frontend		frontend
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AskSmart: A RAG-Powered Intelligent Query System

Local Environment Setup

Ingestion Pipeline

API Endpoint Implementation

Backend Deployment on Render:

Frontend Deployment on Vercel:

Note

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Rishi-Jain2602/AskSmart

Folders and files

Latest commit

History

Repository files navigation

AskSmart: A RAG-Powered Intelligent Query System

Local Environment Setup

Ingestion Pipeline

API Endpoint Implementation

Backend Deployment on Render:

Frontend Deployment on Vercel:

Note

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages