PDFChatbot is a Python application designed to extract and provide relevant information from a set of documents using OpenAI's GPT models and Pinecone for vector similarity search. This project leverages embeddings to find the most relevant context for a given query and uses a chat-based completion model to generate answers with citations.
- Vector Similarity Search: Uses Pinecone to find relevant document sections based on query embeddings.
- Chat-based Completion: Utilizes OpenAI's GPT models to generate answers from the extracted context.
- Citations: Provides source citations for the generated answers.
- Python 3.7+
- OpenAI Python SDK
- Pinecone Python SDK
- dotenv
-
Clone the repository:
git clone https://github.com/gayanMatch/PDFChatbot.git cd PDFChatbot
-
Install dependencies:
pip install -r requirements.txt
-
Environment Variables: Create a
.env
file in the root directory and add your API keys and other configurations:OPENAI_API_KEY=your_openai_api_key PINECONE_API_KEY=your_pinecone_api_key PINECONE_INDEX_NAME=your_pinecone_index_name PINECONE_NAMESPACE=your_pinecone_namespace