Edge-LLM

This project demonstrates how to use a language model with a Flask API. It allows you to send text inputs to the model and receive generated outputs via HTTP requests.

Requirements

Python 3.12+
Install dependencies from requirements.txt:
```
pip install -r requirements.txt
```

Model Setup

Download the Model: Use the huggingface-cli to download the model and save it locally:

huggingface-cli download microsoft/Phi-3-mini-4k-instruct-onnx --include directml/* --local-dir .

Running the API

Start the Flask API: Run the api.py script in the command line:
```
python api.py
```
The Flask API will be running at http://127.0.0.1:5000.

Testing the API with Postman

Open Postman and create a POST request to the following endpoint:
```
http://127.0.0.1:5000/generate
```
In the Body tab, set the request type to JSON and enter the following JSON data:
```
{
  "input": "Tell me a joke."
}
```

You should receive a response similar to this:

{
  "input": "Tell me a joke.",
  "output": "Here's a joke..."
}

References

ONNX Runtime GenAI - Phi-3 Tutorial

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
api.py		api.py
app.py		app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Edge-LLM

Requirements

Model Setup

Running the API

Testing the API with Postman

References

About

Releases

Packages

Languages

License

neerajtiwari360/Edge-LLM

Folders and files

Latest commit

History

Repository files navigation

Edge-LLM

Requirements

Model Setup

Running the API

Testing the API with Postman

References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages