This project demonstrates how to use a language model with a Flask API. It allows you to send text inputs to the model and receive generated outputs via HTTP requests.
- Python 3.12+
- Install dependencies from
requirements.txt
:pip install -r requirements.txt
- Download the Model:
Use the
huggingface-cli
to download the model and save it locally:huggingface-cli download microsoft/Phi-3-mini-4k-instruct-onnx --include directml/* --local-dir .
-
Start the Flask API: Run the
api.py
script in the command line:python api.py
-
The Flask API will be running at
http://127.0.0.1:5000
.
-
Open Postman and create a POST request to the following endpoint:
http://127.0.0.1:5000/generate
-
In the Body tab, set the request type to JSON and enter the following JSON data:
{ "input": "Tell me a joke." }
-
You should receive a response similar to this:
{ "input": "Tell me a joke.", "output": "Here's a joke..." }