How to migration with vLLM? #1370

ash-hun · 2025-03-04T10:46:41Z

ash-hun
Mar 4, 2025

Hi! I'm having trouble using the instructor and need some help.
The versions of the libraries I use are as follows:

openai : 1.65.1
pydantic : 2.10.6
instructor : 1.7.2

First, I confirmed that it works well when creating structured_output using instructor with OpenAI's API as shown below!

from openai import OpenAI
from pydantic import BaseModel

class IntentSchema(BaseModel):
    intent: str

def structured_output(instruction:str) -> object:
    llm = OpenAI(
        api_key='__my_openai_api_key__'
    )

    client = instructor.from_openai(llm)
    
    res = client.chat.completions.create(
        model="gpt-4o",
        response_model=IntentSchema,
        messages=[{"role": "user", "content": instruction}],
    )

    return res

prompt = """
Please intent cooridnating one of below:
- coffee
- beer
- tea

user question: ice americano
"""

structured_output(prompt)

IntentModel(intent='coffee')

The problem starts here. When I use vllm locally to convert the 'Qwen2.5-14B-Instruct-AWQ' model to OpenAI format and serve it so that it can be used, and then apply the instructor, a Connection Error occurs.

from openai import OpenAI
from pydantic import BaseModel

class IntentSchema(BaseModel):
    intent: str

def structured_output(instruction:str) -> object:
    llm = OpenAI(
        base_url="___local_api_url_vllm___",
        api_key="__api_key__"
    )

    instructor_client = instructor.from_openai(
        client=llm
    )
    
    res = instructor_client.chat.completions.create(
        model="Qwen/Qwen2.5-14B-Instruct-AWQ",
        response_model=IntentSchema,
        messages=[
            {"role": "user", "content": instruction}
        ],
        temperature=0.0
    )

    return res

prompt = """
Please intent cooridnating one of below:
- coffee
- beer
- tea

user question: ice americano
"""

structured_output(prompt)

InstructorRetryException: Connection error.

If you have had similar experience or know a solution, please help..!

Answered by ash-hun

Apr 7, 2025

I solved it. In my case, I found that instructor is not applied when the speculative decoding option is activated when serving local llm to vllm. I will close this discussion!

View full answer

ash-hun · 2025-04-07T00:12:30Z

ash-hun
Apr 7, 2025
Author

I solved it. In my case, I found that instructor is not applied when the speculative decoding option is activated when serving local llm to vllm. I will close this discussion!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

How to migration with vLLM? #1370

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

How to migration with vLLM? #1370

Uh oh!

ash-hun Mar 4, 2025

Replies: 1 comment

Uh oh!

ash-hun Apr 7, 2025 Author

ash-hun
Mar 4, 2025

ash-hun
Apr 7, 2025
Author