10000 Add streaming support for zero shot inference by arnavgarg1 · Pull Request #3878 · ludwig-ai/ludwig · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Add streaming support for zero shot inference #3878

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jan 11, 2024
Merged

Conversation

arnavgarg1
Copy link
Contributor
@arnavgarg1 arnavgarg1 commented Jan 11, 2024

This PR introduces a new boolean flag called streaming to the LudwigModel.generate() API which allows users to see streaming output when performing zero-shot inference on single or multiple samples.

Demo

Screen.Recording.2024-01-11.at.7.43.05.AM.mov

Config to reproduce the demo video

import yaml
import logging
from ludwig.api import LudwigModel

config = yaml.safe_load(
    """
model_type: llm
base_model: meta-llama/Llama-2-7b-chat-hf

quantization:
  bits: 4

input_features:
  - name: instruction
    type: text

output_features:
  - name: output
    type: text

generation:
  max_new_tokens: 64
  temperature: 0.1

trainer:
    type: none

backend:
  type: local
"""
)

model = LudwigModel(config, logging_level=logging.INFO)

# Single sample - Normal generation
output = model.generate("What is the meaning of life?", generation_config={"max_new_tokens": 32})

# Single sample - Streaming generation
output = model.generate("What is the meaning of life?", generation_config={"max_new_tokens": 32}, streaming=True)

# Multi sample - Normal generation
output = model.generate(["What is the meaning of life?", "What is the weather like in Germany around December?"], generation_config={"max_new_tokens": 64})

# Multi sample - Streaming generation
output = model.generate(["What is the meaning of life?", "What is the weather like in Germany around December?"], generation_config={"max_new_tokens": 64}, streaming=True)

Copy link
Contributor
@justinxzhao justinxzhao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool!

Copy link
github-actions bot commented Jan 11, 2024

Unit Test Results

  6 files  ±0    6 suites  ±0   14m 10s ⏱️ -5s
12 tests ±0    9 ✔️ ±0    3 💤 ±0  0 ±0 
60 runs  ±0  42 ✔️ ±0  18 💤 ±0  0 ±0 

Results for commit caa8039. ± Comparison against base commit 22024d7.

♻️ This comment has been updated with latest results.

@arnavgarg1 arnavgarg1 changed the title Add support for zero shot streaming generation Add streaming support for zero shot inference Jan 11, 2024
Copy link
Contributor
@justinxzhao justinxzhao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Collaborator
@alexsherstinsky alexsherstinsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arnavgarg1 This is great -- I just made some minor comments (please let me know whether or not they will help). Thank you!

@arnavgarg1
Copy link
Contributor Author

@alexsherstinsky I think I had a poor interface defined! Can you take a look now and see if it makes sense and your comments are addressed?

Copy link
Collaborator
@alexsherstinsky alexsherstinsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM -- so nice!

@arnavgarg1 arnavgarg1 merged commit 926c37e into master Jan 11, 2024
@arnavgarg1 arnavgarg1 deleted the streaming_generation branch January 11, 2024 21:51
vijayi1 pushed a commit to vijayi1/ludwig that referenced this pull request Jan 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0