Open
Description
What would you like to be added:
It would be super great to support benchmarking the LLM throughputs or latencies with different backends.
Why is this needed:
Provide proofs for users.
Completion requirements:
This enhancement requires the following artifacts:
- Design doc
- API change
- Docs update
The artifacts should be linked in subsequent comments.