Open
Description
This is awesome - congrats.
Architecturally, it would be great if you could specify one or more backends and use async to process input that has come back from any of them. Do you think this is how it would be done? Or should you put each llm chat in a discrete async context? In any case, a parallel pattern would be very useful for evals.
Also, do you think it would be possible to make this usable from wasm?