8000 BART, Pegasus, GPT2 model benchmarks are slower compared to vanilla ORT · Issue #234 · hidet-org/hidet · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
BART, Pegasus, GPT2 model benchmarks are slower compared to vanilla ORT #234
Closed
@varshith15

Description

@varshith15

Hey @yaoyaoding!
First of all, amazing work with Hidet!
I have recently been experimenting with hidet to see if it can outperform ORT.
Surprisingly, ORT with IO binding on an ONNX graph(BART, Pegasus, GPT2) without any graph optimisations outperforms the hidet's optimised flow graph even with a search space 2. (on Nvidia A100)
Did you previously run any benchmark comparisons between hidet and ORT? I would love to help debug this!

Also, I have experimented with transformer-deploy, which performs better than vanilla ORT and hidet. Replicating optimisations from transformer-deploy is a good next step. I would love to help with this as well!

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0