8000 ONNX compatible models · Issue #2640 · flairNLP/flair · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
ONNX compatible models #2640
Closed
Closed
@helpmefindaname

Description

@helpmefindaname

To be removed, once it is done: Please add the appropriate label to this ticket, e.g. feature or enhancement.

Is your feature/enhancement request related to a problem? Please describe.
ONNX support is a frequently requested feature, some issues mention it (#2625, #2451, #2317, #1936, #1423, #999)
so I think there is a big desire for the community to support it.
I suppose the usual ONNX compatibility would also make the models compatible to torch.jit (#2528) or AWS Neutron (#2443)

ONNX provides large enhancements in terms of production readiness, it creates a static computational graph which can be quantized and optimized towards specific hardware, see https://onnxruntime.ai/docs/performance/tune-performance.html (it claims to be 17x faster)

Describe the solution you'd like
I'd suggest iterative progression as multiple architecture changes are required:

  1. split the forward/forward_pass methods, such that all models have a method _prepare_tensors which converts all DataPoints to tensors and a forward which takes in tensors and outputs tensors (e.g. for the SeqeuenceTagger we the forward has the signature def forward(self, sentence_tensor: torch.Tensor, lengths: torch.LongTensor) and returns a single tensor scores)
    this change allows conversion to ONNX models, however the logic (like decoding crf scores, filling up sentence results, extracting tensors) won't be implemented. Also embeddings won't be part of the ONNX model.
  2. create the same forward/_prepare_tensors architecture for embeddings, such that those could be converted too.
    This would allow converting embeddings to ONNX models, but again without logic.
  3. change the architecture, that both embeddings and models have the logic part (creating inputs, adding outputs to data points) and the pytorch part be split, such that the pytorch model part can be replaced by a converted ONNX model.
  4. create an end-to-end model wrapper, that both embeddings & the model can be converted to a single ONNX model and used as such.

Notice that this would be 4 different PRs and probably all of them would be very large and should be tested a lot before moving to the next PR,
I would offer to do the first one and then see how much effort this is/how much time I have for this.

Metadata

Metadata

Assignees

No one assigned

    Labels

    wontfixThis will not be worked on

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0