8000 New iteration on convert HF, fix some models, support Qwen3/Qwen3MoE by francoishernandez · Pull Request #238 · eole-nlp/eole · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

New iteration on convert HF, fix some models, support Qwen3/Qwen3MoE #238

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

francoishernandez
Copy link
Member
@francoishernandez francoishernandez commented May 7, 2025

We'll probably never have a perfect solution to handle every HF cases, but it doesn't hurt to keep rationalizing a few things.

Addressed topics

  • centralize mappings and configs in separate file
  • clarify encoder/decoder keys mapping (previously, decoder stuff would be in the root mapping, whereas encoder would be in the specific key)
  • first shards params are transparently grabbed from mapping root, instead of relying on a fixed set which is a hassle to maintain
  • move specific config flags to "config" key of main mapping
  • simplify shards building loop (ongoing -- shall we loop on params/map instead of checkpoints?)

Some notes:

  1. while testing this, I checked Mixtral quickly, and it appears to have been broken for a while (even before previous refactoring); not sure if we'll fix this here or later EDIT: MoE (Mixtral/Qwen3) seems fine after a few patches but AWQ is not -- though deprecated so not sure we want to dive back into it (might be better off investigating llm-compressor which replaces it)
  2. Did not test all architectures yet (e.g. gpt2/nllb/xlmroberta) EDIT: only XLM-roberta not fully tested
  3. transformer decoder refactoring a while ago introduced post_attention_layernorm, which should probably be made optional (e.g. phi-2) EDIT: introduced post_attention_layernorm flag (default True)

@francoishernandez francoishernandez changed the title New iteration on convert HF New iteration on convert HF, fix some models, support Qwen3/Qwen3MoE May 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant
0