[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

量化警告:Token indices sequence length is longer than the specified maximum sequence length for this model (1085165 > 16384). Running this sequence through the model will result in indexing errors #2866

Closed
lingyezhixing opened this issue Dec 8, 2024 · 3 comments
Assignees

Comments

@lingyezhixing
Copy link

量化InternVL2_5-8B,因为网络不稳定,所以我先下载了ptb然后从本地加载数据集进行量化,结果出现如下警告:Token indices sequence length is longer than the specified maximum sequence length for this model (1085165 > 16384). Running this sequence through the model will result in indexing errors。

@lingyezhixing
Copy link
Author

(lmdeploy) C:\Users\31940>lmdeploy lite auto_awq D:\LLM\models\LLM\OpenGVLab\InternVL2_5-8B --calib-dataset ptb --calib-samples 128 --calib-seqlen 2048 --w-bits 4 --w-group-size 128 --batch-size 1 --work-dir D:\LLM\models\LLM\OpenGVLab\InternVL2_5-8B-AWQ
2024-12-08 15:23:36,569 - lmdeploy - INFO - builder.py:64 - matching vision model: InternVLVisionModel
E:\Programming\pycodes\miniconda3\envs\lmdeploy\lib\site-packages\timm\models\layers_init_.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers
warnings.warn(f"Importing from {name} is deprecated, please import via timm.layers", FutureWarning)
InternLM2ForCausalLM has generative capabilities, as prepare_inputs_for_generation is explicitly overwritten. However, it doesn't directly inherit from GenerationMixin. From 👉v4.50👈 onwards, PreTrainedModel will NOT inherit from GenerationMixin, and this model will lose the ability to call generate and other related functions.

  • If you're using trust_remote_code=True, you can get rid of this warning by loading the model with an auto class. See https://huggingface.co/docs/transformers/en/model_doc/auto#auto-classes
  • If you are the owner of the model architecture code, please modify your model class such that it inherits from GenerationMixin (after PreTrainedModel, otherwise you'll get an exception).
  • If you are not the owner of the model architecture class, please contact the model code owner to update it.
    2024-12-08 15:24:03,429 - lmdeploy - INFO - internvl.py:120 - using InternVL-Chat-V1-5 vision preprocess
    Move model.tok_embeddings to GPU.
    Move model.layers.0 to CPU.
    Move model.layers.1 to CPU.
    Move model.layers.2 to CPU.
    Move model.layers.3 to CPU.
    Move model.layers.4 to CPU.
    Move model.layers.5 to CPU.
    Move model.layers.6 to CPU.
    Move model.layers.7 to CPU.
    Move model.layers.8 to CPU.
    Move model.layers.9 to CPU.
    Move model.layers.10 to CPU.
    Move model.layers.11 to CPU.
    Move model.layers.12 to CPU.
    Move model.layers.13 to CPU.
    Move model.layers.14 to CPU.
    Move model.layers.15 to CPU.
    Move model.layers.16 to CPU.
    Move model.layers.17 to CPU.
    Move model.layers.18 to CPU.
    Move model.layers.19 to CPU.
    Move model.layers.20 to CPU.
    Move model.layers.21 to CPU.
    Move model.layers.22 to CPU.
    Move model.layers.23 to CPU.
    Move model.layers.24 to CPU.
    Move model.layers.25 to CPU.
    Move model.layers.26 to CPU.
    Move model.layers.27 to CPU.
    Move model.layers.28 to CPU.
    Move model.layers.29 to CPU.
    Move model.layers.30 to CPU.
    Move model.layers.31 to CPU.
    Move model.norm to GPU.
    Move output to CPU.
    Loading calibrate dataset ...
    Token indices sequence length is longer than the specified maximum sequence length for this model (1085165 > 16384). Running this sequence through the model will result in indexing errors
    model.layers.0, samples: 128, max gpu memory: 7.14 GB
    model.layers.1, samples: 128, max gpu memory: 9.14 GB
    model.layers.2, samples: 128, max gpu memory: 9.14 GB
    model.layers.3, samples: 128, max gpu memory: 9.14 GB
    model.layers.4, samples: 128, max gpu memory: 9.14 GB
    model.layers.5, samples: 128, max gpu memory: 9.14 GB
    model.layers.6, samples: 128, max gpu memory: 9.14 GB
    model.layers.7, samples: 128, max gpu memory: 9.14 GB
    model.layers.8, samples: 128, max gpu memory: 9.14 GB
    model.layers.9, samples: 128, max gpu memory: 9.14 GB
    model.layers.10, samples: 128, max gpu memory: 9.14 GB
    model.layers.11, samples: 128, max gpu memory: 9.14 GB
    model.layers.12, samples: 128, max gpu memory: 9.14 GB
    model.layers.13, samples: 128, max gpu memory: 9.14 GB
    model.layers.14, samples: 128, max gpu memory: 9.14 GB
    model.layers.15, samples: 128, max gpu memory: 9.14 GB
    model.layers.16, samples: 128, max gpu memory: 9.14 GB
    model.layers.17, samples: 128, max gpu memory: 9.14 GB
    model.layers.18, samples: 128, max gpu memory: 9.14 GB
    model.layers.19, samples: 128, max gpu memory: 9.14 GB
    model.layers.20, samples: 128, max gpu memory: 9.14 GB
    model.layers.21, samples: 128, max gpu memory: 9.14 GB
    model.layers.22, samples: 128, max gpu memory: 9.14 GB
    model.layers.23, samples: 128, max gpu memory: 9.14 GB
    model.layers.24, samples: 128, max gpu memory: 9.14 GB
    model.layers.25, samples: 128, max gpu memory: 9.14 GB

@AllentDan
Copy link
Collaborator

Just ignore the warning. It does not influence the quantization.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants
@AllentDan @lingyezhixing and others