-
Notifications
You must be signed in to change notification settings - Fork 3.7k
How to Insert High-Dimensional Matrix data without Protobuf Read Errors #6737
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
A protobuf message cannot exceed the size of 2GB. You can leverage the external tensor option to store the data outside of the protobuf file: Line 287 in 3d5acaf
|
Hi @justinchuby , I have already tried saving the model by: same error then |
When you load, did you set load_external_data to False? |
You can also look into using onnxscript.ir for this |
Hi @justinchuby, Yes, I tried this in init by: unfortunately it didn't work, and it seems I should try onnxscript.ir you mentioned as examples in (https://github.com/microsoft/onnxscript/tree/main/examples/pattern_rewriting.py): model = get_rotary_model(True) rule.apply_to_model(ir_model) Or did I misunderstand? |
Hi @justinchuby, Sorry to bother, but could you please provide a possible example using onnxscript.ir to insert a customized MatMul node, I would be really appreciate~ |
from onnxscript import ir
import numpy as np
def insert_matmul_op(model: ir.Model, target_node_name: str, weight_matrix: np.ndarray):
# A -> Target
# {A, initializer} -> MatMul -> Target
# Find the target node
target_node = model.graph.node(target_node_name)
tensor = ir.tensor(weight_matrix, name=f"{node.inputs[0].name}_initializer")
initializer = ir.Input(tensor.name, tensor.shape, ir.TensorType(tensor.dtype))
initializer.const_value = tensor
model.graph.register_initializer(initializer)
# Print the shape of the weight matrix
print(f"Weight matrix shape: {weight_matrix.shape}")
# Create the new MatMul node
new_node = ir.Node("", "MatMul", inputs=[target_node.inputs[0], initializer])
new_node.outputs[0].name = f"{node.inputs[0].name}_mul"
target_node.prepend(new_node)
target_node.replace_input_with(0, new_node.outputs[0]) |
Hi @justinchuby , It seems the code didn't set the node's const value(File ".../lib/python3.10/site-packages/onnxscript/ir/_core.py", line 1973, in register_initializer
Could you help me with this situation or do you have other advice for me? I would be really appreciate. |
Updated code. Please check: I missed the line to assign the tensor |
Hi @justinchuby, thanks for your example, but it seems will result in same outcome. I will provide the screenshot and error later, I am trying to insert nodes in smaller models like phi to test and finish my task, if you have better solutions or other advice, welcome to share~ |
You may leverage https://github.com/microsoft/onnxscript/blob/main/onnxscript/ir/external_data.py when working with ONNX IR to externalize the big weights. If you save the model as is there's likely going to be a problem. |
Bug Report
Is the issue related to model conversion?
No, this issue is not related to model conversion. It occurs during the process of modifying an existing ONNX model by inserting nodes.
Describe the bug
I encountered an issue when trying to insert multiple nodes with high-dimensional matrices into an ONNX model of a large language model. Specifically, when inserting nodes with matrices of dimension [11008, 11008] in fp32 format, I can only insert them before the down_proj matmul nodes in up to 4 layers of the LLaMA 2 7B model. However, I can insert smaller matrices, such as [4096, 4096], in all layers without issue. This limitation seems to cause Protobuf to fail to read the model, suggesting a potential issue with ONNX's support for handling large matrices.
error:
in deserialize_proto
decoded = typing.cast(Optional[int], proto.ParseFromString(serialized))
google.protobuf.message.DecodeError: Error parsing message with type 'onnx.ModelProto'
insert function:
System information
Expected behavior
I expected to be able to insert nodes with high-dimensional matrices into all layers of the model without encountering Protobuf read errors, similar to the behavior observed with smaller matrices.
Notes
Changing the format to fp16 allows insertion into 5-6 layers.
The maximum number of insertable nodes is not affected by the order of the layers.
Attempting to replace a large matrix with multiple smaller matrices still results in a limitation of inserting only up to 4 nodes in the same layer.
Any insights or suggestions on how to address this issue would be greatly appreciated.
The text was updated successfully, but these errors were encountered: