-
Notifications
You must be signed in to change notification settings - Fork 24.4k
Make debug_pkl smaller by only emitting unique traces. #72596
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
CI Flow Status⚛️ CI FlowRuleset - Version:
|
🔗 Helpful links
💊 CI failures summary and remediationsAs of commit 96db16e (more details on the Dr. CI page):
🕵️ 4 new failures recognized by patternsThe following CI failures do not appear to be due to upstream breakages:
|
This pull request was exported from Phabricator. Differential Revision: D33994011 |
This pull request was exported from Phabricator. Differential Revision: D33994011 |
1d35b17
to
41c9309
Compare
This pull request was exported from Phabricator. Differential Revision: D33994011 |
41c9309
to
b0a9daf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mentioned that unit tests are used as test plans, but I don't see a checked in unit test. Is it left out of commit accidentally?
serialized = c10::ivalue::Tuple::create( | ||
{s->text(), s->filename(), (int64_t)s->starting_line_no()}); | ||
{text_pos, fname_pos, (int64_t)s->starting_line_no()}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this will dedup source code, could we run a one-time manual test on that ads model to see how much saving we can get?
How about the stack trace part in debug_pkl? Could we compress them too? It looks like a lot of the strings have redundancy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this will dedup source code, could we run a one-time manual test on that ads model to see how much saving we can get?
I'll get help from the ads folks to see how to do this.
How about the stack trace part in debug_pkl? Could we compress them too? It looks like a lot of the strings have redundancy.
yes. SourceView object has string field "text", string field "filename". Sometimes the "text" field is stuffed with source code and sometimes stuffed with stack trace. I initially thought it's always source code until I unpicked and printed out what is in there and saw stack trace.
This pull request was exported from Phabricator. Differential Revision: D33994011 |
b0a9daf
to
9ea1cd4
Compare
9ea1cd4
to
b42cbca
Compare
This pull request was exported from Phabricator. Differential Revision: D33994011 |
I meant existing unit tests that do save/load sequences. Ex. test_jit |
This pull request was exported from Phabricator. Differential Revision: D33994011 |
32d5f8b
to
5ab3a46
Compare
This pull request was exported from Phabricator. Differential Revision: D33994011 |
This pull request was exported from Phabricator. Differential Revision: D33994011 |
a2e59ed
to
bb26508
Compare
bb26508
to
38107a3
Compare
This pull request was exported from Phabricator. Differential Revision: D33994011 |
1 similar comment
This pull request was exported from Phabricator. Differential Revision: D33994011 |
38107a3
to
0a35885
Compare
// A stringlike class backed by a vector of string_view | ||
// the string represented are logically the concatenation of the string_views | ||
// This has advantage of not needing continues memory. | ||
struct StringCordView { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider moving some of the heavy implementation of methods to cpp.
Additionally, are all the methods meant to be public?
}; | ||
|
||
// the interface | ||
struct SourceView { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like SourceView
is pure virtual now with Source
being the only class implementing it. If so, why do we need SourceView
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, we don't need.
SchemaParser(const std::string& str) | ||
: L(std::make_shared<SourceView>(c10::string_view(str))), | ||
explicit SchemaParser(const std::string& str) | ||
: L(std::make_shared<Source>(c10::string_view(str))), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean we are copying entire string (within Source) when schema parser is created?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. Source created with string_view don't copy (equiv to old SourceView) and Source created with std::string copies (equiv to old Source)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case, source
may or may not own the underlying string and that is dangerous. Could we preserve the original inheritance structure so that it is clear whether ownership is needed or not?
This pull request was exported from Phabricator. Differential Revision: D33994011 |
0a35885
to
e7c07b2
Compare
This pull request was exported from Phabricator. Differential Revision: D33994011 |
e7c07b2
to
848ecb5
Compare
Summary: Pull Request resolved: pytorch#72596 debug_pkl file inside of pytorch's .pt file consists of a list of SourceRanges. Each SourceRange points to a Source which is a stack track, filename, and start, end numbers. Those are emitted in debug_pkl file as strings. Since many SourceRange shares the same source, the string for trace can be deduped. The newer format saves a set of unique traces in a tuple, then each SourceRange will save the offset of it's trace w.r.t. position in that tuple. (i.e. manually applying dictionary compression). The above helps with smaller file size. On loading, if we copy each trace to Source as string the runtime memory would still blowup. To mitigate this, we use SourceView directly instead of source which will take the reference of string inside of Deserializer and make that into string_view. This is safe because Deserializer is hold by Unpickler by shared_ptr, and Unpickler is also hold by shared_ptr by another Source object. That Source object will be alive during the model construction. Test Plan: unit test Took original file (312271638_930.predictor.disagg.local); loaded with `torch.jit.load` save again with `torch.jit.save`. Unzip both, look at contents: ``` [qihan@devvm5585.vll0 ~]$ du archive -h 4.0K archive/xl_model_weights 3.7M archive/extra 8.0K archive/code/__torch__/caffe2/torch/fb/model_transform/splitting 8.0K archive/code/__torch__/caffe2/torch/fb/model_transform 8.0K archive/code/__torch__/caffe2/torch/fb 8.0K archive/code/__torch__/caffe2/torch 8.0K archive/code/__torch__/caffe2 20M archive/code/__torch__/torch/fx/graph_module 20M archive/code/__torch__/torch/fx 8.0K archive/code/__torch__/torch/classes 20M archive/code/__torch__/torch 20M archive/code/__torch__ 20M archive/code 2.7M archive/constants 35M archive [qihan@devvm5585.vll0 ~]$ du resaved -h 4.0K resaved/extra 8.0K resaved/code/__torch__/caffe2/torch/fb/model_transform/splitting 8.0K resaved/code/__torch__/caffe2/torch/fb/model_transform 8.0K resaved/code/__torch__/caffe2/torch/fb 8.0K resaved/code/__torch__/caffe2/torch 8.0K resaved/code/__torch__/caffe2 1.3M resaved/code/__torch__/torch/fx/graph_module 1.3M resaved/code/__torch__/torch/fx 8.0K resaved/code/__torch__/torch/classes 1.4M resaved/code/__torch__/torch 1.4M resaved/code/__torch__ 1.4M resaved/code 2.7M resaved/constants 13M resaved [qihan@devvm5585.vll0 ~]$ ``` Reviewed By: JasonHanwen Differential Revision: D33994011 fbshipit-source-id: a6db7e1d4c616ced4442ec06936ae005bf269212
This pull request was exported from Phabricator. Differential Revision: D33994011 |
848ecb5
to
96db16e
Compare
Summary: Pull Request resolved: #72596 debug_pkl file inside of pytorch's .pt file consists of a list of SourceRanges. Each SourceRange points to a Source which is a stack track, filename, and start, end numbers. Those are emitted in debug_pkl file as strings. Since many SourceRange shares the same source, the string for trace can be deduped. The newer format saves a set of unique traces in a tuple, then each SourceRange will save the offset of it's trace w.r.t. position in that tuple. (i.e. manually applying dictionary compression). The above helps with smaller file size. On loading, if we copy each trace to Source as string the runtime memory would still blowup. To mitigate this, we use SourceView directly instead of source which will take the reference of string inside of Deserializer and make that into string_view. This is safe because Deserializer is hold by Unpickler by shared_ptr, and Unpickler is also hold by shared_ptr by another Source object. That Source object will be alive during the model construction. Test Plan: unit test Took original file (312271638_930.predictor.disagg.local); loaded with `torch.jit.load` save again with `torch.jit.save`. Unzip both, look at contents: ``` [qihan@devvm5585.vll0 ~]$ du archive -h 4.0K archive/xl_model_weights 3.7M archive/extra 8.0K archive/code/__torch__/caffe2/torch/fb/model_transform/splitting 8.0K archive/code/__torch__/caffe2/torch/fb/model_transform 8.0K archive/code/__torch__/caffe2/torch/fb 8.0K archive/code/__torch__/caffe2/torch 8.0K archive/code/__torch__/caffe2 20M archive/code/__torch__/torch/fx/graph_module 20M archive/code/__torch__/torch/fx 8.0K archive/code/__torch__/torch/classes 20M archive/code/__torch__/torch 20M archive/code/__torch__ 20M archive/code 2.7M archive/constants 35M archive [qihan@devvm5585.vll0 ~]$ du resaved -h 4.0K resaved/extra 8.0K resaved/code/__torch__/caffe2/torch/fb/model_transform/splitting 8.0K resaved/code/__torch__/caffe2/torch/fb/model_transform 8.0K resaved/code/__torch__/caffe2/torch/fb 8.0K resaved/code/__torch__/caffe2/torch 8.0K resaved/code/__torch__/caffe2 1.3M resaved/code/__torch__/torch/fx/graph_module 1.3M resaved/code/__torch__/torch/fx 8.0K resaved/code/__torch__/torch/classes 1.4M resaved/code/__torch__/torch 1.4M resaved/code/__torch__ 1.4M resaved/code 2.7M resaved/constants 13M resaved [qihan@devvm5585.vll0 ~]$ ``` Reviewed By: JasonHanwen Differential Revision: D33994011 fbshipit-source-id: 8e6224c6e942e91c3403f686c8f0937d1002ed41
Hey @qihqi. |
This pull request has been reverted by 3bd1507. To re-land this change, please open another pull request, assignthe same reviewers, fix the CI failures that caused the revert and make sure that the failing CI runs on the PR by applying the proper ciflow label (e.g., ciflow/trunk). |
This pull request has been reverted by 3bd1507. To re-land this change, please open another pull request, assignthe same reviewers, fix the CI failures that caused the revert and make sure that the failing CI runs on the PR by applying the proper ciflow label (e.g., ciflow/trunk). |
Summary: Pull Request resolved: pytorch/pytorch#72596 debug_pkl file inside of pytorch's .pt file consists of a list of SourceRanges. Each SourceRange points to a Source which is a stack track, filename, and start, end numbers. Those are emitted in debug_pkl file as strings. Since many SourceRange shares the same source, the string for trace can be deduped. The newer format saves a set of unique traces in a tuple, then each SourceRange will save the offset of it's trace w.r.t. position in that tuple. (i.e. manually applying dictionary compression). The above helps with smaller file size. On loading, if we copy each trace to Source as string the runtime memory would still blowup. To mitigate this, we use SourceView directly instead of source which will take the reference of string inside of Deserializer and make that into string_view. This is safe because Deserializer is hold by Unpickler by shared_ptr, and Unpickler is also hold by shared_ptr by another Source object. That Source object will be alive during the model construction. Test Plan: unit test Took original file (312271638_930.predictor.disagg.local); loaded with `torch.jit.load` save again with `torch.jit.save`. Unzip both, look at contents: ``` [qihan@devvm5585.vll0 ~]$ du archive -h 4.0K archive/xl_model_weights 3.7M archive/extra 8.0K archive/code/__torch__/caffe2/torch/fb/model_transform/splitting 8.0K archive/code/__torch__/caffe2/torch/fb/model_transform 8.0K archive/code/__torch__/caffe2/torch/fb 8.0K archive/code/__torch__/caffe2/torch 8.0K archive/code/__torch__/caffe2 20M archive/code/__torch__/torch/fx/graph_module 20M archive/code/__torch__/torch/fx 8.0K archive/code/__torch__/torch/classes 20M archive/code/__torch__/torch 20M archive/code/__torch__ 20M archive/code 2.7M archive/constants 35M archive [qihan@devvm5585.vll0 ~]$ du resaved -h 4.0K resaved/extra 8.0K resaved/code/__torch__/caffe2/torch/fb/model_transform/splitting 8.0K resaved/code/__torch__/caffe2/torch/fb/model_transform 8.0K resaved/code/__torch__/caffe2/torch/fb 8.0K resaved/code/__torch__/caffe2/torch 8.0K resaved/code/__torch__/caffe2 1.3M resaved/code/__torch__/torch/fx/graph_module 1.3M resaved/code/__torch__/torch/fx 8.0K resaved/code/__torch__/torch/classes 1.4M resaved/code/__torch__/torch 1.4M resaved/code/__torch__ 1.4M resaved/code 2.7M resaved/constants 13M resaved [qihan@devvm5585.vll0 ~]$ ``` Reviewed By: JasonHanwen Differential Revision: D33994011 fbshipit-source-id: 8e6224c6e942e91c3403f686c8f0937d1002ed41 (cherry picked from commit a7014dd4029308c95007f362a57c31796d686647)
Summary: Pull Request resolved: pytorch/pytorch#72596 debug_pkl file inside of pytorch's .pt file consists of a list of SourceRanges. Each SourceRange points to a Source which is a stack track, filename, and start, end numbers. Those are emitted in debug_pkl file as strings. Since many SourceRange shares the same source, the string for trace can be deduped. The newer format saves a set of unique traces in a tuple, then each SourceRange will save the offset of it's trace w.r.t. position in that tuple. (i.e. manually applying dictionary compression). The above helps with smaller file size. On loading, if we copy each trace to Source as string the runtime memory would still blowup. To mitigate this, we use SourceView directly instead of source which will take the reference of string inside of Deserializer and make that into string_view. This is safe because Deserializer is hold by Unpickler by shared_ptr, and Unpickler is also hold by shared_ptr by another Source object. That Source object will be alive during the model construction. Test Plan: unit test Took original file (312271638_930.predictor.disagg.local); loaded with `torch.jit.load` save again with `torch.jit.save`. Unzip both, look at contents: ``` [qihan@devvm5585.vll0 ~]$ du archive -h 4.0K archive/xl_model_weights 3.7M archive/extra 8.0K archive/code/__torch__/caffe2/torch/fb/model_transform/splitting 8.0K archive/code/__torch__/caffe2/torch/fb/model_transform 8.0K archive/code/__torch__/caffe2/torch/fb 8.0K archive/code/__torch__/caffe2/torch 8.0K archive/code/__torch__/caffe2 20M archive/code/__torch__/torch/fx/graph_module 20M archive/code/__torch__/torch/fx 8.0K archive/code/__torch__/torch/classes 20M archive/code/__torch__/torch 20M archive/code/__torch__ 20M archive/code 2.7M archive/constants 35M archive [qihan@devvm5585.vll0 ~]$ du resaved -h 4.0K resaved/extra 8.0K resaved/code/__torch__/caffe2/torch/fb/model_transform/splitting 8.0K resaved/code/__torch__/caffe2/torch/fb/model_transform 8.0K resaved/code/__torch__/caffe2/torch/fb 8.0K resaved/code/__torch__/caffe2/torch 8.0K resaved/code/__torch__/caffe2 1.3M resaved/code/__torch__/torch/fx/graph_module 1.3M resaved/code/__torch__/torch/fx 8.0K resaved/code/__torch__/torch/classes 1.4M resaved/code/__torch__/torch 1.4M resaved/code/__torch__ 1.4M resaved/code 2.7M resaved/constants 13M resaved [qihan@devvm5585.vll0 ~]$ ``` Reviewed By: JasonHanwen Differential Revision: D33994011 fbshipit-source-id: 8e6224c6e942e91c3403f686c8f0937d1002ed41 (cherry picked from commit a7014dd4029308c95007f362a57c31796d686647)
Summary:
debug_pkl file inside of pytorch's .pt file consists of a list of SourceRanges. Each SourceRange points to a Source which is a stack track, filename, and start, end numbers. Those are emitted in debug_pkl file as strings.
Since many SourceRange contains similar looking strings, they can be deduped.
The newer format saves a set of unique traces in a tuple, then each SourceRange will save the offset of it's trace w.r.t. position in that tuple. (i.e. manually applying dictionary compression).
Test Plan: unit test
Differential Revision: D33994011