8000 Trim file path from Kineto File name to avoid errors when running from another directory by spandoescode · Pull Request #182 · mlcommons/chakra · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Trim file path from Kineto File name to avoid errors when running from another directory #182

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

spandoescode
Copy link

Summary

Using the chakra_trace_link command from a directory other than the one where the Kineto trace is stored raises an error. This is because the absolute file path is directly appended to the argument passed with --chakra-device-trace. The solution is to sanitize the input so that the --chakra-device-trace does not contain any path, just the base file name. This way, the chakra_trace_link command can be used from both the same directory as the Kineto file, as well as any other directory. This is a necessary change when traces from different devices/runs are stored in separate folders.

Test Plan

  • Tested on PACE ICE (RHEL 9.4).
  • Test chakra_trace_link on any trace, in two ways: inside the /logs directory which stores the traces, as well as from outside the /logs directory, passing appropriate path inputs to the chakra_trace_link command.

@spandoescode spandoescode requested a review from a team as a code owner February 7, 2025 22:43
Copy link
github-actions bot commented Feb 7, 2025

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

@JoongunPark
Copy link
Contributor

Hi @spandoescode,
Could we rebase the PR? I think it would be better if all commits pass the tests and we avoid having the dummy commit.

@spandoescode spandoescode force-pushed the smore39/fix_hta_kineto_file_path branch from 768c878 to a4a0ea2 Compare March 14, 2025 19:19
@spandoescode
Copy link
Author

Hey @JoongunPark, I've rebased the changes, and all the commits are now being passed. Could you please check?

@JoongunPark
Copy link
Contributor
JoongunPark commented Mar 28, 2025

Hi @spandoescode,
Your changes look good to me! Just a quick question—do you know why the changes shown in this commit don't appear to reflect the latest version?
It looks like the latest commit already includes this update: link.

It seems you first merge the latest main branch 8000 into yours, and then send PR to main again.

@spandoescode
Copy link
Author
spandoescode commented Apr 14, 2025

Hey @JoongunPark,

The change that you're showing is the commit that was redone because merge of main into my branch. The final PR only contains the single line change for the file name. From my perspective, the PR was made from the latest main branch into main.

Let me know if you think this is okay, or if I should try anything else.

@JoongunPark
Copy link
Contributor
JoongunPark commented May 16, 2025

Hey @JoongunPark,

The change that you're showing is the commit that was redone because merge of main into my branch. The final PR only contains the single line change for the file name. From my perspective, the PR was made from the latest main branch into main.

Let me know if you think this is okay, or if I should try anything else.

Hi, @spandoescode .
I am sorry for late reply. I wasn't able to reproduce the error you mentioned, but I have tested your code does not harm linking/converting process with the same comment.

Still, I see old commits (e.g., d3243d9) in your PR.
If you want to remove those commits, you can branch again from the latest code and only add one commits on top of that.
But this PR will not harm the functionality as it is I think.

@tushar-krishna @srinivas212 , this PR looks good to me. Could you please check? Thank you!

Test result on my Envs.

...
[2025-05-15 22:40:10,759] trace_link.py:49 [INFO]: Linking process successful. Output file is available at /home/un/Project/trace/resnet-8gpu/test_rank_0.json.
[2025-05-15 22:40:10,759] trace_link.py:50 [INFO]: Please run the chakra_converter for further postprocessing.

Start Converting trace: 0
INFO [05/15/2025 10:40:38 PM] Conversion successful. Output file is available at /home/un/Project/trace/resnet-8gpu/test_rank.0.et.
``

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0