8000 vine: more detailed cache-invalid messages by JinZhou5042 · Pull Request #4147 · cooperative-computing-lab/cctools · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

vine: more detailed cache-invalid messages #4147

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

JinZhou5042
Copy link
Member

Proposed Changes

Give an overall description of the changes, along with the context and motivation.
Mention relevant issues and pull requests as needed.

Merge Checklist

The following items must be completed before PRs can be merged.
Check these off to verify you have completed all steps.

  • make test Run local tests prior to pushing.
  • make format Format source code to comply with lint policies. Note that some lint errors can only be resolved manually (e.g., Python)
  • make lint Run lint on source code prior to pushing.
  • Manual Update: Update the manual to reflect user-visible changes.
  • Type Labels: Select a github label for the type: bugfix, enhancement, etc.
  • Product Labels: Select a github label for the product: TaskVine, Makeflow, etc.
  • PR RTM: Mark your PR as ready to merge.

@JinZhou5042 JinZhou5042 reopened this Apr 26, 2025
@JinZhou5042 JinZhou5042 self-assigned this Apr 26, 2025
Copy link
Member
@btovar btovar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pr seems to modify more than just error messages. Could it be clean up just to include the changes to the messages?

@btovar btovar added this to the 7.15.4 milestone May 2, 2025
@JinZhou5042
Copy link
Member Author

These are the main things in this pr:

  • added an extra error_message parameter to vine_transfer_get_any, vine_transfer_request_any, and vine_transfer_get_dir_internal to propagate error messages across different layers of the transfer stack.

  • before assigning a new error message, we check whether it is NULL to ensure that we preserve the first error in a chain of failures.

  • if the transfer times out, we now measure and report how long the connection attempt lasted.

What should we improve?

@btovar
Copy link
Member
btovar commented May 2, 2025

Got it! For some reason I thought that the time measurement of the worker link_connect was doing something more complicated. Is this pr ready to review?

@JinZhou5042
Copy link
Member Author

Not yet though, I'll try killing workers and transfers periodically in a large workflow to see if it truly returns helpful erro messages.

And when it comes to the time measurement, in my end the time(0) + 300 argument always results in a number of 130 in the error message, sometimes it's 128 or 129, sometimes it's 131, 132, but mostly 130. I haven't understood the rationale behind this though.

@JinZhou5042 JinZhou5042 marked this pull request as ready for review May 27, 2025 16:25
@JinZhou5042 JinZhou5042 requested a review from btovar May 27, 2025 16:25
@btovar
Copy link
Member
btovar commented May 28, 2025

ready to merge?

@JinZhou5042
Copy link
Member Author

RTM

@btovar btovar merged commit b03ecd0 into cooperative-computing-lab:master May 29, 2025
10 checks passed
@JinZhou5042 JinZhou5042 deleted the cache_invalid_message branch May 29, 2025 19:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0