8000 HLD for SmsrtSwitch DPU graceful shutdown by rameshraghupathy · Pull Request #1991 · sonic-net/SONiC · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

HLD for SmsrtSwitch DPU graceful shutdown #1991

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 25 commits into
base: master
Choose a base branch
from

Conversation

rameshraghupathy
Copy link
Contributor
@rameshraghupathy rameshraghupathy commented May 13, 2025

HLD for SmsrtSwitch DPU graceful shutdown

Related PRs:
sonic-net/sonic-platform-common#567
sonic-net/sonic-host-services#255

@mssonicbld
Copy link
Collaborator

/azp run

Copy link
No pipelines are associated with this pull request.

@mssonicbld
Copy link
Collaborator

/azp run

Copy link
No pipelines are associated with this pull request.

@mssonicbld
Copy link
Collaborator

/azp run

Copy link
No pipelines are associated with this pull request.


### 1. Host-side Named Pipe

A named pipe (e.g., `/var/run/gnoi_reboot.pipe`) is created on the host. It acts as a one-way communication channel from PMON to the host.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the benefit of using pipes instead of using the DB for the communication channel between daemons? SONiC has a well-tested publisher/subscriber implementation used throughout the system.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@oleksandrivantsiv We will be using Redis, the benefits are captured in the document.

@mssonicbld
Copy link
Collaborator

/azp run

Copy link
No pipelines are associated with this pull request.

@mssonicbld
Copy link
Collaborator

/azp run

Copy link
No pipelines are associated with this pull request.

@mssonicbld
Copy link
Collaborator

/azp run

Copy link
No pipelines are associated with this pull request.

@mssonicbld
Copy link
Collaborator

/azp run

Copy link
No pipelines are associated with this pull request.

@mssonicbld
Copy link
Collaborator

/azp run

Copy link
No pipelines are associated with this pull request.

@mssonicbld
Copy link
Collaborator

/azp run

Copy link
No pipelines are associated with this pull request.

Copy link
Contributor
@vvolam vvolam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than these comments, LGTM.

@mssonicbld
Copy link
Collaborator

/azp run

Copy link
No pipelines are associated with this pull request.

vvolam
vvolam previously approved these changes May 22, 2025
Copy link
Contributor
F438 @vvolam vvolam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

@mssonicbld
Copy link
Collaborator

/azp run

Copy link
No pipelines are associated with this pull request.

@mssonicbld
Copy link
Collaborator

/azp run

Copy link
No pipelines are associated with this pull request.

Copy link
Contributor
@vvolam vvolam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rameshraghupathy As discussed in the community meeting call, could you update HLD with state transition flag. Fail the command if the flag is already set. Thanks!

@mssonicbld
Copy link
Collaborator

/azp run

Copy link
No pipelines are associated with this pull request.

@mssonicbld
Copy link
Collaborator

/azp run

Copy link
No pipelines are associated with this pull request.

@mssonicbld
Copy link
Collaborator

/azp run

Copy link
No pipelines are associated with this pull request.

@mssonicbld
Copy link
Collaborator

/azp run

Copy link
No pipelines are associated with this pull request.


<p align="center"><img src="./images/reboot-interoperability.svg"></p>

The diagram above illustrates two scenarios where both module_base.py and smartswitch_reboot_helper might attempt to initiate a shutdown and reboot simultaneously. By utilizing the shared RedisDB table GNOI_REBOOT_REQUEST, the system ensures that only the first trigger is honored, and subsequent attempts during an ongoing reboot are effectively ignored.
Copy link
Contributor
@vvolam vvolam May 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is one of the approaches. But we can do this by reading the transition_in_progress flag by reboot script, right? which one is simple and better?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vvolam Actually it is not just that. More details on how will this be handled is provided in the senarios section last statement. Let me rephrase it and update you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
0