Description
Details
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Open | None | T31744 FlaggedRev installation (deployment) requests (tracking) | |||
Stalled | None | T143886 Activating Flagged revisions on ar.wikinews | |||
Stalled | None | T204354 Flagged Revisions for Vietnamese Wikipedia | |||
Stalled | None | T205145 Deploy FlaggedRevs on bn.wikibooks | |||
Stalled | None | T221933 Enable Flagged Revisions (for trial run purpose) at the Chinese Wikipedia | |||
Open | None | T185664 Code stewardship review: FlaggedRevs | |||
Resolved | Ladsgroup | T277883 Drop all low-use and unused features of FlaggedRevs to make it more maintainable | |||
Resolved | Ladsgroup | T300774 Drop fr_img_* columns | |||
Open | None | T291916 Tracking task for Bullseye migrations in production | |||
Resolved | Marostegui | T298585 Upgrade WMF database-and-backup-related hosts to bullseye | |||
Resolved | Marostegui | T300473 Upgrade s5 to Bullseye | |||
Resolved | Marostegui | T303798 Switchover s5 master db1130 -> db1100 |
Event Timeline
Change 758288 had a related patch set uploaded (by Marostegui; author: Marostegui):
[operations/puppet@production] s5 codfw hosts: Disable notifications
Change 758288 merged by Marostegui:
[operations/puppet@production] s5 codfw hosts: Disable notifications
Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1001 for host db2137.codfw.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1001 for host db2128.codfw.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1001 for host db2137.codfw.wmnet with OS bullseye completed:
- db2137 (WARN)
- Downtimed on Icinga
- Disabled Puppet
- Removed from Puppet and PuppetDB if present
- Deleted any existing Puppet certificate
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202201310654_marostegui_32737_db2137.out
- Checked BIOS boot parameters are back to normal
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is not optimal, downtime not removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1001 for host db2113.codfw.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1001 for host db2128.codfw.wmnet with OS bullseye completed:
- db2128 (WARN)
- Downtimed on Icinga
- Disabled Puppet
- Removed from Puppet and PuppetDB if present
- Deleted any existing Puppet certificate
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202201310655_marostegui_477_db2128.out
- Checked BIOS boot parameters are back to normal
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is not optimal, downtime not removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1001 for host db2111.codfw.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1001 for host db2075.codfw.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1001 for host db2113.codfw.wmnet with OS bullseye completed:
- db2113 (WARN)
- Downtimed on Icinga
- Disabled Puppet
- Removed from Puppet and PuppetDB if present
- Deleted any existing Puppet certificate
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202201310729_marostegui_23914_db2113.out
- Checked BIOS boot parameters are back to normal
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is not optimal, downtime not removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1001 for host db2111.codfw.wmnet with OS bullseye completed:
- db2111 (WARN)
- Downtimed on Icinga
- Disabled Puppet
- Removed from Puppet and PuppetDB if present
- Deleted any existing Puppet certificate
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202201310732_marostegui_24247_db2111.out
- Checked BIOS boot parameters are back to normal
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is not optimal, downtime not removed
- Updated Netbox data from PuppetDB
Change 758419 had a related patch set uploaded (by Marostegui; author: Marostegui):
[operations/puppet@production] db2123: Disable notifications
Change 758419 merged by Marostegui:
[operations/puppet@production] db2123: Disable notifications
Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1001 for host db2123.codfw.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1001 for host db2075.codfw.wmnet with OS bullseye completed:
- db2075 (WARN)
- Downtimed on Icinga
- Disabled Puppet
- Removed from Puppet and PuppetDB if present
- Deleted any existing Puppet certificate
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202201310739_marostegui_26861_db2075.out
- Checked BIOS boot parameters are back to normal
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is not optimal, downtime not removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1001 for host db2123.codfw.wmnet with OS bullseye completed:
- db2123 (WARN)
- Downtimed on Icinga
- Disabled Puppet
- Removed from Puppet and PuppetDB if present
- Deleted any existing Puppet certificate
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202201310809_marostegui_4894_db2123.out
- Checked BIOS boot parameters are back to normal
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is not optimal, downtime not removed
- Updated Netbox data from PuppetDB
Change 758465 had a related patch set uploaded (by Marostegui; author: Marostegui):
[operations/puppet@production] db1154: Disable notifications
Change 758465 merged by Marostegui:
[operations/puppet@production] db1154: Disable notifications
Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1001 for host db1154.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1001 for host db1154.eqiad.wmnet with OS bullseye completed:
- db1154 (WARN)
- Downtimed on Icinga
- Disabled Puppet
- Removed from Puppet and PuppetDB if present
- Deleted any existing Puppet certificate
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202201311306_marostegui_12113_db1154.out
- Checked BIOS boot parameters are back to normal
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is not optimal, downtime not removed
- Updated Netbox data from PuppetDB
Change 758719 had a related patch set uploaded (by Marostegui; author: Marostegui):
[operations/puppet@production] db1110: Disable notifications
Mentioned in SAL (#wikimedia-operations) [2022-02-01T06:21:11Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1110 for reimage T300473', diff saved to https://phabricator.wikimedia.org/P19716 and previous config saved to /var/cache/conftool/dbconfig/20220201-062111-marostegui.json
Change 758719 merged by Marostegui:
[operations/puppet@production] db1110: Disable notifications
Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1001 for host db1110.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1001 for host db1110.eqiad.wmnet with OS bullseye completed:
- db1110 (FAIL)
- Downtimed on Icinga
- Disabled Puppet
- Removed from Puppet and PuppetDB if present
- Deleted any existing Puppet certificate
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202202010624_marostegui_28187_db1110.out
- Checked BIOS boot parameters are back to normal
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is not optimal, downtime not removed
- Updated Netbox data from PuppetDB
- Failed to get Netbox script results, try manually: https://netbox.wikimedia.org/api/extras/job-results/2404824/
Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1001 for host db1110.eqiad.wmnet with OS bullseye executed with errors:
- db1110 (FAIL)
- Downtimed on Icinga
- Disabled Puppet
- Removed from Puppet and PuppetDB if present
- Deleted any existing Puppet certificate
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202202010624_marostegui_28187_db1110.out
- Checked BIOS boot parameters are back to normal
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is not optimal, downtime not removed
- Updated Netbox data from PuppetDB
- Failed to get Netbox script results, try manually: https://netbox.wikimedia.org/api/extras/job-results/2404824/
- The reimage failed, see the cookbook logs for the details
Change 758782 had a related patch set uploaded (by Marostegui; author: Marostegui):
[operations/puppet@production] db1100: Disable notifications
Mentioned in SAL (#wikimedia-operations) [2022-02-01T08:10:51Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1100 for reimage T300473', diff saved to https://phabricator.wikimedia.org/P19747 and previous config saved to /var/cache/conftool/dbconfig/20220201-081050-marostegui.json
Change 758782 merged by Marostegui:
[operations/puppet@production] db1100: Disable notifications
Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1001 for host db1100.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1001 for host db1100.eqiad.wmnet with OS bullseye executed with errors:
- db1100 (FAIL)
- Downtimed on Icinga
- Disabled Puppet
- Removed from Puppet and PuppetDB if present
- Deleted any existing Puppet certificate
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- The reimage failed, see the cookbook logs for the details
Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1001 for host db1100.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1001 for host db1100.eqiad.wmnet with OS bullseye completed:
- db1100 (WARN)
- Downtimed on Icinga
- Unable to disable Puppet, the host may have been unreachable
- Removed from Puppet and PuppetDB if present
- Deleted any existing Puppet certificate
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202202010833_marostegui_21361_db1100.out
- Checked BIOS boot parameters are back to normal
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is not optimal, downtime not removed
- Updated Netbox data from PuppetDB
db1144 is not a backup source according to puppet, it's only a core multiinstance. I upgraded it to bullseye as part of T302950
Change 770875 had a related patch set uploaded (by Marostegui; author: Marostegui):
[operations/puppet@production] db1166: Disable notifications
Change 770875 merged by Marostegui:
[operations/puppet@production] db1161: Disable notifications
Mentioned in SAL (#wikimedia-operations) [2022-03-15T08:05:05Z] <marostegui> dbmaint on s5@eqiad T300473
Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1001 for host db1161.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1001 for host db1161.eqiad.wmnet with OS bullseye completed:
- db1161 (WARN)
- Downtimed on Icinga
- Disabled Puppet
- Removed from Puppet and PuppetDB if present
- Deleted any existing Puppet certificate
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202203150813_marostegui_1181908_db1161.out
- Checked BIOS boot parameters are back to normal
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is not optimal, downtime not removed
- Updated Netbox data from PuppetDB
Change 775722 had a related patch set uploaded (by Marostegui; author: Marostegui):
[operations/puppet@production] db1130: Disable notifications
Change 775722 merged by Marostegui:
[operations/puppet@production] db1130: Disable notifications
Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1001 for host db1130.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1001 for host db1130.eqiad.wmnet with OS bullseye completed:
- db1130 (WARN)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present
- Deleted any existing Puppet certificate
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202204040511_marostegui_2744843_db1130.out
- Checked BIOS boot parameters are back to normal
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is not optimal, downtime not removed
- Updated Netbox data from PuppetDB