[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Page MenuHomePhabricator

jbond (John Bond)
Staff Site Reliability Engineer

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Sunday

  • Clear sailing ahead.

User Details

User Since
Jan 7 2019, 1:06 PM (308 w, 3 d)
Availability
Available
IRC Nick
jbond
LDAP User
Jbond
MediaWiki User
Unknown

Recent Activity

Jun 17 2024

jbond added a comment to T367554: Cloud VPS "sso" project Buster deprecation.

hi all i wanted to say that the sso project is used so that users have an SSO testing infrastructure to use in cloud services. Originally this was also used to provide sso to production like services in cloud services, however this later functionality has been moved.

Jun 17 2024, 4:07 PM · Cloud-VPS (Debian Buster Deprecation)

May 13 2024

jbond added a comment to T364492: Ownership confusion on cloud-local puppet servers.

Puppet 7 has some new ownership constraints which means that we can no longer investigate these repos as root, for example:

FYI this is an artifact of new version of git not puppet. you will need something like the following on cloud standalon puppet masters

May 13 2024, 2:43 PM · cloud-services-team (FY2024/2025-Q1-Q2), Patch-For-Review, Puppet-Infrastructure

Jan 5 2024

jbond added a comment to T352974: puppet7 on cumin breaks database connections.

it appears that most of our hosts are still using /etc/ssl/certs/Puppet_Internal_CA.pem and should be migrated to use `/etc/ssl/certs/wmf-ca-certificates.crt

This is likely the issue something, somewhere likley is still using Puppet_internal_ca.pem, /var/lib/puppet/ssl/ca/ca.pem or $facts['pupet_config']['localcecert'] directly

Jan 5 2024, 3:58 PM · DBA, SRE-tools, Puppet-Core, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE

Nov 29 2023

jbond created P53949 (An Untitled Masterwork).
Nov 29 2023, 1:34 PM

Nov 28 2023

jbond added a comment to T352156: PHP Warning: geoip_country_code_by_name(): Required database not available at /usr/share/GeoIP/GeoIP.dat..

@jbond @Muehlenhoff It appears that we are missing GeoIP.dat, GeoIPRegion.dat, and GeoIPCity.dat from the new puppetmasters. Would it be alright if we'd just copied them over from the old puppetmasters for now?

Nov 28 2023, 12:22 PM · MW-1.42-notes (1.42.0-wmf.9; 2023-12-12), LandingCheck, Patch-For-Review, MediaWiki-extensions-WikimediaEvents, Release-Engineering-Team, MW-on-K8s, serviceops, Wikimedia-production-error

Nov 27 2023

jbond added a project to T350565: Switch conftool to use the version 3 etcd datastore: conftool.
Nov 27 2023, 4:15 PM · conftool, Data-Persistence, Traffic, serviceops
jbond added a project to T341442: Add a cookbook to safely deploy puppet changes: Puppet-Core.
Nov 27 2023, 3:49 PM · User-Elukey, Puppet-Core, Infrastructure-Foundations, SRE-tools, Spicerack, SRE
jbond added a comment to T351950: taavi's netbox-next account is stuck.

ill leave this to @SLyngshede-WMF as im guessing they have ben experimenting with migrating netbox to OIDC T308002: Move Netbox authentication to python-social-auth

Nov 27 2023, 12:34 PM · Infrastructure-Foundations, netbox

Nov 24 2023

jbond updated subscribers of T265633: Allow running PCC with different states of the private repo for prod/change catalog.

@hashar how do we get the updated commit-validator into the CI images used bu puppet. i.e. something similar to https://gerrit.wikimedia.org/r/c/integration/config/+/971546/2/dockerfiles/commit-message-validator/Dockerfile.template . i have a test commit which should pass once upgraded

Nov 24 2023, 3:20 PM · Patch-For-Review, Infrastructure-Foundations, User-jbond, Puppet CI
jbond added a comment to T320636: smart-data-dump fails occasionally due to facter timeouts.

Maybe it is related to the puppet 7 upgrade in some way?

Although the timing could suggest this is caused by puppet 7 i would suggest caution for the following reasons:

  • ~50% of servers are no running puppet7
  • As the migration is active we can't be sure theses servers where running puppet7 when the the timeouts happened
  • most importantly: this script calls raid.rb directly as such puppet is not involved. the only library used is facter (and even then only very limited amount) which has not been upgraded as part of the puppet migration
Nov 24 2023, 10:24 AM · Puppet (Puppet 7.0), SRE Observability (FY2022/2023-Q2), Observability-Alerting

Nov 22 2023

jbond updated the task description for T349619: Migrate roles to puppet7.
Nov 22 2023, 4:25 PM · Data-Platform-SRE (2024.06.17 - 2024.07.07), serviceops, collaboration-services, SRE-tools, Puppet-Core, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
jbond updated the task description for T349619: Migrate roles to puppet7.
Nov 22 2023, 4:06 PM · Data-Platform-SRE (2024.06.17 - 2024.07.07), serviceops, collaboration-services, SRE-tools, Puppet-Core, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
jbond updated the task description for T349619: Migrate roles to puppet7.
Nov 22 2023, 3:45 PM · Data-Platform-SRE (2024.06.17 - 2024.07.07), serviceops, collaboration-services, SRE-tools, Puppet-Core, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
jbond added a comment to T351624: Probes for centrallog hosts fail to validate with "x509: issuer name does not match subject from issuing certificate".

I have refreshed the patches for T347565: Switch rsyslog to use the new PKI infrastructure. which among other things updates central auth to use a pki.discovert.wmnet issued cert and updates blackbox::check::tcp to use the same certs. im running pcc now and will fix up any issues

Nov 22 2023, 12:46 PM · Patch-For-Review, User-fgiunchedi, Observability-Logging, SRE
jbond added a comment to T351624: Probes for centrallog hosts fail to validate with "x509: issuer name does not match subject from issuing certificate".

@fgiunchedi i have created a CR to use pki.discovery.wmnet to request a puppet agent certificate instead of using expos_puppet_certs. this should work around the issue

Nov 22 2023, 11:46 AM · Patch-For-Review, User-fgiunchedi, Observability-Logging, SRE

Nov 21 2023

jbond triaged T351624: Probes for centrallog hosts fail to validate with "x509: issuer name does not match subject from issuing certificate" as Medium priority.
Nov 21 2023, 4:53 PM · Patch-For-Review, User-fgiunchedi, Observability-Logging, SRE
jbond added a comment to T351624: Probes for centrallog hosts fail to validate with "x509: issuer name does not match subject from issuing certificate".

@fgiunchedi i have created a CR to use pki.discovery.wmnet to request a puppet agent certificate instead of using expos_puppet_certs. this should work around the issue

Nov 21 2023, 4:50 PM · Patch-For-Review, User-fgiunchedi, Observability-Logging, SRE
jbond updated the task description for T349619: Migrate roles to puppet7.
Nov 21 2023, 3:55 PM · Data-Platform-SRE (2024.06.17 - 2024.07.07), serviceops, collaboration-services, SRE-tools, Puppet-Core, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
jbond updated the task description for T349619: Migrate roles to puppet7.
Nov 21 2023, 3:15 PM · Data-Platform-SRE (2024.06.17 - 2024.07.07), serviceops, collaboration-services, SRE-tools, Puppet-Core, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
jbond updated the task description for T349619: Migrate roles to puppet7.
Nov 21 2023, 1:24 PM · Data-Platform-SRE (2024.06.17 - 2024.07.07), serviceops, collaboration-services, SRE-tools, Puppet-Core, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
jbond closed T324623: Switch rsyslog from gtls to ossl as Resolved.

All systems hav now been migrated to ossl

Nov 21 2023, 1:24 PM · User-MoritzMuehlenhoff, Cloud-VPS, cloud-services-team, Patch-For-Review, SRE, observability, User-dcaro
jbond edited projects for T351624: Probes for centrallog hosts fail to validate with "x509: issuer name does not match subject from issuing certificate", added: Observability-Logging; removed Patch-For-Review, SRE-tools, Puppet-Core, Puppet (Puppet 7.0), Infrastructure-Foundations.
Nov 21 2023, 1:23 PM · Patch-For-Review, User-fgiunchedi, Observability-Logging, SRE
jbond added a comment to T351624: Probes for centrallog hosts fail to validate with "x509: issuer name does not match subject from issuing certificate".

@fgiunchedi Everything is using openssl now, do you still see the errors?

Nov 21 2023, 1:23 PM · Patch-For-Review, User-fgiunchedi, Observability-Logging, SRE
jbond closed T324623: Switch rsyslog from gtls to ossl, a subtask of T127717: Move Cloud VPS auth.logs to central logging, as Resolved.
Nov 21 2023, 1:23 PM · Cloud-VPS, cloud-services-team, User-dcaro, Sustainability (Incident Followup)
jbond closed T324623: Switch rsyslog from gtls to ossl, a subtask of T351181: syslog tls clients failing to connect to centrallog2002 post puppet7 migration, as Resolved.
Nov 21 2023, 1:23 PM · Patch-For-Review, SRE-tools, Puppet-Core, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
jbond updated the task description for T349619: Migrate roles to puppet7.
Nov 21 2023, 12:32 PM · Data-Platform-SRE (2024.06.17 - 2024.07.07), serviceops, collaboration-services, SRE-tools, Puppet-Core, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
jbond reopened T350809: Sporadic puppet failures as "Open".
Nov 21 2023, 11:19 AM · Patch-For-Review, Puppet-Infrastructure, Infrastructure-Foundations
jbond added a comment to T350809: Sporadic puppet failures.

Looking at puppet board we are still having issues when we do a puppet merge. The following are times in utc where we had a puppet-merge occurring, each of theses times we have 8-10 puppet failures

Nov 21 2023, 11:19 AM · Patch-For-Review, Puppet-Infrastructure, Infrastructure-Foundations
jbond closed T351653: thanos internal TLS failure after puppet 7 update as Resolved.

I have rolled out a new wmf-certificates package which i believe has fixed this error. all swift services on thanos-fe1001 are now started. tentatively closing but please reopen if i missed something

Nov 21 2023, 11:09 AM · SRE, Infrastructure-Foundations, Puppet (Puppet 7.0), SRE-swift-storage
jbond updated the task description for T349619: Migrate roles to puppet7.
Nov 21 2023, 9:18 AM · Data-Platform-SRE (2024.06.17 - 2024.07.07), serviceops, collaboration-services, SRE-tools, Puppet-Core, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE

Nov 20 2023

jbond added a comment to T351653: thanos internal TLS failure after puppet 7 update.

@MatthewVernon this is almost certainly something using the the puppet ca directly instead of using /etc/ssl/certs/wmf-ca-certificates.crt. I need to investigate a bit more why openssl is failing. specifically

Nov 20 2023, 5:29 PM · SRE, Infrastructure-Foundations, Puppet (Puppet 7.0), SRE-swift-storage
jbond closed T195981: the package resource should mark packages as manually installed as Resolved.

this has been fixed upstream we should get the benefit when we upgrade to puppet7

Nov 20 2023, 4:21 PM · Puppet-Core, Infrastructure-Foundations, User-jbond
jbond closed T195981: the package resource should mark packages as manually installed, a subtask of T265138: Work required to prepare for puppet 7, as Resolved.
Nov 20 2023, 4:21 PM · Infrastructure-Foundations, Patch-For-Review, User-jbond, SRE, Puppet
jbond closed T290065: Look at pam-duress authentication module as Declined.
Nov 20 2023, 4:20 PM · User-jbond, Infrastructure-Foundations, Security
jbond closed T283771: Allow idrac ftp fetching of firmware updates (either to existing ftp or new solution) as Resolved.

@RobH closing this as we now have the upgrade-firmware cookbook but please reopen if needed

Nov 20 2023, 4:12 PM · Infrastructure-Foundations, SRE-tools, SRE, DC-Ops
jbond closed T281369: Additional CFSSL tasks as Resolved.
Nov 20 2023, 4:11 PM · Infrastructure-Foundations, User-jbond, CFSSL-PKI, SRE
jbond closed T287751: [Cloud VPS alert][puppet-dev] Puppet failure on pdev-pdb.puppet-dev.eqiad1.wikimedia.cloud (172.16.6.86) as Resolved.

this has since been fixed

Nov 20 2023, 4:09 PM · cloud-services-team, Cloud-VPS
jbond placed T286905: Add logout.d script for Gerrit up for grabs.
Nov 20 2023, 4:09 PM · Patch-Needs-Improvement, Release-Engineering-Team (Radar), Gerrit, Infrastructure-Foundations, User-jbond, CAS-SSO, SRE
jbond reassigned T299839: Clarify whether members of ldap/nda should be added to #WMF-NDA from jbond to MoritzMuehlenhoff.
Nov 20 2023, 4:05 PM · Infrastructure-Foundations, WMF-NDA-Requests
jbond closed T314136: decomission puppetmaster[12]00[12] and replace them with puppetmaster[12]00[45] as Resolved.
Nov 20 2023, 3:56 PM · Puppet-Infrastructure, Patch-For-Review, Infrastructure-Foundations
jbond closed T324229: Fix autorestart and debclient dependency as Declined.

not enough information

Nov 20 2023, 3:54 PM · SRE-tools, Infrastructure-Foundations
jbond closed T333135: Offboard nfraison as Resolved.
Nov 20 2023, 3:46 PM · SRE, Infrastructure-Foundations, Infrastructure Security
jbond added a comment to T324623: Switch rsyslog from gtls to ossl.

Reading the task it seems like the last blocker was to "wait out buster" (T324623#8449852). however as we have now deployed this to buster (T324623#9334403) it seems like we can move ahead. Are the any concerns to making this change, it seems fairly simple

Nov 20 2023, 12:36 PM · User-MoritzMuehlenhoff, Cloud-VPS, cloud-services-team, Patch-For-Review, SRE, observability, User-dcaro
jbond added a comment to T351624: Probes for centrallog hosts fail to validate with "x509: issuer name does not match subject from issuing certificate".

The software in this case is prometheus blackbox exporter @jbond. AFAICT ossl doesn't suffer from this problem though I might be wrong as I've only glanced at the issue!

Nov 20 2023, 12:33 PM · Patch-For-Review, User-fgiunchedi, Observability-Logging, SRE
jbond closed T351634: create systemd timer to clean up failed pcc jobs as Resolved.

Timer as now been deployed

Nov 20 2023, 12:20 PM · Jenkins, Infrastructure-Foundations, Continuous-Integration-Infrastructure, Puppet CI
jbond closed T351634: create systemd timer to clean up failed pcc jobs, a subtask of T336350: PCC: worker out of disk space, as Resolved.
Nov 20 2023, 12:20 PM · Jenkins, Infrastructure-Foundations, Continuous-Integration-Infrastructure, Puppet CI
jbond renamed T351634: create systemd timer to clean up failed pcc jobs from create systemd timer toi clean up failed pcc jobs to create systemd timer to clean up failed pcc jobs.
Nov 20 2023, 12:19 PM · Jenkins, Infrastructure-Foundations, Continuous-Integration-Infrastructure, Puppet CI
jbond created T351634: create systemd timer to clean up failed pcc jobs.
Nov 20 2023, 11:55 AM · Jenkins, Infrastructure-Foundations, Continuous-Integration-Infrastructure, Puppet CI
jbond added a subtask for T336350: PCC: worker out of disk space: T348974: update PCC to handle cancled jobs gracefully.
Nov 20 2023, 11:47 AM · Jenkins, Infrastructure-Foundations, Continuous-Integration-Infrastructure, Puppet CI
jbond added a parent task for T348974: update PCC to handle cancled jobs gracefully: T336350: PCC: worker out of disk space.
Nov 20 2023, 11:47 AM · Infrastructure-Foundations, Puppet CI
jbond added a comment to T351624: Probes for centrallog hosts fail to validate with "x509: issuer name does not match subject from issuing certificate".

@fgiunchedi what is the probing software? we do have a bit of a work around for this which may work here as well. also if it is the issue you mention I'm not sure that switching to ossl will help. however i suspect T347565: Switch rsyslog to use the new PKI infrastructure would

Nov 20 2023, 10:50 AM · Patch-For-Review, User-fgiunchedi, Observability-Logging, SRE

Nov 17 2023

jbond added a comment to T351354: Service implementation for cloudelastic1007-1010.

@bking in order for me to investigate further i need either broken host to investigate or a way to replicate the issue.

Nov 17 2023, 2:39 PM · Data-Platform-SRE (2024.02.12 - 2024.03.03)
jbond added a comment to P53543 (An Untitled Masterwork).

output

Agent                                             |CA Server                                         |
====================================================================================================
compile1001.puppet-dev.eqiad1.wikimedia.cloud     |pm7.puppet-dev.eqiad1.wikimedia.cloud             |
db7.puppet-dev.eqiad1.wikimedia.cloud             |pm7.puppet-dev.eqiad1.wikimedia.cloud             |
puppetboard1001.puppet-dev.eqiad1.wikimedia.cloud |pm7.puppet-dev.eqiad1.wikimedia.cloud             |
puppetdb1002.puppet-dev.eqiad1.wikimedia.cloud    |pm7.puppet-dev.eqiad1.wikimedia.cloud             |
acme-chief1001.puppet-dev.eqiad1.wikimedia.cloud  |puppetmaster.cloudinfra.wmflabs.org               |
project-pm.puppet-dev.eqiad1.wikimedia.cloud      |puppetmaster.cloudinfra.wmflabs.org               |
pupptserver7.puppet-dev.eqiad1.wikimedia.cloud    |puppetmaster.cloudinfra.wmflabs.org               |
agent7.puppet-dev.eqiad1.wikimedia.cloud          |puppetserver1001.eqiad.wmnet                      |
Nov 17 2023, 1:03 PM
jbond created P53543 (An Untitled Masterwork).
Nov 17 2023, 1:02 PM
jbond added a comment to T351354: Service implementation for cloudelastic1007-1010.

@bking i took a look at cloudelastic1010 as i had thought this was in some broken state from the reimage cookbook. however from the puppet certs i can see its been around since Nov 9 07:30:40 2023 GMT and has had puppet disabled for the last 36 hours.

Nov 17 2023, 9:58 AM · Data-Platform-SRE (2024.02.12 - 2024.03.03)

Nov 16 2023

jbond edited P53517 (An Untitled Masterwork).
Nov 16 2023, 1:40 PM
jbond created P53517 (An Untitled Masterwork).
Nov 16 2023, 1:38 PM
jbond updated the task description for T349619: Migrate roles to puppet7.
Nov 16 2023, 1:35 PM · Data-Platform-SRE (2024.06.17 - 2024.07.07), serviceops, collaboration-services, SRE-tools, Puppet-Core, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
jbond added a comment to T350786: No entries at all in beta-logs.wmcloud.org since 2023-11-06 Z 12:15:39.

i have logged into logging-logstash-02.logging.eqiad1.wikimedia.cloud and ran systemctl restart logstash.service hopefully that has fixed this
Add Comment

Nov 16 2023, 1:32 PM · Quality-and-Test-Engineering-Team, SRE Observability (FY2023/2024-Q2), Observability-Logging, Release-Engineering-Team, Beta-Cluster-Infrastructure
jbond added a comment to T274593: Logstash beta is not getting any events.
Nov 16 2023, 1:32 PM · observability, User-DannyS712, Beta-Cluster-Infrastructure, Wikimedia-Logstash
jbond updated the task description for T349619: Migrate roles to puppet7.
Nov 16 2023, 1:15 PM · Data-Platform-SRE (2024.06.17 - 2024.07.07), serviceops, collaboration-services, SRE-tools, Puppet-Core, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
jbond updated the task description for T349619: Migrate roles to puppet7.
Nov 16 2023, 12:17 PM · Data-Platform-SRE (2024.06.17 - 2024.07.07), serviceops, collaboration-services, SRE-tools, Puppet-Core, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE

Nov 15 2023

jbond updated the task description for T349619: Migrate roles to puppet7.
Nov 15 2023, 7:05 PM · Data-Platform-SRE (2024.06.17 - 2024.07.07), serviceops, collaboration-services, SRE-tools, Puppet-Core, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
jbond updated the task description for T349619: Migrate roles to puppet7.
Nov 15 2023, 6:36 PM · Data-Platform-SRE (2024.06.17 - 2024.07.07), serviceops, collaboration-services, SRE-tools, Puppet-Core, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
jbond added a comment to T350688: PKI: configure a check for ocsp.

looking at the ocsp file using the following command suggests that something with ocprefresh is not rworking correctly as the response is from kafka ca

Nov 15 2023, 6:12 PM · Patch-For-Review, Infrastructure-Foundations, CFSSL-PKI
jbond added a comment to T350688: PKI: configure a check for ocsp.

The following command should be be able to be used to check

Nov 15 2023, 3:54 PM · Patch-For-Review, Infrastructure-Foundations, CFSSL-PKI
jbond closed T351181: syslog tls clients failing to connect to centrallog2002 post puppet7 migration as Resolved.

i have rolled out a change so that buster machines use openssl which seems to have fixed the issue. please reopen if you see other problems

Nov 15 2023, 1:54 PM · Patch-For-Review, SRE-tools, Puppet-Core, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
jbond closed T351181: syslog tls clients failing to connect to centrallog2002 post puppet7 migration, a subtask of T349619: Migrate roles to puppet7, as Resolved.
Nov 15 2023, 1:54 PM · Data-Platform-SRE (2024.06.17 - 2024.07.07), serviceops, collaboration-services, SRE-tools, Puppet-Core, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
jbond added a comment to T351181: syslog tls clients failing to connect to centrallog2002 post puppet7 migration.

i have tested using openssl and that works so ill prepare a patch to switch all buster to openssl

Nov 15 2023, 12:08 PM · Patch-For-Review, SRE-tools, Puppet-Core, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
jbond added a comment to T351181: syslog tls clients failing to connect to centrallog2002 post puppet7 migration.

Well i have updated apt1001 to 8.2102.0-2~deb10u1 and i still see the problem so that would suggest its not an issue with rsyslog :/. perhaps a different option would be to pressure T347565, however i fear we may hit the same issue

Nov 15 2023, 10:48 AM · Patch-For-Review, SRE-tools, Puppet-Core, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE

Nov 14 2023

jbond updated the task description for T349619: Migrate roles to puppet7.
Nov 14 2023, 5:55 PM · Data-Platform-SRE (2024.06.17 - 2024.07.07), serviceops, collaboration-services, SRE-tools, Puppet-Core, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
jbond closed T346216: PCC should escape HTML parameters (XSS) as Resolved.

This is actually live now so ill assume taavi was right, and see if the error comes again

Nov 14 2023, 4:31 PM · SecTeam-Processed, Infrastructure-Foundations, Puppet CI, Vuln-XSS, Security
jbond closed T300048: Postgres puppet modules use MD5 for users by default as Resolved.

going to close this as i think its resolved but please reopen if not

Nov 14 2023, 4:25 PM · PostgreSQL, Puppet-Infrastructure, Maps, netbox, Infrastructure-Foundations, SRE
jbond updated subscribers of T345830: Puppetserver first run errors.

@jhathaway I suspect you have already fixed theses with your dcl work are you able to confirm/update?

Nov 14 2023, 4:18 PM · Infrastructure-Foundations, Puppet-Infrastructure
jbond updated the task description for T345830: Puppetserver first run errors.
Nov 14 2023, 4:17 PM · Infrastructure-Foundations, Puppet-Infrastructure
jbond added a comment to T350809: Sporadic puppet failures.

set priority to high as this is causing issues

Nov 14 2023, 4:17 PM · Patch-For-Review, Puppet-Infrastructure, Infrastructure-Foundations
jbond raised the priority of T350809: Sporadic puppet failures from Medium to High.
Nov 14 2023, 4:16 PM · Patch-For-Review, Puppet-Infrastructure, Infrastructure-Foundations
jbond removed a project from T350809: Sporadic puppet failures: Puppet CI.
Nov 14 2023, 4:16 PM · Patch-For-Review, Puppet-Infrastructure, Infrastructure-Foundations
jbond triaged T351181: syslog tls clients failing to connect to centrallog2002 post puppet7 migration as High priority.
Nov 14 2023, 4:13 PM · Patch-For-Review, SRE-tools, Puppet-Core, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
jbond triaged T351104: Issues which should be fixed by puppet7 upgrade as Medium priority.
Nov 14 2023, 4:13 PM · Puppet-Infrastructure, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
jbond closed T341056: volatile: We need to configure the volatile endpoint on puppetserveres as Resolved.

volatile is now synced to all puppetserveres and agents using puppet7 can fetch data correctly

Nov 14 2023, 2:47 PM · Patch-For-Review, Puppet-Infrastructure, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
jbond closed T341056: volatile: We need to configure the volatile endpoint on puppetserveres, a subtask of T330490: Next steps for Puppet 7, as Resolved.
Nov 14 2023, 2:46 PM · Patch-For-Review, Puppet-Infrastructure, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
jbond updated the task description for T347565: Switch rsyslog to use the new PKI infrastructure.
Nov 14 2023, 2:46 PM · Observability-Logging, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
jbond updated the task description for T340741: expose_puppet_certs: Services will need to trust the new ca.
Nov 14 2023, 1:44 PM · serviceops-radar, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
jbond updated subscribers of T351094: nftables ignores drange filter for IPv6 if drange only has IPv4 addresses.

However, the IPv6 rule should not be there, right now it's incorrectly allowing v6 traffic to all addresses on port 3306.

It seems from some of the test cases that this may have been intentional. however i agree with you that the current bahviour seems undesirable. I created a CR lets see what @MoritzMuehlenhoff says

Nov 14 2023, 1:41 PM · Infrastructure-Foundations, Data-Services, cloud-services-team
jbond added a comment to T351181: syslog tls clients failing to connect to centrallog2002 post puppet7 migration.

edit: or possibly this one https://github.com/rsyslog/rsyslog/issues/4035

ok i don't think its this as we still have SSL_set_verify_depth(pThis->ssl, 4); in the buster packages

Nov 14 2023, 12:11 PM · Patch-For-Review, SRE-tools, Puppet-Core, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
jbond added a comment to T350809: Sporadic puppet failures.

@jhathaway thanks for investigating by the sounds of it would could probably have a bit of a win if we:

  • set environment_timeout = unlimited
  • update puppet-merge to do a systemctl reload puppetserver after g10k
Nov 14 2023, 12:00 PM · Patch-For-Review, Puppet-Infrastructure, Infrastructure-Foundations
jbond added a comment to T351181: syslog tls clients failing to connect to centrallog2002 post puppet7 migration.

edit: or possibly this one https://github.com/rsyslog/rsyslog/issues/4035

ok i don't think its this as we still have SSL_set_verify_depth(pThis->ssl, 4); in the buster packages

Nov 14 2023, 10:56 AM · Patch-For-Review, SRE-tools, Puppet-Core, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
jbond added a comment to T351181: syslog tls clients failing to connect to centrallog2002 post puppet7 migration.

Feels like this could be related to https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=887637 (https://github.com/rsyslog/rsyslog/issues/2762) that iss is about the server not sending the intermediate but i wonder if the same issues means the client doesn't read the sent intermediate

Nov 14 2023, 10:38 AM · Patch-For-Review, SRE-tools, Puppet-Core, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
jbond added a comment to T351181: syslog tls clients failing to connect to centrallog2002 post puppet7 migration.

from a very simple test this appears to only affect buster

Nov 14 2023, 10:21 AM · Patch-For-Review, SRE-tools, Puppet-Core, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
jbond added a comment to T351181: syslog tls clients failing to connect to centrallog2002 post puppet7 migration.

Some additional information

Nov 14 2023, 10:10 AM · Patch-For-Review, SRE-tools, Puppet-Core, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE

Nov 13 2023

jbond changed the status of T340741: expose_puppet_certs: Services will need to trust the new ca, a subtask of T330490: Next steps for Puppet 7, from Open to In Progress.
Nov 13 2023, 4:24 PM · Patch-For-Review, Puppet-Infrastructure, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
jbond changed the status of T340741: expose_puppet_certs: Services will need to trust the new ca from Open to In Progress.
Nov 13 2023, 4:24 PM · serviceops-radar, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
jbond closed T349915: find solution for acmechief in puppet7 as Resolved.

This is in place now use hiera key during migration

Nov 13 2023, 4:15 PM · Traffic, Patch-For-Review, Puppet-Infrastructure, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
jbond closed T349915: find solution for acmechief in puppet7, a subtask of T330490: Next steps for Puppet 7, as Resolved.
Nov 13 2023, 4:15 PM · Patch-For-Review, Puppet-Infrastructure, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
jbond closed T340739: Create cookbook to migrate servers from the puppetmasters to puppetservers as Resolved.

this is complete

Nov 13 2023, 4:15 PM · Patch-For-Review, SRE-tools, Puppet-Core, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
jbond closed T349291: Fix outstanding puppet 7 issues as Resolved.

Theses issues are all resolved

Nov 13 2023, 4:14 PM · Patch-For-Review, Puppet-Infrastructure, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
jbond closed T349291: Fix outstanding puppet 7 issues, a subtask of T330490: Next steps for Puppet 7, as Resolved.
Nov 13 2023, 4:14 PM · Patch-For-Review, Puppet-Infrastructure, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
jbond closed T340739: Create cookbook to migrate servers from the puppetmasters to puppetservers, a subtask of T330490: Next steps for Puppet 7, as Resolved.
Nov 13 2023, 4:13 PM · Patch-For-Review, Puppet-Infrastructure, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
jcrespo awarded T347390: Create backups for puppetservers a Orange Medal token.
Nov 13 2023, 4:08 PM · Patch-For-Review, Data-Persistence-Backup, Puppet-Infrastructure, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE