[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Page MenuHomePhabricator

Deploy "add a link" to 8th round of wikis
Closed, ResolvedPublic

Description

  • Training models
    • Fulah Wikipedia ff
    • Finnish Wikipedia fi
    • Võro Wikipedia fiu-vro
    • Fijian Wikipedia fj
    • Faroese Wikipedia fo
    • Arpitan Wikipedia frp
    • Northern Frisian Wikipedia frr
    • Friulian Wikipedia fur
    • Western Frisian Wikipedia fy see T308133#8459395
    • Irish Wikipedia ga
    • Gagauz Wikipedia gag
    • Gan Chinese Wikipedia gan see T308133#8469595
    • Guianan Creole Wikipedia gcr
    • Scottish Gaelic Wikipedia gd
    • Galician Wikipedia gl
    • Gilaki Wikipedia glk
    • Guarani Wikipedia gn
    • Goan Konkani Wikipedia gom
    • Gorontalo Wikipedia gor
    • Gothic Wikipedia got
    • Gujarati Wikipedia gu
    • Manx Wikipedia gv
  • Models verification
  • Publish Datasets
  • Populate the excluded section titles
  • Deploy back-end
  • Check how the model works on the wikis
  • In Search, use hasrecommendation:link to find articles
  • Test them on https://api.wikimedia.org/service/linkrecommendation/apidocs/#/default/get_v1_linkrecommendations__project___domain___page_title_
  • Inform communities
  • Deploy front-end, April 12, (except for gorwiki which will be deployed within T308134 on May 17th)

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

21/22 models were trained successfully in the 8th round of wikis.

The Western Frisian Wikipedia (fywiki) returned a UnicodeEncodeError being investigated in this task T325521.

Model evaluation for models whose training pipeline run successfully has been completed and below are the backtesting results:

Precision@0.5Recall@0.5
ffwiki0.920.86
fiwiki0.710.27
fiu_vrowiki0.890.58
fjwiki0.790.67
fowiki0.860.60
frpwiki0.830.45
frrwiki0.930.72
furwiki0.800.46
gawiki0.800.44
gagwiki0.880.50
ganwiki0.670.01
gcrwiki0.950.84
gdwiki0.820.51
glwiki0.830.42
glkwiki0.970.74
gnwiki0.810.43
gomwiki0.820.38
gorwiki0.990.97
gotwiki0.910.71
guwiki0.840.32
gvwiki0.820.52

CCing @MGerlach, in case he would like to add comments on the backtesting evaluation.

The conclusion on the backtesting results is that most of the languages look fine besides:

  • ganwiki has a low precision (0.67) and very low recall (0.01).
  • fiwiki's precision (0.71) is slightly lower than the recommended one (0.75).

Talked to @MGerlach about these results and agreed: not to deploy ganwiki; and deploy fiwiki since its precision is not too low and the recall is good.

@kostajh, we published datasets for all 20/22 models that passed the evaluation in this round.

Sgs moved this task from Triaged to Sprint 0 (Growth Team) on the Growth-Team board.
Sgs edited projects, added Growth-Team (Sprint 0 (Growth Team)); removed Growth-Team.
Sgs added a subscriber: kevinbazira.
Sgs changed the task status from Open to In Progress.Feb 24 2023, 11:58 AM

I ran this script for adding the link-recommendation task type and and populating the excluded sections:

PHAB=T304551
for WIKI in ffwiki fiwiki fiu_vrowiki fjwiki fowiki frpwiki frrwiki furwiki gawiki gagwiki ganwiki gcrwiki gdwiki glwiki glkwiki gnwiki gomwiki gorwiki gotwiki guwiki gvwiki; do
    ORIGIN=`mwscript getConfiguration.php $WIKI --settings 'wgCanonicalServer' --format json | jq --raw-output '.wgCanonicalServer'`
    mwscript extensions/GrowthExperiments/maintenance/changeWikiConfig.php $WIKI \
            --page MediaWiki:NewcomerTasks.json \
            --create-only \
            --json \
            --summary "Growth features configuration boilerplate ([[phab:$PHAB]])" \
            link-recommendation \
            '{ "type": "link-recommendation", "group": "easy" }'
    jq "select(.wiki==\"$WIKI\" and .probability > 0.25) | .section" wiki_sections.jsonl \
        | jq --slurp --compact-output "unique" \
        | mwscript extensions/GrowthExperiments/maintenance/changeWikiConfig.php $WIKI \
            --page MediaWiki:NewcomerTasks.json \
            --json \
            --summary "machine-generated configuration for excluding sections from link recommendations ([[phab:$PHAB]]), feel free to improve" \
            link-recommendation.excludedSections \
            "`cat`"
    echo "$ORIGIN/wiki/MediaWiki:NewcomerTasks.json"
    echo "$ORIGIN/w/index.php?title=MediaWiki:NewcomerTasks.json&diff=next"
    echo "Press <Enter> to continue"
    read # give time for manual verification
done

I checked the configuration and it seemed to be correctly updated in all wikis, aside from the wrong PHAB ticket :(. The only mentions worth are fjwiki, gcr and got which didn't get any excluded section.

Change 892364 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):

[operations/mediawiki-config@master] GrowthExperiments: Enable link recommendation for 8th round wikis

https://gerrit.wikimedia.org/r/892364

Change 892363 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):

[operations/mediawiki-config@master] GrowthExperiments: Enable backend of link recommendation for 7,8,9th round wikis

https://gerrit.wikimedia.org/r/892363

Change 892364 abandoned by Sergio Gimeno:

[operations/mediawiki-config@master] GrowthExperiments: Enable link recommendation for 8th round wikis

Reason:

squashed in I81293b799ec5afe62a19ac2d79e0434047cf1be2

https://gerrit.wikimedia.org/r/892364

Change 892363 merged by jenkins-bot:

[operations/mediawiki-config@master] GrowthExperiments: Enable backend of link recommendation for 7, 8, 9th round wikis

https://gerrit.wikimedia.org/r/892363

Mentioned in SAL (#wikimedia-operations) [2023-03-15T20:13:23Z] <samtar@deploy2002> Started scap: Backport for [[gerrit:899673|GrowthExperiments: enable frontend of link recommendation for 6th round wikis (T304550)]], [[gerrit:892363|GrowthExperiments: Enable backend of link recommendation for 7, 8, 9th round wikis (T304551 T308133 T308134)]]

Mentioned in SAL (#wikimedia-operations) [2023-03-15T20:14:55Z] <samtar@deploy2002> sgimeno and samtar: Backport for [[gerrit:899673|GrowthExperiments: enable frontend of link recommendation for 6th round wikis (T304550)]], [[gerrit:892363|GrowthExperiments: Enable backend of link recommendation for 7, 8, 9th round wikis (T304551 T308133 T308134)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet

Mentioned in SAL (#wikimedia-operations) [2023-03-15T20:23:36Z] <samtar@deploy2002> Finished scap: Backport for [[gerrit:899673|GrowthExperiments: enable frontend of link recommendation for 6th round wikis (T304550)]], [[gerrit:892363|GrowthExperiments: Enable backend of link recommendation for 7, 8, 9th round wikis (T304551 T308133 T308134)]] (duration: 10m 12s)

Sgs updated the task description. (Show Details)
Sgs subscribed.
  • ff.wp returns "There were no results matching the query."
  • fiu-vro.wp returns "There were no results matching the query."
  • fj.wp returns "There were no results matching the query."
  • fo.wp returns "There were no results matching the query."
  • frp.wp returns "There were no results matching the query."
  • frr.wp returns "There were no results matching the query."
  • fur.wp returns "There were no results matching the query."
  • ga.wp returns "There were no results matching the query."
  • gag.wp returns "There were no results matching the query."
  • gcr.wp returns "There were no results matching the query."
  • gd.wp returns "There were no results matching the query."
  • gl.wp returns "There were no results matching the query."
  • glk.wp returns "There were no results matching the query."
  • gn.wp returns "There were no results matching the query."
  • gom.wp returns "There were no results matching the query."
  • gor.wp returns "There were no results matching the query."
  • got.wp returns "There were no results matching the query."
  • gu.wp returns "There were no results matching the query."
  • gv.wp returns "There were no results matching the query."

Finnish (fi.wp) works.

Change 905193 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):

[operations/mediawiki-config@master] GrowthExperiments: add link backend amends

https://gerrit.wikimedia.org/r/905193

  • gor.wp returns "There were no results matching the query."

I forgot to enable the backend for gorwiki in 892363

Finnish (fi.wp) works.

And I wrongly enabled fiwiki and ganwiki which where meant to be excluded per T308133#8469595.

I'm amending these mistakes today and we can look into gorwiki in 24-48h

The rest of models are working fine now. I can't tell why they took more time produce results than expected. Our suspicion is that is due to services being temporary unavailable and the maintenace script failing to load the datasets into the data base. Also there have been some infrastructure changes eqiad — codfw that could have affected the pipeline. We'll keep monitoring this and open separate tasks for the required fixes.

Change 905193 merged by jenkins-bot:

[operations/mediawiki-config@master] GrowthExperiments: add link backend amends

https://gerrit.wikimedia.org/r/905193

Mentioned in SAL (#wikimedia-operations) [2023-04-03T13:28:08Z] <taavi@deploy2002> Started scap: Backport for [[gerrit:905193|GrowthExperiments: add link backend amends (T308133)]]

Mentioned in SAL (#wikimedia-operations) [2023-04-03T13:29:28Z] <taavi@deploy2002> sgimeno and taavi: Backport for [[gerrit:905193|GrowthExperiments: add link backend amends (T308133)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet

Mentioned in SAL (#wikimedia-operations) [2023-04-03T13:35:23Z] <taavi@deploy2002> Finished scap: Backport for [[gerrit:905193|GrowthExperiments: add link backend amends (T308133)]] (duration: 07m 15s)

I'm amending these mistakes today and we can look into gorwiki in 24-48h

I'll check everything on Wednesday.

If everything goes well, can we imagine deploying these wikis on Wednesday April 12 (next week)?

The rest of models are working fine now. I can't tell why they took more time produce results than expected. Our suspicion is that is due to services being temporary unavailable and the maintenace script failing to load the datasets into the data base. Also there have been some infrastructure changes eqiad — codfw that could have affected the pipeline. We'll keep monitoring this and open separate tasks for the required fixes.

Good to know! :)

Finnish (fi.wp) works.

And I wrongly enabled fiwiki and ganwiki which where meant to be excluded per T308133#8469595.

Per comment, fiwiki is supposed to be deployed, even if low:

Talked to @MGerlach about these results and agreed: not to deploy ganwiki; and deploy fiwiki since its precision is not too low and the recall is good.

Do you mean fywiki?

Do you mean fywiki?

No, I actually miss-read Kevin's comment and disabled fiwiki. I've scheduled a deploy to re-enable it today at 15h UTC+2. Sorry for the confusion.

If everything goes well, can we imagine deploying these wikis on Wednesday April 12 (next week)?

Sure!

Change 905950 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):

[operations/mediawiki-config@master] GrowthExperiments: enable add link frontend and backend

https://gerrit.wikimedia.org/r/905950

Change 905950 merged by jenkins-bot:

[operations/mediawiki-config@master] GrowthExperiments: enable add link backend in wiki rounds (8,9th)

https://gerrit.wikimedia.org/r/905950

Mentioned in SAL (#wikimedia-operations) [2023-04-05T13:08:58Z] <lucaswerkmeister-wmde@deploy2002> Started scap: Backport for [[gerrit:905950|GrowthExperiments: enable add link backend in wiki rounds (8,9th) (T308133 T308134)]]

Mentioned in SAL (#wikimedia-operations) [2023-04-05T13:10:28Z] <lucaswerkmeister-wmde@deploy2002> lucaswerkmeister-wmde and sgimeno: Backport for [[gerrit:905950|GrowthExperiments: enable add link backend in wiki rounds (8,9th) (T308133 T308134)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet

Mentioned in SAL (#wikimedia-operations) [2023-04-05T13:16:58Z] <lucaswerkmeister-wmde@deploy2002> Finished scap: Backport for [[gerrit:905950|GrowthExperiments: enable add link backend in wiki rounds (8,9th) (T308133 T308134)]] (duration: 08m 00s)

Do you mean fywiki?

No, I actually miss-read Kevin's comment and disabled fiwiki. I've scheduled a deploy to re-enable it today at 15h UTC+2. Sorry for the confusion.

It shouldn't be deployed now, as it was not announced to communities.

If everything goes well, can we imagine deploying these wikis on Wednesday April 12 (next week)?

Sure!

I tested the models, and they are all working except:

  • gor.wp returns "There were no results matching the query."

We can proceed, except for gor. I announced the deployment in Tech News for next week (April 12).

Change 907899 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):

[operations/mediawiki-config@master] GrowthExperiments: enable add link frontend in 7,8th round wikis

https://gerrit.wikimedia.org/r/907899

Sgs removed Sgs as the assignee of this task.Apr 12 2023, 9:52 AM
Sgs added a subscriber: Tgr.

I tested the models, and they are all working except:

  • gor.wp returns "There were no results matching the query."

I'm still investigating this; the configuration is in place and the datasets seem correctly published but there are several failures from the load-datasets.py script. These seem intentionally filtered out from our logstash dashboard; is the read only mode exception expected? Support on how to debug this forward this issue is welcome, cc @kevinbazira @kostajh @Tgr

As a side note, one thing that could be worth adding to the exception logging is the wiki id.

Change 907899 merged by jenkins-bot:

[operations/mediawiki-config@master] GrowthExperiments: enable add link frontend in 7,8th round wikis

https://gerrit.wikimedia.org/r/907899

Mentioned in SAL (#wikimedia-operations) [2023-04-12T13:07:11Z] <lucaswerkmeister-wmde@deploy2002> Started scap: Backport for [[gerrit:907899|GrowthExperiments: enable add link frontend in 7,8th round wikis (T304551 T308133)]]

Mentioned in SAL (#wikimedia-operations) [2023-04-12T13:08:33Z] <lucaswerkmeister-wmde@deploy2002> sgimeno and lucaswerkmeister-wmde: Backport for [[gerrit:907899|GrowthExperiments: enable add link frontend in 7,8th round wikis (T304551 T308133)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet

Mentioned in SAL (#wikimedia-operations) [2023-04-12T13:20:42Z] <lucaswerkmeister-wmde@deploy2002> Finished scap: Backport for [[gerrit:907899|GrowthExperiments: enable add link frontend in 7,8th round wikis (T304551 T308133)]] (duration: 13m 30s)

I'm still investigating this; the configuration is in place and the datasets seem correctly published but there are several failures from the load-datasets.py script. These seem intentionally filtered out from our logstash dashboard; is the read only mode exception expected? Support on how to debug this forward this issue is welcome, cc @kevinbazira @kostajh @Tgr

I don't remember why we have that filter, but the errors are quite regular (there is two of them near :00 every hour) so presumably the cronjob is running on the wrong container.

tgr@mwmaint2002:~$ ack gorwiki /var/log/mediawiki/mediawiki_job_growthexperiments-refreshLinkRecommendations-s3/syslog.log

gives a bunch of topic exhausted, 500 tasks still needed. The link recommendation service does seem to be working:

Apr 16 15:18:34 mwmaint2002 mediawiki_job_growthexperiments-refreshLinkRecommendations-s3[12719]: gorwiki:      checking candidate Wikipedia... number of good links too small (1)
Apr 16 15:18:35 mwmaint2002 mediawiki_job_growthexperiments-refreshLinkRecommendations-s3[12719]: gorwiki:      checking candidate Wadala... All of the links in the recommendation have been pruned
Apr 16 15:18:35 mwmaint2002 mediawiki_job_growthexperiments-refreshLinkRecommendations-s3[12719]: gorwiki:      checking candidate Sulawesi... link recommendation already stored

so the maintenance script is getting recommendations from the service, just not good enough ones. Probably just a small wiki with not enough articles?

As a side note, one thing that could be worth adding to the exception logging is the wiki id.

Would be nice, although wouldn't matter for the readonly issue since the tables for all wikis are on the same server. Plus, these errors all occur for the checksum table which is not wiki-specific. (Probably we just touch that one first.)

I'm still investigating this; the configuration is in place and the datasets seem correctly published but there are several failures from the load-datasets.py script. These seem intentionally filtered out from our logstash dashboard; is the read only mode exception expected? Support on how to debug this forward this issue is welcome, cc @kevinbazira @kostajh @Tgr

I don't remember why we have that filter, but the errors are quite regular (there is two of them near :00 every hour) so presumably the cronjob is running on the wrong container.

@JMeybohm helped me set up the cron job in rDEPLOYCHARTS13f33ca63a76: linkrecommendation: Cron job to load datasets.

The idea was to seamlessly handle data center switchover. The high-level concept is that if codfw becomes the active data center, then the cron job for the codfw deployment of linkrecommendation handles imports of data; when eqiad is the active data center, then the eqiad deployment of linkrecommendation app should handle it. The setup is:

  • a cron job to periodically import datasets exists and is active for both eqiad and codfw deployments of the linkrecommendation app
  • at any given time, one of the data centers is in read-only mode. Normally, that's codfw. So the cron job running for eqiad is periodically updating/importing new datasets to the linkrecommendation database table on the m2 server
  • at any given time, the cron job for the non-active data center is expected to return the "The MariaDB server is running with the --read-only option so it cannot execute this statement" error message.

What seems off here is that the read-only error is appearing for codfw, when that is the active datacenter. @Ladsgroup @Marostegui is it possible that the m2 server is only in eqiad, and not codfw? If so, then we should only define the cron job for the eqiad deployment of the linkrecommendation kubernetes app.

We don't switchover misc databases: https://orchestrator.wikimedia.org/web/cluster/alias/m2 The replica exists but doesn't get read/write traffic. So anytime you're making a change, it's going to make a cross dc connection across United States (with 72ms round trip latency).

We don't switchover misc databases: https://orchestrator.wikimedia.org/web/cluster/alias/m2 The replica exists but doesn't get read/write traffic. So anytime you're making a change, it's going to make a cross dc connection across United States (with 72ms round trip latency).

Ok, I filed T334928: linkrecommendation: Cron job should only run with eqiad deployment as a follow-up.

Just to be sure: has anything of what you discussed impacted the deployment? :)

so the maintenance script is getting recommendations from the service, just not good enough ones. Probably just a small wiki with not enough articles?

gorwiki shows now ~350 recommendations which seem fine for a small wiki. The fact no recommendations were shown at the time we tested could be due to some flakiness on the script we haven't yet identified. I'll take on T334928 before the next round of scaling (10th), hopefully that helps isolating the problem.

Just to be sure: has anything of what you discussed impacted the deployment? :)

No, the deployment was done normally and all wikis from 8th round except gorwiki have the frontend enabled.

Since gorwiki is working now we could enable it whithin the 9th round, see T308134#8840607.

Sgs changed the task status from In Progress to Open.May 10 2023, 2:11 PM
Sgs updated the task description. (Show Details)
Sgs moved this task from In Progress to QA on the Growth-Team (Sprint 0 (Growth Team)) board.
Etonkovidova subscribed.

Checked guwiki, galwiki, gotwiki, and fjwiki - "add a link" works as expected ( a new account was checked and also, enabling Homepage for an old account).

guwiki shows edits tagged with Suggested: add links tag in Special:RecentChanges.