Description
Details
Subject | Repo | Branch | Lines +/- | |
---|---|---|---|---|
Remove usage of MEDIAWIKI_JOB_RUNNER constant | mediawiki/extensions/CirrusSearch | master | +6 -4 |
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | • Pchelolo | T157088 [EPIC] Develop a JobQueue backend based on EventBus | |||
Open | daniel | T175146 JobQueue: Unify JobRunner entry points | |||
Resolved | • Clarakosi | T244826 Create rest endpoint for executing jobs instead of /rpc/RunSingleJob | |||
Resolved | • Clarakosi | T244828 Investigate alternatives to MEDIAWIKI_JOB_RUNNER global | |||
Resolved | EBernhardson | T247129 Remove usage of MEDIAWIKI_JOB_RUNNER from CirrusSearch extension |
Event Timeline
Based on the provided ticket, no replacement will be offered. I think it's still important to be able to differentiate web requests from job runner requests. As a relatively common example, a count of web requests that invoked cirrussearch isn't possible from these logs if job runner logs report the same execution context as web requests. For reference these are not standard operational logs that only go to logstash, these are logs that feed our analytics and ML pipelines.
I think the keyword here is "web request". Define web requests?
Do you consider some or all of rest.php, thumb.php, api.php, or load.php to be web requests? Under the current scheme, these were "web" except for api.php.
In the above proposal, "web" would be limited to "index.php", "api" remain the same, and the others under "misc". Would that work for you? If not, is there another way this can be grouped that would work for you?
Are these Hadoop queries automated or is it by hand? In my strawmen proposal, the two user-related strings "web" and "api" for the same log field would continue to exist.
If you're currently filtering out "cli", one would filter out "misc" instead - or simply look only for "web"/"api".
Detecting job runner explicitly will no longer be possible as jobs would no longer have their own entry point. They can run under any of cli, rpc/…php, or (soon) via w/rest.php/…/job/execute. For CirrusSearch, should rest.php get its own string label, or fall under one of api or misc or web. If under api or web beware that this would include jobs in the future. If placed under misc or under its own label, beware that endpoints mainly existing via api today, might in the future (also) have similar endpoints under rest.php.
Definitions are of course hard, but in the end what the analytics this feeds cares about is executions that directly return results to a user. I imagine only index.php would be fine, it looks like in the years since this code was written and new defines have been added to index.php that we can key off of.
In the above proposal, "web" would be limited to "index.php", "api" remain the same, and the others under "misc". Would that work for you? If not, is there another way this can be grouped that would work for you?
Are these Hadoop queries automated or is it by hand? In my strawmen proposal, the two user-related strings "web" and "api" for the same log field would continue to exist.
If you're currently filtering out "cli", one would filter out "misc" instead - or simply look only for "web"/"api".
Detecting job runner explicitly will no longer be possible as jobs would no longer have their own entry point. They can run under any of cli, rpc/…php, or (soon) via w/rest.php/…/job/execute. For CirrusSearch, should rest.php get its own string label, or fall under one of api or misc or web. If under api or web beware that this would include jobs in the future. If placed under misc or under its own label, beware that endpoints mainly existing via api today, might in the future (also) have similar endpoints under rest.php.
As you've surmised we don't actually care that they are job runner, we simply need job runner to be in some easily filtered class of things. I will work out switching this over to detect index.php directly and double check that none of the outputs change in surprising ways.
Change 579305 had a related patch set uploaded (by EBernhardson; owner: EBernhardson):
[mediawiki/extensions/CirrusSearch@master] Remove usage of MEDIAWIKI_JOB_RUNNER constant
Change 579305 merged by jenkins-bot:
[mediawiki/extensions/CirrusSearch@master] Remove usage of MEDIAWIKI_JOB_RUNNER constant