[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Page MenuHomePhabricator

Add FactGrid to WDQS allowlist
Open, MediumPublic

Description

FactGrid is a Wikibase instance that has been collecting historical and research data since 2018. We requested being added to the federation allowlist in 2022, but AFAICT this never happened (I guess it was right at the unfortunate time that the previous process was abandoned but nobody had been told to instead create a Phabricator task yet).

Event Timeline

It would be cool if we could come up with some sort of criteria for what we want to allow federation with query wise.
I imagine most important wikibases are now in wikibase.world, and we could liekly just automatically generate a suggested whitelist for use in query services from some default + listed instances there?

e.g. https://wikibase.world/wiki/Item:Q30
States 1 million pages etc

I think the license was historically a bigger issue than size of page.

In general the allowlist contents for Wikidata is one of those topics that's currently falling between the cracks as not clearly a Wikidata community decided topic but also not a purely engineering team decided topic. It seems there was a community discussion here about 4 years ago which doesn't look fully concluded to me. It also references T265290 as the ticket to determine this process

@Addshore
Is there some kind of curation/governance happening in wikibase.world ?
Example: if I create a Cloud instance, dump 2 million random pages and add it to wikibase.world, will the community there notice that this is actually not a real data set that shouldn't be endorsed?

@Tarrow
It feels like this discussion will eventually merge with the 'community-centered discussions around data governance policies'. Both are about endorsement of certain kind of data sets in the ecosystem, and it seems like there will have to be a single decision-making process in the end.

Governance, no not really. It's primarily there to enable discover ability and overview of the ecosystem. It's not trying to tell you information about the quality of data within a wikibase (at least yet) etc.

Number of pages was an example, but adding licence information to wikibase.world likely also makes sense.
There is also information showing what wikibases linked to each other (connected data) that could be used to determine some form of value of the data Heald withing too for example.