Hi. I'd like to create a catalog at Mix'n'Match for CONOR ID (P1280). Main page is , but I found an alternative one with better interface, so here is a list of personal names and corporate names to import.
Topic on User talk:Magnus Manske/Structured Discussions Archive 1
I've created the catalog 948 myself (hope Magnus don't mind). As P1280 is an authority control for people, I've included only the personal names. You @Sporti may ask to create a property for corporations. PD: Your tool is great, Magnus.
Thanks @Gerwoman, however you have imported people from only the first page of almost 600.
Fixed.
Shouldn't there be over 1170000 entries? Now its at ~ 900000.
Yes, you are right. Perhaps Magnus can help us.
Also Property talk:P1280 states it should include people and organizations.
I ran the same scraper again, and now it's 1150316; I suspect some pages don't load properly at random. I'm running it yet again to get them all.
Now, I can have only one scraper per catalog. To get organizations as well, I could
- change the scraper to just find everything, but then it won't add a type (since I don't know the type) for new entries
- make a new catalog for organizations. That would be "cleaner", but then we have two catalog for the property, plus "missing" sync reports between MnM and Wikidata
First CONOR ID (P1280) would need to accept organisations too as per request at Wikidata:Property proposal/Archive/21#P1280, now it's only set for people, organizations were somehow forgotten. I think a new catalog would be better for org, it is also about 100X smaller.
Changed CONOR ID (P1280) to allow organisations, so you can upload the corporate catalog aswell.
Also could a bot connect people based on their bibliography, for example and exsisting AC like VIAF ?
@Gerwoman: Would you make another catalog for corporate names?
Created CONOR2
Thanks, but it's ~ 10700 out of 13700 entries, so some didn't get imported just as before.
We need the help of Magnus, as before :)
@Magnus Manske: Can you run the scraper again and maybe the first one aswell, it is still missing ~ 7500 entries.
Fiddled with the scraper, now 12456 entries. Not sure which ones are missing.