Topic on User talk:Magnus Manske/Structured Discussions Archive 1

Jump to navigation Jump to search

Add database of CONOR to Mix'n'Match

16
Sporti (talkcontribs)

Hi. I'd like to create a catalog at Mix'n'Match for CONOR ID (P1280). Main page is , but I found an alternative one with better interface, so here is a list of personal names and corporate names to import.

Gerwoman (talkcontribs)

I've created the catalog 948 myself (hope Magnus don't mind). As P1280 is an authority control for people, I've included only the personal names. You @Sporti may ask to create a property for corporations. PD: Your tool is great, Magnus.

Sporti (talkcontribs)

Thanks @Gerwoman, however you have imported people from only the first page of almost 600.

Sporti (talkcontribs)
Magnus Manske (talkcontribs)

I ran the same scraper again, and now it's 1150316; I suspect some pages don't load properly at random. I'm running it yet again to get them all.

Now, I can have only one scraper per catalog. To get organizations as well, I could

  • change the scraper to just find everything, but then it won't add a type (since I don't know the type) for new entries
  • make a new catalog for organizations. That would be "cleaner", but then we have two catalog for the property, plus "missing" sync reports between MnM and Wikidata
Sporti (talkcontribs)
Sporti (talkcontribs)

Changed CONOR ID (P1280) to allow organisations, so you can upload the corporate catalog aswell.

Also could a bot connect people based on their bibliography, for example and exsisting AC like VIAF ?

Sporti (talkcontribs)

@Gerwoman: Would you make another catalog for corporate names?

Gerwoman (talkcontribs)
Sporti (talkcontribs)

Thanks, but it's ~ 10700 out of 13700 entries, so some didn't get imported just as before.

Gerwoman (talkcontribs)

We need the help of Magnus, as before :)

Sporti (talkcontribs)

@Magnus Manske: Can you run the scraper again and maybe the first one aswell, it is still missing ~ 7500 entries.

Magnus Manske (talkcontribs)

Fiddled with the scraper, now 12456 entries. Not sure which ones are missing.