Rdmpage
|
I am Roderic D. M. Page (Q7356570), you can find me on Twitter as @rdmpage, and I have a blog iPhylo which lists my current projects.
Things to fix
editDuplicate ISBNs
editFrom https://www.wikidata.org/wiki/User:Maxlath/P212_unique_value_constraint_violations_by_user
- 978-0-444-98921-5: Q99641558 Q99654548 Q99654549 Q99654550 Q99657116 Q99657118
- 978-0-7923-7048-2: Q101572043 Q101572047 Q101572051 Q99653208
- 978-3-030-31166-7: Q101630225 Q101630228 Q101630232 Q101630235
- 978-3-642-16410-1: Q101572050 Q101572053 Q101572057 Q99642093
- 978-90-474-2775-9: Q99641698 Q99654800 Q99657236 Q99657237
- 978-90-481-5001-4: Q99962247 Q99962254
- 978-90-481-5848-5: Q99966123 Q99975333
- 978-90-481-9961-7: Q101572059 Q101572063 Q101572064 Q101572065 Q101572069 Q99657252
- 978-94-009-8634-3: Q101630321 Q99572619
- 978-94-009-8643-5: Q101572092 Q101572094 Q101572095
- 978-94-010-3887-4: Q99975741 Q99975758
- 978-94-010-5568-0: Q99952950 Q99952954 Q99966555 Q99967736 Q99967740 Q99968202
- 978-94-010-7579-4: Q101572072 Q101572076 Q101572077 Q99641573
- 978-94-010-8294-5: Q101572081 Q101572083 Q101572084 Q101572088
- 978-94-015-2062-1: Q99961998 Q99973900
- 978-981-10-6982-6: Q101630343 Q101630346 Q101630347
- Ichneutica sistens (Q104214244) uses first valid description (Q1361864) for the original name but applies it to a more recent combination. The original name is Eumichtis sistens. Note also that the original publication for that name New species, &c., of heterocerous Lepidoptera from Canterbury, New Zealand, collected by Mr. R. W. Fereday (Q104214297) cites a series of page ranges, so it is a set of articles, not just one. Need to explore how to fix this, and how to make sure that the interpretation of first valid description (Q1361864) is clear.
Taxa versus species
editNorops duellmani (Q6450757) and Anolis duellmani (Q2814307) are the same taxon, in this example different wikis link to different names, and sometimes the page names don't match.
Essays
editWikidata
editHow to add references Help:Sources
Place where WikiCite-related stuff gets discussed. Wikidata talk:WikiProject Source MetaData
Deletions
editSee https://phabricator.wikimedia.org/T291659#7402884%7Cstatement and
SELECT DISTINCT ?item ?itemLabel WHERE {
SERVICE wikibase:mwapi { bd:serviceParam wikibase:endpoint "www.wikidata.org". bd:serviceParam wikibase:api "Generator". bd:serviceParam mwapi:generator "links". bd:serviceParam mwapi:titles "Wikidata:Requests for deletions". bd:serviceParam mwapi:gpllimit "max". bd:serviceParam mwapi:gplnamespace "0". ?item wikibase:apiOutputItem mwapi:title. } ?item wdt:P6944 ?id . SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
Property proposals
editOnes I've made or have been involved in.
Wikidata:Property_proposal/Authority_control#National_Diet_Library_Persistent_ID
Wikidata:Property_proposal/Institut_de_recherche_pour_le_développement_(IRD)_identifier
Data quality
editSPARQL
editSee User:Succu/SPARQL for lots of relevant examples.
Deprecation
editSee Help:Deprecation. One example of using this would be to correct dates of articles where CrossRef has got the date wrong (e.g., Wiley metadata).
- cannot be confirmed by other sources (Q25895909)
- not been able to confirm this claim (Q21655367)
- conflation (Q14946528)
- incorrect value (Q41755623)
- item/value with less precision and/or accuracy (Q42727519)
Withdrawn identifiers
edite.g. if an ISSN has been cancelled use reason for deprecated rank (P2241) withdrawn identifier value (Q21441764)
Deprecated identifiers
editSee Haplostoma humesi, New Species (Copepoda: Cyclopoida: Ascidicolidae), Associated with a Compound Ascidian (Aplidium Sp.) from Madagascar (Q104118218) for an example with two DOIs.
Duplicates
editTrying to clean up Zootaxa, first example The type specimens of Tachinidae (Diptera) housed in the Museo Argentino de Ciencias Naturales “Bernardino Rivadavia”, Buenos Aires (Q29469527) and The type specimens of Tachinidae (Diptera) housed in the Museo Argentino de Ciencias Naturales “Bernardino Rivadavia”, Buenos Aires (Q35800184)
Problem is that Revision of the neotropical Exoristini (Diptera, Tachinidae): the status of the genera Epiplagiops and Tetragrapha (Q35950285) cites both, but they are not merged in that references list of cited works.
- I got bored and manually fixed the duplication.
Another test case: Revision of Zorion Pascoe (Coleoptera: Cerambycidae), an endemic genus of New Zealand (Q79429489) and Revision of Zorion Pascoe (Coleoptera: Cerambycidae), an endemic genus of New Zealand (Q28939048), where one has CrossRef DOI and is cited by A checklist of New Zealand Cerambycidae (Insecta: Coleoptera), excluding Lamiinae (Q56166058), the other has Zenodo DOI, has authors instead of author strings, and is linked to a taxon Zorion taranakiensis (Q14848194).
- Merged
Zootaxa
editNotes on de-duplicating Zootaxa. The counts here come from a local database I made, and will be out of date, especially as @Succu: is working through duplicates manually. "x" is not entered yet.
Year | No. articles in WD at start | Duplicates | No. in CrossRef | Query | Notes |
---|---|---|---|---|---|
2001 | 17 | 0 | 20 | https://w.wiki/WJ8 | |
2002 | 105 | 0 | 107 | https://w.wiki/WJD | |
2003 | 258 | 0 | 269 | https://w.wiki/XMP | |
2004 | 261 | 0 | 388 | https://w.wiki/XA6 | |
2005 | 424 | 1 | 583 | https://w.wiki/XAQ | Duplication is ZENODO DOI record Q79429489 of record Q28939048 |
2006 | 10 | 0 | 851 | https://w.wiki/XAY | Q88363212 has only a Zenodo DOI |
2007 | 167 | 0 | 1067 | https://w.wiki/XDb | |
2008 | 22 | 0 | 1111 | https://w.wiki/XE2 | |
2009 | 19 | 0 | 1466 | https://w.wiki/XHJ | |
2010 | 59 | 0 | 1416 | https://w.wiki/XJV | |
2011 | 47 | 0 | 1650 | https://w.wiki/XMW | |
2012 | 43 | 0 | 1893 | https://w.wiki/XMZ | Q29470959 has two PMIDs both valid but clearly duplicates |
2013 | 2144 | 1237 | 2123 | https://w.wiki/XMb | Q29468684 had duplicate PMIDs, one wrong. Q30860840 has a DOI but CrossRef metadata is wrong (title is that of preceding article). A number of 2013 articles have DOIs that don't resolve and hence DOI field in Wikidata is not populated. |
2014 | 2017 | 3 | 2027 | https://w.wiki/XMc | |
2015 | 2338 | 0 | 2341 | https://w.wiki/XMd | |
2016 | 2315 | 5 | 2334 | https://w.wiki/XMe | Q28821788 has two PMIDs both valid but clearly duplicates |
2017 | 2241 | 9 | 1859 | https://w.wiki/XMf | Note that Wikidata has more than CrossRef, check what happened here. |
2018 | 2231 | 814 | 2322 | https://w.wiki/XMg | |
2019 | 2408 | 9 | 2505 | https://w.wiki/XMh | |
2020 | 1242 | 0 | 1106 | https://w.wiki/XMj |
References
editTo add references for a statement in Wikidata using Quickstatements:
Q36504420 P21 Q6581072 S248 Q28948401
Note "S" instead of "P" for "stated in" property. To see the result in Wikidata see Andreja Kofol-Seliger (Q36504420).
Examples
editHomonyms
editHomonyms can be linked to replacement names, and to each other.
Authority control
editAuthority control |
Geography
editWikidata:List of properties/geography
Can store GeoJSON in Wikicommons, e.g. Commons:Data:BioStor/95684.map. Need to create page manually then add data (create page via a red link).
Can link to Wikidata using geoshape (P3896)
Need to figure out how to retrieve GeoJSON for use in applications.
See also Wikidata:Property_proposal/distribution_map_of_taxon
Taxonomic properties
editSee also Template:Taxonomy_properties., Wikidata:WikiProject Taxonomy, and Wikidata:WikiProject Taxonomy/Tutorial
problems
editWikidata conflates names and taxa, see wikipedia:User:Peter_coxhead/Wikidata_issues, Wikidata:Property_proposal/taxon_synonym_string, is it possible to resolve this?
For a summary of properties see Template:Taxa Versus Names.
properties
editProperties of a taxon (Q16521)
taxon name (P225) taxon rank (P105) parent taxon (P171) taxon common name (P1843)
taxon synonym (P1420) taxon range map image (P181) hybrid of (P1531)
Examples: Synalpheus pinkfloydi (Q29367343)
geography
edit
type locality
edittype locality (biology) (P5304)
Locality must be a Wikidata item. Qualifiers include object named as (P1932) and point in time (P585) and coordinate location (P625) (e.g., Solanum aspersum (Q1305990).
We can create a map of type localities: Try it!
nomenclature
edit- replaced synonym (for nom. nov.) (P694)
- year of publication of scientific name for taxon (P574)
- has basionym (P566)
- original combination (P1403)
- subject has role (P2868) (protonym)
- taxonomic type (P427)
- species nova (Q27652812) ( has main subject (P921)
- year of publication of scientific name for taxon (P574)
- P5326 Search
nomenclature entities
editprotonym (Q14192851) basionym (Q810198)
taxon identifiers
editITIS TSN (P815) Encyclopedia of Life ID (P830) NCBI taxonomy ID (P685) GBIF taxon ID (P846) IPNI plant ID (P961) Plant List ID (Royal Botanic Gardens, Kew) (P1070) IUCN taxon ID (P627) Tropicos ID (P960) WCSPF ID (P3591) Avibase taxon ID (P2026) MSW ID (P959) BOLD Systems taxon ID (P3606) MycoBank taxon name ID (P962) Index Fungorum taxon ID (P1391)
page level link between name and BHL
editUse BHL page ID (P687) as a reference for a taxon name (P225), e.g. https://www.wikidata.org/wiki/Q11577#P225
=== genome --
sequenced genome URL (P6800) for link to genome, e.g. Asian tiger mosquito (Q477918) has https://metazoa.ensembl.org/Aedes_albopictus
Taxonomic examples
editUser:Achim_Raschka/Erstbeschreibungen has a list of new species descriptions, see also query that generates a timeline: [1]
Nice examples for literature mapping
edit- http://localhost/~rpage/ipni-names/index.php?q=Calanthe
- http://localhost/~rpage/ipni-names/index.php?q=Schismatoglottis
- http://ispecies.org/?q=Yoania
- http://ispecies.org/?q=Crepidium
Interesting cases
edit- Trimeresurus xiangchengensis (Q3020693) taxon name is Protobothrops xiangchengensis but label is Trimeresurus xiangchengensis.
Names that aren't taxa
editThere are cases where names have an entry but they are not instances of taxa, e.g Satsuma chalybeia (Q25351070) which is described as a "name that may not be used" and is an instance of unavailable combination (Q17487588) synonym (Q1040689) and Satsuma Murray (1874) non Adams (1868) (Q25661771)(!). Looks like an attempt to separate names from taxa...
What this also means is that we can't rely on simply searching for things that are instances of taxon (Q16521) when looking to match names.
Traits
editJSTOR
editJSTOR content on Internet Archive https://archive.org/details/jstor_ejc
Major journal projects
edithttps://www.wanfangdata.com.cn/sns/perio/xbzwxb/?tabId=article&publishYear=2020&issueNum=12&isSync=0&page=2 and CNKI. DOIs for some articles but they don't seem to be resolving. For example https://d.wanfangdata.com.cn/periodical/xbzwxb202101001 has DOI http://dx.chinadoi.cn/10.7606/j.issn.1000-4025.2021.01.0001 which doesn't resolve.
Years | Volumes | Source | Pagination? | Notes | Status |
---|---|---|---|---|---|
1981-2021 | CNKI | Yes | adding | ||
1999-2009 | Wanfang | Yes | DOIs 10.3321 ISTIC | Added | |
2012-2013 | Wanfang | Yes | DOIs 10.3969 ISTIC | Added | |
2014-2016 | Wanfang | Yes | DOIs 10.7606 ISTIC | Not added yet |
Several name changes, multiple data sources, Chinese and English, lack of pagination data in some cases, mixed DOI agencies, not all DOIs resolve. PDFs available. Oh the fun we will have...
Can use Internet Archive to extract page numbers e.g., SOME NEW TAXA OF OLEACEAE FROM TIBET,CHINA (Q106563578) from https://archive.org/download/plantdiversity-0253-2700-32749/plantdiversity-0253-2700-32749_page_numbers.json
One challenge is the overlap between Wanfang volumes 21-26 and journal.kib.ac.cn. Wanfang has pages and DOIs, journal.kib.ac.cn doesn't have either, but does have PDFs. So we need to somehow crosslink these :(
Years | Volumes | Source | Pagination? | Notes | Status |
---|---|---|---|---|---|
1999-2010 | 21-32 | Wanfang Data | Yes | DOIs, mostly 10.3969 but some 10.3724 | Added using scraped data, then map to journal.kib.ac.cn URLs, some manual editing of DOIs |
2011-2015 | 33-37 | Wanfang Data | Yes | PLANT DIVERSITY AND RESOURCES, DOIs 10.3724 | |
1979-2020 | 1-42 | http://journal.kib.ac.cn | Mostly (2005 onwards, before that, no) | All called Plant Diversity, some DOIs 10.1016, 10.7677, 10.3724, every article has a URL | Use to add Plant Diversity And Resources (2011-2015), and enhance Wanfang data 1999-2010. |
1979-2015? | Example | CNKI | No, nor does it have volumes | http://www.cnki.com.cn/Journal/A-A6-YOKE-1979.htm | Incomplete coverage, use to enhance by add CNKI ids where possible. |
2016-2020 | 38-42 | Elsevier | Yes | Plant Diversity, DOIs 10.1016 | Add from CrossRef |
Prefix | Who | Agency | Link |
---|---|---|---|
10.7677 | wangfangdata | ISTC | https://doi.org/10.7677 |
10.3969 | wangfangdata | ISTC | https://doi.org/10.3969 |
10.1016 | crossref | crossref | https://doi.org/10.1016 |
10.3724 | Crossref - Science Press | crossref | https://doi.org/10.3724 |
Lots of articles already, some with CNKI DOIs. Need to explore further, and add PDFs and other links.
It looks like the links I had in BioNames are now gone, so need to remap. Stevenliuyi did some amazing work adding articles with CNKI DOIs, but these lack volume numbers and pagination, so need to add those. I'd also like to archive and link to the PDFs. Rdmpage (talk) 11:50, 21 July 2021 (UTC)
Note that DOI dates are often incorrect, they look to be dates article went online, not when it was published!
CNKI has issued DOIs for the complete journal (as far as I can determine). The journal also has a hoe page http://gswxb.cnjournals.cn/gswxben/home that has PDFs and also includes DOIs. The HTML has citation_ tags but poorly implemented with author footnotes included, no DOI or pagination, etc.
OK, more complicated. CNKI has complete journal, but DOIs have been issued by different sources.
Years | Volumes | Journal | DOIs | Notes |
---|---|---|---|---|
1953-1998 | 1 - | Acta Palaeontologica Sinica | CNKI | DOI 10.19800/j.cnki.aps... |
1999-2009? | Acta Palaeontologica Sinica | ITISC | Wanfang DOI 10.3969/J.ISSN.0001-6616.2009.04.003 | |
2010- | 49- | Acta Palaeontologica Sinica | CNKI | DOI 10.19800/j.cnki.aps... |
Name changes, ISSN changes, etc. Also in CiNii.
Years | ISSN | Source | DOIs | Notes |
---|---|---|---|---|
1932-2001 | 0001-6799,2189-7050 | J-Stage | 10.18942/bunruichiri... JaLC | Acta phytotaxonomica et geobotanica / 植物分類, 地理 Acta Phytotaxonomica et Geobotanica (Q100375972) |
2001- | 1346-7565,2189-7042 | J-Stage | 10.18942/apg... JaLC | Acta Phytotaxonomica et Geobotanica (APG) / 植物分類,地理 Acta Phytotaxonomica et Geobotanica (Q5656888) |
1979-2020 | 2189-7034,1346-6852 | J-Stage | 分類 : bunrui : 日本植物分類学会誌 / Bunrui Bunrui (Q40186046) |
Acta Phytotaxonomica Sinica (Q5656885) and Journal of Systematics and Evolution (Q15733644): multiple web sites, multiple DOIs, etc. See Acta Phytotaxonomica Sinica: A Bibliographic Summary of Published Volumes (Q28955370) for some earlier history. See http://gb.oversea.cnki.net/kcms/detail/detail.aspx?dbCode=cjfd&QueryID=4&CurRec=1&filename=ZWFX200801001&dbname=CJFD2008 for title change.
Years | Volumes | Journal | DOIs | Notes |
---|---|---|---|---|
2009- | 47- | Journal of Systematics and Evolution | CrossRef | DOI 10.1111/ ISSN 1674-4918 |
2008-2009 | 46 | Journal of Systematics and Evolution | ? | DOI 10.3724/SP.J... ISSN |
2005-2007 | 43-45 | Acta Phytotaxonomica Sinica | CrossRef | DOI 10.1360 ISSN 0529-1526 |
1951-2004 | 1-42 | Acta Phytotaxonomica Sinica |
ISSN 0001-7302, needs lots of work, became Current Zoology (Q15749150). Local database (publications) has CNKI URLs such as http://www.cnki.com.cn/Article/CJFDTOTAL-BEAR200806023.htm which now break, can be rewritten as https://oversea.cnki.net/kcms/detail/detail.aspx?dbcode=CJFD&filename=BEAR200806023, which also has page information.
Also links (but no DOIs) in Wanfang https://d.wanfangdata.com.cn/periodical/dwxb200801015 Also links in CQVIP e.g. http://www.cqvip.com/qk/94056x/200801/26635405.html
Name changes, ISSN changes, etc. Acta Zootaxonomica Sinica (Q15761826) and Zoological Systematics (Q21386166). Lack of pagination data. Lots of articles already added by @Stevenliuyi: with CNKI CJFD journal article ID (P6769) identifier, but these include old articles in Acta Zootaxonomica Sinica (Q15761826) linked to new name for journal Zoological Systematics (Q21386166).
I've started to move the pre-2014 articles to Acta Zootaxonomica Sinica (Q15761826) and adding the Wanfang DOIs. Pagination have to be added later.
2020-11-14 Danger Will Robinson! CNKI and Wanfang article numbering is different, so can't rely on simply mapping Wanfang URL to CNKI even though they look very similar :(. Will need to be cleverer about the mapping...
Years | ISSN | Source | DOIs | Notes |
---|---|---|---|---|
1964- | 1000-0739 | CNKI | Acta Zootaxonomica Sinica (Q15761826) | |
1998-2013 | 1000-0739 | WanFang | 10.3969/j.issn.1000-0739... | Not all have DOIs |
2014- | 2095-6827 | http://www.zootax.com.cn | 10.11865/zs. | DOIs don't seem to resolve? |
This journal has suffered from having BHL treat issues as titles, so that Wikidata has separate items for articles that have poor metadata and aren't linked to the journal. Seems to be restricted to pre-1923 content, will need to check. Given that BHL continues to do this, will need to spend some time linking to newer BHL and IA content...
- I've linked most(all?) stray pre-1923 articles to Handles and the journal itself. 2020-12-18
Many articles with PMID but no DOI (and also badly translated English titles). See https://w.wiki/55Ld for query for PMID but no DOI. I have matched PMIDs to records locally, need to move this to Wikidata.
Article A revision of the genus Heliophanus C. L. Koch, 1833 (Aranei: Salticidae) (Q60864671) has a bad Internet Archive ID (P724) as it contains []. Need to fix.
We have lots of BioOne DOIs for articles before 2000, which is when BioOne first has content for this journal. Prior to 2000 all DOIs that resolve to content are JSTOR DOIs. Looks like BioOne feed CrossRef a bunch of DOIs for which it never had content, hence CrossRef metadata has lots of duplicate DOIs. Some of these are in Wikidata... what a mess.
- I've added JSTOR DOIs to those pre-2000 articles that only had "fake" BioOne DOIs, and added all JSTOR-DOI content prior to 2000 Rdmpage (talk) 13:46, 4 April 2022 (UTC)
Note that we also have some BHL DOIs for this journal, and also there are a small number of articles in Horizon.
There is overlap between Persee and lasef.org. CrossRef from 2018 points to lasef.org, landing page is volume not individual srticle.
So we have Persee and lasef.org overlap 2008-2016, 2017 is just lasef.org, then 2018 - CrossRef
Years | Source | DOIs | Notes |
---|---|---|---|
1896-2016 | Persee https://www.persee.fr/collection/bsef | Some articles from 1896 have DOIs 10.3406/bsef. | Post 2016 are embargoed so no metadata, all PDFs have a captcha in front of them. |
2008- | https://lasef.org/ | PDFs freely available (most recent under embargo) | |
2018- | CrossRef | 10.32475/bsef. |
Some have DOIs e.g. Five new species of the genus Trichotichnus from Taiwan (Coleoptera, Carabidae, Harpalini) (Q111523062) 10.20643/00001606 which leads to a repository https://omnh.repo.nii.ac.jp
Need to sort out ISSNs and history
Also bad things have happened involving mismatch between metadata (titles and pages) and Internet Archive PDFs. Need to review all articles and sort this out :(
For example Cloning and Gene Expression of 3-Hydroxy-3-Methylglutaryl-CoA Synthase Gene( AsHMGS ) from Aquilaria sinensis (Lour.) Gilg (Q96107615) title doesn't match first line or pagination, and IA PDF is for different article.
Note that there are duplicate DOIs, e.g. 10.7525/j.issn.1673-5102.2007.05.002 appears in the metadata for two articles!
Wrong or inconsistent metadata in Wikidata
edit- Bad
- Fixed/OK
Q96107547 Q96107548 Q96107551 Q96107552 Q96107555 Q96107556 Q96107557 Q96107559 Q96107568 Q96107570 Q96107571 Q96107573 Q96107574 Q96107576 Q96107578 Q96107580 Q96107581 Q96107582 Q96107590 Q96107592 Q96107593 Q96107594 Q96107597 Q96107599 Q96107601 Q96107606 Q96107607 Q96107611 Q96107613 Q96107615 Q96107626 Q96107710 Q96107886 Q96107896 Q96107978 Q96108014 Q96108078 Q96108079 Q96108091 Q96108099 Q96108284 Q96108340 Q96108438 Q96108512 Q96108519 Q96108655 Q96108753 Q96108770 Q96108786 Q96108839 Q96108857 Q96108909 Q96108977 Q96109101
Wrong PDF in IA
editFor some articles the contents of the PDF in IA are wrong, e.g. Q96107529 Q96107530 Q96107532 Q96107535 Q96107536 Q96107528
I've made all these items in IA dark.
Wrong IA metadata
editI've deleted DOIs when I've found a mismatch between metadata and DOI.
Bulletin of the United States National Museum
editBit of a mess with multiple items of different type, see discussion https://www.wikidata.org/wiki/Wikidata_talk:WikiProject_Periodicals/Archive_2#Bulletin_of_the_United_States_National_Museum https://www.wikidata.org/wiki/Wikidata_talk:WikiProject_Periodicals/Archive_3#Bulletin_of_the_United_States_National_Museum_(take_2) https://www.wikidata.org/wiki/Wikidata_talk:WikiProject_Taxonomy/Archive/2020/10#Bulletin_of_the_United_States_National_Museum
Based on BHL, ISSN and SuDoc
Name | ISSN | BHL | Sudoc | Years |
---|---|---|---|---|
Bulletin - or / United States National Museum | 0362-9236 | 169510 | 037366866 | 1907-1971 |
Bulletin of the United States National Museum | 0096-2961 | 169509 | 036653233 | 1875-1905 |
Note also on BHL "No. 1-16 issued also in: Smithsonian miscellaneous collections, v. 13, 23-24."
Name | Description | Years | ISSN | BHL | Notes |
---|---|---|---|---|---|
Bulletin - United States National Museum (Q21385133) | version of volume 1 of a journal | 1877-1971 | 0362-9236 | BHL:169509 (1875-1905) |
118 publications linked to this item 1889-1969 https://w.wiki/4PrV |
Bulletin of the United States National Museum (Q21385329) | scientific journal (1875–1905) | 1875–1905 | 0096-2961 |
Nine publications linked to this item 1886-1970 https://w.wiki/4PrY | |
Bulletin of the United States National Museum (Q56634273) | volume 1 of a publication |
has edition or translation (P747) Bulletin - United States National Museum (Q21385133) , no external identifiers | |||
Bulletin of the United States National Museum (Q56633969) | publicaton of the United States Museum |
has part(s) (P527) Bulletin of the United States National Museum (Q56634273) no external identifiers but linked to https://commons.wikimedia.org/wiki/Category:Bulletins_of_the_United_States_National_Museum which spans 1879-1971 |
Looks like eLibrary.ru (Q4037789) may have made some mistakes linking translations together, may have to check against Springer website for Entomological Review (Q47161189). Rdmpage (talk) 17:11, 29 January 2023 (UTC)
Some confusion between Japanese Journal of Ichthyology (Q21385442) and Ichthyological Research (Q15760079) (partly caused by Springer). These are two separate journals. Metadata for Japanese Journal of Ichthyology (Q21385442) retrieved via DOIs often lacks titles, will need to add manually. Also need to add titles in multiple languages via harvesting web site.
Online but will require some effort to get list of articles http://zgnydxxb.ijournals.cn/zgnydxxb/ch/index.aspx
Three articles so far, note that Japanese Species of Parmelia Ach. (sens. str.), Parmeliaceae (Q59123248) is a composite of several articles with the same title (i.e., it is a series of papers).
Journal of The Asiatic Society of Bengal (and Materials for a Flora of the Malayan Peninsula)
editThis journal contains Materials for a Flora of the Malayan Peninsula, which has also been issued as a separate reprint.
There is a scanned archive at South Asia Archive (Q104412505) http://www.southasiaarchive.com that is behind a paywall, there are also freely accessible PDFs at https://pahar.in/journals/.
Items about the journal include Systematic notes on Asian birds. 51. Dates of avian names introduced in early volumes of the Journal of the Asiatic Society of Bengal (Q89633372)
Items about specific articles include XIV.—Notes on the œconomy of the Paussidæ, extracted from Capt. W. J. E. Boyes' Paper, published in the Journal of the Asiatic Society of Bengal (No. 138.—N. S. No. 54) (Q99848588)Items about Materials for a Flora of the Malayan Peninsula Materials for a Flora of the Malayan Peninsula (Q13554331) (generic item for work) and Materials for a flora of the Malayan Peninsula (Q51383079) (BHL copy). Articles mentioning it Materials for a Flora of the Malayan Peninsula (Q64285277) Materials for a Flora of the Malayan Peninsula (Q64276618) (in Nature)
Multiple DOIs, duplication, redirects, etc. Plus we have items from PubMed that don't have DOIs. In short, a clusterfuck.
Suggested work flow:
- Map DOIs 10.3852 to Wikidata, use php doi_to_doi.php to get redirect DOIs, map those to Wikidata, then see if we need to add extra DOIs to those items (i.e., 10.3852 and 10.1080) https://w.wiki/53Nn
- Map PMIDs to Wikidata for references that lack DOIs in Wikidata https://w.wiki/53Nk then add those DOIs to Wikidata
- Map JSTOR to Wikidata, add JSTOR ids if missing from Wikidata
- Merge records with different DOIs and update corresponding Wikidata items
- Add any missing records to Wikidata
Issues
editUsing redirection to find the "other DOI" for 10.3852 uncovered 25 items that are duplicates, i.e., both DOIs have a Wikidata item:
Q33292380 | Q54802190 |
Q31120184 | Q105485138 |
Q31042045 | Q63854883 |
Q39437650 | Q60401823 |
Q34491252 | Q110615535 |
Q58803921 | Q81300049 |
Q28295469 | Q56485068 |
Q33309119 | Q110827263 |
Q31139254 | Q57254285 |
Q56915900 | Q31111921 |
Q60449006 | Q51709251 |
Q28301216 | Q59678579 |
Q31130104 | Q56931476 |
Q22255400 | Q34626370 |
Q56145227 | Q31111919 |
Q28276701 | Q60895113 |
Q58045983 | Q39098297 |
Q64385287 | Q31081978 |
Q51119147 | Q57448972 |
Q111373678 | Q51188492 |
Q111262623 | Q80758057 |
Q31120961 | Q59205306 |
Q58656219 | Q82464853 |
Q56951379 | Q34399871 |
Years | Source | DOIs | Notes |
---|---|---|---|
1909-2004 | JSTOR | 10.2307 | Resolve to JSTOR |
2005-2016 | Now on T&F | 10.3852 | DOIs like 10.3852/mycologia seem to be redirects to 10.1080 |
1909-2022 | T&F | 10.1080 | Complete coverage, so many duplicates of JSTOR content |
1945-2020 | PubMed | Some missing DOIs | Sporadic, many lacking DOIs |
These duplicates have now been merged.
Then looked at items with PMID but no DOI in time period 2005-2016, these seem to be articles with no DOI at all.
Next issue is to identify those records that have only one DOI in Wikidata. These correspond to item with 10.3852 DOIs, a Wikidata item, and no Wikidata item for the 10.1080 DOI. I have added the missing DOIs to these records (they now have 2 DOIs).
This leaves us with DOIs for this time period that have no Wikidata item at all. These will be added.
DOIs from 2017 onwards are just T&F so these can be added directly.
Records for 1909-2014 will have to be merged where they have two DOIs, and we also need to check for PubMed-only records, of which there are a number. PMIDs will link both DOIs, so will need both DOis added, then add missing DOI for records with one DOI, then add missing records (and extra DOIs).
Details
editProgress: volume 96
-- Look at a volume SELECT guid, title, volume, spage, epage, doi, wikidata, pmid, pii FROM publications_tmp WHERE issn='0027-5514' AND volume=96 ORDER BY CAST(spage AS SIGNED);
-- add DOIs to records with PMID but no DOI SELECT CONCAT(pii, char(9),'P356', char(9), '"', doi , '"') FROM publications_tmp WHERE issn='0027-5514' AND volume=96 and wikidata is null and pmid is not NULL ORDER BY pii;
-- Get wikidata for DOIs with no PMID SELECT guid FROM publications_tmp WHERE issn='0027-5514' AND volume=96 and wikidata is null and pii is null;
-- Get any articles with newer DOIs that we should add SELECT * FROM publications_tmp WHERE issn='0027-5514' AND volume=96 and wikidata is null and pii is null and doi like '10.1080%';
PubMed errors
editThere is a problem with volumes 60 and 61 in PubMed, some articles are assigned to the wrong volume :( Fixed Rdmpage (talk) 08:46, 15 April 2022 (UTC)
CNKI, see also https://manu40.magtech.com.cn/Jwxb
History of the journal in https://manu40.magtech.com.cn/Jwxb/CN/abstract/abstract2906.shtml https://doi.org/10.13346/j.mycosystema.2011.01.014 = https://www.cnki.net/kcms/doi/10.13346/j.mycosystema.2011.01.014.html
Years | Journal | ISSN | Volumes | Source | Pagination? | Notes | Status |
---|---|---|---|---|---|---|---|
1982-1996 | Acta Mycologica Sinica | 0256-1883 | 1-15 | magtech and CNKI | Yes | DOIs 10.13346 | Not added yet |
1997-2003 | Mycosystema | 1007-3515 | 16-22 | magtech | no | no DOIs(?) | Not added yet |
2004-2013 | Mycosystema | 1672-6472 | 23- | magtech and CNKI | Yes | DOIs 10.13346 | Not added yet |
Muséum National d'Histoire Naturelle journals
editSee 1802–2018: 220 ans d'histoire des périodiques au Muséum (Q93462644) and Timeline of the scientific publications of The Museum for details.
Records of the Auckland Museum
editSpecial:Contributions/Prosperosity has made some edits to this journal and merged the older records I added (JSTOR) to newer ones they created. I will need to update my local mapping between JSTOR ids and Wikidata to accommodate this.
- Done
Russian Journal of Genetics / Genetic
editWikidata has these journals confused. For example Gene diversity for haptoglobin and transferrin classical markers among Hindu and Muslim populations of Aligarh City, India. (Q48605442) is stated as being published in Russian Journal of Genetics (Q15753063) (the English language journal) with a link to PubMed, but PubMed says it is published in Genetika which is the Russian journal (which lacks a Wikidata item). Russian Journal of Genetics (Q15753063) has three ISSNs, 1022-7954, 1608-3369, 0016-6758, the last one (0016-6758) is for Genetika (Moskva) https://portal.issn.org/resource/issn/0016-6758. Need to unpack this journal and link articles to correct journal. Note that the English language articles will (mostly? all?) be translations of the articles in Genetika.
For the case of the article "Gene diversity for haptoglobin and transferri..." the Wikidata record is from GENETIKA but Wikidata links this to Russian Journal of Genetics. A bit of a mess :(.
Journal | DOI | PMID | URL | pages | |
---|---|---|---|---|---|
Russian Journal of Genetics | 10.1134/s1022795411060044 | https://link.springer.com/article/10.1134%2FS1022795411060044 | 47(6): 744-748 | ||
GENETIKA ГЕНЕТИКА | - | 21866866 | https://elibrary.ru/item.asp?id=16455347 | 47(6): 842-846 |
Some articles with DOIs, PDFs available, website is a bit sluggish. Has English and Chinese metadata.
There is a big gap in DOI coverage where we have JSTOR ids but no DOIs. These DOIs exist for Wiley content, so need to match DOIs to existing Wikidata records (from JSTOR).
Volumes to do are 57-61, see https://w.wiki/3dCH
Journal has several names, not sure of the timing of each, e.g. Transactions of the Lepidopterological Society of Japan, Tyô to Ga, Lepidoptera Science.
My first import generate a number of duplicates as I didn't check that DOIs were unique before adding them (doh!). This resulted in 1348 duplicates which I am merging.
I also want to link these articles to their CiNii identifiers.
Volume 42 (2008) onwards has CrossRef DOIs, although there are issues with their resolution. Prior to 2008 lots of articles online, not all with volumes, etc.
Many articles already in Wikidata, mix of DOIs (not all work), also some coverage in Wanfang and CNKI. Wikidata coverage is based primarily on PubMed. Note that there are at least three different DOI agencies, and some overlap in DOIs and/or agencies
Two ISSNs (and two Wikidata items) Zoological Research (Q15766889) 0254-5853 and Zoological Research (Q27714095) 2095-8137
According to NLM "Began with Volume 37, issue 5 (18 September 2016)." https://locatorplus.gov/cgi-bin/Pwebrecon.cgi?v1=5&ti=1,5&SC=Title&SA=Zoological%20research&PID=e3kbuLJdLaFS0PmiKWaW_rCl&SEQ=20210406112547&SID=2 but this is not entirely clear from the journal itself. For example, the cover for "Volume 35 Issue 5 18 September 2014" (3516) says ISSN 2095-8137 whereas "Volume 35 Issue 3 18 May 2014" has ISSN 0254-5853. 2014 also seems to be the year that the DOIs have mixed ISSNs. What a mess. Cover of vol 35 issue 4 18 July 2014 has ISSN 2095-8137 and also DOIs with that ISSN, so I think that os the issue when the ISSN changed.
Note that CrossRef and Wanfang DOIs overlap in volume 29! Also looks like not all Crossref DOIs work :(
Added most of 0254-5853 vols 1-34,35 still need to add Chinese titles to post vol 29 as many are Pubmed English translations. Will need to import Chinese titles and store in multilingual as current code uses DOI as GUID and hence misses the Chinese titles. Rdmpage (talk) 13:47, 2 May 2022 (UTC)
Years | Volumes | DOIs | ISSN | Notes |
---|---|---|---|---|
2017- | 38- | CrossRef | 2095-8137 | DOI 10.24272/... |
2014-2017 | 35-38 | CNKI | 2095-8137 | DOI 10.13918/j.issn.2095-8137 |
2013-2014 | 34-35 | ? | 0254-5853 | DOI 10.11813/j.issn.0254-5853 (broken) |
2004-2021 | 25-42 | ? | ? | Bioline http://www.bioline.org.br/toc?id=zr |
2008-2013 | 29-34 | CrossRef | 0254-5853 | DOI 10.3724/... |
1999-2008 | 20-29 | ISTIC | 0254-5853 | DOI 10.3321/j.issn: Wanfang |
1980-1998 | 1-19 | 0254-5853 | No DOIs |
Multilingual titles
editIf you have a Chinese title (e.g., "西北植物学报") and a transliteration (e.g., "Xibei zhiwu xuebao") then you can connect the two using Hanyu Pinyin transliteration (P1721). See Acta botanica Boreali-Occidentalia Sinica (Q27721266).
Quote from Contributions to the botanical journal Sunyatsenia from 1930 to 1948 (Q28944969)
"Names of Chinese botanists follow the convention of placing the family name first followed by given names; names of Westerners follow the western convention of placing the family name last. Chinese botanists mostly followed the Wade-Giles system of Romanization when transliterating their name; and the current pinyin system was initiated late in the careers of most of the early Chinese botanists around 1950s. They were not required to adopt the pinyin system if they had actively published and were known under a different transliteration of their name."
Wade-Giles is Wade-Giles (Q208442). This means we may need to take some care in handling Chinese names for older literature.
Titles with HTML markup
edittitle (P1476) shouldn't have any markup, but you can add a qualifier to the title title in HTML (P6833) to include the markup. For example, see Sur le genre Trypanoxyuris (Oxyuridae, Nematoda) IV. Sous-genre Trypanoxyuris parasite de Primates Cebidae et Atelidae (suite) Étude morphologique de Trypanoxyuris callicebi n. sp. (Q64173850).
Full text
editNote document file on Wikimedia Commons (P996) e.g. for A new cryptic species of Anolis lizard from northwestern South America (Iguanidae, Dactyloinae) (Q58700998) which essentially embeds a PDF in Wikidata!
Checksums
editMaybe add checksum (P4092) as a property to a publication, as a way to link (indirectly) to content, see also https://hash-archive.org and https://bentrask.com/?q=hash://sha256/98493caa8b37eaa26343bbf73f232597a3ccda20498563327a4c3713821df892 by Ben Trask (Q63232898).
Author matching and related issues
editMatches without series ordinal
editNote that User:EvaSeidlmayer has added author (P50) to lots of references without adding series ordinal (P1545), and leaving author name string (P2093) in place, so we have two entries for the same author, one as a thing and one as a thing (see e.g., North American distribution ofEleocharis mamillata(Cyperaceae) and confusion withE. macrostachyaandE. palustris (Q100395512) ).
Redirects
editWikispecies
editSometimes we have authors (or other entities) that two Wikidata items (e.g., two links to Wikispecies) when there is really only one entity (e.g., one person). An example is Eduardo Flórez Daza (Q21392863) and Eduardo Flórez Daza (Q56650857). These are the same person, and the Wikispecies entry for Eduardo Flórez Daza is a redirect to Álvaro Eduardo Flórez-Daza. The convention for this seems to be:
- Delete data for the redirect item (but add missing values to item that will be pointed too).
- Keep the instance of the redirect item (e.g., instance of (P31) human (Q5)
- Add subject has role (P2868) Wikimedia redirect (Q21528878)
In this case Eduardo Flórez Daza (Q21392863) is now mostly empty and data on this person can be found at Eduardo Flórez Daza (Q56650857).
Authors
editWhen merging authors, e.g. John William Thieret (Q102229589) with John William Thieret (Q21390395) the expectation is that a BOT will update every link to Q102229589 to point to Q 21390395. This process seems to take a long time. Telegram chat suggests 8 days https://t.me/c/1497612692/4509. Q102229589 was made a redirect 2023-07-19, the links were updated 2023-07-27 by User:KrBot (eight days later).
Bibliographic relationships
editReviews
editCould use for reviews of books, etc.
Translations
editTwo New Species of the Weevil Genus Mecysmoderes Schoenherr, 1837 (Coleoptera, Curculionidae: Ceutorhynchinae) from Vietnam (Q99837830) in Entomological Review (Q47161189) is the English language version of ДВА НОВЫХ ВИДА ДОЛГОНОСИКОВ РОДА MECYSMODERES SCHOENHERR, 1837 (COLEOPTERA, CURCULIONIDAE: CEUTORHYNCHINAE) ИЗ ВЬЕТНАМА (Q99838137) in Entomologicheskoe Obozrenie (Q4532102). How do we represent this relationship?
OK, we can use edition or translation of (P629) and its inverse has edition or translation (P747) to link the two works together. Maybe should also make translated article and instance of version, edition or translation (Q3331189).
Errata
editThe article The first African record of Artolenzites acuta comb. nov. (Basidiomycota, Polyporaceae) (Q99931585) has an erratum Erratum to: The first African record of Artolenzites acuta comb. nov. (Basidiomycota, Polyporaceae) (Q99888822). To connect an article to its errata we use corrigendum / erratum (P2507) as a property of the original article, hence we have Q99931585 -- P2507 --> Q99888822
Note that there are bots that automatically add instance of (P31) erratum (Q1348305) to erata (see history of Erratum to: The first African record of Artolenzites acuta comb. nov. (Basidiomycota, Polyporaceae) (Q99888822) ).
Note also that User:Trilotat has some useful; queries to find corrections and the things they correct https://www.wikidata.org/wiki/User:Trilotat/SPARQL#Corrections,_errata_and_corrigenda
PDFs
editTo add a PDF for an article use full work available at URL (P953), add file format (P2701) Portable Document Format (Q42332) as a qualifier to say that it is a PDF, and add archive URL (P1065) with a link to the URL in the Wayback machine if it has been archived there. See Trithecoides, a new subgenus of Culicoides (Diptera: Ceratopogonidae) (Q89666437) for an example.
Books
editOne model is the "book" is written work (Q47461344) which has basic information (title, author) and OCLC work ID (P5331) as an identifier (for example). Then. we have editions version, edition or translation (Q3331189) that have ISBNs (e.g., ISBN-10 (P957), Google Books ID (P675) etc. Editions are linked to works by has edition or translation (P747), works are linked to editions edition or translation of (P629). The Wikidata:WikiProject Books wants every book to have both written work (Q47461344) and at least one version, edition or translation (Q3331189), which seems redundant for many cases. For now I use Google Books to add books and by default make them written work (Q47461344). I follow Wikidata:WikiProject Books if there are multiple editions that seem important (e.g., they are cited).
Wikisource
editSee for example The Afghan War (Q19077572).
A version, edition or translation (Q3331189) has document file on Wikimedia Commons (P996), linking to a file on Commons, and Wikisource index page URL (P1957) which is the link to the Wikisource page for the transcription of the book.
Chapters
editA chapter (Q1980247) is part of (P361) a book, and the book should list each chapter as has part(s) (P527), see for example The Canterbury Tales (Q191663)
Citations
editIn Ridleyandra merohmerea (Gesneriaceae), a new species from Kelantan, Peninsular Malaysia (Q42258926) I explored adding citations without DOIs as strings using unknown (Q24238356). See also proposal by GerardM for a citation string Wikidata:Property proposal/cites work string.
On the basis of this (unsuccessful) proposal GerardM has been exploring adding citations to cites work (P2860) using placeholder for "somevalue" (Q53569537), see for example Can trophic rewilding reduce the impact of fire in a more flammable world? (Q57805204).
Given that Quickstatements struggles with placeholder for "somevalue" (Q53569537) we will need to look at using the API to edit these statements directly (using unique statement ids). Need to be able to"
- add a citation that lacks an item (with qualifiers)
- retrieve details of citation that lacks item so we can try and add or match it
- update citation that lacks an item with corresponding item
API experiments Q102901875 and Q102902439.
Unstructured citations
editQuerying for unstructured citations:
select * where { # Ridleyandra merohmerea ... VALUES ?work { wd:Q42258926 } . # Outsized effect of predation... # VALUES ?work { wd:Q102058694 } . # Get cited works ?work p:P2860 ?statement . ?statement ps:P2860 ?cites . # stated as OPTIONAL { ?statement pq:P1932 ?unstructured . } # series ordinal OPTIONAL { ?statement pq:P1545 ?position . } # title OPTIONAL { ?statement pq:P1476 ?title . } # author name string OPTIONAL { ?statement pq:P2093 ?authors . } # publication date OPTIONAL { ?statement pq:P577 ?date . } # DOI OPTIONAL { ?statement pq:P356 ?doi . } # URL OPTIONAL { ?statement pq:P953 ?url . } FILTER (!isIRI(?cites)) } ORDER BY (xsd:integer(?position))
Bibliographic identifiers (and proposals)
edit(see also Template:Bibliographic_properties )
CNKI CJFD journal article ID (P6769)
https://www.wikidata.org/wiki/Wikidata:Property_proposal/National_Diet_Library_Persistent_ID
BHL
editBibliographic licenses including text mining
editCould add information on licensing when adding works via CrossRef, would need to create items for each license, see e.g. https://www.wikidata.org/wiki/Help:Copyrights#RightsStatements for links to various licenses that could be used as templates. For example,
- instance of (P31) license (Q79719)
- facet of (P1269) copyright (Q1297822)
- subclass of (P279) copyrighted (Q50423863)
- described at URL (P973) url to license
- exact match (P2888) url to license
Examples
editBibliographic harvesting, RSS
editRSS feed
edit
OAI
editURL (P2699) OAI endpoint, qualifier protocol (P2700) Open Archives Initiative Protocol for Metadata Harvesting (Q2430433)
Publishing engines
editsoftware engine (P408) Open Journal Systems (Q1710177)
Engines for taxonomy journals https://w.wiki/3fgN
select * { ?journal wdt:P31 wd:Q5633421 . ?journal wdt:P1476 ?title . ?journal wdt:P408 ?engine . ?engine rdfs:label ?label . FILTER(LANG(?title) = "en") FILTER(LANG(?label) = "en") ?article schema:about ?journal . FILTER(regex(str(?article), "species.wikimedia.org")) ?journal wdt:P495 ?country . ?country wdt:P625 ?coordinates . } LIMIT 10
Things to fix
editNew species and new records of ant-eating spiders from Mediterranean Europe (Araneae: Zodariidae) (Q104465474) has the same DOI cited many times, but this is an error as each references is different! So we have massive duplication of works cited.
Other
edit(see also Template:Bibliographic_properties and Wikidata:WikiProject_Books)
IUCN conservation status (P141) BHL page ID (P687)
zoological specimen (Q2114846)
Babel user information | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||||||||
Users by language |