[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Scholia logoThis user uses Scholia.
QuickStatements logoThis user uses QuickStatements.
This user loves Wikidata.
1,114,594+This user has made over 1,114,594 contributions to Wikidata.

I am Roderic D. M. Page (Q7356570), you can find me on Twitter as @rdmpage, and I have a blog iPhylo which lists my current projects.

Things to fix

edit

Duplicate ISBNs

edit

From https://www.wikidata.org/wiki/User:Maxlath/P212_unique_value_constraint_violations_by_user



Taxa versus species

edit

Norops duellmani (Q6450757) and Anolis duellmani (Q2814307) are the same taxon, in this example different wikis link to different names, and sometimes the page names don't match.

Essays

edit

User:Rdmpage/Referencing taxon names

Wikidata

edit

How to add references Help:Sources

Wikidata:Notability

Place where WikiCite-related stuff gets discussed. Wikidata talk:WikiProject Source MetaData

Deletions

edit

See https://phabricator.wikimedia.org/T291659#7402884%7Cstatement and

SELECT DISTINCT ?item ?itemLabel WHERE {

   SERVICE wikibase:mwapi
   {
     bd:serviceParam wikibase:endpoint "www.wikidata.org".
     bd:serviceParam wikibase:api "Generator".
     bd:serviceParam mwapi:generator "links".
     bd:serviceParam mwapi:titles "Wikidata:Requests for deletions".
     bd:serviceParam mwapi:gpllimit "max".
     bd:serviceParam mwapi:gplnamespace "0".
     ?item wikibase:apiOutputItem mwapi:title.
   }
  
   ?item wdt:P6944 ?id .
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }

}

Property proposals

edit

Ones I've made or have been involved in.

Wikidata:Property_proposal/Authority_control#National_Diet_Library_Persistent_ID

Wikidata:Property_proposal/Institut_de_recherche_pour_le_développement_(IRD)_identifier

Wikidata:Property_proposal/Index_to_Organism_Names_ID

Data quality

edit

See Wikidata:WikiProject_Data_Quality

SPARQL

edit

See User:Succu/SPARQL for lots of relevant examples.

Deprecation

edit

See Help:Deprecation. One example of using this would be to correct dates of articles where CrossRef has got the date wrong (e.g., Wiley metadata).

Withdrawn identifiers

edit

e.g. if an ISSN has been cancelled use reason for deprecated rank (P2241) withdrawn identifier value (Q21441764)

Deprecated identifiers

edit

See Haplostoma humesi, New Species (Copepoda: Cyclopoida: Ascidicolidae), Associated with a Compound Ascidian (Aplidium Sp.) from Madagascar (Q104118218) for an example with two DOIs.

Duplicates

edit

Trying to clean up Zootaxa, first example The type specimens of Tachinidae (Diptera) housed in the Museo Argentino de Ciencias Naturales “Bernardino Rivadavia”, Buenos Aires (Q29469527) and The type specimens of Tachinidae (Diptera) housed in the Museo Argentino de Ciencias Naturales “Bernardino Rivadavia”, Buenos Aires (Q35800184)

Problem is that Revision of the neotropical Exoristini (Diptera, Tachinidae): the status of the genera Epiplagiops and Tetragrapha (Q35950285) cites both, but they are not merged in that references list of cited works.

I got bored and manually fixed the duplication.


Another test case: Revision of Zorion Pascoe (Coleoptera: Cerambycidae), an endemic genus of New Zealand (Q79429489) and Revision of Zorion Pascoe (Coleoptera: Cerambycidae), an endemic genus of New Zealand (Q28939048), where one has CrossRef DOI and is cited by A checklist of New Zealand Cerambycidae (Insecta: Coleoptera), excluding Lamiinae (Q56166058), the other has Zenodo DOI, has authors instead of author strings, and is linked to a taxon Zorion taranakiensis (Q14848194).

Merged

Zootaxa

edit

Notes on de-duplicating Zootaxa. The counts here come from a local database I made, and will be out of date, especially as @Succu: is working through duplicates manually. "x" is not entered yet.

Year No. articles in WD at start Duplicates No. in CrossRef Query Notes
2001 17 0 20 https://w.wiki/WJ8
2002 105 0 107 https://w.wiki/WJD
2003 258 0 269 https://w.wiki/XMP
2004 261 0 388 https://w.wiki/XA6
2005 424 1 583 https://w.wiki/XAQ Duplication is ZENODO DOI record Q79429489 of record Q28939048
2006 10 0 851 https://w.wiki/XAY Q88363212 has only a Zenodo DOI
2007 167 0 1067 https://w.wiki/XDb
2008 22 0 1111 https://w.wiki/XE2
2009 19 0 1466 https://w.wiki/XHJ
2010 59 0 1416 https://w.wiki/XJV
2011 47 0 1650 https://w.wiki/XMW
2012 43 0 1893 https://w.wiki/XMZ Q29470959 has two PMIDs both valid but clearly duplicates
2013 2144 1237 2123 https://w.wiki/XMb Q29468684 had duplicate PMIDs, one wrong. Q30860840 has a DOI but CrossRef metadata is wrong (title is that of preceding article). A number of 2013 articles have DOIs that don't resolve and hence DOI field in Wikidata is not populated.
2014 2017 3 2027 https://w.wiki/XMc
2015 2338 0 2341 https://w.wiki/XMd
2016 2315 5 2334 https://w.wiki/XMe Q28821788 has two PMIDs both valid but clearly duplicates
2017 2241 9 1859 https://w.wiki/XMf Note that Wikidata has more than CrossRef, check what happened here.
2018 2231 814 2322 https://w.wiki/XMg
2019 2408 9 2505 https://w.wiki/XMh
2020 1242 0 1106 https://w.wiki/XMj

References

edit

To add references for a statement in Wikidata using Quickstatements:

Q36504420 P21 Q6581072 S248 Q28948401

Note "S" instead of "P" for "stated in" property. To see the result in Wikidata see Andreja Kofol-Seliger (Q36504420).

Examples

edit


User:Achim_Raschka/MSW-Cetacea

Homonyms

edit

Homonyms can be linked to replacement names, and to each other.

Authority control

edit
Authority control

Geography

edit

Wikidata:List of properties/geography

Can store GeoJSON in Wikicommons, e.g. Commons:Data:BioStor/95684.map. Need to create page manually then add data (create page via a red link).

Can link to Wikidata using geoshape (P3896) View with SQID 

Need to figure out how to retrieve GeoJSON for use in applications.

See also Wikidata:Property_proposal/distribution_map_of_taxon

Taxonomic properties

edit

See also Template:Taxonomy_properties., Wikidata:WikiProject Taxonomy, and Wikidata:WikiProject Taxonomy/Tutorial

problems

edit

Wikidata conflates names and taxa, see wikipedia:User:Peter_coxhead/Wikidata_issues, Wikidata:Property_proposal/taxon_synonym_string, is it possible to resolve this?

For a summary of properties see Template:Taxa Versus Names.

properties

edit

Properties of a taxon (Q16521)  View with Reasonator  View with SQID 

taxon name (P225) View with SQID  taxon rank (P105) View with SQID  parent taxon (P171) View with SQID  taxon common name (P1843) View with SQID 

taxon synonym (P1420) View with SQID  taxon range map image (P181) View with SQID  hybrid of (P1531) View with SQID 

taxon author (P405) View with SQID 

Examples: Synalpheus pinkfloydi (Q29367343)

geography

edit


type locality

edit

type locality (biology) (P5304) View with SQID 

Locality must be a Wikidata item. Qualifiers include object named as (P1932) View with SQID  and point in time (P585) View with SQID  and coordinate location (P625) View with SQID  (e.g., Solanum aspersum (Q1305990).

We can create a map of type localities: Try it!

nomenclature

edit

nomenclature entities

edit

protonym (Q14192851)  View with Reasonator  View with SQID  basionym (Q810198)  View with Reasonator  View with SQID 

taxon identifiers

edit

ITIS TSN (P815) View with SQID  Encyclopedia of Life ID (P830) View with SQID  NCBI taxonomy ID (P685) View with SQID  GBIF taxon ID (P846) View with SQID  IPNI plant ID (P961) View with SQID  Plant List ID (Royal Botanic Gardens, Kew) (P1070) View with SQID  IUCN taxon ID (P627) View with SQID  Tropicos ID (P960) View with SQID  WCSPF ID (P3591) View with SQID  Avibase taxon ID (P2026) View with SQID  MSW ID (P959) View with SQID  BOLD Systems taxon ID (P3606) View with SQID  MycoBank taxon name ID (P962) View with SQID  Index Fungorum taxon ID (P1391) View with SQID 


edit

Use BHL page ID (P687) as a reference for a taxon name (P225), e.g. https://www.wikidata.org/wiki/Q11577#P225

=== genome --

sequenced genome URL (P6800) for link to genome, e.g. Asian tiger mosquito (Q477918) has https://metazoa.ensembl.org/Aedes_albopictus

Taxonomic examples

edit

User:Achim_Raschka/Erstbeschreibungen has a list of new species descriptions, see also query that generates a timeline: [1]


Nice examples for literature mapping

edit

Interesting cases

edit

Names that aren't taxa

edit

There are cases where names have an entry but they are not instances of taxa, e.g Satsuma chalybeia (Q25351070) which is described as a "name that may not be used" and is an instance of unavailable combination (Q17487588) synonym (Q1040689) and Satsuma Murray (1874) non Adams (1868) (Q25661771)(!). Looks like an attempt to separate names from taxa...

What this also means is that we can't rely on simply searching for things that are instances of taxon (Q16521) when looking to match names.

Traits

edit

JSTOR

edit

JSTOR content on Internet Archive https://archive.org/details/jstor_ejc

Major journal projects

edit

https://www.wanfangdata.com.cn/sns/perio/xbzwxb/?tabId=article&publishYear=2020&issueNum=12&isSync=0&page=2 and CNKI. DOIs for some articles but they don't seem to be resolving. For example https://d.wanfangdata.com.cn/periodical/xbzwxb202101001 has DOI http://dx.chinadoi.cn/10.7606/j.issn.1000-4025.2021.01.0001 which doesn't resolve.

Acta botanica Boreali-Occidentalia Sinica
Years Volumes Source Pagination? Notes Status
1981-2021 CNKI Yes adding
1999-2009 Wanfang Yes DOIs 10.3321 ISTIC Added
2012-2013 Wanfang Yes DOIs 10.3969 ISTIC Added
2014-2016 Wanfang Yes DOIs 10.7606 ISTIC Not added yet

Several name changes, multiple data sources, Chinese and English, lack of pagination data in some cases, mixed DOI agencies, not all DOIs resolve. PDFs available. Oh the fun we will have...

Can use Internet Archive to extract page numbers e.g., SOME NEW TAXA OF OLEACEAE FROM TIBET,CHINA (Q106563578) from https://archive.org/download/plantdiversity-0253-2700-32749/plantdiversity-0253-2700-32749_page_numbers.json

One challenge is the overlap between Wanfang volumes 21-26 and journal.kib.ac.cn. Wanfang has pages and DOIs, journal.kib.ac.cn doesn't have either, but does have PDFs. So we need to somehow crosslink these :(

Acta Botanica Yunnanica
Years Volumes Source Pagination? Notes Status
1999-2010 21-32 Wanfang Data Yes DOIs, mostly 10.3969 but some 10.3724 Added using scraped data, then map to journal.kib.ac.cn URLs, some manual editing of DOIs
2011-2015 33-37 Wanfang Data Yes PLANT DIVERSITY AND RESOURCES, DOIs 10.3724
1979-2020 1-42 http://journal.kib.ac.cn Mostly (2005 onwards, before that, no) All called Plant Diversity, some DOIs 10.1016, 10.7677, 10.3724, every article has a URL Use to add Plant Diversity And Resources (2011-2015), and enhance Wanfang data 1999-2010.
1979-2015? Example CNKI No, nor does it have volumes http://www.cnki.com.cn/Journal/A-A6-YOKE-1979.htm Incomplete coverage, use to enhance by add CNKI ids where possible.
2016-2020 38-42 Elsevier Yes Plant Diversity, DOIs 10.1016 Add from CrossRef
DOI prefixes
Prefix Who Agency Link
10.7677 wangfangdata ISTC https://doi.org/10.7677
10.3969 wangfangdata ISTC https://doi.org/10.3969
10.1016 crossref crossref https://doi.org/10.1016
10.3724 Crossref - Science Press crossref https://doi.org/10.3724

Lots of articles already, some with CNKI DOIs. Need to explore further, and add PDFs and other links.

It looks like the links I had in BioNames are now gone, so need to remap. Stevenliuyi did some amazing work adding articles with CNKI DOIs, but these lack volume numbers and pagination, so need to add those. I'd also like to archive and link to the PDFs. Rdmpage (talk) 11:50, 21 July 2021 (UTC)

Note that DOI dates are often incorrect, they look to be dates article went online, not when it was published!


CNKI has issued DOIs for the complete journal (as far as I can determine). The journal also has a hoe page http://gswxb.cnjournals.cn/gswxben/home that has PDFs and also includes DOIs. The HTML has citation_ tags but poorly implemented with author footnotes included, no DOI or pagination, etc.

OK, more complicated. CNKI has complete journal, but DOIs have been issued by different sources.

Acta Palaeontologica Sinica
Years Volumes Journal DOIs Notes
1953-1998 1 - Acta Palaeontologica Sinica CNKI DOI 10.19800/j.cnki.aps...
1999-2009? Acta Palaeontologica Sinica ITISC Wanfang DOI 10.3969/J.ISSN.0001-6616.2009.04.003
2010- 49- Acta Palaeontologica Sinica CNKI DOI 10.19800/j.cnki.aps...

Name changes, ISSN changes, etc. Also in CiNii.

Acta phytotaxonomica et geobotanica
Years ISSN Source DOIs Notes
1932-2001 0001-6799,2189-7050 J-Stage 10.18942/bunruichiri... JaLC Acta phytotaxonomica et geobotanica / 植物分類, 地理 Acta Phytotaxonomica et Geobotanica (Q100375972)
2001- 1346-7565,2189-7042 J-Stage 10.18942/apg... JaLC Acta Phytotaxonomica et Geobotanica (APG) / 植物分類,地理 Acta Phytotaxonomica et Geobotanica (Q5656888)
1979-2020 2189-7034,1346-6852 J-Stage 分類 : bunrui : 日本植物分類学会誌 / Bunrui Bunrui (Q40186046)

Acta Phytotaxonomica Sinica (Q5656885) and Journal of Systematics and Evolution (Q15733644): multiple web sites, multiple DOIs, etc. See Acta Phytotaxonomica Sinica: A Bibliographic Summary of Published Volumes (Q28955370) for some earlier history. See http://gb.oversea.cnki.net/kcms/detail/detail.aspx?dbCode=cjfd&QueryID=4&CurRec=1&filename=ZWFX200801001&dbname=CJFD2008 for title change.

Acta Phytotaxonomica Sinica
Years Volumes Journal DOIs Notes
2009- 47- Journal of Systematics and Evolution CrossRef DOI 10.1111/ ISSN 1674-4918
2008-2009 46 Journal of Systematics and Evolution ? DOI 10.3724/SP.J... ISSN
2005-2007 43-45 Acta Phytotaxonomica Sinica CrossRef DOI 10.1360 ISSN 0529-1526
1951-2004 1-42 Acta Phytotaxonomica Sinica


ISSN 0001-7302, needs lots of work, became Current Zoology (Q15749150). Local database (publications) has CNKI URLs such as http://www.cnki.com.cn/Article/CJFDTOTAL-BEAR200806023.htm which now break, can be rewritten as https://oversea.cnki.net/kcms/detail/detail.aspx?dbcode=CJFD&filename=BEAR200806023, which also has page information.

Also links (but no DOIs) in Wanfang https://d.wanfangdata.com.cn/periodical/dwxb200801015 Also links in CQVIP e.g. http://www.cqvip.com/qk/94056x/200801/26635405.html

Name changes, ISSN changes, etc. Acta Zootaxonomica Sinica (Q15761826) and Zoological Systematics (Q21386166). Lack of pagination data. Lots of articles already added by @Stevenliuyi: with CNKI CJFD journal article ID (P6769) identifier, but these include old articles in Acta Zootaxonomica Sinica (Q15761826) linked to new name for journal Zoological Systematics (Q21386166).

I've started to move the pre-2014 articles to Acta Zootaxonomica Sinica (Q15761826) and adding the Wanfang DOIs. Pagination have to be added later.

2020-11-14 Danger Will Robinson! CNKI and Wanfang article numbering is different, so can't rely on simply mapping Wanfang URL to CNKI even though they look very similar :(. Will need to be cleverer about the mapping...

Acta Zootaxonomica Sinica
Years ISSN Source DOIs Notes
1964- 1000-0739 CNKI Acta Zootaxonomica Sinica (Q15761826)
1998-2013 1000-0739 WanFang 10.3969/j.issn.1000-0739... Not all have DOIs
2014- 2095-6827 http://www.zootax.com.cn 10.11865/zs. DOIs don't seem to resolve?

This journal has suffered from having BHL treat issues as titles, so that Wikidata has separate items for articles that have poor metadata and aren't linked to the journal. Seems to be restricted to pre-1923 content, will need to check. Given that BHL continues to do this, will need to spend some time linking to newer BHL and IA content...

  • I've linked most(all?) stray pre-1923 articles to Handles and the journal itself. 2020-12-18

Many articles with PMID but no DOI (and also badly translated English titles). See https://w.wiki/55Ld for query for PMID but no DOI. I have matched PMIDs to records locally, need to move this to Wikidata.

Article A revision of the genus Heliophanus C. L. Koch, 1833 (Aranei: Salticidae) (Q60864671) has a bad Internet Archive ID (P724) as it contains []. Need to fix.

We have lots of BioOne DOIs for articles before 2000, which is when BioOne first has content for this journal. Prior to 2000 all DOIs that resolve to content are JSTOR DOIs. Looks like BioOne feed CrossRef a bunch of DOIs for which it never had content, hence CrossRef metadata has lots of duplicate DOIs. Some of these are in Wikidata... what a mess.

I've added JSTOR DOIs to those pre-2000 articles that only had "fake" BioOne DOIs, and added all JSTOR-DOI content prior to 2000 Rdmpage (talk) 13:46, 4 April 2022 (UTC)

Note that we also have some BHL DOIs for this journal, and also there are a small number of articles in Horizon.

There is overlap between Persee and lasef.org. CrossRef from 2018 points to lasef.org, landing page is volume not individual srticle.

So we have Persee and lasef.org overlap 2008-2016, 2017 is just lasef.org, then 2018 - CrossRef

Bulletin de la Société Entomologique de France
Years Source DOIs Notes
1896-2016 Persee https://www.persee.fr/collection/bsef Some articles from 1896 have DOIs 10.3406/bsef. Post 2016 are embargoed so no metadata, all PDFs have a captcha in front of them.
2008- https://lasef.org/ PDFs freely available (most recent under embargo)
2018- CrossRef 10.32475/bsef.


Some have DOIs e.g. Five new species of the genus Trichotichnus from Taiwan (Coleoptera, Carabidae, Harpalini) (Q111523062) 10.20643/00001606 which leads to a repository https://omnh.repo.nii.ac.jp

Need to sort out ISSNs and history

Also bad things have happened involving mismatch between metadata (titles and pages) and Internet Archive PDFs. Need to review all articles and sort this out :(

For example Cloning and Gene Expression of 3-Hydroxy-3-Methylglutaryl-CoA Synthase Gene( AsHMGS ) from Aquilaria sinensis (Lour.) Gilg (Q96107615) title doesn't match first line or pagination, and IA PDF is for different article.

Note that there are duplicate DOIs, e.g. 10.7525/j.issn.1673-5102.2007.05.002 appears in the metadata for two articles!

Wrong or inconsistent metadata in Wikidata

edit
  • Bad
  • Fixed/OK

Q96107547 Q96107548 Q96107551 Q96107552 Q96107555 Q96107556 Q96107557 Q96107559 Q96107568 Q96107570 Q96107571 Q96107573 Q96107574 Q96107576 Q96107578 Q96107580 Q96107581 Q96107582 Q96107590 Q96107592 Q96107593 Q96107594 Q96107597 Q96107599 Q96107601 Q96107606 Q96107607 Q96107611 Q96107613 Q96107615 Q96107626 Q96107710 Q96107886 Q96107896 Q96107978 Q96108014 Q96108078 Q96108079 Q96108091 Q96108099 Q96108284 Q96108340 Q96108438 Q96108512 Q96108519 Q96108655 Q96108753 Q96108770 Q96108786 Q96108839 Q96108857 Q96108909 Q96108977 Q96109101

Wrong PDF in IA

edit

For some articles the contents of the PDF in IA are wrong, e.g. Q96107529 Q96107530 Q96107532 Q96107535 Q96107536 Q96107528

I've made all these items in IA dark.

Wrong IA metadata

edit

I've deleted DOIs when I've found a mismatch between metadata and DOI.


Bulletin of the United States National Museum

edit

Bit of a mess with multiple items of different type, see discussion https://www.wikidata.org/wiki/Wikidata_talk:WikiProject_Periodicals/Archive_2#Bulletin_of_the_United_States_National_Museum https://www.wikidata.org/wiki/Wikidata_talk:WikiProject_Periodicals/Archive_3#Bulletin_of_the_United_States_National_Museum_(take_2) https://www.wikidata.org/wiki/Wikidata_talk:WikiProject_Taxonomy/Archive/2020/10#Bulletin_of_the_United_States_National_Museum


Based on BHL, ISSN and SuDoc

Name ISSN BHL Sudoc Years
Bulletin - or / United States National Museum 0362-9236 169510 037366866 1907-1971
Bulletin of the United States National Museum 0096-2961 169509 036653233 1875-1905

Note also on BHL "No. 1-16 issued also in: Smithsonian miscellaneous collections, v. 13, 23-24."

Bulletin of the United States National Museum
Name Description Years ISSN BHL Notes
Bulletin - United States National Museum (Q21385133) version of volume 1 of a journal 1877-1971 0362-9236 BHL:169509 (1875-1905)

118 publications linked to this item 1889-1969 https://w.wiki/4PrV

Bulletin of the United States National Museum (Q21385329) scientific journal (1875–1905) 1875–1905 0096-2961

Nine publications linked to this item 1886-1970 https://w.wiki/4PrY

Bulletin of the United States National Museum (Q56634273) volume 1 of a publication

has edition or translation (P747) Bulletin - United States National Museum (Q21385133) , no external identifiers

Bulletin of the United States National Museum (Q56633969) publicaton of the United States Museum

has part(s) (P527) Bulletin of the United States National Museum (Q56634273) no external identifiers but linked to https://commons.wikimedia.org/wiki/Category:Bulletins_of_the_United_States_National_Museum which spans 1879-1971

Looks like eLibrary.ru (Q4037789) may have made some mistakes linking translations together, may have to check against Springer website for Entomological Review (Q47161189). Rdmpage (talk) 17:11, 29 January 2023 (UTC)

Some confusion between Japanese Journal of Ichthyology (Q21385442) and Ichthyological Research (Q15760079) (partly caused by Springer). These are two separate journals. Metadata for Japanese Journal of Ichthyology (Q21385442) retrieved via DOIs often lacks titles, will need to add manually. Also need to add titles in multiple languages via harvesting web site.

Online but will require some effort to get list of articles http://zgnydxxb.ijournals.cn/zgnydxxb/ch/index.aspx

Three articles so far, note that Japanese Species of Parmelia Ach. (sens. str.), Parmeliaceae (Q59123248) is a composite of several articles with the same title (i.e., it is a series of papers).

Journal of The Asiatic Society of Bengal (and Materials for a Flora of the Malayan Peninsula)

edit

This journal contains Materials for a Flora of the Malayan Peninsula, which has also been issued as a separate reprint.

There is a scanned archive at South Asia Archive (Q104412505) http://www.southasiaarchive.com that is behind a paywall, there are also freely accessible PDFs at https://pahar.in/journals/.

Journal of The Asiatic Society of Bengal
Name Years ISSN BHL Notes
Journal of the Asiatic Society of Bengal. Part 2. Natural History (Q2840714) 1871(?)-1936 https://www.biodiversitylibrary.org/bibliography/51678 Treated as a separate publication by Wikidata and IPNI, Wikidata says it "replaced" Journal of the Asiatic Society of Bengal (Q16584125) whereas is was a separate part ("2") that continued on after the journalJournal of the Asiatic Society of Bengal (Q16584125) into journal and proceedings.
Journal of the Asiatic Society of Bengal (Q16584125) 1832-1905 (1936 for part 2) 0368-1068
Proceedings of the Asiatic Society of Bengal (Q41298163) 1865-1904 0369-8416 https://www.biodiversitylibrary.org/bibliography/9578 Proceedings of the Asiatic Society of Bengal
Journal and proceedings of the Asiatic Society of Bengal (Q51496795) 1905- 0368-3451 https://www.biodiversitylibrary.org/bibliography/47024 new ser., v.1 onwards, note that the Proceedings are at the end of the volume, and some of the archive indexes the proceedings (e.g., http://www.southasiaarchive.com/Content/sarf.120250/227126/003 ). southasiaarchive has new volume numbering from 1906
Journal of the Asiatic Society 1935 to 1950's(?) https://catalog.hathitrust.org/Record/000500734 Volume change in southasiaarchive for 1935
Journal of the Asiatic Society (Q27716010) Vol. 1, no. 1 (1959)- 0368-3303
Materials for a Flora of the Malayan Peninsula Reprint, see https://biostor.org/reference/184053 and https://www.nparks.gov.sg/sbg/research/publications/gardens-bulletin-singapore/-/media/sbg/gardens-bulletin/4-4-36-2-02-y1983-v36p2-gbs-pg-177.pdf for details and advice on how to cite. https://www.biodiversitylibrary.org/bibliography/10805 There are various versions in BHL https://www.biodiversitylibrary.org/search?searchTerm=Materials+for+a+Flora+of+the+Malayan+Peninsula&stype=F#/titles

Items about the journal include Systematic notes on Asian birds. 51. Dates of avian names introduced in early volumes of the Journal of the Asiatic Society of Bengal (Q89633372)

Items about specific articles include XIV.—Notes on the œconomy of the Paussidæ, extracted from Capt. W. J. E. Boyes' Paper, published in the Journal of the Asiatic Society of Bengal (No. 138.—N. S. No. 54) (Q99848588)Items about Materials for a Flora of the Malayan Peninsula Materials for a Flora of the Malayan Peninsula (Q13554331) (generic item for work) and Materials for a flora of the Malayan Peninsula (Q51383079) (BHL copy). Articles mentioning it Materials for a Flora of the Malayan Peninsula (Q64285277) Materials for a Flora of the Malayan Peninsula (Q64276618) (in Nature)

Multiple DOIs, duplication, redirects, etc. Plus we have items from PubMed that don't have DOIs. In short, a clusterfuck.

Suggested work flow:

Map DOIs 10.3852 to Wikidata, use php doi_to_doi.php to get redirect DOIs, map those to Wikidata, then see if we need to add extra DOIs to those items (i.e., 10.3852 and 10.1080) https://w.wiki/53Nn
Map PMIDs to Wikidata for references that lack DOIs in Wikidata https://w.wiki/53Nk then add those DOIs to Wikidata
Map JSTOR to Wikidata, add JSTOR ids if missing from Wikidata
Merge records with different DOIs and update corresponding Wikidata items
Add any missing records to Wikidata

Issues

edit

Using redirection to find the "other DOI" for 10.3852 uncovered 25 items that are duplicates, i.e., both DOIs have a Wikidata item:

Duplicates
Q33292380 Q54802190
Q31120184 Q105485138
Q31042045 Q63854883
Q39437650 Q60401823
Q34491252 Q110615535
Q58803921 Q81300049
Q28295469 Q56485068
Q33309119 Q110827263
Q31139254 Q57254285
Q56915900 Q31111921
Q60449006 Q51709251
Q28301216 Q59678579
Q31130104 Q56931476
Q22255400 Q34626370
Q56145227 Q31111919
Q28276701 Q60895113
Q58045983 Q39098297
Q64385287 Q31081978
Q51119147 Q57448972
Q111373678 Q51188492
Q111262623 Q80758057
Q31120961 Q59205306
Q58656219 Q82464853
Q56951379 Q34399871
Mycologia
Years Source DOIs Notes
1909-2004 JSTOR 10.2307 Resolve to JSTOR
2005-2016 Now on T&F 10.3852 DOIs like 10.3852/mycologia seem to be redirects to 10.1080
1909-2022 T&F 10.1080 Complete coverage, so many duplicates of JSTOR content
1945-2020 PubMed Some missing DOIs Sporadic, many lacking DOIs


These duplicates have now been merged.

Then looked at items with PMID but no DOI in time period 2005-2016, these seem to be articles with no DOI at all.

Next issue is to identify those records that have only one DOI in Wikidata. These correspond to item with 10.3852 DOIs, a Wikidata item, and no Wikidata item for the 10.1080 DOI. I have added the missing DOIs to these records (they now have 2 DOIs).

This leaves us with DOIs for this time period that have no Wikidata item at all. These will be added.

DOIs from 2017 onwards are just T&F so these can be added directly.

Records for 1909-2014 will have to be merged where they have two DOIs, and we also need to check for PubMed-only records, of which there are a number. PMIDs will link both DOIs, so will need both DOis added, then add missing DOI for records with one DOI, then add missing records (and extra DOIs).

Details

edit

Progress: volume 96

-- Look at a volume SELECT guid, title, volume, spage, epage, doi, wikidata, pmid, pii FROM publications_tmp WHERE issn='0027-5514' AND volume=96 ORDER BY CAST(spage AS SIGNED);

-- add DOIs to records with PMID but no DOI SELECT CONCAT(pii, char(9),'P356', char(9), '"', doi , '"') FROM publications_tmp WHERE issn='0027-5514' AND volume=96 and wikidata is null and pmid is not NULL ORDER BY pii;

-- Get wikidata for DOIs with no PMID SELECT guid FROM publications_tmp WHERE issn='0027-5514' AND volume=96 and wikidata is null and pii is null;

-- Get any articles with newer DOIs that we should add SELECT * FROM publications_tmp WHERE issn='0027-5514' AND volume=96 and wikidata is null and pii is null and doi like '10.1080%';

PubMed errors

edit

There is a problem with volumes 60 and 61 in PubMed, some articles are assigned to the wrong volume :( Fixed Rdmpage (talk) 08:46, 15 April 2022 (UTC)

CNKI, see also https://manu40.magtech.com.cn/Jwxb

History of the journal in https://manu40.magtech.com.cn/Jwxb/CN/abstract/abstract2906.shtml https://doi.org/10.13346/j.mycosystema.2011.01.014 = https://www.cnki.net/kcms/doi/10.13346/j.mycosystema.2011.01.014.html

Mycosystema
Years Journal ISSN Volumes Source Pagination? Notes Status
1982-1996 Acta Mycologica Sinica 0256-1883 1-15 magtech and CNKI Yes DOIs 10.13346 Not added yet
1997-2003 Mycosystema 1007-3515 16-22 magtech no no DOIs(?) Not added yet
2004-2013 Mycosystema 1672-6472 23- magtech and CNKI Yes DOIs 10.13346 Not added yet

Muséum National d'Histoire Naturelle journals

edit

See 1802–2018: 220 ans d'histoire des périodiques au Muséum (Q93462644) and Timeline of the scientific publications of The Museum for details.

Records of the Auckland Museum

edit

Special:Contributions/Prosperosity has made some edits to this journal and merged the older records I added (JSTOR) to newer ones they created. I will need to update my local mapping between JSTOR ids and Wikidata to accommodate this.

  • Done

Russian Journal of Genetics / Genetic

edit

Wikidata has these journals confused. For example Gene diversity for haptoglobin and transferrin classical markers among Hindu and Muslim populations of Aligarh City, India. (Q48605442) is stated as being published in Russian Journal of Genetics (Q15753063) (the English language journal) with a link to PubMed, but PubMed says it is published in Genetika which is the Russian journal (which lacks a Wikidata item). Russian Journal of Genetics (Q15753063) has three ISSNs, 1022-7954, 1608-3369, 0016-6758, the last one (0016-6758) is for Genetika (Moskva) https://portal.issn.org/resource/issn/0016-6758. Need to unpack this journal and link articles to correct journal. Note that the English language articles will (mostly? all?) be translations of the articles in Genetika.

For the case of the article "Gene diversity for haptoglobin and transferri..." the Wikidata record is from GENETIKA but Wikidata links this to Russian Journal of Genetics. A bit of a mess :(.

Gene diversity for haptoglobin and transferrin classical markers among Hindu and Muslim populations of Aligarh City, India (Q48605442).
Journal DOI PMID URL pages
Russian Journal of Genetics 10.1134/s1022795411060044 https://link.springer.com/article/10.1134%2FS1022795411060044 47(6): 744-748
GENETIKA ГЕНЕТИКА - 21866866 https://elibrary.ru/item.asp?id=16455347 47(6): 842-846

Some articles with DOIs, PDFs available, website is a bit sluggish. Has English and Chinese metadata.

There is a big gap in DOI coverage where we have JSTOR ids but no DOIs. These DOIs exist for Wiley content, so need to match DOIs to existing Wikidata records (from JSTOR).

Volumes to do are 57-61, see https://w.wiki/3dCH

Journal has several names, not sure of the timing of each, e.g. Transactions of the Lepidopterological Society of Japan, Tyô to Ga, Lepidoptera Science.

My first import generate a number of duplicates as I didn't check that DOIs were unique before adding them (doh!). This resulted in 1348 duplicates which I am merging.

I also want to link these articles to their CiNii identifiers.


Volume 42 (2008) onwards has CrossRef DOIs, although there are issues with their resolution. Prior to 2008 lots of articles online, not all with volumes, etc.

Many articles already in Wikidata, mix of DOIs (not all work), also some coverage in Wanfang and CNKI. Wikidata coverage is based primarily on PubMed. Note that there are at least three different DOI agencies, and some overlap in DOIs and/or agencies

Two ISSNs (and two Wikidata items) Zoological Research (Q15766889) 0254-5853 and Zoological Research (Q27714095) 2095-8137

According to NLM "Began with Volume 37, issue 5 (18 September 2016)." https://locatorplus.gov/cgi-bin/Pwebrecon.cgi?v1=5&ti=1,5&SC=Title&SA=Zoological%20research&PID=e3kbuLJdLaFS0PmiKWaW_rCl&SEQ=20210406112547&SID=2 but this is not entirely clear from the journal itself. For example, the cover for "Volume 35 Issue 5 18 September 2014" (3516) says ISSN 2095-8137 whereas "Volume 35 Issue 3 18 May 2014" has ISSN 0254-5853. 2014 also seems to be the year that the DOIs have mixed ISSNs. What a mess. Cover of vol 35 issue 4 18 July 2014 has ISSN 2095-8137 and also DOIs with that ISSN, so I think that os the issue when the ISSN changed.

Note that CrossRef and Wanfang DOIs overlap in volume 29! Also looks like not all Crossref DOIs work :(

Added most of 0254-5853 vols 1-34,35 still need to add Chinese titles to post vol 29 as many are Pubmed English translations. Will need to import Chinese titles and store in multilingual as current code uses DOI as GUID and hence misses the Chinese titles. Rdmpage (talk) 13:47, 2 May 2022 (UTC)

Zoological Research
Years Volumes DOIs ISSN Notes
2017- 38- CrossRef 2095-8137 DOI 10.24272/...
2014-2017 35-38 CNKI 2095-8137 DOI 10.13918/j.issn.2095-8137
2013-2014 34-35 ? 0254-5853 DOI 10.11813/j.issn.0254-5853 (broken)
2004-2021 25-42 ? ? Bioline http://www.bioline.org.br/toc?id=zr
2008-2013 29-34 CrossRef 0254-5853 DOI 10.3724/...
1999-2008 20-29 ISTIC 0254-5853 DOI 10.3321/j.issn: Wanfang
1980-1998 1-19 0254-5853 No DOIs

Multilingual titles

edit

If you have a Chinese title (e.g., "西北植物学报") and a transliteration (e.g., "Xibei zhiwu xuebao") then you can connect the two using Hanyu Pinyin transliteration (P1721). See Acta botanica Boreali-Occidentalia Sinica (Q27721266).

Quote from Contributions to the botanical journal Sunyatsenia from 1930 to 1948 (Q28944969)

"Names of Chinese botanists follow the convention of placing the family name first followed by given names; names of Westerners follow the western convention of placing the family name last. Chinese botanists mostly followed the Wade-Giles system of Romanization when transliterating their name; and the current pinyin system was initiated late in the careers of most of the early Chinese botanists around 1950s. They were not required to adopt the pinyin system if they had actively published and were known under a different transliteration of their name."

Wade-Giles is Wade-Giles (Q208442). This means we may need to take some care in handling Chinese names for older literature.

Titles with HTML markup

edit

title (P1476) shouldn't have any markup, but you can add a qualifier to the title title in HTML (P6833) to include the markup. For example, see Sur le genre Trypanoxyuris (Oxyuridae, Nematoda) IV. Sous-genre Trypanoxyuris parasite de Primates Cebidae et Atelidae (suite) Étude morphologique de Trypanoxyuris callicebi n. sp. (Q64173850).

Full text

edit

Note document file on Wikimedia Commons (P996) e.g. for A new cryptic species of Anolis lizard from northwestern South America (Iguanidae, Dactyloinae) (Q58700998) which essentially embeds a PDF in Wikidata!

Checksums

edit

Maybe add checksum (P4092) as a property to a publication, as a way to link (indirectly) to content, see also https://hash-archive.org and https://bentrask.com/?q=hash://sha256/98493caa8b37eaa26343bbf73f232597a3ccda20498563327a4c3713821df892 by Ben Trask (Q63232898).

edit

Matches without series ordinal

edit

Note that User:EvaSeidlmayer has added author (P50) to lots of references without adding series ordinal (P1545), and leaving author name string (P2093) in place, so we have two entries for the same author, one as a thing and one as a thing (see e.g., North American distribution ofEleocharis mamillata(Cyperaceae) and confusion withE. macrostachyaandE. palustris (Q100395512) ).


Redirects

edit

Wikispecies

edit

Sometimes we have authors (or other entities) that two Wikidata items (e.g., two links to Wikispecies) when there is really only one entity (e.g., one person). An example is Eduardo Flórez Daza (Q21392863) and Eduardo Flórez Daza (Q56650857). These are the same person, and the Wikispecies entry for Eduardo Flórez Daza is a redirect to Álvaro Eduardo Flórez-Daza. The convention for this seems to be:

In this case Eduardo Flórez Daza (Q21392863) is now mostly empty and data on this person can be found at Eduardo Flórez Daza (Q56650857).

Authors

edit

When merging authors, e.g. John William Thieret (Q102229589) with John William Thieret (Q21390395) the expectation is that a BOT will update every link to Q102229589 to point to Q 21390395. This process seems to take a long time. Telegram chat suggests 8 days https://t.me/c/1497612692/4509. Q102229589 was made a redirect 2023-07-19, the links were updated 2023-07-27 by User:KrBot (eight days later).

Bibliographic relationships

edit

Reviews

edit

Could use for reviews of books, etc.

Translations

edit

Two New Species of the Weevil Genus Mecysmoderes Schoenherr, 1837 (Coleoptera, Curculionidae: Ceutorhynchinae) from Vietnam (Q99837830) in Entomological Review (Q47161189) is the English language version of ДВА НОВЫХ ВИДА ДОЛГОНОСИКОВ РОДА MECYSMODERES SCHOENHERR, 1837 (COLEOPTERA, CURCULIONIDAE: CEUTORHYNCHINAE) ИЗ ВЬЕТНАМА (Q99838137) in Entomologicheskoe Obozrenie (Q4532102). How do we represent this relationship?

OK, we can use edition or translation of (P629) and its inverse has edition or translation (P747) to link the two works together. Maybe should also make translated article and instance of version, edition or translation (Q3331189).


Errata

edit

The article The first African record of Artolenzites acuta comb. nov. (Basidiomycota, Polyporaceae) (Q99931585) has an erratum Erratum to: The first African record of Artolenzites acuta comb. nov. (Basidiomycota, Polyporaceae) (Q99888822). To connect an article to its errata we use corrigendum / erratum (P2507) as a property of the original article, hence we have Q99931585 -- P2507 --> Q99888822

Note that there are bots that automatically add instance of (P31) erratum (Q1348305) to erata (see history of Erratum to: The first African record of Artolenzites acuta comb. nov. (Basidiomycota, Polyporaceae) (Q99888822) ).

Note also that User:Trilotat has some useful; queries to find corrections and the things they correct https://www.wikidata.org/wiki/User:Trilotat/SPARQL#Corrections,_errata_and_corrigenda

PDFs

edit

To add a PDF for an article use full work available at URL (P953), add file format (P2701) Portable Document Format (Q42332) as a qualifier to say that it is a PDF, and add archive URL (P1065) with a link to the URL in the Wayback machine if it has been archived there. See Trithecoides, a new subgenus of Culicoides (Diptera: Ceratopogonidae) (Q89666437) for an example.

Books

edit

One model is the "book" is written work (Q47461344) which has basic information (title, author) and OCLC work ID (P5331) as an identifier (for example). Then. we have editions version, edition or translation (Q3331189) that have ISBNs (e.g., ISBN-10 (P957), Google Books ID (P675) etc. Editions are linked to works by has edition or translation (P747), works are linked to editions edition or translation of (P629). The Wikidata:WikiProject Books wants every book to have both written work (Q47461344) and at least one version, edition or translation (Q3331189), which seems redundant for many cases. For now I use Google Books to add books and by default make them written work (Q47461344). I follow Wikidata:WikiProject Books if there are multiple editions that seem important (e.g., they are cited).

Wikisource

edit

See for example The Afghan War (Q19077572).

A version, edition or translation (Q3331189) has document file on Wikimedia Commons (P996), linking to a file on Commons, and Wikisource index page URL (P1957) which is the link to the Wikisource page for the transcription of the book.

Chapters

edit

A chapter (Q1980247) is part of (P361) a book, and the book should list each chapter as has part(s) (P527), see for example The Canterbury Tales (Q191663)


Citations

edit

In Ridleyandra merohmerea (Gesneriaceae), a new species from Kelantan, Peninsular Malaysia (Q42258926) I explored adding citations without DOIs as strings using unknown (Q24238356). See also proposal by GerardM for a citation string Wikidata:Property proposal/cites work string.

On the basis of this (unsuccessful) proposal GerardM has been exploring adding citations to cites work (P2860) using placeholder for "somevalue" (Q53569537), see for example Can trophic rewilding reduce the impact of fire in a more flammable world? (Q57805204).

Given that Quickstatements struggles with placeholder for "somevalue" (Q53569537) we will need to look at using the API to edit these statements directly (using unique statement ids). Need to be able to"

  • add a citation that lacks an item (with qualifiers)
  • retrieve details of citation that lacks item so we can try and add or match it
  • update citation that lacks an item with corresponding item

API experiments Q102901875 and Q102902439.


Unstructured citations

edit

Querying for unstructured citations:

select * where {
  # Ridleyandra merohmerea ...
  VALUES ?work { wd:Q42258926 } .
  
  # Outsized effect of predation...
  # VALUES ?work { wd:Q102058694 } .

  # Get cited works  
   ?work p:P2860 ?statement . 
   ?statement ps:P2860 ?cites . 
  
   # stated as
   OPTIONAL { ?statement pq:P1932 ?unstructured . }
 
   # series ordinal
   OPTIONAL { ?statement pq:P1545 ?position . }

   # title
   OPTIONAL { ?statement pq:P1476 ?title . }

   # author name string
   OPTIONAL { ?statement pq:P2093 ?authors . }

   # publication date
   OPTIONAL { ?statement pq:P577 ?date . }

   # DOI
   OPTIONAL { ?statement pq:P356 ?doi . }

    # URL
   OPTIONAL { ?statement pq:P953 ?url . }

  FILTER (!isIRI(?cites))
}
ORDER BY (xsd:integer(?position))

https://w.wiki/qD9

Bibliographic identifiers (and proposals)

edit

(see also Template:Bibliographic_properties )

Handle ID (P1184) View with SQID 

Zenodo ID (P4901) View with SQID 

CNKI CJFD journal article ID (P6769) View with SQID 

WoRMS source ID (P6678) View with SQID 

https://www.wikidata.org/wiki/Wikidata:Property_proposal/National_Diet_Library_Persistent_ID

Wikidata:WikiProject BHL

Bibliographic licenses including text mining

edit

Could add information on licensing when adding works via CrossRef, would need to create items for each license, see e.g. https://www.wikidata.org/wiki/Help:Copyrights#RightsStatements for links to various licenses that could be used as templates. For example,

See also https://www.crossref.org/documentation/retrieve-metadata/rest-api/text-and-data-mining-for-researchers/

Examples

edit

Bibliographic harvesting, RSS

edit

RSS feed

edit

web feed URL (P1019)


URL (P2699) OAI endpoint, qualifier protocol (P2700) Open Archives Initiative Protocol for Metadata Harvesting (Q2430433)


Publishing engines

edit
software engine (P408) Open Journal Systems (Q1710177) 

https://w.wiki/3fgD


Engines for taxonomy journals https://w.wiki/3fgN

select * { ?journal wdt:P31 wd:Q5633421 . ?journal wdt:P1476 ?title . ?journal wdt:P408 ?engine . ?engine rdfs:label ?label . FILTER(LANG(?title) = "en") FILTER(LANG(?label) = "en") ?article schema:about ?journal . FILTER(regex(str(?article), "species.wikimedia.org")) ?journal wdt:P495 ?country . ?country wdt:P625 ?coordinates . } LIMIT 10

Things to fix

edit

New species and new records of ant-eating spiders from Mediterranean Europe (Araneae: Zodariidae) (Q104465474) has the same DOI cited many times, but this is an error as each references is different! So we have massive duplication of works cited.

Other

edit

(see also Template:Bibliographic_properties and Wikidata:WikiProject_Books)

IUCN conservation status (P141) View with SQID  BHL page ID (P687) View with SQID 

zoological specimen (Q2114846)  View with Reasonator  View with SQID 


Babel user information
en-N This user has a native understanding of English.
de-1 Dieser Benutzer beherrscht Deutsch auf grundlegendem Niveau.
zh-1 这位用户的中文达到初级水平
fr-0 Cet utilisateur n’a aucune connaissance en français (ou le comprend avec de grandes difficultés).
ja-0 この利用者は日本語分かりません (または理解するのがかなり困難です)。
vi-0 Thành viên này hoàn toàn không biết tiếng Việt (hoặc rất khó khăn để hiểu).
pt-0 Este utilizador não compreende português (ou compreende com dificuldades consideráveis).
es-0 Este usuario no tiene ningún conocimiento del español (o lo entiende con mucha dificultad).
cs-0 Tento uživatel nerozumí česky (nebo rozumí se značnými problémy).
th-0 ผู้ใช้คนนี้ไม่มีความรู้เกี่ยวกับภาษาไทย (หรือเข้าใจได้ด้วยความยากลำบาก)
ru-0 Этот участник не владеет русским языком (или понимает его с трудом).
ar-0 هذا المستخدم ليس لديه معرفة بالعربية (أو يفهمها بصعوبة بالغة).
ko-0 이 사용자는 한국어모르거나, 이해하는 데 어려움이 있습니다.
sk-0 Tento užívateľ nerozumie po slovensky (alebo rozumie so značnými problémami).
nl-0 Deze gebruiker heeft geen kennis van het Nederlands (of begrijpt het met grote moeite).
ms-0 Pengguna ini tidak mampu bertutur dalam (atau sukar memahami) bahasa Melayu.
id-0 Pengguna ini tidak memiliki pengetahuan bahasa Indonesia (atau memahaminya dengan sangat sulit).
Users by language