Description
Hi Pano,
Long time no see;-)
I have an issue regarding the evaluation of an XPATH expression.
Here is the data test.xml
:
<root>
<record><header status=""><identifier>oai:boris.unibe.ch:144298</identifier><datestamp>2020-11-19T07:21:51Z</datestamp><setSpec>7375626A656374733D646463393030:646463393330</setSpec><setSpec>6469766973696F6E733D4443443541343432424434424531374445303430354338323739304334444532:4443443541343432433137424531374445303430354338323739304334444532:4443443541343432424630334531374445303430354338323739304334444532:4443443541343432433239314531374445303430354338323739304334444532</setSpec><setSpec>74797065733D64617461736574</setSpec></header><metadata>
<record schemaLocation="http://www.loc.gov/MARC21/slim http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd">
<leader>03144nam a2200517 u 4500</leader>
<controlfield tag="001">144298</controlfield>
<controlfield tag="005">20201119072151.0</controlfield>
<controlfield tag="006">m o d </controlfield>
<controlfield tag="007">cr |n |||a||||</controlfield>
<controlfield tag="008">200525s2020 |||m o d ger d</controlfield>
<datafield tag="024" ind1="7" ind2=" ">
<subfield code="a">10.7892/boris.144298</subfield>
<subfield code="2">doi</subfield>
</datafield>
<datafield tag="035" ind1=" " ind2=" ">
<subfield code="a">(BORIS)144298</subfield>
</datafield>
<datafield tag="040" ind1=" " ind2=" ">
<subfield code="a">CH-ZuSLS UBE</subfield>
<subfield code="b">ger</subfield>
<subfield code="e">rda</subfield>
</datafield>
<datafield tag="041" ind1=" " ind2=" ">
<subfield code="a">deu</subfield>
</datafield>
<datafield tag="100" ind1="1" ind2=" ">
<subfield code="a">Ginella, Francesca<name>Francesca Ginella</name><firstname>Francesca</firstname><lastname>Ginella</lastname></subfield>
<subfield code="e">VerfasserIn</subfield>
<subfield code="4">aut</subfield>
</datafield>
<datafield tag="245" ind1="1" ind2="0">
<subfield code="a">Anhang 1</subfield>
<subfield code="b">Materialbasis zu den Grosstierknochen von Seedorf, Lobsigesee</subfield>
<subfield code="h">[electronic resource] /</subfield>
<subfield code="c">Francesca Ginella, Jörg Schibler</subfield>
</datafield>
<datafield tag="264" ind1=" " ind2="1">
<subfield code="c">2020<normalized>2020-01-01</normalized></subfield>
</datafield>
<datafield tag="336" ind1=" " ind2=" ">
<subfield code="a">Text</subfield>
<subfield code="b">txt</subfield>
<subfield code="2">rdacontent</subfield>
</datafield>
<datafield tag="337" ind1=" " ind2=" ">
<subfield code="a">Computermedien</subfield>
<subfield code="b">c</subfield>
<subfield code="2">rdamedia</subfield>
</datafield>
<datafield tag="338" ind1=" " ind2=" ">
<subfield code="a">Online-Ressource</subfield>
<subfield code="b">cr</subfield>
<subfield code="2">rdacarrier</subfield>
</datafield>
<datafield tag="341" ind1=" " ind2=" ">
<subfield code="a">textual</subfield>
</datafield>
<datafield tag="347" ind1=" " ind2=" ">
<subfield code="a">Textdatei</subfield>
<subfield code="b">Spreadsheet</subfield>
<subfield code="c">7kB</subfield>
</datafield>
<datafield tag="506" ind1="0" ind2=" ">
<subfield code="a">Open Access</subfield>
<subfield code="f">Unrestricted online access</subfield>
<subfield code="f">License : Creative Commons: Namensnennung (CC-BY)</subfield>
<subfield code="q">CH-000038-5</subfield>
<subfield code="2">star</subfield>
<subfield code="3">Spreadsheet : https://boris.unibe.ch/144298/1/Lobsigesee_11_Grosstierknochen_Anhang_1_Materialbasis.csv</subfield>
</datafield>
<datafield tag="520" ind1="3" ind2=" ">
<subfield code="a">Datensatz zur Materialbasis der Grosstierknochen aus der neolithischen Moorsiedlung Seedorf, Lobsigesee (BE). Konkordanzliste zwischen Fundnummer, Positionsnummer und Auswertungseinheit/Phase und der Anzahl untersuchter Knochen, in: Ginella, F. und Schibler, J. 2020. Grosstierknochen, in: Heitz, C., Abseits der grossen Seen: Archäologie und Erhaltung der neolithischen Unesco-Welterbe-Fundstelle Seedorf, Lobsigesee. Mit Beiträgen von J. Affolter, C. Brombacher, F. Ginella, R. Haab, H. Hüster Plogmann, R. Krebs, L. Matile, Ph. Rentzel, J. Schibler und A. Hafner. Hefte zur Archäologie im Kanton Bern 7. Archäologischer Dienst des Kantons Bern: Bern 2020, 208–256.</subfield>
</datafield>
<datafield tag="546" ind1=" " ind2=" ">
<subfield code="a">deu</subfield>
</datafield>
<datafield tag="653" ind1="0" ind2=" ">
<subfield code="a">Archäologie</subfield>
</datafield>
<datafield tag="653" ind1="0" ind2=" ">
<subfield code="a">Archäozoologie</subfield>
</datafield>
<datafield tag="653" ind1="0" ind2=" ">
<subfield code="a">Neolithikum</subfield>
</datafield>
<datafield tag="653" ind1="0" ind2=" ">
<subfield code="a">Grosstierknochen</subfield>
</datafield>
<datafield tag="653" ind1="0" ind2=" ">
<subfield code="a">Seeufersiedlung</subfield>
</datafield>
<datafield tag="653" ind1="0" ind2=" ">
<subfield code="a">Feuchtbodenarchäologie</subfield>
</datafield>
<datafield tag="655" ind1=" " ind2="7">
<subfield code="a">Datensatz</subfield>
<subfield code="2">local</subfield>
</datafield>
<datafield tag="655" ind1=" " ind2="7">
<subfield code="a">Online-Ressource</subfield>
<subfield code="2">gnd-carrier</subfield>
</datafield>
<datafield tag="700" ind1="1" ind2=" ">
<subfield code="a">Schibler, Jörg<name>Jörg Schibler</name><firstname>Jörg</firstname><lastname>Schibler</lastname></subfield>
<subfield code="e">VerfasserIn</subfield>
<subfield code="4">aut</subfield>
</datafield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="a">boris.unibe.ch</subfield>
<subfield code="n">Universitätsbibliothek Universität Bern/UB</subfield>
<subfield code="u">https://boris.unibe.ch/144298/</subfield>
<subfield code="z">Volltext</subfield>
<subfield code="z">Zugriff via BORIS</subfield>
</datafield>
<datafield tag="856" ind1="4" ind2="A">
<subfield code="a">boris.unibe.ch</subfield>
<subfield code="n">Universitätsbibliothek Universität Bern/UB</subfield>
<subfield code="u">https://boris.unibe.ch/144298/1/Lobsigesee_11_Grosstierknochen_Anhang_1_Materialbasis.csv</subfield>
<subfield code="q">spreadsheet</subfield>
<subfield code="y">Electronic resource (Spreadsheet)</subfield>
<subfield code="s">7kB</subfield>
<subfield code="z">Online-Zugriff via World Wide Web für Volltext</subfield>
<subfield code="z">Kostenfrei</subfield>
<subfield code="z">Andere</subfield>
</datafield>
<datafield tag="856" ind1="7" ind2=" ">
<subfield code="a">boris.unibe.ch</subfield>
<subfield code="2">doi</subfield>
<subfield code="n">Universitätsbibliothek Universität Bern/UB</subfield>
<subfield code="u">https://doi.org/10.7892/boris.144298</subfield>
<subfield code="z">Zugriff via BORIS</subfield>
</datafield>
<datafield tag="909" ind1="u" ind2="y">
<subfield code="a">930 History of ancient world (to ca. 499)</subfield>
<subfield code="2">UBE</subfield>
<subfield code="8">ddc930</subfield>
<subfield code="9">eng</subfield>
</datafield>
<datafield tag="909" ind1="u" ind2="y">
<subfield code="a">930 Geschichte des Altertums (bis ca. 499), Archäologie</subfield>
<subfield code="2">UBE</subfield>
<subfield code="8">ddc930</subfield>
<subfield code="9">ger</subfield>
</datafield>
<datafield tag="909" ind1="u" ind2="y">
<subfield code="a">930 Histoire du monde antique (jusque vers 499)</subfield>
<subfield code="2">UBE</subfield>
<subfield code="8">ddc930</subfield>
<subfield code="9">fr</subfield>
</datafield>
<datafield tag="909" ind1="u" ind2="y">
<subfield code="a">900 History</subfield>
<subfield code="2">UBE</subfield>
<subfield code="8">ddc900</subfield>
<subfield code="9">eng</subfield>
</datafield>
<datafield tag="909" ind1="u" ind2="y">
<subfield code="a">900 Geschichte</subfield>
<subfield code="2">UBE</subfield>
<subfield code="8">ddc900</subfield>
<subfield code="9">ger</subfield>
</datafield>
<datafield tag="909" ind1="u" ind2="y">
<subfield code="a">900 Histoire</subfield>
<subfield code="2">UBE</subfield>
<subfield code="8">ddc900</subfield>
<subfield code="9">fr</subfield>
</datafield>
<datafield tag="909" ind1="u" ind2="y">
<subfield code="a">Dewey Decimal Classification</subfield>
<subfield code="2">UBE</subfield>
<subfield code="8">dewey</subfield>
<subfield code="9">eng</subfield>
</datafield>
<datafield tag="909" ind1="u" ind2="y">
<subfield code="a">Dewey Decimal Classification</subfield>
<subfield code="2">UBE</subfield>
<subfield code="8">dewey</subfield>
<subfield code="9">ger</subfield>
</datafield>
<datafield tag="909" ind1="u" ind2="y">
<subfield code="a">Dewey Decimal Classification</subfield>
<subfield code="2">UBE</subfield>
<subfield code="8">dewey</subfield>
<subfield code="9">fr</subfield>
</datafield>
</record></metadata><about/></record>
</root>
And here is the mapping:
PREFIX rr: <http://www.w3.org/ns/r2rml#>
PREFIX rml: <http://semweb.mmlab.be/ns/rml#>
PREFIX ql: <http://semweb.mmlab.be/ns/ql#>
PREFIX carml: <http://carml.taxonic.com/carml/>
PREFIX schema: <http://schema.org/>
@base <http://example.com/ns#>.
<#LogicalSourceArticle> a rml:BaseSource ;
#rml:source [ a carml:Stream ; ] ;
rml:source "preprocessed/boris_dataset/test.xml" ;
rml:referenceFormulation ql:XPath;
rml:iterator "root/record/metadata/record" .
<#LogicalSourceAuthor> a rml:BaseSource ;
#rml:source [ a carml:Stream ; ] ;
rml:source "preprocessed/boris_dataset/test.xml" ;
rml:referenceFormulation ql:XPath;
rml:iterator "root/record/metadata/record/datafield[@tag='100' or @tag='700'][subfield[@code='4' and contains(text(),'aut')]]" .
<#LogicalSourceDataDownload> a rml:BaseSource ;
#rml:source [ a carml:Stream ; ] ;
rml:source "preprocessed/boris_dataset/test.xml" ;
rml:referenceFormulation ql:XPath;
rml:iterator "root/record/metadata/record/datafield[@tag='856' and @ind1='4' and @ind2='A']" .
<#ArticleMapping>
a rr:TriplesMap;
rml:logicalSource <#LogicalSourceArticle> ;
rr:subjectMap [
rr:template "https://data.connectome.ch/boris/dataset/{controlfield[@tag='001']}" ;
rr:class schema:Dataset ;
] ;
rr:predicateObjectMap [
rr:predicate schema:name ;
rr:objectMap [
rml:reference "datafield[@tag='245']/subfield[@code='a']" ;
];
] ;
rr:predicateObjectMap [
rr:predicate schema:alternateName ;
rr:objectMap [
rml:reference "datafield[@tag='245']/subfield[@code='b']" ;
];
] ;
rr:predicateObjectMap [
rr:predicate schema:identifier ;
rr:objectMap [
rml:reference "controlfield[@tag='001']" ;
];
] ;
rr:predicateObjectMap [
rr:predicate schema:sameAs ;
rr:objectMap [
rr:template "https://doi.org/{datafield[@tag='024']/subfield[@code='a']}" ;
];
] ;
rr:predicateObjectMap [
rr:predicate schema:abstract;
rr:objectMap [
rml:reference "datafield[@tag='520']/subfield[@code='a']";
]
] ;
rr:predicateObjectMap [
rr:predicate schema:author;
rr:objectMap [
rr:template "https://data.connectome.ch/boris/person/{datafield[@tag='100' or @tag='700'][subfield[@code='4' and contains(text(),'aut')]]/subfield[@code='a']/name}";
]
] ;
rr:predicateObjectMap [
rr:predicate schema:distribution;
rr:objectMap [
rr:template "https://data.connectome.ch/boris/datadownload/{datafield[@tag='856' and @ind1='4' and @ind2='A']/subfield[@code='u']}";
]
] ;
rr:predicateObjectMap [
rr:predicate schema:datePublished;
rr:objectMap [
rml:reference "datafield[@tag='264']/subfield[@code='c']/normalized" ;
rr:datatype xsd:date
]
] ;
rr:predicateObjectMap [
rr:predicate schema:keywords ;
rr:objectMap [
rml:reference "datafield[@tag='653']/subfield[@code='a']" ;
];
] ;
rr:predicateObjectMap [
rr:predicate schema:conditionsOfAccess ;
rr:objectMap [
rml:reference "datafield[@tag='506'][1]/subfield[@code='f'][2]" ; # match only second subfield "f" if present
];
] .
<#AuthorMapping> a rr:TriplesMap ;
rml:logicalSource <#LogicalSourceAuthor> ;
rr:subjectMap [
rr:template "https://data.connectome.ch/boris/person/{subfield[@code='a']/name}" ;
rr:class schema:Person ;
] ;
rr:predicateObjectMap [
rr:predicate schema:name ;
rr:objectMap [
rml:reference "subfield[@code='a']/name" ;
];
] ;
rr:predicateObjectMap [
rr:predicate schema:familyName ;
rr:objectMap [
rml:reference "subfield[@code='a']/lastname" ;
];
] ;
rr:predicateObjectMap [
rr:predicate schema:givenName ;
rr:objectMap [
rml:reference "subfield[@code='a']/firstname" ;
];
] ;
rr:predicateObjectMap [
rr:predicate schema:sameAs ;
rr:objectMap [
rml:reference "subfield[@code='0']/orcid" ;
rr:termType rr:IRI
];
] .
<#DataDownloadMapping> a rr:TriplesMap ;
rml:logicalSource <#LogicalSourceDataDownload> ;
rr:subjectMap [
rr:template "https://data.connectome.ch/boris/datadownload/{subfield[@code='u']}" ;
rr:class schema:DataDownload ;
] ;
rr:predicateObjectMap [
rr:predicate schema:name ;
rr:objectMap [
rml:reference "subfield[@code='q']" ;
];
] ;
rr:predicateObjectMap [
rr:predicate schema:author;
rr:objectMap [
rr:template "https://data.connectome.ch/boris/person/{//datafield[@tag='100' or @tag='700'][subfield[@code='4' and contains(text(),'aut')]]/subfield[@code='a']/name}"; # get parent node
]
] ;
rr:predicateObjectMap [
rr:predicate schema:contentUrl ;
rr:objectMap [
rml:reference "subfield[@code='u']" ;
rr:termType rr:IRI
]
] .
I tried this with rmlmapper (6.2.1) and carml (0.4.7).
diff (ignoring URI/IRI difference, see kg-construct/rml-questions#28):
<https://data.connectome.ch/boris/datadownload/https%3A%2F%2Fboris.unibe.ch%2F144298%2F1%2FLobsigesee_11_Grosstierknochen_Anhang_1_Materialbasis.csv> ns1:author <https://data.connectome.ch/boris/person/Francesca%20Ginella>, <https://data.connectome.ch/boris/person/J%C3%B6rg%20Schibler> .
CARML does not write out the author property for datadownload while rmlmapper does and I suspect the problem is related to the XPATH expression to the parent element:
{//datafield[@tag='100' or @tag='700'][subfield[@code='4' and contains(text(),'aut')]]/subfield[@code='a']/name}
What I want to express is to go one level up to the element root/record/metadata/record
from datafield
and then choose the indicated datafield for the author info. Is there a bug in the expression that would explain the different behaviour?
Thanks and enjoy your weekend!
Tobias