8000 as_rdf needs to escape certain characters · Issue #37 · ropensci/rdflib · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
as_rdf needs to escape certain characters #37
Open
@josephguillaume

Description

@josephguillaume

as_rdf.data.frame, write_nquads, normalize_table or poor_mans_nquads need to escape certain characters otherwise rdf_parse and therefore as_rdf either returns an error or no content.

The characters to be escaped include at least double quotes in string literals and spaces in predicates.
Examples follow below. There are obvious solutions to these particular cases, but it's not clear to me what would be needed for the solutions to be generally applicable and not cause any regressions.

A multiple word predicate silently fails to return any triples unless it is URLencoded

df <- data.frame(1,1)
names(df) <- c("id","multiple word predicate")
g <- as_rdf(df,prefix = "http://example.org#",key_column = "id")
g
# Total of 0 triples, stored in hashes

ntab<-rdflib:::normalize_table(df,key_column = "id")
ntab$predicate<-sapply(ntab$predicate,URLencode)
rdflib:::poor_mans_nquads(ntab,"temp.nquads",prefix="http://example.org#")
g<-rdf_parse("temp.nquads",format="nquads")
g
# Total of 1 triples, stored in hashes
#-------------------------------
#  <http://example.org#1> <http://example.org#multiple%20word%20predicate> "1"^^<http://www.w3.org/2001/XMLSchema#decimal> .

# As a workaround, a user can URLencode the data.frame names
df <- data.frame(1,1)
names(df) <- c("id","multiple word predicate")
names(df) <- sapply(names(df),URLencode)
g <- as_rdf(df,prefix = "http://example.org#",key_column = "id")
g

A string with double quotes silently fails to return any triples unless a backslash escape character is added (which itself needs to be escaped in R)

df <- data.frame(1,'string with "quotes"')
names(df) <- c("id","predicate")
g <- as_rdf(df,prefix = "http://example.org#",key_column = "id")
g
# Total of 0 triples, stored in hashes

ntab<-rdflib:::normalize_table(df,key_column = "id")
ntab$object<-gsub('"','\\"',ntab$object,fixed=T)
rdflib:::poor_mans_nquads(ntab,"temp.nquads",prefix="http://example.org#")
g<-rdf_parse("temp.nquads",format="nquads")
g
# Total of 1 triples, stored in hashes
# -------------------------------
#  <http://example.org#1> <http://example.org#predicate> "string with "quotes""^^<http://www.w3.org/2001/XMLSchema#string> .


# As a workaround, a user can replace quotes within the relevant columns of the data.frame
df <- data.frame(1,'string with "quotes"')
names(df) <- c("id","predicate")
df$predicate <- gsub('"','\\"',df$predicate,fixed=T)
g <- as_rdf(df,prefix = "http://example.org#",key_column = "id")
g

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0