-
Notifications
You must be signed in to change notification settings - Fork 124
fix(gen-jsonld): handle multiple contexts #2602
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
5154af3
to
930bf9c
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #2602 +/- ##
=======================================
Coverage 82.95% 82.95%
=======================================
Files 127 127
Lines 13856 13856
Branches 3863 3863
=======================================
Hits 11494 11494
Misses 1722 1722
Partials 640 640 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
My changes have a bigger impact on the generated JSON-LD than expected... Before my changes calling JSON-LD generation from the CLI ( With my changes the |
e5439ac
to
811f2d2
Compare
The changes in the context for the test schema simple_uri_test.yaml can be seen here. |
4fe1f18
to
9a724c8
Compare
@Silvanoc I don't have a full root cause, but for now: https://github.com/linkml/linkml/blob/fix-json-ld-context-typing/linkml/generators/jsonldgen.py#L165 https://github.com/linkml/linkml/blob/fix-json-ld-context-typing/linkml/generators/jsonldcontextgen.py#L234 This causes some odd handling of model / self.model in end_schema: I tried getting equivalent behavior using command-line options to generate jsonld-context, but because of the differences above I haven't succeeded. Will probably have to look at this at a later time. Given that the JSON-LD may not parse in either form, do you want to break this work out into a separate issue? |
Also these lines account for the difference in JSON-LD generator adds @base after the rest of the context is generated here: The default context generator adds it here: |
@Silvanoc Below are collapsed draft patches that pass arguments into ContextGenerator serialize() the way jsonld-context does. They also move base processing down into ContextGenerator. We should loop in a core dev to check. expand/collapse Patch for JSON-LD generator handling of context base and serialize parametersdiff --git a/linkml/generators/jsonldgen.py b/linkml/generators/jsonldgen.py
index aa5f9b2b..5e3ecf87 100644
--- a/linkml/generators/jsonldgen.py
+++ b/linkml/generators/jsonldgen.py
diff --git a/linkml/generators/jsonldgen.py b/linkml/generators/jsonldgen.py
index aa5f9b2b..bece1738 100644
--- a/linkml/generators/jsonldgen.py
+++ b/linkml/generators/jsonldgen.py
@@ -162,7 +162,10 @@ class JSONLDGenerator(Generator):
# model_context = self.schema.source_file.replace('.yaml', '.prefixes.context.jsonld')
# context = [METAMODEL_CONTEXT_URI, f'file://./{model_context}']
# TODO: The _visit function above alters the schema in situ
- add_prefixes = ContextGenerator(self.original_schema, model=False, metadata=False).serialize()
+ add_prefixes = ContextGenerator(
+ self.original_schema, base=base_prefix, model=False, metadata=False, prefixes=False
+ ).serialize(
+ base=base_prefix, model=False, metadata=False, prefixes=False)
add_prefixes_json = loads(add_prefixes)
context = [METAMODEL_CONTEXT_URI, add_prefixes_json["@context"]]
elif isinstance(context, str): # Some of the older code doesn't do multiple contexts
@@ -181,8 +184,6 @@ class JSONLDGenerator(Generator):
if self.format == "jsonld":
self.schema["@context"] = context[0] if len(context) == 1 and not base_prefix else context
- if base_prefix:
- self.schema["@context"].append({"@base": base_prefix})
# json_obj["@id"] = self.schema.id
out = str(as_json(self.schema, indent=" ")) + "\n"
self.schema = self.original_schema The change in JSON-LD context generator sets base regardless of the model check. expand/collapse Patch for JSON-LD CONTEXT handling of context basediff --git a/linkml/generators/jsonldcontextgen.py b/linkml/generators/jsonldcontextgen.py
index da2d9d92..43e12f3c 100644
--- a/linkml/generators/jsonldcontextgen.py
+++ b/linkml/generators/jsonldcontextgen.py
@@ -98,12 +98,6 @@ class ContextGenerator(Generator):
comments.source = self.schema.source_file
context.comments = comments
context_content = {"xsd": "http://www.w3.org/2001/XMLSchema#"}
- if base:
- base = str(base)
- if "://" not in base:
- self.context_body["@base"] = os.path.relpath(base, os.path.dirname(self.schema.source_file))
- else:
- self.context_body["@base"] = base
if prefixes:
for prefix in sorted(self.emit_prefixes):
url = str(self.namespaces[prefix])
@@ -120,6 +114,12 @@ class ContextGenerator(Generator):
context_content[k] = v
for k, v in self.slot_class_maps.items():
context_content[k] = v
+ if base:
+ base = str(base)
+ if "://" not in base:
+ context_content["@base"] = os.path.relpath(base, os.path.dirname(self.schema.source_file))
+ else:
+ context_content["@base"] = base
context["@context"] = context_content
if output:
with open(output, "w", encoding="UTF-8") as outf: This results in the model being excluded in both commands below and the base being included in both cases:
In jsonld-gen the paths to the imported model contexts are also added to the context, an additional difference. It would change the type of the returned context json to be union(dict,list) serialized to a string? The script below generates parsable JSON-LD with data that conforms to the model. expand/collapse simple script to generate json-ld and context, read it and write as turtleimport os
import json
from rdflib import Graph
# Generate the JSON-LD
os.chdir('tests/test_scripts/input')
os.system("linkml generate jsonld simple_uri_test.yaml > simple_uri_test.out.jsonld")
# also generate the .context.jsonld for the included LinkML .yaml, doesn't seem to happen automatically
os.system("linkml generate jsonld-context simple_slots.yaml > simple_slots.context.jsonld")
os.chdir('includes')
os.system("linkml generate jsonld-context simple_types.yaml > simple_types.context.jsonld")
os.chdir('..')
# parse and reserialize the resulting RDF/JSON-LD as RDF/Turtle
# the results look plausible, but I haven't gone over them in detail
g = Graph().parse("simple_uri_test.out.jsonld", format='json-ld')
print(g.serialize(format='turtle'))
print("\n\n---------------------\n\n")
# Generate the JSON-LD Context directly and read it back in
ctx_file = "simple_uri_test.out.context.jsonld"
os.system(f"linkml generate jsonld-context --no-metadata --mergeimports --flatprefixes simple_uri_test.yaml > {ctx_file}")
with open(ctx_file) as f:
ctx = json.load(f)
# Create a data object and make it into JSON-LD by attaching the context
data = {
"@context": ctx,
"model": {
"state": "discombobulated"
}
}
# Parse the data object as RDF/JSON-LD and re-serialize it as RDF/Turtle
# the triple [] simple_uri_test:model [ foo:state "discombobulated" ] . corresponds to the data.
g2 = Graph().parse(data=data, format='json-ld')
print(g2.serialize(format='turtle')) The generated RDF from the example script is as follows, which is what I would expect: @prefix foo: <http://samples.r.us/foo#> .
@prefix simple_uri_test: <https://raw.githubusercontent.com/linkml/linkml/main/tests/test_scripts/input/simple_uri_test/> .
[] simple_uri_test:model [ foo:state "discombobulated" ] . |
@tfliss I don't get what you mean with following sentence:
Do you want to have an issue for this PR? I mean there is not much in here, so I don't know what you would want to break into a separate issue... |
@tfliss the context after applying your patch is closer to the one being currently generated, than that generated applying the patch of this PR. You can see them below: "@context" from current "main"[
"https://w3id.org/linkml/extensions.context.jsonld",
"simple_slots.context.jsonld",
"includes/simple_types.context.jsonld",
"https://w3id.org/linkml/types.context.jsonld",
{
"@base": "https://raw.githubusercontent.com/linkml/linkml/main/tests/test_scripts/input/simple_uri_test/"
}
] "@context" from current "fix-json-ld-context-typing"[
"https://w3id.org/linkml/meta.context.jsonld",
{
"xsd": "http://www.w3.org/2001/XMLSchema#",
"bar": {
"@id": "http://samples.r.us/bar_",
"@prefix": true
},
"foo": "http://samples.r.us/foo#",
"linkml": "https://w3id.org/linkml/",
"simple_uri_test": "https://raw.githubusercontent.com/linkml/linkml/main/tests/test_scripts/input/simple_uri_test/",
"@vocab": "https://raw.githubusercontent.com/linkml/linkml/main/tests/test_scripts/input/simple_uri_test/",
"tag": {
"@type": "xsd:anyURI",
"@id": "linkml:tag"
},
"value": {
"@type": "linkml:Any",
"@id": "linkml:value"
},
"extensions": {
"@type": "linkml:Extension",
"@id": "linkml:extensions"
},
"state": {
"@id": "foo:state"
},
"AnyValue": {
"@id": "linkml:Any"
},
"Extensible": {
"@id": "linkml:Extensible"
},
"Extension": {
"@id": "linkml:Extension"
},
"Model": {
"@id": "foo:model"
}
},
"https://w3id.org/linkml/extensions.context.jsonld",
"simple_slots.context.jsonld",
"includes/simple_types.context.jsonld",
"https://w3id.org/linkml/types.context.jsonld",
{
"@base": "https://raw.githubusercontent.com/linkml/linkml/main/tests/test_scripts/input/simple_uri_test/"
}
] "@context" from Tim's patch on "fix-json-ld-context-typing"[
"https://w3id.org/linkml/meta.context.jsonld",
{
"xsd": "http://www.w3.org/2001/XMLSchema#",
"@base": "https://raw.githubusercontent.com/linkml/linkml/main/tests/test_scripts/input/simple_uri_test/"
},
"https://w3id.org/linkml/extensions.context.jsonld",
"simple_slots.context.jsonld",
"includes/simple_types.context.jsonld",
"https://w3id.org/linkml/types.context.jsonld"
] Or possibly easier to view are the Difference "@context" between "main" and Tim's patch on "fix-json-ld-context-typing"--- ../main/main.context.ld.json 2025-04-11 14:06:34
+++ ./fix-json-ld-context-typing.context.ld.json 2025-04-11 14:06:16
@@ -1,9 +1,11 @@
[
+ "https://w3id.org/linkml/meta.context.jsonld",
+ {
+ "xsd": "http://www.w3.org/2001/XMLSchema#",
+ "@base": "https://raw.githubusercontent.com/linkml/linkml/main/tests/test_scripts/input/simple_uri_test/"
+ },
"https://w3id.org/linkml/extensions.context.jsonld",
"simple_slots.context.jsonld",
"includes/simple_types.context.jsonld",
- "https://w3id.org/linkml/types.context.jsonld",
- {
- "@base": "https://raw.githubusercontent.com/linkml/linkml/main/tests/test_scripts/input/simple_uri_test/"
- }
+ "https://w3id.org/linkml/types.context.jsonld"
] Difference "@context" between "main" and "fix-json-ld-context-typing"--- ../main/main.context.ld.json 2025-04-11 14:06:34
+++ ./tim.context.ld.json 2025-04-11 14:12:13
@@ -1,4 +1,43 @@
[
+ "https://w3id.org/linkml/meta.context.jsonld",
+ {
+ "xsd": "http://www.w3.org/2001/XMLSchema#",
+ "bar": {
+ "@id": "http://samples.r.us/bar_",
+ "@prefix": true
+ },
+ "foo": "http://samples.r.us/foo#",
+ "linkml": "https://w3id.org/linkml/",
+ "simple_uri_test": "https://raw.githubusercontent.com/linkml/linkml/main/tests/test_scripts/input/simple_uri_test/",
+ "@vocab": "https://raw.githubusercontent.com/linkml/linkml/main/tests/test_scripts/input/simple_uri_test/",
+ "tag": {
+ "@type": "xsd:anyURI",
+ "@id": "linkml:tag"
+ },
+ "value": {
+ "@type": "linkml:Any",
+ "@id": "linkml:value"
+ },
+ "extensions": {
+ "@type": "linkml:Extension",
+ "@id": "linkml:extensions"
+ },
+ "state": {
+ "@id": "foo:state"
+ },
+ "AnyValue": {
+ "@id": "linkml:Any"
+ },
+ "Extensible": {
+ "@id": "linkml:Extensible"
+ },
+ "Extension": {
+ "@id": "linkml:Extension"
+ },
+ "Model": {
+ "@id": "foo:model"
+ }
+ },
"https://w3id.org/linkml/extensions.context.jsonld",
"simple_slots.context.jsonld",
"includes/simple_types.context.jsonld", I lack the level of JSON-LD expertise to assess which result is correct. What I can tell is that in neither |
@Silvanoc Thanks for the great summary and output diffs above. The elements in your branch with type and id entries are due to the model=False argument being changed to True by the time it got to the ContextGenerator logic. The entries with prefix are due to the prefix=False not being passed in and defaulting to True.
One thing I realized is it is not necessary to move the code block in the context generator. While I'd still recommend a core dev review I'm otherwise confident enough today to recommend applying the patches to this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just finishing my review. This is a great improvement, thanks for talking through the context behaviors with me, I understand it a lot better now.
At this point I'd recommend:
- applying the patches we discussed
- update the snapshot as appropriate to the patched output
- add a multi-context test parameter case to /test_scripts/test_gen_jsonld.py#test_metamodel_valid_calls()
504e311
to
f3fafd9
Compare
6b1bd2f
to
8dc37f4
Compare
@amc-corey-cox Can you clarify the procedures/expectations of approval? Does merging to main wait until all of the issues under the parent #2624 are approved? On its own this PR still includes the model info in the snapshot JSON-LD despite the model=False flag being set. Once you confirm that's ok, I'll approve. |
@tfliss @amc-corey-cox I'm trying to improve test coverage and observing some strange things. Therefore moving to "draft" to block merging until I have a clearer understanding. |
8dc37f4
to
2af9549
Compare
2af9549
to
181f138
Compare
CLI supports multiple context files (what returns a tuple), but code mostly expecting a single context (and therefore only a string). IDEs and linters recognize the inconsistency thanks to typing. Signed-off-by: Silvano Cirujano Cuesta <silvano.cirujano-cuesta@siemens.com>
Signed-off-by: Silvano Cirujano Cuesta <silvano.cirujano-cuesta@siemens.com>
Validate that appending multiple files on JSON-LD generation works as expected. Signed-off-by: Silvano Cirujano Cuesta <silvano.cirujano-cuesta@siemens.com>
Signed-off-by: Silvano Cirujano Cuesta <silvano.cirujano-cuesta@siemens.com>
181f138
to
b6410c0
Compare
CLI supports multiple context files (what returns a tuple), but code mostly expecting a single context (and therefore only a string).
IDEs and linters recognize the inconsistency thanks to typing.
The context can get extended , therefore converting the tuple into a list.
[Update]: trying to analyze some unexpected changes in the resulting JSON-LD I think I have uncovered an unknown issue:
linkml generate jsonldcontext
, [...]Pending the confirmation of an expert, I am the impression that the patch is correct and fixing both the types inconsistencies and the context generation.
I wonder what's the reason for the lack of code-sharing for
@context
generation between jsonldgen.py and jsonldcontextgen.py.Closes #2625