8000 Elements of parent classes are inheriting namespaces from their children (and sometimes they don't) · Issue #254 · dapper91/pydantic-xml · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
Elements of parent classes are inheriting namespaces from their children (and sometimes they don't) #254
Open
@jwfraustro

Description

@jwfraustro

Hello,

I'm sorry that this is a long write-up, but I've tried to be as thorough as possible. The tl;dr is in the Summary section at the end.

It has been well-documented that sub-models don't inherit their namespaces from their parents and must be explicitly defined: #221

It has also been noted that child elements of a parent do inherit the namespaces of their parents. #197

However, what hasn't been touched on, it seems, is that inherited classes are propagating their namespaces to their parents. This has led to a problem, and then a secondary problem as a consequence.

The Setup

Consider a FooSchema.xsd schema that defines two types under its default namespace: A BaseClass, and a MiddleClass that inherits from it, and adds an additional element:

FOO_NSMAP = {
    "foo": "FooSchema.xsd",
    }

class BaseClass(BaseXmlModel, nsmap=FOO_NSMAP, ns="foo"):
    """Base class of the FooSchema"""

    base_element: str = element(tag="base_element")

class MiddleClass(BaseClass, nsmap=FOO_NSMAP, ns="foo"):
    """An inherited class of BaseClass in the FooSchema"""

    middle_element: str = element(tag="middle_element")

middle_class = MiddleClass(
    base_element="base",
    middle_element="middle",
)

print(middle_class.to_xml())
#<foo:MiddleClass xmlns:foo="FooSchema.xsd">
#    <foo:base_element>base</foo:base_element>
#    <foo:middle_element>middle</foo:middle_element>
#</foo:MiddleClass>

# Consuming it with a default namespace works too!
middle_class = MiddleClass.from_xml("""
<MiddleClass xmlns="FooSchema.xsd">
    <base_element>base</base_element>
    <middle_element>middle</middle_element>
</MiddleClass>
""")

This all makes perfect sense, both classes and their elements are all under the FooSchema.xsd namespace, and we know that base_element and middle_element are inheriting that namespace, as mentioned in 197 above.

Adding An Inherited Class with a Different Namespace

Now, consider a BarSchema.xsd schema that has a type that inherits from MiddleClass in FooSchema.xsd, but does so under a different prefix:

BAR_NSMAP = {"foo": "FooSchema.xsd", "bar": "BarSchema.xsd"}

class BarTopClass(MiddleClass, nsmap=BAR_NSMAP, ns="bar"):
    """A class under BarSchema that inherits from FooSchema"""

    top_element: str = element(tag="top_element")

bar_top_class = BarTopClass(
    base_element="base",
    middle_element="middle",
    top_element="top",
)
print(bar_top_class.to_xml())

Now, printing this with to_xml(), we might expect:

Expectation

The elements under FooSchema would retain their specified, inherited, foo: prefix, just like before and the TopClass would get its bar prefix:

<bar:BarTopClass xmlns:foo="FooSchema.xsd" xmlns:bar="BarSchema.xsd">
    <foo:base_element>base</foo:base_element>
    <foo:middle_element>middle</foo:middle_element>
    <bar:top_element>top</bar:top_element>
</bar:BarTopClass>

The Problem

The Actual Result

But what actually happens is more confusing. Elements of the parent classes are being overwritten by the namespaces of their descendants, which is patently incorrect:

<bar:BarTopClass xmlns:foo="FooSchema.xsd" xmlns:bar="BarSchema.xsd">
    <bar:base_element>base</bar:base_element>
    <bar:middle_element>middle</bar:middle_element>
    <bar:top_element>top</bar:top_element>
</bar:BarTopClass>

This should probably not happen, and seems very unexpected.

An attempted solution

Given what is mentioned in 197, one of the ways we might try to fix this is to explicitly declare namespaces on the elements of the parent classes, with ns="foo" like so:

# FooSchema

class BaseClass(BaseXmlModel, nsmap=FOO_NSMAP, ns="foo"):
    """Base class of the FooSchema"""

    base_element: str = element(tag="base_element", ns="foo")

class MiddleClass(BaseClass, nsmap=FOO_NSMAP, ns="foo"):
    """An inherited class of BaseClass in the FooSchema"""

    middle_element: str = element(tag="middle_element", ns="foo")

And this does work! Trying print(bar_top_class.to_xml()) again, we now get:

<bar:BarTopClass xmlns:foo="FooSchema.xsd" xmlns:bar="BarSchema.xsd">
    <foo:base_element>base</foo:base_element>
    <foo:middle_element>middle</foo:middle_element>
    <bar:top_element>top</bar:top_element>
</bar:BarTopClass>

until...

Problem 2 - A second inherited class

For whatever reason, the writers of BazSchema.xsd have decided to do something silly and import MiddleClass and its schema FooSchema.xsd under a different name:

BAZ_NSMAP = {"footoo": "FooSchema.xsd", "baz": "BazSchema.xsd"}

class BazTopClass(MiddleClass, nsmap=BAZ_NSMAP, ns="baz"):
    """A class under baz schema that also inherits from foo schema, but named it differently"""

    top_baz_element: str = element(tag="top_baz_element")


baz_top_class = BazTopClass(
    top_baz_element="baz_top",
    middle_element="middle",
    base_element="base",
)

print(baz_top_class.to_xml())

now, print(baz_top_class.to_xml()) gives us:

<baz:BazTopClass xmlns:footoo="FooSchema.xsd" xmlns:baz="BazSchema.xsd">
    <base_element>base</base_element>
    <middle_element>middle</middle_element>
    <baz:top_baz_element>baz_top</baz:top_baz_element>
</baz:BazTopClass>

Now it gives no namespace at all? Surely, we can consume the XML if it's fully qualified though:

BazTopClass.from_xml("""
<baz:BazTopClass xmlns:footoo="FooSchema.xsd" xmlns:baz="BazSchema.xsd">
    <footoo:base_element>base</footoo:base_element>
    <footoo:middle_element>middle</footoo:middle_element>
    <baz:top_baz_element>baz_top</baz:top_baz_element>
</baz:BazTopClass>
""")

pydantic_core._pydantic_core.ValidationError: 2 validation errors for BazTopClass
base_element
  [line 2]: Field required [type=missing, input_value={'top_baz_element': 'baz_top'}, input_type=dict]
middle_element
  [line 2]: Field required [type=missing, input_value={'top_baz_element': 'baz_top'}, input_type=dict]

Okay, well, if we used ns="foo" on those parent classes, maybe we can use that namespace:

baz_top_class = BazTopClass.from_xml("""
<baz:BazTopClass xmlns:footoo="FooSchema.xsd" xmlns:baz="BazSchema.xsd">
    <foo:middle_element>middle</foo:middle_element>
    <foo:base_element>base</foo:base_element>
    <baz:top_baz_element>baz_top</baz:top_baz_element>
</baz:BazTopClass>
""")

lxml.etree.XMLSyntaxError: Namespace prefix foo on middle_element is not defined, line 3, column 24

that error, unfortunately, makes sense. As a shot in the dark, I tried re-defining those parent classes, explicitly with the new BAZ_NSMAP prefixes:

class BazBaseClass(BaseClass, nsmap=BAZ_NSMAP, ns="footoo"):
    pass

class BazMiddleClass(MiddleClass, nsmap=BAZ_NSMAP, ns="footoo"):
    pass

class BazTopClass(BazMiddleClass, nsmap=BAZ_NSMAP, ns="baz", search_mode="unordered"):
    """A class under baz schema that also inherits from foo schema, but named it differently"""

    top_baz_element: str = element(tag="top_baz_element", ns="baz")

but that unfortunately, still gives us the same:

<baz:BazTopClass xmlns:footoo="FooSchema.xsd" xmlns:baz="BazSchema.xsd">
    <base_element>base</base_element>
    <middle_element>middle</middle_element>
    <baz:top_baz_element>baz_top</baz:top_baz_element>
</baz:BazTopClass>

In fact, the only way it seems I can get it to work, is to redefine every class and every single element that Baz inherits from, under the new namespace.

Summary

  1. Descendant classes are overwriting the namespaces of their parent class's elements unless the namespace is explicitly defined on every element of every parent class. This includes every class in the middle, all the way to the root class.

The apparent solution of explicitly declaring the namespace prefix on these classes leads to:

  1. Any class that inherits from the parents under a different namespace will lose that prefix entirely, and the only solution seems to be to redefine all parent classes.

This has some knock-on effects, as you might imagine. It also complicates scenarios like new schema versions, and schemas who define their default namespace as "", but inheriting schemas do define it under some prefix.

Problem 1 seems to be the real issue here, and it's hard to tell how closely the two are linked.

Thanks for taking the time to look at this.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0