8000 huge_tree option for XML parser · Issue #247 · dapper91/pydantic-xml · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
huge_tree option for XML parser #247
Open
@JordanBarnartt

Description

@JordanBarnartt

I ran into a situation where I was asking pydantic-xml to parse very large XML documents, and was receiving errors like: lxml.etree.XMLSyntaxError: CData section too big found, line 66036, column 194.

According to https://lxml.de/apidoc/lxml.etree.html#lxml.etree.XMLParser, this can be increased with the huge_tree=True parameter. However, there does not appear to be a way to enable this for the pydantic-xml parser.

I was able to solve my issues by monkey-patching like so:

from pydantic_xml.model import BaseXmlModel
from lxml import etree

def _from_xml(cls, source, context=None): 
    """
    Deserializes an xml string to an object of `cls` type.

    :param source: xml string
    :param context: pydantic validation context
    :return: deserialized object
    """

    parser = etree.XMLParser(huge_tree=True)
    return cls.from_xml_tree(etree.fromstring(source, parser), context=context)


BaseXmlModel.from_xml = classmethod(_from_xml)

It would be nice if huge_tree were exposed as part of the interface when running from_xml. Is this change desirable? If so, I'd be happy to write a PR. Maybe even something a bit more general to be able to pass arbitrary arguments to the XMLParser creation.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestv2Version 2 related

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0