-
Notifications
You must be signed in to change notification settings - Fork 283
replace test XML diffing: yaxmldiff instead of pyutilib #495
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #495 +/- ##
==========================================
+ Coverage 95.40% 95.71% +0.30%
==========================================
Files 20 20
Lines 2481 2472 -9
Branches 428 424 -4
==========================================
- Hits 2367 2366 -1
+ Misses 50 46 -4
+ Partials 64 60 -4
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGFM. The suggestions are because this part shall never be reached in the CI..
Pyutilib is a wonderful collection of Python helpers developed by the same people who created gcovr. But as gcovr is converging towards more standard Python practices, we've only been using it for XML diffing. It turns out that XML diffing is a very popular problem, but most libraries do something that's too clever for our tests or don't produce readable output. The `lxml.doctestcompare` module is very good but annotates differences *inline*, e.g. produces this when comparing `<a/>` with `<b/>`: Expected: <a></a> Got: <b></b> Diff: <a (got: b)></a (got: b)> In large documents, this can be hard to spot. There are additional issues, such as special `<any/>` tags, `any=""` attributes, and `...` ellipsis that are useful for doctests, but not useful for comparing two actual XML documents. Another tool is xmldiff which is also very good, except that it outputs diff as instructions that are more machine-readable than human-accessible. For the same problem, it will output: [rename, /a[1], b] So because I'm dissatisfied with available tools, I've written Yet Another XML Differ (yaxmldiff). It uses lxml and tries to approximate a unified diff: - <a/> + <b/> On larger problems, it will try to collapse irrelevant sub-trees and attributes. For example, here is a diff between the gcc-5 and gcc-8 reference files for the simple1 test: <coverage branch-rate="0.5" complexity="0.0" - branches-covered="4" + branches-covered="2" - branches-valid="8" + branches-valid="4" - line-rate="0.8" + line-rate="0.7777777777777778" - lines-covered="8" + lines-covered="7" - lines-valid="10" + lines-valid="9" - timestamp="" + timestamp="1601411485" - version="" + version="gcovr 5.0" > <sources> ... </sources> <packages> <package branch-rate="0.5" complexity="0.0" name="" - line-rate="0.8" + line-rate="0.7777777777777778" > <classes> <class branch-rate="0.5" complexity="0.0" filename="..." name="..." - line-rate="0.8" + line-rate="0.7777777777777778" > <methods/> <lines> <line .../> <line ...> ... </line> <line .../> <line .../> <line .../> <line branch="..." hits="1" - number="17" + number="14" /> <line ...> ... </line> <line branch="..." hits="0" - number="22" + number="19" /> <line .../> - <line ...>...</line> </lines> </class> </classes> </package> </packages> </coverage> This tool did spot a problem in the existing reference files. In the GCC-5 reference for the "shadow" test, Pyutilib considered the following two attributes equivalent, likely because it tries to compare things that look like numbers as numbers: condition-coverage="50% (1/2)" condition-coverage="50% (2/4)" The reference file was updated manually. This also hints at the drawback of yaxmldiff: it treats everything as text, and cannot apply rounding like the Pyutilib functions. Links: * yaxmldiff on PyPI: <https://pypi.org/project/yaxmldiff/> * yaxmldiff on GitHub: <https://github.com/latk/yaxmldiff.py>
The driver for the end-to-end integration tests contains some checks that can't fire in a successful run, so ignore their coverage to massage our codecov figures. Co-authored-by: Michael Förderer <40258682+Spacetown@users.noreply.github.com>
Pyutilib is a wonderful collection of Python helpers developed by the same people who created gcovr. But as gcovr is converging towards more standard Python practices, we've only been using it for XML diffing.
It turns out that XML diffing is a very popular problem, but most libraries do something that's too clever for our tests or don't produce readable output.
The
lxml.doctestcompare
module is very good but annotates differences inline, e.g. produces this when comparing<a/>
with<b/>
:In large documents, this can be hard to spot.
There are additional issues, such as special
<any/>
tags,any=""
attributes, and...
ellipsis that are useful for doctests, but not useful for comparing two actual XML documents.Another tool is xmldiff which is also very good, except that it outputs diff as instructions that are more machine-readable than human-accessible. For the same problem, it will output:
So because I'm dissatisfied with available tools, I've written Yet Another XML Differ (yaxmldiff). It uses lxml and tries to approximate a unified diff:
On larger problems, it will try to collapse irrelevant sub-trees and attributes. For example, here is a diff between the gcc-5 and gcc-8 reference files for the simple1 test:
This tool did spot a problem in the existing reference files. In the GCC-5 reference for the "shadow" test, Pyutilib considered the following two attributes equivalent, likely because it tries to compare things that look like numbers as numbers:
The reference file was updated manually.
This also hints at the drawback of yaxmldiff: it treats everything as text, and cannot apply rounding like the Pyutilib functions.
Links:
This PR contains two commits: one for the actual change, one to re-format the test_gcovr.py file. The reformatting will be redundant once #493 is merged.