8000 replace test XML diffing: yaxmldiff instead of pyutilib by latk · Pull Request #495 · gcovr/gcovr · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

replace test XML diffing: yaxmldiff instead of pyutilib #495

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jun 13, 2021

Conversation

latk
Copy link
Member
@latk latk commented Jun 13, 2021

Pyutilib is a wonderful collection of Python helpers developed by the same people who created gcovr. But as gcovr is converging towards more standard Python practices, we've only been using it for XML diffing.

It turns out that XML diffing is a very popular problem, but most libraries do something that's too clever for our tests or don't produce readable output.

The lxml.doctestcompare module is very good but annotates differences inline, e.g. produces this when comparing <a/> with <b/>:

Expected:
  <a></a>

Got:
  <b></b>

Diff:
  <a (got: b)></a (got: b)>

In large documents, this can be hard to spot.

There are additional issues, such as special <any/> tags, any="" attributes, and ... ellipsis that are useful for doctests, but not useful for comparing two actual XML documents.

Another tool is xmldiff which is also very good, except that it outputs diff as instructions that are more machine-readable than human-accessible. For the same problem, it will output:

[rename, /a[1], b]

So because I'm dissatisfied with available tools, I've written Yet Another XML Differ (yaxmldiff). It uses lxml and tries to approximate a unified diff:

- <a/>
+ <b/>

On larger problems, it will try to collapse irrelevant sub-trees and attributes. For example, here is a diff between the gcc-5 and gcc-8 reference files for the simple1 test:

  <coverage branch-rate="0.5" complexity="0.0"
-   branches-covered="4"
+   branches-covered="2"
-   branches-valid="8"
+   branches-valid="4"
-   line-rate="0.8"
+   line-rate="0.7777777777777778"
-   lines-covered="8"
+   lines-covered="7"
-   lines-valid="10"
+   lines-valid="9"
-   timestamp=""
+   timestamp="1601411485"
-   version=""
+   version="gcovr 5.0"
  >
    <sources>
    ...
    </sources>
    <packages>
      <package branch-rate="0.5" complexity="0.0" name=""
-       line-rate="0.8"
+       line-rate="0.7777777777777778"
      >
        <classes>
          <class branch-rate="0.5" complexity="0.0" filename="..." name="..."
-           line-rate="0.8"
+           line-rate="0.7777777777777778"
          >
            <methods/>
            <lines>
              <line .../>
              <line ...>
              ...
              </line>
              <line .../>
              <line .../>
              <line .../>
              <line branch="..." hits="1"
-               number="17"
+               number="14"
              />
              <line ...>
              ...
              </line>
              <line branch="..." hits="0"
-               number="22"
+               number="19"
             />
              <line .../>
-             <line ...>...</line>
            </lines>
          </class>
        </classes>
      </package>
    </packages>
  </coverage>

This tool did spot a problem in the existing reference files. In the GCC-5 reference for the "shadow" test, Pyutilib considered the following two attributes equivalent, likely because it tries to compare things that look like numbers as numbers:

condition-coverage="50% (1/2)"
condition-coverage="50% (2/4)"

The reference file was updated manually.

This also hints at the drawback of yaxmldiff: it treats everything as text, and cannot apply rounding like the Pyutilib functions.

Links:

This PR contains two commits: one for the actual change, one to re-format the test_gcovr.py file. The reformatting will be redundant once #493 is merged.

@latk latk added Type: Enhancement QA related to testing, build infrastructure, etc dependencies Pull requests that update a dependency file labels Jun 13, 2021
@latk latk added this to the UpcomingRelease milestone Jun 13, 2021
@codecov
Copy link
codecov bot commented Jun 13, 2021

Codecov Report

Merging #495 (5d69cfb) into master (07d419e) will increase coverage by 0.30%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #495      +/-   ##
==========================================
+ Coverage   95.40%   95.71%   +0.30%     
==========================================
  Files          20       20              
  Lines        2481     2472       -9     
  Branches      428      424       -4     
==========================================
- Hits         2367     2366       -1     
+ Misses         50       46       -4     
+ Partials       64       60       -4     
Flag Coverage Δ
ubuntu-18.04 95.10% <100.00%> (+0.30%) ⬆️
windows-2019 95.26% <100.00%> (+0.30%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
gcovr/tests/test_gcovr.py 98.01% <100.00%> (+4.88%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 07d419e...5d69cfb. Read the comment docs.

Copy link
Member
@Spacetown Spacetown left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGFM. The suggestions are because this part shall never be reached in the CI..

latk and others added 3 commits June 13, 2021 23:10
Pyutilib is a wonderful collection of Python helpers developed by the same
people who created gcovr. But as gcovr is converging towards more standard
Python practices, we've only been using it for XML diffing.

It turns out that XML diffing is a very popular problem, but most libraries do
something that's too clever for our tests or don't produce readable output.

The `lxml.doctestcompare` module is very good but annotates
differences *inline*, e.g. produces this when comparing `<a/>` with `<b/>`:

    Expected:
      <a></a>

    Got:
      <b></b>

    Diff:
      <a (got: b)></a (got: b)>

In large documents, this can be hard to spot.

There are additional issues, such as special `<any/>` tags, `any=""` attributes,
and `...` ellipsis that are useful for doctests, but not useful for comparing
two actual XML documents.

Another tool is xmldiff which is also very good, except that it outputs diff as
instructions that are more machine-readable than human-accessible. For the same
problem, it will output:

    [rename, /a[1], b]

So because I'm dissatisfied with available tools, I've written Yet Another XML
Differ (yaxmldiff). It uses lxml and tries to approximate a unified diff:

    - <a/>
    + <b/>

On larger problems, it will try to collapse irrelevant sub-trees and attributes.
For example, here is a diff between the gcc-5 and gcc-8 reference files for the
simple1 test:

    <coverage branch-rate="0.5" complexity="0.0"
  -   branches-covered="4"
  +   branches-covered="2"
  -   branches-valid="8"
  +   branches-valid="4"
  -   line-rate="0.8"
  +   line-rate="0.7777777777777778"
  -   lines-covered="8"
  +   lines-covered="7"
  -   lines-valid="10"
  +   lines-valid="9"
  -   timestamp=""
  +   timestamp="1601411485"
  -   version=""
  +   version="gcovr 5.0"
    >
      <sources>
      ...
      </sources>
      <packages>
        <package branch-rate="0.5" complexity="0.0" name=""
  -       line-rate="0.8"
  +       line-rate="0.7777777777777778"
        >
          <classes>
            <class branch-rate="0.5" complexity="0.0" filename="..." name="..."
  -           line-rate="0.8"
  +           line-rate="0.7777777777777778"
            >
              <methods/>
              <lines>
                <line .../>
                <line ...>
                ...
                </line>
                <line .../>
                <line .../>
                <line .../>
                <line branch="..." hits="1"
  -               number="17"
  +               number="14"
                />
                <line ...>
                ...
                </line>
                <line branch="..." hits="0"
  -               number="22"
  +               number="19"
                />
                <line .../>
  -             <line ...>...</line>
              </lines>
            </class>
          </classes>
        </package>
      </packages>
    </coverage>

This tool did spot a problem in the existing reference files. In the GCC-5
reference for the "shadow" test, Pyutilib considered the following two
attributes equivalent, likely because it tries to compare things that look like
numbers as numbers:

    condition-coverage="50% (1/2)"
    condition-coverage="50% (2/4)"

The reference file was updated manually.

This also hints at the drawback of yaxmldiff: it treats everything as text, and
cannot apply rounding like the Pyutilib functions.

Links:

* yaxmldiff on PyPI: <https://pypi.org/project/yaxmldiff/>
* yaxmldiff on GitHub: <https://github.com/latk/yaxmldiff.py>
The driver for the end-to-end integration tests contains some checks that can't fire in a successful run, so ignore their coverage to massage our codecov figures.

Co-authored-by: Michael Förderer <40258682+Spacetown@users.noreply.github.com>
@latk latk merged commit 84143ba into gcovr:master Jun 13, 2021
@latk latk deleted the xml-testing branch June 13, 2021 21:21
@Spacetown Spacetown added the Internal change Internal change not visible to user label Dec 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file Internal change Internal change not visible to user QA related to testing, build infrastructure, etc Type: Enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0