-
Notifications
You must be signed in to change notification settings - Fork 283
Type checking + refactoring for the coverage data model #600
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #600 +/- ##
==========================================
- Coverage 95.14% 95.11% -0.04%
==========================================
Files 23 24 +1
Lines 3355 3313 -42
Branches 616 617 +1
==========================================
- Hits 3192 3151 -41
- Misses 90 92 +2
+ Partials 73 70 -3
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGFM, only some questions.
In an earlier design, functions were accounted for on a per-line basis. This commit removes remnants of that design.
This assists with type checking throughout all coverage handling functions, especially within the writers, and will make future refactoring safer. To avoid cyclic dependencies between modules, the "gcovr.coverage" module must now be a leaf module. To achieve this, some functions were moved from "utils" to "coverage".
Previously, coverage data was updated in-place with coverage.update(new_coverage) However, this has some problems: * It cannot deal properly with the DecisionCoverage cases, since the type checker (correctly!) refuses to allow updating of different types. * Updating in-place is mutable state, and that is conceptually complicated. * It is increasingly clear that merging coverage is not just a convenience function, but that it is business logic in its own right. In particular, issue gcovr#586 suggests that in the future, merging will have to depend on command line options. This should not be done directly in "gcovr.coverage" since that must be a leaf module without further dependencies. So now a more functional interface is implemented: from .merging import merge_file coverage = merge_file(coverage, new_coverage) This might reuse the contents of the input objects, so that the argument objects must not be used afterwards. To support this change, some of the coverage processing functions in "gcovr.writer.json" and "gcovr.__main__" were also updated to return coverage instead of updating it in-place. While re-implementing branch coverage merging, it became clear that the tri-state booleans None/False/True for the "fallthrough" and "throw" fields was ill-defined. Whereas the gcov-parser would only produce None/True, the JSON reader/writer coerced these to a False/True bool. Here, None and False mean "unknown". But the old branch coverage update() function only checked whether the value was None, meaning that coverage data loaded via "--add-tracefile" could not be merged successfully. This was fixed by keeping these fields as ordinary booleans: True if known, False if unknown.
Various functions return `(total, covered, percentage)` tuples, which makes type-checking difficult and leads to error-prone code, due to the need to unpack the tuple into correct variables. The new classes provide a more convenient interface. This commit starts the migration in the coverage data model, but doesn't yet switch the writers over entirely - they mostly use the `stat.to_tuple` compatibility layer.
This unearthed two bugs, which are kept for now. They should be fixed in a separate PR for better visibility. Function coverage in Coberatura reports --------------------------------------- The Cobertura writer does its own coverage data aggregation. When calculating package-level function coverage, it would update the package-level metric by *overwriting* it with the coverage from the last file, instead of *adding* the counts. Branch and decision coverage aggregation ---------------------------------------- For calculating line coverage, only lines that are explicitly covered or uncovered are considered. For calculating other kinds of coverage metrics (in particular branch coverage), all lines are considered regardless of whether that line would be considered for line coverage. Changing this would lead to substantially different coverage metrics, so it is kept for now.
This means that the HTML writer can use the shared SummarizedStats class instead of aggregating coverage itself.
I've rebased and worked through the review remarks. I thought about replacing the CovData type alias with a proper class, but found myself moving in the other direction – moving more behaviour away from the coverage data model and into the new The This discovered some additional problems, in particular with the decision analysis: I think it can resurrect coverage information for lines that were previously excluded. |
The methods is_conditional() is_switch() is_uncheckable() used to indicate the type of a DecisionCoverage object. However, that information is already provided via its class that can be check with `isinstance()`.
There used to be uncovered_lines_str() and uncovered_branches_str() methods on the LineCoverage class. But since this only relates to output formatting, it should not be part of the core coverage data model.
The code used to have about 8 levels of indentation... Now, the largest function roughly fits on one screen. Some unused variables could be removed.
These methods *look* like they are ordinary accessors, but actually create the coverage object if it doesn't already exist. This causes problems. (A) This behavior was not expected in some parts of the code, especially the decision-analysis. (B) The default LineCoverage object does not behave like a neutral element for the merge_line() operation due to issues with the "noncode" flag. This commit DOESN'T FIX those issues, but makes them more visible. Mutation of the coverage data is now done almost entirely via the "gcovr.merging" module. To assist with this, the "gcovr.merging" module now contains "insert()" functions which add a single element into the next-larger structure, for example "insert_function_coverage()" inserts FunctionCoverage into FileCoverage. In order to ensure bug-compatibility, Uses of the "line()" method are removed with the more clearly named "get_or_create_line_coverage()" function. Merging/updating function coverage is special in that it will only merge data when the functions are defined on the same line, yet the gcov-parser will update the line number if the function is redefined. To account for this (and future developments), all "insert_*()" and "merge_*()" functions now take an optional "MergeOption" parameter.
functions now return coverage data instead of updating data structures in place. Using dict literals is preferred to make type checking more feasible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGFM
As a “dividend” from dropping support for Python 3.6, various improvements have become possible. This PR focuses on adding more type annotations for the
gcovr.coverage
data model, externalizing the merging logic, and replacing the increasingly large tuples fromget_global_stats()
and so on with more useful classes. In doing so, a number of bugs were found, but not necessarily fixed as part of this PR.For reviewing this PR, moving one commit at a time is recommended. The commit messages provide further explanation. Here, I summarize the central changes and consequences.
gcovr.coverage
must be a leaf modulePython doesn't handle cyclic dependencies very well. Since the coverage data model now provides types that are used in type annotations throughout the code, the
gcovr.coverage
module now cannot have dependencies on other gcovr modules.In particular, the
CovData
type alias describes the top-level filename → FileCoverage dictionary.As part of adding type annotations, some dead code regarding function coverage could be removed.
New
CoverageStat
,DecisionCoverageStat
,SummarizedStats
typesThe functions like
file.branch_coverage()
now return aCoverageStat(covered, total)
object. This has apercent
property that replaces the need to callcalculate_coverage()
. In case a default value other than None is desired, thepercent_or(default)
method can be used.Since decision coverage has an additional
uncheckable
count, a separate but otherwise equivalentDecisionCoverageStat
type is needed.The
SummarizedStats
type aggregates line, branch, function, and decision coverage for a file or for the entire project, depending on which constructor function is used. This replaces theget_global_stats()
andsummarize_file_coverage()
functions.As a benefit of this, the different writers can now share common logic for aggregating coverage statistics, instead of doing everything themselves. Updating the writers uncovered some potential bugs, which are described below but not yet fixed.
Merging coverage data with
merged = merge_file(left, right)
instead ofleft.update(right)
Previously, coverage data structures were merged with the
update()
method. This had a number of issues:DecisionCoverage
types reasonably.gcovr.coverage
module (as it must now be a leaf module).To support this more “functional” style, some functions in other parts (
__main__
andgcov
) were updated to return CovData instead of updating it in-place.To insert individual elements, the merging module also provides functions like
insert_line_coverage(file, line)
.Auditing the merging logic found a bug in the interplay of
--add-tracefile
+ branch coverage and in the decision coverage analysis.Fixed bug: correctly merge branch coverage data from
--add-tracefile
In BranchCoverage, the
fallthrough
andthrow
flags used to be tri-state booleans None/False/True whereNone
meant “unknown” andTrue
meant “known and set”. But the JSON writer coerced those to boolean False/True. This caused a problem because the old merging logic only checked whether the flag was None, and if so choose the other value. This meant the order of updates would become relevant when merging data loaded via--add-tracefile
. In pseudocode:The correct result in either case would be
throw=True
because it should override the “unknown” state.Now the flags use False for “unknown” and True for “known and set”, which allows the JSON output to remain stable.
Open bug: package-level function coverage in Cobertura is entirely wrong
In pseudocode, the problem is that the Cobertura writer does its own coverage aggregation, but just overwrites instead of adding up function coverage stats:
Open bug: branch coverage also counts noncode lines
For calculating line coverage, only the following lines are counted:
Notably, this excludes
noncode
lines.The
branch_coverage()
method has no such check, and will also consider branches on noncode lines!This shouldn't matter since the gcovr_parser tries to check whether a line should contain branches, but there seems to be some discrepancy. Adding the noncode check in the coverage data model would cause a lot of test failures. Further investigation is needed to understand what's going on.
Open bug: decision analysis can resurrect excluded lines
The decision analysis accesses coverage data via the
file.line(lineno)
method (before this PR) /get_or_create_line_coverage(file, lineno)
(after this PR). But if no LineCoverage data structure already exists for that line, it will be created. The default LineCoverage also sets noncode=False, so that the newly-created lines will be counted as uncovered.I haven't created a testcase to check this, but fixing this issue and moving to read-only access seems to break the tests, so correct behavior should be determined in a separate issue.
Surprising discrepancy: null coverage in JSON-Summary writer
A never-ending problem is the default value if a coverage percentage is NaN. In the JSON-Summary writer, the project-summary and file-summary percentage fields seem to have different defaults. Whereas the project-summary defaults to 0%, the file-summary renders as null. I've kept this behaviour, for now, but it would probably make sense to figure out an unified approach eventually.
This kind of discrepancy was much easier to find when using the unified
CoverageStat.percent
method and when using a type checker such as MyPy.Obstacles to more type checking
At this point, more linting and more type-checking is not directly possible because there are still tons of spurious errors. In particular, heterogeneous dicts are tricky, e.g. if one dict entry contains a list and another a string. Our HTML writer is full of those. Eventually, these dicts can be replaced with dataclasses, but that is out of scope for this PR. Another problem is the Options type in the main function, since some fields change their types during execution (e.g. compiling filters). I do not intend to add MyPy to the CI process anytime soon, but I would like fewer type errors in order to have a better development experience, especially for people using IDEs or LSP-enabled editors like VS Code.