Unittester failures summary #16833

hmeriann · 2025-03-25T20:14:38Z

Currently when we have a failure in the CI (running unit tests), the actual test failure is hidden in mountains of successful runs.

We should print the output of all failing tests again at the end of running the unittester so we can more easily find the failing tests. This can make it easier to look through the CI.

This PR adds the Failures Summary printed out right in the end of the logs by default.
Passing --summarize-failures flag to the ./build/release/test/unittest should enable summary output in the end of test run. There is a python script to run unit tests one by one (scripts/run_tests_one_by_one.py) which should print summary by default.

Python script captures the whole failure details message easily, but the unittest - only the failing case with the problematic line number.

Should look like this for unittest:

=====================================================
================  FAILURES  SUMMARY  ================
=====================================================

1: test/sql/aggregate/aggregates/histogram_table_function.test: 20 
2: test/sql/copy/csv/test_date.test: 11

And for scripts/run_tests_one_by_one.py:

=====================================================
================  FAILURES  SUMMARY  ================
=====================================================

1: ['test/sql/aggregate/aggregates/histogram_table_function.test']
FAILED WITH ERROR:
 ================================================================================
Wrong result in query! (test/sql/aggregate/aggregates/histogram_table_function.test:20)!
================================================================================
SELECT * FROM histogram_values(integers, i, bin_count := 2);
================================================================================
Mismatch on row 1, column count(index 2)
1 <> 1q
================================================================================
Expected result:
================================================================================
60      1q
80      0
100     1
================================================================================
Actual result:
================================================================================
60      1
80      0
100     1

2: ['test/sql/copy/csv/test_date.test']
FAILED WITH ERROR:
 ================================================================================
Wrong result in query! (test/sql/copy/csv/test_date.test:11)!
================================================================================
COPY date_test FROM 'data/csv/test/date.csv';
================================================================================
Mismatch on row 1, column Count(index 1)
1 <> 1q
================================================================================
Expected result:
================================================================================
1q
================================================================================
Actual result:
================================================================================
1

Mytherin · 2025-03-26T08:16:31Z

Thanks for the PR! LGTM - the CI failure is unrelated.

Can we start enabling these flags on the various CI runs as well?

Tmonster

Looks good! Just the one question

test/sqlite/result_helper.cpp

hmeriann · 2025-03-26T14:14:41Z

Made the summary look the same as summary of Python runner results - for now it's only for CompareValues(). Will add more

hmeriann · 2025-03-26T16:51:57Z

I need to add maybe some test cases to test that all the failures will get into a summary

hmeriann · 2025-03-26T17:00:34Z

there is the SQLLogicTestLogger which is used to output all the unittest test results. I've added a SQLLogicTestLogger::AddToSummary(string log_message) method to write found failure into a file.

I've also modified the return types of most of the SQLLogicTestLogger methods so that the strings they return can be collected into a log_message variable.

I think that the filename should be defined somewhere else, not in the SQLLogicTestLogger::AddToSummary(string log_message) method, maybe globally, like the duckdb_unittest_tempdir name.

hmeriann · 2025-03-27T12:24:41Z

We still will have to scroll in case of long logs like this

Tmonster · 2025-03-27T12:29:56Z

We still will have to scroll in case of long logs like this

I was going to say yea. I think if you summarize everything, it's a bit too long of a message. The reason we want this is just to see what files have failing test cases. It's much nicer to look at the files, then run the tests locally to find out exactly what has failed. If you want to have a summary with also the failures, then I suggest having two flags --summarize-failures and --summarize-failues-verbose, with verbose also publishing expected error messages etc.

Mytherin · 2025-03-27T13:03:25Z

I think the current approach is fine - showing the error again is helpful as well. This is useful if we want to e.g. check if an error is expected. Sometimes errors are also hard to reproduce locally.

The HTTPFS example is a bit excessive because there are many errors and the errors are in EXPLAIN statements - but that's not going to look great anyway. For smaller errors this looks great to me - e.g. https://github.com/duckdb/duckdb/actions/runs/14105375963/job/39511652190?pr=16833

.github/workflows/Main.yml

…w the unittester to print the failed test names + failed line as a run summary

…to the test file changed returning types for some of the logger methods

Mytherin

Can we pick this up again? After digging through some more logs this is still very much a needed change. I've left some more comments:

test/sqlite/sqllogic_test_logger.cpp

test/helpers/test_helpers.cpp

test/sqlite/sqllogic_test_logger.cpp

Mytherin

Thanks for the fixes! Looks good. Some more comments below. I also noticed that we are not reporting time out failures in run_tests_one_by_one.py, see e.g. the "Release Assertions with Clang" run. Can we report those as well?

[2573/4670]: test/sql/join/asof/test_asof_join_missing.test_slowStill running...
Still running...
Still running...
(TIMED OUT)

test/include/test_helpers.hpp

test/sqlite/sqllogic_test_runner.cpp

test/sqlite/sqllogic_test_logger.cpp

test/unittest.cpp

…ode Coverage job

Mytherin · 2025-06-17T08:12:20Z

Thanks!

hmeriann requested a review from Tmonster March 25, 2025 20:14

hmeriann changed the title ~~Unittester summary~~ Unittester failures summary Mar 26, 2025

Tmonster reviewed Mar 26, 2025

View reviewed changes

test/sqlite/result_helper.cpp Outdated Show resolved Hide resolved

duckdb-draftbot marked this pull request as draft March 26, 2025 14:13

hmeriann marked this pull request as ready for review March 26, 2025 16:52

duckdb-draftbot marked this pull request as draft March 27, 2025 09:19

hmeriann marked this pull request as ready for review March 27, 2025 09:31

duckdb-draftbot marked this pull request as draft March 27, 2025 10:54

hmeriann marked this pull request as ready for review March 27, 2025 10:55

hmeriann requested a review from Tmonster March 27, 2025 10:55

duckdb-draftbot marked this pull request as draft March 27, 2025 11:23

hmeriann marked this pull request as ready for review March 27, 2025 11:23

Mytherin reviewed Mar 27, 2025

View reviewed changes

.github/workflows/Main.yml Outdated Show resolved Hide resolved

hmeriann added 10 commits March 28, 2025 14:45

Add new --summarize-failures argument to the unittester. It will allo…

9d40f62

…w the unittester to print the failed test names + failed line as a run summary

add summary to py script

2cce404

outputs one line

dd21dba

after merge

c95af4e

print failures summary

e5591d1

format-fix

749de56

format-fix

538bb04

summary format changes

d291ef9

output test name without brackets

ab700a5

collect the error output into the log_message variable and then save …

2c61b64

…to the test file changed returning types for some of the logger methods

hmeriann marked this pull request as ready for review May 30, 2025 12:13

hmeriann requested a review from Tmonster May 30, 2025 14:00

hmeriann added Ready For Review and removed Ready For Review labels May 30, 2025

Merge remote-tracking branch 'origin/main' into unittester-summary

bbe87f3

duckdb-draftbot marked this pull request as draft June 2, 2025 17:09

hmeriann marked this pull request as ready for review June 3, 2025 07:14

compare inputs to strings

f621268

hmeriann added the Ready For Review label Jun 3, 2025

Mytherin reviewed Jun 11, 2025

View reviewed changes

test/sqlite/sqllogic_test_logger.cpp Show resolved Hide resolved

test/helpers/test_helpers.cpp Outdated Show resolved Hide resolved

test/sqlite/sqllogic_test_logger.cpp Outdated Show resolved Hide resolved

test/sqlite/sqllogic_test_logger.cpp Outdated Show resolved Hide resolved

hmeriann added 5 commits June 11, 2025 16:33

cleaning up

6c90c9c

Format UnexpectedFailure message like all other failures

042a995

ff

da446ef

SafeAppend() => AppendFailure()

2481656

remove commented code

f4fd435

duckdb-draftbot marked this pull request as draft June 11, 2025 17:17

hmeriann added 3 commits June 12, 2025 09:20

Merge branch 'main' into unittester-summary

10fc1f0

no duplicating headers

ff188e0

add SUMMARIZE_FAILURES as step's env

d85d3dd

hmeriann marked this pull request as ready for review June 12, 2025 15:17

hmeriann requested a review from Mytherin June 13, 2025 07:51

Mytherin reviewed Jun 13, 2025

View reviewed changes

duckdb-draftbot marked this pull request as draft June 13, 2025 11:34

hmeriann added 3 commits June 15, 2025 09:56

Clean up and organise the Failures Summary static variables into a class

4a5f30c

strategy fail-fast applies to matrix, but we don't have a matrix in C…

a860301

…ode Coverage job

FailureSeummary as a class

46df878

hmeriann force-pushed the unittester-summary branch from 2d3d0f7 to 46df878 Compare June 15, 2025 07:56

hmeriann marked this pull request as ready for review June 15, 2025 07:57

Mytherin merged commit 48bcb44 into duckdb:main Jun 17, 2025
66 of 71 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Unittester failures summary #16833

Unittester failures summary #16833

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Unittester failures summary #16833

Unittester failures summary #16833

Uh oh!

Conversation

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!