You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This issue describes improvements in the algorithm Allure 3 uses to read the input data. The currently implemented algorithm is the following:
If the directory is an XCTest result bundle, pass it to the xcresult reader and finish.
Otherwise, for each file in the directory, do:
For each reader in order [allure1, allure2, cucumberjson, junitxml], do:
Pass the file to the reader.
If the reader returns true, go to the next file in step 2.
Otherwise, go to the next reader in step 2.1.
Treat the file as an attachment and go to the next file in step 2.
This algorithm is simple and well-balanced for the currently implemented set of features. However, some potential issues may become severe in the future.
The issues
Allure 3 requires all input files to be put in a single directory. For example, if you have a Cucumber JSON test result file and two Allure result files, you should merge them into a single directory before running Allure 3:
That is not how users typically structure their test results. Usually, the results of each type go separately.
Also, all files are treated equally and go through the same reading process, which may lead to issues described in the sections below.
Issue 1: attachments misidentification
Generally, a file in the input directory is either an attachment or a test result file (a file with metadata about one or more test results). We rely on a naming convention checked by the allure2 reader: if a file name contains -attachment (optionally followed by an extension), the reader considers it an attachment, and the rest of the readers won't be run on that file.
Otherwise, all readers will be executed. If none consume the file, it will be treated as an attachment.
In some cases, those rules aren't enough.
Attachments not following the naming convention
Since we neither document nor enforce the attachment naming convention, it's possible some 3rd party integrations won't follow it. For example, the junitxml reader will happily consume an attachment named results.xml with the following content:
That will manifest in a test result with an attachment missing.
The issue will be more frequent when more readers with external attachments are added. That's because it's much less likely that attachments added from, say, JUnit XML will follow the Allure naming conventions.
All of the above is valid if the attachment content matches the format some reader requires. On the other hand, we plan to support archives (.zip, .tar.gz, etc.), which complicates things. For example, we won't be able to distinguish between a .zip attachment and an edge case of an Allure results archive that doesn't contain result files.
Test result files matching the naming convention
In this case, a valid test result file that some reader could've consumed gets consumed by allure2 as an attachment instead.
For example, a Cucumber JSON file named test-results-attachment.json will result in an empty report. Such a file can be an email attachment, or the -attachment part can be a part of the system-under-test name (imagine something like test-results.smart-attachment.cucumber.json).
Issue 2: negative performance impact
The currently implemented readers are filtering their input files:
allure1: by the naming convention;
allure2: by the naming conventions;
cucumberjson: by the .json extension;
junitxml: by the .xml extension;
The whole sequence works well because the filters mostly don't overlap.
It doesn't scale well, though:
We may implement other future JSON- and XML-based test result formats (NUnit XML, xUnit XML, TRX, etc.), which may lead to parsing some files more than once.
Once we implement features like Support attachments via stdout in the JUnit XML reader #80, readers that filter the files only by extension will start parsing attachments with the same extensions. That will hit the performance if the size and/or the number of such attachments in the report is big.
The above point becomes more severe if we consider support for archives, as now we have to uncompress them first.
In short, if we keep the current approach, we will apply more readers than we should.
Issue 3: live updates are more challenging
The current implementation of the watch mode works by detecting new files in the directory and feeding them to the readers. That works well for Allure results but allows results in other formats to be loaded only once per session (unless the user doesn't use a new file name each time).
While detecting existing files that have been changed and are ready to be reread is a more complex task than detecting new files, we might still want to support this in the future, in which case it will be necessary to distinguish between watching modes:
incremental mode at the directory level (detect new files in a directory)
incremental mode at the file level (detect new content added to an existing file)
batch mode at the file level (reread a file once it's updated)
batch mode at the directory level (reread a directory once its files are updated; tricky but might be doable)
With the current implementation, the only way to do that is to either keep an eye on all files without exceptions or track which reader has consumed the file and switch to the batch mode based on it. Both options make the watch mode more complex.
Issue 4: can't go beyond file system
We would like to unlock the ability to implement loading data from sources other than the local file system. A couple of examples:
A URL of a file resource.
A Base64-encoded data.
While not top priority, it would be convenient if the reading pipeline and the CLI support such options. An obvious choice is to take URIs in addition to local paths (which could be shorthands for file URIs).
Proposal
It's more natural for users to organize test results produced by different means separately:
allure generate <source URI 1> <source URI 2> ... <source URI N>
Such usage will allow users to select the sources they want in the report in an arbitrary combination.
Note
We should also accept local and absolute paths (strings like ./target/test-results/allure-results) and treat them as file: URIs. We may support wildcards in such shortcuts to make them less verbose, so the above command can also be written like this: allure generate ./target/test-results/*.
Each URI represents a source of test results. We can now execute different reading strategies based on the source type.
A new reading pipeline
A new format of arguments implies a new reading process in place of the current one. The intention is to feed the right readers with the correct input.
Let's call such input for a reader a reading unit. I'll use this term in this text to highlight that it doesn't have to be a file. A reading unit is a piece of data that encodes test result information (test results, containers, embedded attachments, attachment links, and other metadata) in an arbitrary combination in some format.
The process won't collect file attachments referenced by test result files anymore (because such attachments shouldn't be listed in source URIs). For example, that includes attachments referenced by a JUnit XML file (see #80). Visiting such attachments should become the job of readers, which means the process should set up an implementation of the visit function that suits the reading unit.
The process must also configure the watching by setting a supported mode and notifying the watcher of what to do if an update is detected.
With all that in mind, let's define the process as a multi-staged pipeline:
Stage 1: detect the type of a source URI
Stage 2: based on the URI type, prepare the following:
the reading units
the sequence of readers (possibly on a per-unit basis)
the attachment resolver
the watch mode
Stage 3: pass the reading units to the correct readers
Stage 4: complete the watcher configuration for the source
Source categories
A source URI can be categorized without looking at the content of the reading units. Examples of categories are:
Test result files
Directories
MAC bundles
Archives
URLs
Base64-encoded data
etc.
Each category then defined the following aspects of the reading pipeline:
How to get the reading units?
What readers to use?
How to resolve external attachments?
What mode to use to watch the results?
What to do when the watcher detects the change?
Let's briefly describe those characteristics for some categories.
Test result files
A test result file URI points to a file in the file system. Examples include .xml, .json, .trx, and .jsonl (see #79) files. Those URIs never point to attachments, which prevents some issues described at the beginning of this document.
The characteristics of test result file sources:
reading unit: the file's content
reader: based on the extension
attachment resolution: relative to the file's directory
watch mode:batched, file-scoped; .jsonl files might support incremental, file-scoped
watcher's action: feed the read content to the reader
Directories
A directory URI points to a directory that may contain multiple files. We may further classify directories by checking their file names against well-known naming conventions to narrow the number of readers to try.
In most cases, such a directory would be an Allure results directory. However, it could also be a directory containing unmerged JUnit XML results or results in some other format.
The characteristics of test result file sources:
reading units: the content of the files inside the directory
reader: allure1, allure2, xml: depending on the naming convention of the children files
attachment resolution: relative to the directory
watch mode:incremental, directory-scoped for allure1 and allure2 readers; batched, directory-scoped for junitxml
watcher's action: feed a new or updated (for junitxml) file to the reader
XCTest result bundles
XCTest result bundles are a special case of directories that should be handled separately. It requires special tooling, which is only available on MAC. It's also poorly compatible with watch mode because a bundle is a whole unit, even though its files might be updated at different moments.
We have a function that takes a path to a bundle and a visitor object. This function can be viewed as a reader in all regards.
reading unit: the path to the bundle (it's read via xcresulttool, which requires a path)
reader:xcresult
attachment resolution: not needed (no external files are referenced)
watch mode:none (or batched, directory-scoped, but it's tricky)
watcher's action: run the reader after the bundle is fully updated
Archives
Files in an archive need to be uncompressed first. How to treat them is yet to be decided. One option is to consider all top-level items as separate sources and pass them through the reading pipeline (possibly with some limitations).
Another option is to associate the entire archive with some source type. How to figure out which type to use is an open question. It could be based on naming conventions or require some user input.
reading units: determined by the content of the archive
readers: determined by the content of the archive
attachment resolution: relative to the archive root or a directory inside it
watch mode:batched, file-scoped
watcher's action: uncompress the content again and apply the readers again
Note
Since we haven't implemented archive sources yet, it's reasonable to extract this section into a new issue and implement it separately.
The text was updated successfully, but these errors were encountered:
Context
This issue describes improvements in the algorithm Allure 3 uses to read the input data. The currently implemented algorithm is the following:
[allure1, allure2, cucumberjson, junitxml]
, do:This algorithm is simple and well-balanced for the currently implemented set of features. However, some potential issues may become severe in the future.
The issues
Allure 3 requires all input files to be put in a single directory. For example, if you have a Cucumber JSON test result file and two Allure result files, you should merge them into a single directory before running Allure 3:
That is not how users typically structure their test results. Usually, the results of each type go separately.
Also, all files are treated equally and go through the same reading process, which may lead to issues described in the sections below.
Issue 1: attachments misidentification
Generally, a file in the input directory is either an attachment or a test result file (a file with metadata about one or more test results). We rely on a naming convention checked by the
allure2
reader: if a file name contains-attachment
(optionally followed by an extension), the reader considers it an attachment, and the rest of the readers won't be run on that file.Otherwise, all readers will be executed. If none consume the file, it will be treated as an attachment.
In some cases, those rules aren't enough.
Attachments not following the naming convention
Since we neither document nor enforce the attachment naming convention, it's possible some 3rd party integrations won't follow it. For example, the
junitxml
reader will happily consume an attachment namedresults.xml
with the following content:That will manifest in a test result with an attachment missing.
The issue will be more frequent when more readers with external attachments are added. That's because it's much less likely that attachments added from, say, JUnit XML will follow the Allure naming conventions.
All of the above is valid if the attachment content matches the format some reader requires. On the other hand, we plan to support archives (
.zip
,.tar.gz
, etc.), which complicates things. For example, we won't be able to distinguish between a.zip
attachment and an edge case of an Allure results archive that doesn't contain result files.Test result files matching the naming convention
In this case, a valid test result file that some reader could've consumed gets consumed by
allure2
as an attachment instead.For example, a Cucumber JSON file named
test-results-attachment.json
will result in an empty report. Such a file can be an email attachment, or the-attachment
part can be a part of the system-under-test name (imagine something liketest-results.smart-attachment.cucumber.json
).Issue 2: negative performance impact
The currently implemented readers are filtering their input files:
allure1
: by the naming convention;allure2
: by the naming conventions;cucumberjson
: by the.json
extension;junitxml
: by the.xml
extension;The whole sequence works well because the filters mostly don't overlap.
It doesn't scale well, though:
In short, if we keep the current approach, we will apply more readers than we should.
Issue 3: live updates are more challenging
The current implementation of the
watch
mode works by detecting new files in the directory and feeding them to the readers. That works well for Allure results but allows results in other formats to be loaded only once per session (unless the user doesn't use a new file name each time).While detecting existing files that have been changed and are ready to be reread is a more complex task than detecting new files, we might still want to support this in the future, in which case it will be necessary to distinguish between watching modes:
With the current implementation, the only way to do that is to either keep an eye on all files without exceptions or track which reader has consumed the file and switch to the batch mode based on it. Both options make the
watch
mode more complex.Issue 4: can't go beyond file system
We would like to unlock the ability to implement loading data from sources other than the local file system. A couple of examples:
While not top priority, it would be convenient if the reading pipeline and the CLI support such options. An obvious choice is to take URIs in addition to local paths (which could be shorthands for file URIs).
Proposal
It's more natural for users to organize test results produced by different means separately:
The proposal is to make
allure generate
accept an arbitrary number of URIs, each representing a source of test results data:Or, put it generically:
Such usage will allow users to select the sources they want in the report in an arbitrary combination.
Note
We should also accept local and absolute paths (strings like
./target/test-results/allure-results
) and treat them asfile:
URIs. We may support wildcards in such shortcuts to make them less verbose, so the above command can also be written like this:allure generate ./target/test-results/*
.Each URI represents a source of test results. We can now execute different reading strategies based on the source type.
A new reading pipeline
A new format of arguments implies a new reading process in place of the current one. The intention is to feed the right readers with the correct input.
Let's call such input for a reader a
reading unit
. I'll use this term in this text to highlight that it doesn't have to be a file. A reading unit is a piece of data that encodes test result information (test results, containers, embedded attachments, attachment links, and other metadata) in an arbitrary combination in some format.The process won't collect file attachments referenced by test result files anymore (because such attachments shouldn't be listed in source URIs). For example, that includes attachments referenced by a JUnit XML file (see #80). Visiting such attachments should become the job of readers, which means the process should set up an implementation of the visit function that suits the reading unit.
The process must also configure the watching by setting a supported mode and notifying the watcher of what to do if an update is detected.
With all that in mind, let's define the process as a multi-staged pipeline:
Source categories
A source URI can be categorized without looking at the content of the reading units. Examples of categories are:
Each category then defined the following aspects of the reading pipeline:
watch
the results?Let's briefly describe those characteristics for some categories.
Test result files
A test result file URI points to a file in the file system. Examples include
.xml
,.json
,.trx
, and.jsonl
(see #79) files. Those URIs never point to attachments, which prevents some issues described at the beginning of this document.The characteristics of test result file sources:
batched, file-scoped
;.jsonl
files might supportincremental, file-scoped
Directories
A directory URI points to a directory that may contain multiple files. We may further classify directories by checking their file names against well-known naming conventions to narrow the number of readers to try.
In most cases, such a directory would be an Allure results directory. However, it could also be a directory containing unmerged JUnit XML results or results in some other format.
The characteristics of test result file sources:
incremental, directory-scoped
for allure1 and allure2 readers;batched, directory-scoped
forjunitxml
junitxml
) file to the readerXCTest result bundles
XCTest result bundles are a special case of directories that should be handled separately. It requires special tooling, which is only available on MAC. It's also poorly compatible with
watch
mode because a bundle is a whole unit, even though its files might be updated at different moments.We have a function that takes a path to a bundle and a visitor object. This function can be viewed as a reader in all regards.
xcresulttool
, which requires a path)xcresult
none
(orbatched, directory-scoped
, but it's tricky)Archives
Files in an archive need to be uncompressed first. How to treat them is yet to be decided. One option is to consider all top-level items as separate sources and pass them through the reading pipeline (possibly with some limitations).
Another option is to associate the entire archive with some source type. How to figure out which type to use is an open question. It could be based on naming conventions or require some user input.
batched, file-scoped
Note
Since we haven't implemented archive sources yet, it's reasonable to extract this section into a new issue and implement it separately.
The text was updated successfully, but these errors were encountered: