decodecorpus #1664

ephiepark · 2019-06-27T22:10:45Z

Decoder was enforcing a constraint that is more strict than the spec. Sequence section for a compressed block can be less than the constraint that was enforced by decoder. When a block with rle mode for all (literal length, match length, and offset) is followed by another block with repeat mode for all (literal length, match length, and offset), the second block's sequence section can be 2 bytes (1 byte for number of sequences and 1 byte for compression mode for each field).

This pull request includes 2 changes.

fixing the decoder code to remove the constraint.
Updating decodecorpus code to generate repeat after rle mode.

Run ./decodecorpus -ptestfiles -otestfiles -n1 -s1236947365 -v to generate an input that has an issue before the fix.

Decodecorpus

terrelln

Looks reasonable to me, but I'll let Yann take a look.

How probable is it that decodecorpus generates a block that triggers this bug? Starting on a random seed, approximately how long does it take to generate an offending block?

Add test case for short bistream

ephiepark · 2019-06-28T01:12:55Z

@terrelln
It takes a few minutes. I would say less than 10 minutes. They are cases where sequences section is 4 bytes, which does cause issues. But I don't think I have ever seen sequences section being <= 3 so far. I added a unittest to cover that though.

Cyan4973 · 2019-06-28T17:13:29Z

A targeted test case for an extreme outlier,
and a generic test case generator which reaches the target issue in a reasonable time frame,
it's the right mix I believe.

Cyan4973 · 2019-06-28T17:17:31Z

tests/zstreamtest.c

+            CHECK_Z( ZSTD_decompressStream(zds, &outBuff, &inBuff) );
+        }
+
+        {   XXH64_state_t xxhStateIn, xxhStateOut;


nit:
if all you want to do is compare decompressed and decodedBuffer content and ensure they are equal, you could simply memcmp() them.

Cyan4973 · 2019-06-28T17:19:10Z

tests/zstreamtest.c

+        outBuff.pos = 0;
+
+        while (inBuff.pos < inBuff.size) {
+            CHECK_Z( ZSTD_decompressStream(zds, &outBuff, &inBuff) );


nit: it's slightly stronger than CHECK_Z() :
presuming the frame is complete (is it ?), you want a return value 0, to indicate that the decoder has reached the end of frame. You also want to ensure it has consumed the entire input (inBuff.pos == inBuff.compressedSize).

Yes, it's a complete frame.

Cyan4973 · 2019-06-28T17:19:58Z

Looks good !
Just some minor comments on the unit test case.

reflect code review comments

ephiepark and others added 3 commits June 26, 2019 16:39

enable repeat mode on rle

734eff7

Fix a constraint stricter than the spec

c7c1ba3

Merge pull request #5 from ephiepark/decodecorpus

36d0bc2

Decodecorpus

facebook-github-bot added the CLA Signed label Jun 27, 2019

terrelln reviewed Jun 27, 2019

View reviewed changes

ephiepark and others added 2 commits June 27, 2019 17:37

Add test case for short bistream

01e8384

Merge pull request #6 from ephiepark/decodecorpus

9f4d71f

Add test case for short bistream

Cyan4973 reviewed Jun 28, 2019

View reviewed changes

ephiepark and others added 2 commits July 1, 2019 10:17

reflect code review comments

2830952

Merge pull request #7 from ephiepark/decodecorpus

01f5b5d

reflect code review comments

Cyan4973 approved these changes Jul 1, 2019

View reviewed changes

Cyan4973 merged commit 4d611ca into facebook:dev Jul 1, 2019

felixhandte mentioned this pull request Jul 19, 2019

Merge v1.4.1 to Master #1691

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

decodecorpus #1664

decodecorpus #1664

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

decodecorpus #1664

decodecorpus #1664

Uh oh!

Conversation

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!