8000 Fix panic when applying v2 updates in yrs by aligning Skip encoding with yjs by catssay · Pull Request #545 · y-crdt/y-crdt · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Fix panic when applying v2 updates in yrs by aligning Skip encoding with yjs #545

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

catssay
Copy link
@catssay catssay commented Apr 25, 2025

Hey, thanks for the great work on Yjs — we’ve been able to add real-time collaboration to our product because of it!

We’ve recently been trying to migrate our Yjs server over to yrs. During testing, we hit a panic when loading updates into YDoc. After digging in, it looks like yrs isn’t able to decode updates created with encode_state_as_update_v2.

I spent a few hours comparing things between Yjs and Yrs and found that the encoding of Skip differs. After tweaking the code to match Yjs’s behavior, the panic was gone.

I’ve attached the update file that was causing the issue (encoded withencode_state_as_update_v1 and write it with std::fs::write), in case it helps.

And maybe we can calcuate item's length directly, since offset and len share the same semantics (due to item.len = offset). The change leads to a significantly performance improvement for large content. In our case the original version took around 60 seconds to run apply_update, while the modified version completes it in just 4 seconds. But I'm not 100% sure if the change is fully aligned with the intended design.

And fair warning — I’m pretty new to Rust and still learning my way around Yjs/Yrs, I don't known how to write the tests, so let me know if anything looks off.

// src/structs/Skip.js
export class Skip extends AbstractStruct {
    //...
    write (encoder, offset) {
        encoder.writeInfo(structSkipRefNumber)
        // write as VarUint because Skips can't make use of predictable length-encoding
        encoding.writeVarUint(encoder.restEncoder, this.length - offset)
      }
    // ...
 }

@catssay
Copy link
Author
catssay commented Apr 25, 2025

Hey, thanks for the great work on Yjs — we’ve been able to add real-time collaboration to our product because of it!

We’ve recently been trying to migrate our Yjs server over to yrs. During testing, we hit a panic when loading updates into YDoc. After digging in, it looks like yrs isn’t able to decode updates created with encode_state_as_update_v2.

I spent a few hours comparing things between Yjs and Yrs and found that the encoding of Skip differs. After tweaking the code to match Yjs’s behavior, the panic was gone.

I’ve attached the update file that was causing the issue (encoded withencode_state_as_update_v1 and write it with std::fs::write), in case it helps.

Also, fair warning — I’m pretty new to Rust and still learning my way around Yjs/Yrs, I don't known how to write the tests, so let me know if anything looks off.

The binary is not full matched, there is still some bytes not matched, but encoding and decoding with v2 are fine.

After further debugging, it turns out that keys in hasmap not in order, that's ok.

@catssay catssay changed the title align Skip encoding with Yjs format Fix panic when applying V2 updates in Yrs by aligning Skip encoding with Yjs Apr 25, 2025
@catssay catssay changed the title Fix panic when applying V2 updates in Yrs by aligning Skip encoding with Yjs Fix panic when applying v2 updates in yrs by aligning Skip encoding with yjs Apr 25, 2025
@Horusiath
Copy link
Collaborator

Thanks for your fix. Are you able to provide a test case that shows, it indeed solves the issue?

< 8000 /div>
@catssay
Copy link
Author
catssay commented Apr 28, 2025

Thanks for your fix. Are you able to provide a test case that shows, it indeed solves the issue?

Hey @Horusiath , thanks for your reply.

I'm not very familiar with Yjs and Yrs yet, so I'm not sure how to properly reproduce the case with code tests. But I dumped a v1 update that can consistently cause decode_v2 panic after encoding it with encode_v2.

The update is a bit large. Hopefully, I'll figure out how to reproduce it with a smaller update later and with code tests.

state_v1.bin.zip

reproduce with:

use std::fs;
use yrs::{
    updates::{decoder::Decode, encoder::Encode},
    Update,
};

fn main() {
    let v1_update = fs::read("./state_v1.bin").unwrap();
    let v1_decoded = Update::decode_v1(&v1_update).unwrap();
    let v2_update = Update::encode_v2(&v1_decoded);
    let _ = Update::decode_v2(&v2_update).unwrap();
}

will output:

thread 'main' panicked at src/main.rs:11:43:
called `Result::unwrap()` on an `Err` value: UnexpectedValue
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Copy link
Collaborator
@Horusiath Horusiath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like your fix is ok, but apparently this fix seems to be a one liner.

@@ -944,7 +944,8 @@ impl BlockCarrier {
}
BlockCarrier::Skip(x) => {
encoder.write_info(BLOCK_SKIP_REF_NUMBER);
encoder.write_len(x.len - offset);
// write as VarUint because Skips can't make use of predictable length-encoding
encoder.write_var(x.len - offset);
Copy link
Collaborator
@Horusiath Horusiath May 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tested your example, and as it turns out this line change alone is enough to make it pass - the change above doesn't seem to affect results. I've also prepared artificial test with skip blocks inside and it also passed with only this line present.

This test is basically enough to reproduce the issue:

    #[test]
    fn update_v2_with_skips() {
        let u1 = Update {
            blocks: UpdateBlocks {
                clients: HashMap::from_iter([(
                    1,
                    VecDeque::from_iter([
                        BlockCarrier::Item(
                            Item::new(
                                ID::new(1, 0),
                                None,
                                None,
                                None,
                                None,
                                TypePtr::Named("test".into()),
                                None,
                                ItemContent::String("hello".into()),
                            )
                            .unwrap(),
                        ),
                        BlockCarrier::Skip(BlockRange::new(ID::new(1, 5), 3)),
                        BlockCarrier::Item(
                            Item::new(
                                ID::new(1, 8),
                                None,
                                Some(ID::new(1, 7)),
                                None,
                                None,
                                TypePtr::Unknown,
                                None,
                                ItemContent::String("world".into()),
                            )
                            .unwrap(),
                        ),
                    ]),
                )]),
            },
            delete_set: DeleteSet::default(),
        };
        let encoded = u1.encode_v2();
        let u2 = Update::decode_v2(&encoded).unwrap();
        assert_eq!(u1, u2);
    }

let content = item.content.splice(offset as usize, encoding).unwrap();
item.len = offset;
let mut new = Box::new(Item {
id: ID::new(client, clock + offset),
len: content.len(OffsetKind::Utf16),
len: len - offset,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this change is necessary whatsoever.

Copy link
@okxiaoliang4 okxiaoliang4 May 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this change is necessary whatsoever.

@Horusiath This change is for performance. Avoid multiple calculations of string length to affect efficiency.

In one of our data update sets

macmini m4 32gb

before:
apply_update 86s.
after:
apply_update 55ms.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll be happy to add this change then. Can you rebase your changes please?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. @catssay will handle this.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Horusiath and @okxiaoliang4 for your reply, I've rebased changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
0