-
Notifications
You must be signed in to change notification settings - Fork 96
Fix panic when applying v2 updates in yrs by aligning Skip encoding with yjs #545
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
The binary is not full matched, there is still some bytes not matched, but encoding and decoding with v2 are fine. After further debugging, it turns out that keys in hasmap not in order, that's ok. |
Thanks for your fix. Are you able to provide a test case that shows, it indeed solves the issue? |
Hey @Horusiath , thanks for your reply. I'm not very familiar with Yjs and Yrs yet, so I'm not sure how to properly reproduce the case with code tests. But I dumped a v1 update that can consistently cause decode_v2 panic after encoding it with encode_v2. The update is a bit large. Hopefully, I'll figure out how to reproduce it with a smaller update later and with code tests. reproduce with: use std::fs;
use yrs::{
updates::{decoder::Decode, encoder::Encode},
Update,
};
fn main() {
let v1_update = fs::read("./state_v1.bin").unwrap();
let v1_decoded = Update::decode_v1(&v1_update).unwrap();
let v2_update = Update::encode_v2(&v1_decoded);
let _ = Update::decode_v2(&v2_update).unwrap();
} will output: thread 'main' panicked at src/main.rs:11:43:
called `Result::unwrap()` on an `Err` value: UnexpectedValue
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like your fix is ok, but apparently this fix seems to be a one liner.
yrs/src/update.rs
Outdated
@@ -944,7 +944,8 @@ impl BlockCarrier { | |||
} | |||
BlockCarrier::Skip(x) => { | |||
encoder.write_info(BLOCK_SKIP_REF_NUMBER); | |||
encoder.write_len(x.len - offset); | |||
// write as VarUint because Skips can't make use of predictable length-encoding | |||
encoder.write_var(x.len - offset); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tested your example, and as it turns out this line change alone is enough to make it pass - the change above doesn't seem to affect results. I've also prepared artificial test with skip blocks inside and it also passed with only this line present.
This test is basically enough to reproduce the issue:
#[test]
fn update_v2_with_skips() {
let u1 = Update {
blocks: UpdateBlocks {
clients: HashMap::from_iter([(
1,
VecDeque::from_iter([
BlockCarrier::Item(
Item::new(
ID::new(1, 0),
None,
None,
None,
None,
TypePtr::Named("test".into()),
None,
ItemContent::String("hello".into()),
)
.unwrap(),
),
BlockCarrier::Skip(BlockRange::new(ID::new(1, 5), 3)),
BlockCarrier::Item(
Item::new(
ID::new(1, 8),
None,
Some(ID::new(1, 7)),
None,
None,
TypePtr::Unknown,
None,
ItemContent::String("world".into()),
)
.unwrap(),
),
]),
)]),
},
delete_set: DeleteSet::default(),
};
let encoded = u1.encode_v2();
let u2 = Update::decode_v2(&encoded).unwrap();
assert_eq!(u1, u2);
}
let content = item.content.splice(offset as usize, encoding).unwrap(); | ||
item.len = offset; | ||
let mut new = Box::new(Item { | ||
id: ID::new(client, clock + offset), | ||
len: content.len(OffsetKind::Utf16), | ||
len: len - offset, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if this change is necessary whatsoever.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if this change is necessary whatsoever.
@Horusiath This change is for performance. Avoid multiple calculations of string length to affect efficiency.
In one of our data update sets
macmini m4 32gb
before:
apply_update 86s.
after:
apply_update 55ms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll be happy to add this change then. Can you rebase your changes please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. @catssay will handle this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Horusiath and @okxiaoliang4 for your reply, I've rebased changes.
Hey, thanks for the great work on Yjs — we’ve been able to add real-time collaboration to our product because of it!
We’ve recently been trying to migrate our Yjs server over to yrs. During testing, we hit a panic when loading updates into YDoc. After digging in, it looks like yrs isn’t able to decode updates created with encode_state_as_update_v2.
I spent a few hours comparing things between Yjs and Yrs and found that the encoding of Skip differs. After tweaking the code to match Yjs’s behavior, the panic was gone.
I’ve attached the update file that was causing the issue (encoded with
encode_state_as_update_v1
and write it withstd::fs::write
), in case it helps.And maybe we can calcuate item's length directly, since
offset
andlen
share the same semantics (due toitem.len = offset
). The change leads to a significantly performance improvement for large content. In our case the original version took around 60 seconds to runapply_update
, while the modified version completes it in just 4 seconds. But I'm not 100% sure if the change is fully aligned with the intended design.And fair warning — I’m pretty new to Rust and still learning my way around Yjs/Yrs, I don't known how to write the tests, so let me know if anything looks off.