-
Notifications
You must be signed in to change notification settings - Fork 831
Implement FromStr for OutPoint #177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
7da94d8
to
d016269
Compare
src/blockdata/transaction.rs
Outdated
type Err = ParseOutPointError; | ||
|
||
fn from_str(s: &str) -> Result<Self, Self::Err> { | ||
let split: Vec<&str> = s.split(":").collect(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please avoid this allocation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. Fixed!
d016269
to
f2cebee
Compare
|
||
fn from_str(s: &str) -> Result<Self, Self::Err> { | ||
let find = s.find(':'); | ||
if find == None || find != s.rfind(':') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Creative :).
Can we add a check that the total length does not exceed 75 (64 bytes txid plus one :
plus 10 bytes decimal number) before calling either find
? Then the total performance is bounded even for giant pathological inputs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the actual max limit of the index? Probably u32, but bounded by the 4 MW block limit and min output size :D Messy. 10 digits seems reasonable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in extra commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
10 digits covers the whole u32
range. I don't think we need to be more clever than that.
src/blockdata/transaction.rs
Outdated
let colon = find.unwrap(); | ||
Ok(OutPoint { | ||
txid: Sha256dHash::from_hex(&s[..colon]).map_err(ParseOutPointError::Txid)?, | ||
vout: s[colon + 1..].parse().map_err(ParseOutPointError::Vout)?, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd really prefer that we not accept leading 0s here, but it seems like we'd have to write our own decimal parser to get that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It shouldn't be difficult to just check [colon + 1..]
for leading zeros and return error if detected, then parse normally.
fn ensure_no_leading_zeros(s: &str) -> Result<&str> {
//
}
ensure_no_loading_zeros(s[colon + 1..])?.parse().map_err(ParseOutPointError::Vout)?
Out of curiosity, what is the rationale here, as opposed to the usual "be flexible in what you accept, and strict in what you produce"? Limiting the exploitation surface in case of something else went wrong elsewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main rationale is that it makes it much easier to write a fuzzer that tries to round-trip things. (OutPoint
s don't quite round-trip because we parse mixed-case hashes while we only serialize in lowercase, but I'm willing to work around that.)
This isn't so important for outpoints, but for many things in Bitcoin, unambiguous encodings are really important. In particular, anything that might be encoded in a transaction witness (or otherwise in an unsigned part of a transaction) could be modified on the wire if its data admits multiple encodings. This could change the length of the transaction, lowering its feerate, and therefore its priority when being mined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, from a software review point of view, I've found a lot of bugs in things (e.g. PSBT :)) by writing a round-trip fuzzer and making sure that every failure to roundtrip had a good reason behind it. This gets easier the fewer ambiguities there are.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting! Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in extra commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd really put it in a standalone function as I described in a comment abvoe. This way it's self-documenting, and you can write a quick
#[test]
fn leading_zero_detection() {
... // couple of asserts proving it works
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem I see with that is that it could be misinterpreted as also checking for a number. I mean because we do uint parsing after, arbitrary non-single-character strings will pass. I think it's strange that has_leading_zeroes("lol")
returns false while for "0lol"
it returns true. You understand my concern?
There is an additional test case in test_outpoint
if that's your concern..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True. Then just make fn parse_with_no_leading_zeros(&str) -> Result<u64>
, I guess. Parsing a number without accepting leader zeroes is just general enough that it can be named, factored out, increasing clarity and improving testing (you can test a smaller, self-contained function).
Edit: fn parse_with_no_leading_zeros<T>(&str) -> Result<T>
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function also needs to reject +
; probably it should just enforce that the string is nonempty and its first character is a digit.
If this can go in and needs a squash, let me know. |
No need to squash. But it's still not working :). It turns out that For reference the fuzz function I'm using is
|
4083906
to
fefd5d4
Compare
Please review. I created |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
descriptor: Accept a generic Secp256k1 ctx instead of just `SignOnly`
I'm sorry for the many lines of error code :)