8000 Implement FromStr for OutPoint by stevenroose · Pull Request #177 · rust-bitcoin/rust-bitcoin · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Implement FromStr for OutPoint #177

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 18, 2018

Conversation

stevenroose
Copy link
Collaborator

I'm sorry for the many lines of error code :)

type Err = ParseOutPointError;

fn from_str(s: &str) -> Result<Self, Self::Err> {
let split: Vec<&str> = s.split(":").collect();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please avoid this allocation

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Fixed!


fn from_str(s: &str) -> Result<Self, Self::Err> {
let find = s.find(':');
if find == None || find != s.rfind(':') {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Creative :).

Can we add a check that the total length does not exceed 75 (64 bytes txid plus one : plus 10 bytes decimal number) before calling either find? Then the total performance is bounded even for giant pathological inputs.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the actual max limit of the index? Probably u32, but bounded by the 4 MW block limit and min output size :D Messy. 10 digits seems reasonable.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in extra commit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

10 digits covers the whole u32 range. I don't think we need to be more clever than that.

let colon = find.unwrap();
Ok(OutPoint {
txid: Sha256dHash::from_hex(&s[..colon]).map_err(ParseOutPointError::Txid)?,
vout: s[colon + 1..].parse().map_err(ParseOutPointError::Vout)?,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd really prefer that we not accept leading 0s here, but it seems like we'd have to write our own decimal parser to get that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It shouldn't be difficult to just check [colon + 1..] for leading zeros and return error if detected, then parse normally.

fn ensure_no_leading_zeros(s: &str) -> Result<&str> {
   //
}

ensure_no_loading_zeros(s[colon + 1..])?.parse().map_err(ParseOutPointError::Vout)?

Out of curiosity, what is the rationale here, as opposed to the usual "be flexible in what you accept, and strict in what you produce"? Limiting the exploitation surface in case of something else went wrong elsewhere?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main rationale is that it makes it much easier to write a fuzzer that tries to round-trip things. (OutPoints don't quite round-trip because we parse mixed-case hashes while we only serialize in lowercase, but I'm willing to work around that.)

This isn't so important for outpoints, but for many things in Bitcoin, unambiguous encodings are really important. In particular, anything that might be encoded in a transaction witness (or otherwise in an unsigned part of a transaction) could be modified on the wire if its data admits multiple encodings. This could change the length of the transaction, lowering its feerate, and therefore its priority when being mined.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, from a software review point of view, I've found a lot of bugs in things (e.g. PSBT :)) by writing a round-trip fuzzer and making sure that every failure to roundtrip had a good reason behind it. This gets easier the fewer ambiguities there are.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting! Thanks!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in extra commit.

Copy link
Contributor
@dpc dpc Oct 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd really put it in a standalone function as I described in a comment abvoe. This way it's self-documenting, and you can write a quick

#[test]
fn leading_zero_detection() {
... // couple of asserts proving it works
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem I see with that is that it could be misinterpreted as also checking for a number. I mean because we do uint parsing after, arbitrary non-single-character strings will pass. I think it's strange that has_leading_zeroes("lol") returns false while for "0lol" it returns true. You understand my concern?

There is an additional test case in test_outpoint if that's your concern..

Copy link
Contributor
@dpc dpc Oct 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. Then just make fn parse_with_no_leading_zeros(&str) -> Result<u64>, I guess. Parsing a number without accepting leader zeroes is just general enough that it can be named, factored out, increasing clarity and improving testing (you can test a smaller, self-contained function).

Edit: fn parse_with_no_leading_zeros<T>(&str) -> Result<T>?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function also needs to reject +; probably it should just enforce that the string is nonempty and its first character is a digit.

@stevenroose
Copy link
Collaborator Author

If this can go in and needs a squash, let me know.

@apoelstra
Copy link
Member

No need to squash. But it's still not working :). It turns out that u32::parse() will also accept leading + signs, which you also need to eliminate.

For reference the fuzz function I'm using is

use bitcoin::blockdata::transaction::OutPoint;
use bitcoin::consensus::encode;

use std::str::FromStr;

fn do_test(data: &[u8]) {
    let lowercase: Vec<u8> = data.iter().map(|c| match *c {
        b'A' => b'a',
        b'B' => b'b',
        b'C' => b'c',
        b'D' => b'd',
        b'E' => b'e',
        b'F' => b'f',
        x => x
    }).collect();
    let data_str = match String::from_utf8(lowercase) {
        Err(_) => return,
        Ok(s) => s,
    };
    match OutPoint::from_str(&data_str) {
        Ok(op) => {
            assert_eq!(op.to_string().as_bytes(), data_str.as_bytes());
        }
        Err(_) => {
            // If we can't deserialize as a string, try consensus deserializing
            let res: Result<OutPoint, _> = encode::deserialize(data);
            if let Ok(deser) = res {
                let ser = encode::serialize(&deser);
                assert_eq!(ser, data);
                let string = deser.to_string();
                match OutPoint::from_str(&string) {
                    Ok(destring) => assert_eq!(destring, deser),
                    Err(_) => panic!()
                }
            }
        }
    }
}

@stevenroose
Copy link
Collaborator Author

Please review. I created parse_vout and added more tests.

Copy link
Member
@apoelstra apoelstra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@apoelstra apoelstra merged commit 0764673 into rust-bitcoin:master Oct 18, 2018
yancyribbens pushed a commit to yancyribbens/rust-bitcoin that referenced this pull request Mar 23, 2024
descriptor: Accept a generic Secp256k1 ctx instead of just `SignOnly`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0