8000 Start crate smashing by tcharding · Pull Request #751 · rust-bitcoin/rust-bitcoin · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Start crate smashing #751

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from
Closed

Conversation

tcharding
Copy link
Member
@tcharding tcharding commented Jan 4, 2022

I'm expecting this to take a few iterations. Not sure if this is wanted still as part of 0.28 release. I've taken it off draft because its ready for review although clearly not ready for merge. I need input from others for this one, both opinions and also some technical things, primarily I cannot work out how to re-export bitcoin_hashes from bitocoin-str and get macros to work in Rust 1.29 (that should be the only CI failure), it builds locally with newer versions of Rust.

Make an attempt to start the crate smashing described in #550. This is a long thread and its not super easy to tease out what has consensus and what does not but this PR will try to do that.

The stated aims of this work are:

  • Reduce LOC count and compile times for downstream projects that do not need all of the code in rust-bitcoin.
  • Assist in stabilising the codebase by having conservative code in separate crates from frequently changing code (includes assisting downstream users, and the wider Rust Bitcoin ecosystem, in being able to upgrade, easily and rapidly, the more stable crates).
  • Make the codebase more closely follow principles of Clean Architecture (orphan rule aside).

At a high level this PR does:

  • Create a workspace, move all the code int a crate called bitcoin
  • Extract out a second crate called bitcoin-str (also referred to as bitcoin-encoding in the issue thread)

I've tried to only do changes required to make code build when extracting code into bitcoin-str.

bitcoin-str includes:

  • Extracted base58 module
  • Extracted endian module
  • Re-export of extern bech32 crate
  • Re-export of extern base64 crate
  • Re-export of extern bitcoin_hashes crate as hashes
  • Attempted to re-export hashes::hex::* from module bitcoin_str::hex for ergonomics

Questions/TODOs

@DON-MAC-256
Copy link

Any reason to switch to bitcoin-str from bitcoin-encoding, which seems to have rough consensus in #550?

@Kixunil
Copy link
Collaborator
Kixunil commented Jan 4, 2022

Awesome, I just hit issues with bitcoin being a single crate (also related to no_std)

@DON-MAC-256 @apoelstra said he likes bitcoin-str more and nobody objected.

@Kixunil
Copy link
Collaborator
Kixunil commented Jan 5, 2022

Thinking about this again wouldn't it be easier to start with separating-out primitives?

@tcharding
Copy link
Member Author
tcharding commented Jan 5, 2022

We have to start at the bottom, right?

@Kixunil
Copy link
Collaborator
Kixunil commented Jan 5, 2022

Not sure what you consider bottom but thinking about it again:

  • isolated components (types/functions with no deps and nothing dependent) should be easiest but I don't know if we have many/any of those
  • separating entities should be most useful and reasonably easy. E.g. I need Amount a separate crate without the annoying no-std feature yesterday. :)
  • things that have dependencies are probably impossible to separate before their dependencies because they need to reference their dependencies and if they reference them within bitcoin crate they can't be re-exported in bitcoin crate (circular dependency)

How I would approach it:

  1. Copy util::amount to a separate crate.
  2. Delete util::amount from rust-bitcoin
  3. Fix all errors by referencing Amount from the new crate instead
  4. reexport Amount in rust-bitcoin for compatibility
  5. go to 0 but with other type which doesn't depend on anything in rust-bitcoin

@tcharding tcharding force-pushed the crate-smashing branch 2 times, most recently from 9c3e016 to 040598f Compare January 6, 2022 01:14
@tcharding
Copy link
Member Author
tcharding commented Jan 6, 2022

Not sure what you consider bottom

I took the bottom to be bitcoin-str (excluding external crates bitcoin_hashes, bech32, and base64). This is based primarily on the following to comments

Did I misunderstand the thread? Do you still have some concern on the ordering of primitives vs encoding (bitcoin-str)?

@Kixunil
Copy link
Collaborator
Kixunil commented Jan 6, 2022

Actually, you're right. bitcoin-str must go first because we can't do Clean Architecture in Rust perfectly.

The `base58` module should not be encapsulating a secp256k1 error, these
things are un-related. Removing the error variant is an API breaking
change but is necessary in preparation for splitting the `base58` module
out into the about-to-be-created `bitcoin-str` crate.
@tcharding tcharding force-pushed the crate-smashing branch 2 times, most recently from f7f2f65 to 320cc45 Compare January 11, 2022 02:58
Add a workspace to the top level directory.

Create a directory `bitcoin` and move into it the following as is with
no code changes:

- src
- Cargo.toml
- test_data
- examples

Add a workspace to the repository root directory. Add the newly created
`bitcoin` crate to the workspace.

Fix `contrib/test.sh`, fuzz, embedded, and CI so everything still passes.
@tcharding tcharding marked this pull request as ready for review January 11, 2022 03:13
Create a new crate `bitcoin-str` and add it to the newly created
workspace. Extract `base58.rs` and `endian.rs` modules out of `bitcoin`
and into the new crate. Depend upon and re-export `bitcoin_hashes` and
`base64-compat` from the new crate. Fix manifest files as required.

Re-exports in `bitcoin` of all types and functions from `bitcoin-str`
allow for minimal changes to the code base.
@Kixunil Kixunil added this to the 0.30.0 milestone Jan 11, 2022
@Kixunil
Copy link
Collaborator
Kixunil commented Jan 11, 2022

@tcharding IDK if you noticed that this was originally planned for version 0.30. Maybe it'll cause many rebases of this. However I would agree with change to get this in 0.29

@tcharding
Copy link
Member Author

Oh I did not realize that, I thought it was wanted for 0.28. I'll convert back to draft and put this on ice. Thanks for pointing that out.

@tcharding tcharding marked this pull request as draft January 11, 2022 20:45
@@ -0,0 +1,41 @@
[package]
name = "bitcoin-str"
version = "0.27.0"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
version = "0.27.0"
version = "0.1.0"

Comment on lines +1 to +3
[workspace]
members = ["bitcoin", "bitcoin-str"]
exclude = ["embedded", "fuzz"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can reduce the diff here by not moving the main crate into a subdirectory.

Regular crate manifests can also have a [workspace] table, meaning you can introduce bitcoin-str and move all the stuff in without touching anything in bitcoin.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh cool, I did not know that. Cheers man!


[dependencies]
bech32 = { version = "0.8.1", default-features = false }
bitcoin_hashes = { version = "0.10.0", default-features = false }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need the hashes as well or are we just taking hex from this? Would be nice to move hex into bitcoin-str and have bitcoin-hashes depend on bitcoin-str. AFAIK, hex only lives in bitcoin-hashes because bitcoin-hashes was already a leaf dependency.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

concept ACK that dependency inversion.

Yes, the hex stuff was put into bitcoin-hashes purely because it was the smallest crate in the ecosystem. (I might've put hex into bech32 if it'd been around at the time :P.)

Copy link
Member Author
@tcharding tcharding Jan 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are three functions in the base58 module that depend on hashing for checksum'ing, e.g.

/// Obtain a string with the base58check encoding of a slice
/// (Tack the first 4 256-digits of the object's Bitcoin hash onto the end.)
pub fn check_encode_slice(data: &[u8]) -> String {
    let checksum = sha256d::Hash::hash(data);
    encode_iter(
        data.iter()
            .cloned()
            .chain(checksum[0..4].iter().cloned())
    )
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was hoping to convey that with the comment in bitcoin-str/src/lib.rs

pub mod hex {
    //! Re-exports hex de/serialization code from `bitcoin_hashes`.
    //!
    // There is a chicken and the egg problem with putting _all_ the en/decoding code in a single crate.
    // Hashing cannot be separated totally from en/decoding because base58 -> hashing -> hex.

Or is there a way around this?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we don't use the hex crate?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After reading your argument I no longer feel comfortable with accepting option 3.

I believe we should really revive it. But could the crate be without "bitcoin" in its name or docs? I'd like it being neutral general purpose 1.36-compatible crate. hex-compat similarly to base64-compat sounds sensible.

Copy link
Collaborator
@dr-orlovsky dr-orlovsky Jan 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is already amplify_num crate I made for this purpose, when work on bitcoin_hex has stalled. I will be ok with moving it to this org and even renaming it. Additionally to hex it has big and small integers (u1-u7, u256 taken from rust-bitcoin ecosystem and u512-u1024). u256 as multiple improvements over type in rust-bitcoin, providing arithmetics, overflow wraping/checkings, octal/hex etc formating – but the code base is still quite small.

Currently it is a part of the repo managed by me and @Kixunil: https://github.com/LNP-BP/rust-amplify

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unconvinced that "changes will happen" to hex encoding code which require synchronized updates, but concept ACK having a separate hex-compat crate.

Agree that it shouldn't have bitcoin anywhere in the name.

I'm also OK in concept with talking to the hex maintainer but I have no interest in doing this myself.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unconvinced that "changes will happen" to hex encoding code which require synchronized updates, but concept ACK having a separate hex-compat crate.

That was my thinking as well. Very likely, the hex-encoding needs within bitcoin-hashes are not going to change any time soon and hence we might be able to reduce it down to 2-3 private functions that have exactly the signatures we need because the don't need to account for any other usage.

Contrary to that, a public hex module within bitcoin-str will likely need to offer much more flexible APIs, see more usage and thus require more maintenance.

That was the thinking when I proposed option 3 above. I am not fussed about extracting a dedicated crate though.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I've mentioned there that I wish for MSRV to be no newer than 1.41.1 and noticed they also have 1.0 milestone which is very nice. I can see that the code being short and unlikely to change can be a good argument for duplication being fine but still prefer a separate crate. It could be a fork of hex with changes to make it compilable on 1.41.1.

@tcharding tcharding closed this Jul 11, 2022
@tcharding tcharding deleted the crate-smashing branch July 11, 2022 01:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants
0