8000 Error if body not fully downloaded by schneems · Pull Request #947 · heroku/libcnb.rs · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Error if body not fully downloaded #947

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

- libcnb:
- Order of automatically applied environment variables by libcnb, such as `PATH=<layer>/bin`, now matches the upstream CNB lifecycle. ([#938](https://github.com/heroku/libcnb.rs/pull/938))

- libherokubuildpack:
- The `download_file` function now errors if bytes received does not match the content length header. ([#947](https://github.com/heroku/libcnb.rs/pull/947))

## [0.29.0] - 2025-05-02

Expand Down
3 changes: 2 additions & 1 deletion libherokubuildpack/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ workspace = true

[features]
default = ["command", "download", "digest", "error", "inventory", "log", "inventory-semver", "inventory-sha2", "tar", "toml", "fs", "write"]
download = ["dep:ureq", "dep:thiserror"]
download = ["dep:ureq", "dep:thiserror", "dep:http"]
digest = ["dep:sha2"]
error = ["log", "dep:libcnb"]
inventory = ["dep:hex", "dep:serde", "dep:thiserror", "dep:toml"]
Expand All @@ -41,6 +41,7 @@ crossbeam-utils = { version = "0.8.21", optional = true }
# As such we have to use the next best alternate backend, which is `zlib`.
flate2 = { version = "1.1.1", default-features = false, features = ["zlib"], optional = true }
hex = { version = "0.4.3", optional = true }
http = { version = "1", optional = true }
libcnb = { workspace = true, optional = true }
pathdiff = { version = "0.2.3", optional = true }
semver = { version = "1.0.26", features = ["serde"], optional = true }
Expand Down
57 changes: 53 additions & 4 deletions libherokubuildpack/src/download.rs
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
use http::{HeaderName, header::CONTENT_LENGTH};
use std::num::ParseIntError;
use std::{fs, io};

#[derive(thiserror::Error, Debug)]
Expand All @@ -8,6 +10,35 @@ pub enum DownloadError {

#[error("I/O error while downloading file: {0}")]
IoError(#[from] std::io::Error),

#[error("Missing required header: `{0}`")]
MissingRequiredHeader(HeaderName),

#[error("Failed to convert header value for `{0}` into a string")]
HeaderEncodingError(HeaderName),

#[error("Cannot parse into an integer: {0}")]
CannotParseInteger(ParseIntError),

#[error("Expected `{expected}` bytes received `{received}`")]
UnexpectedBytes {
expected: u64,
received: u64,
path_deleted: Result<(), std::io::Error>,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got a little clever with this representation. I'm not sure if this is a net good or bad idea (to return a Result here).

My logic for this is: It seems unlikely that the delete would fail if we were able to create and write to the file, however, it's still possible so we need to represent it. It seems wrong to return an DownloadError::IoError as that explicitly says it's for io errors during download (as opposed to cleanup). We could add another variant just for this case, but it would end up having almost the same information.

The primary error here is the bytes not written, whether or not the disk could be cleaned is a secondary concern.

I'm open to alternative suggestions.

},
}

impl DownloadError {
/// Do not suggest a retry if the failure is due to disk error or if the path could not be cleaned
pub fn retry_suggested(&self) -> bool {
!matches!(
self,
DownloadError::UnexpectedBytes {
path_deleted: Err(_),
..
} | DownloadError::IoError(_)
)
}
}

/// Downloads a file via HTTP(S) to a local path
Expand All @@ -32,9 +63,27 @@ pub fn download_file(
destination: impl AsRef<std::path::Path>,
) -> Result<(), DownloadError> {
let response = ureq::get(uri.as_ref()).call().map_err(Box::new)?;
let mut reader = response.into_body().into_reader();
let mut file = fs::File::create(destination.as_ref())?;
io::copy(&mut reader, &mut file)?;
let expected: u64 = response
.headers()
.get(CONTENT_LENGTH)
.ok_or_else(|| DownloadError::MissingRequiredHeader(CONTENT_LENGTH))?
.to_str()
.map_err(|_| DownloadError::HeaderEncodingError(CONTENT_LENGTH))?
.parse()
.map_err(DownloadError::CannotParseInteger)?;
Comment on lines +68 to +73
Copy link
Contributor
@runesoerensen runesoerensen May 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to change this function to require a Content-Length header (e.g. F82A not support Transfer-Encoding: chunked)?

Seems like this might be a bit surprising for a generic download_file function?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps it'd make sense to use the content_length function on the body instead, and only compare the written bytes if it contains a value.

That would also eliminate the need to manually get and parse the header.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

100% overlooked that, thanks. Great suggestion. I'm wondering if ureq BodyReader is able to somehow fail the io::copy if the connection is closed unexpectedly and we're waiting for missing packets. I'm going to make a small POC ruby server that demonstrates the behavior and compare to curl.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found that the behavior is documented in the changelog: https://github.com/algesten/ureq/blob/main/CHANGELOG.md#140:

Error if body length is less than content-length.

I also experimentally confirm that ureq does the correct thing in the chunked and fixed content-length case (errors the from the IO reader), see below. So, it looks like this case is handled and we don't need to explicitly check. Closing my PR.

Fixed length

$ PORT=9292 MODE=fixed ruby server.rb
Chunked server listening on port 9292
Client connected
Real length: `24`, claimed length: 34
Sent fixed body with mismatched Content-Length
$ cd ureureq_it_u_buy_it && cargo run
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.02s
     Running `target/debug/ureq_it_u_buy_it`
Downloaded file size: `24` bytes

thread 'main' panicked at src/main.rs:33:12:
called `Result::unwrap()` on an `Err` value: IoError(Custom { kind: UnexpectedEof, error: "Peer disconnected" })
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Chunked

$ PORT=9292 MODE=chunked ruby server.rb
Chunked server listening on port 9292
Client connected
Sent chunk 1
Sent chunk 2
Server exiting ungracefully...
$ cd ureureq_it_u_buy_it && cargo run
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.02s
     Running `target/debug/ureq_it_u_buy_it`

thread 'main' panicked at src/main.rs:31:12:
called `Result::unwrap()` on an `Err` value: IoError(Custom { kind: UnexpectedEof, error: "Peer disconnected" })
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace


Ok(())
let mut file = fs::File::create(destination.as_ref())?;
let received = io::copy(&mut response.into_body().into_reader(), &mut file)?;
// Ensure file is closed
drop(file);
if received == expected {
Ok(())
} else {
let path_deleted = fs::remove_file(&destination);
Err(DownloadError::UnexpectedBytes {
expected,
received,
path_deleted,
})
}
}
0