8000 Fix synchronous pool shutdowns being arbitrarly delayed by Popax21 · Pull Request #1041 · lettre/lettre · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Fix synchronous pool shutdowns being arbitrarly delayed #1041

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 7, 2025

Conversation

Popax21
Copy link
Contributor
@Popax21 Popax21 commented Feb 5, 2025

Previously, the connection pool thread did not drop its upgraded Arc pool reference while sleeping until the next idle duration check. This causes a drop of the SmtpTransport to not shut down any connections until said thread wakes up again (since it still holds a reference to the pool), which can take up to 60s with default settings. In practice, this means that connections will most likely not be properly closed before the program exists, (since the SmtpTransport is most likely dropped when the program shuts down) which violates the SMTP specification which states that:

The sender MUST NOT intentionally close the channel until it sends a QUIT command, and it SHOULD wait until it receives the reply (even if there was an error response to a command).

Ideally, lettre would offer an API which allows a user of the crate to properly shut down their transport, which allows them to ensure that all connections are closed before the program exists. This is especially relevant for AsyncSmtpTransport: it currently executes all cleanup work on a background task spawned using the executor, with no way to block on said task; as such the only thing a user can do to (try to) ensure that all connections are closed when the program shuts down is to insert an arbitrary delay into the shutdown sequence, which is a hack at best. However, I've decided to limit the scope of this PR to just fixing the more "obvious" flaw with the synchronous pooling implementation, leaving such an API up for future implementation.

Previously, the connection pool thread did not drop its upgraded `Arc` pool reference while sleeping until the next idle duration check. This causes a drop of the `SmtpTransport` to not shut down any connections until said thread wakes up again (since it still holds a reference to the pool), which can take up to 60s with default settings. In practice, this means that connections will most likely not be properly closed before the program exists, (since the `SmtpTransport` is most likely dropped when the program shuts down) which violates the SMTP specification which states that:
> The sender MUST NOT intentionally close the channel until it sends a QUIT command, and it SHOULD wait until it receives the reply (even if there was an error response to a command).
Copy link
Member
@paolobarbolini paolobarbolini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Thanks for the PR.

👍 for having a proper method of shutting down the SMTP transport that waits for it to QUIT and disconnect

@paolobarbolini paolobarbolini merged commit 795beda into lettre:master Feb 7, 2025
6 checks passed
@Popax21
Copy link
Contributor Author
Popax21 commented Feb 7, 2025

👍 for having a proper method of shutting down the SMTP transport that waits for it to QUIT and disconnect

Is this something you would like to work on yourself, or would you accept a PR for this? I imagine that the cleanest way to implement this is to add a (async) fn shutdown(self) -> Result<()> method to the (Async)Transport traits, potentially with a default implementation as to not break any existing traits impls by external crates (dyn-object compatibility isn't a concern since both traits already have associated types). If a PR is the preferred option, and this API surface is something you agree with, I would be willing to implement this functionality myself in an additional PR. :)

@Popax21 Popax21 deleted the patch-1 branch February 7, 2025 22:26
@paolobarbolini
Copy link
Member

Is this something you would like to work on yourself, or would you accept a PR for this? I imagine that the cleanest way to implement this is to add a (async) fn shutdown(self) -> Result<()> method to the (Async)Transport traits, potentially with a default implementation as to not break any existing traits impls by external crates (dyn-object compatibility isn't a concern since both traits already have associated types). If a PR is the preferred option, and this API surface is something you agree with, I would be willing to implement this functionality myself in an additional PR. :)

I'd love to have a PR about this. I currently have limited time to give to the project so everything is appreciated. As for the implementation I agree with your proposal. Do we want to have send_raw return an error when called on a transport that's shutting down?

@Popax21
Copy link
Contributor Author
Popax21 commented Feb 13, 2025

Do we want to have send_raw return an error when called on a transport that's shutting down?

That shouldn't be an issue if the shutdown method takes ownership of the transport like I originally proposed - the lifecycle of the conceptual transport is automatically tied to the lifecycle of the transport object. This is also why I had a brief remark about dyn-safety; this transfer of ownership would have made the trait itself no longer dyn-safe, but as mentioned above this isn't an issue since it hasn't been before this either way.

As a brief side remark, thanks for taking the time to maintain this crate! I'm currently still on vacation, but I'll try to get a PR sent your way implementing this sometime next week.

@paolobarbolini
Copy link
Member
paolobarbolini commented Feb 13, 2025

That shouldn't be an issue if the shutdown method takes ownership of the transport like I originally proposed

This doesn't work because transports can be cloned, but all clones reference the same underlying connection pool, kind of like having an Arc. See AsyncSmtpTransport the same as reqwest::Client.

@Popax21
Copy link
Contributor Author
Popax21 commented Feb 13, 2025

That shouldn't be an issue if the shutdown method takes ownership of the transport like I originally proposed

This doesn't work because transports can be cloned, but all clones reference the same underlying connection pool, kind of like having an Arc. See AsyncSmtpTransport the same as reqwest::Client.

Ah, seems like I missed that impl (all my programs have their transports inside mutex-like primitives to prevent the connection pool from growing too big). In that case having send_raw error out is probably the right thing to do, just like you described earlier (the alternative would be to silently ignore the message, but that sounds like a bad idea for various reasons).

@Popax21
Copy link
Contributor Author
Popax21 commented Feb 13, 2025

(the alternative would be to silently ignore the message, but that sounds like a bad idea for various reasons).

One more alternative implementation I just thought of: instead of having send_raw fail after the transport has been shut down, one could instead have shutdown fail if there are other remaining references to the transport. This might be cleaner, since it also enforces a contract on any client code to stop using the transport after the shutdown, however such an design would be rather vulnerable to spurious failures if a transport reference were to leak. I think both designs could work (personally I have a slight preference towards this one, since it wouldn't require any additional synchronization primitives to coordinate the shutdown), but in the end the call is up to you to make.

@paolobarbolini paolobarbolini mentioned this pull request Feb 17, 2025
@paolobarbolini
Copy link
Member

One more alternative implementation I just thought of: instead of having send_raw fail after the transport has been shut down, one could instead have shutdown fail if there are other remaining references to the transport. This might be cleaner, since it also enforces a contract on any client code to stop using the transport after the shutdown, however such an design would be rather vulnerable to spurious failures if a transport reference were to leak. I think both designs could work (personally I have a slight preference towards this one, since it wouldn't require any additional synchronization primitives to coordinate the shutdown), but in the end the call is up to you to make.

I feel like this alternative approach may be more idiomatic in a way, but also I'm not fully convinced by it. It'd be nice to have to handle an Arc::make_mut style error in just one place instead of the equivalent of Weak::upgrade everywhere, but I also see the rest of the ecosystem use the first method (for example sqlx connection pools and tokio channels), that it's nice to experience the "hacks" you can sometimes with avoiding having to wrap things in Option because the type already sort of does it internally.

Also now that I think of it I've long had the idea of doing something in the style of warp::filters::BoxedFilter to fix #938, but I'm not sure that adding a method that takes ownership of the transport would work with that system.

@Popax21
Copy link
Contributor Author
Popax21 commented Feb 17, 2025

I've now opened PR #1045 using the design we discussed here ((async) fn shutdown(&self) -> Result<()>, with send_raw returning an error after shutdown).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0