x/net/http2: flow control desync when receiving data for canceled stream · Issue #56558 · golang/go · GitHub
More Web Proxy on the site http://driver.im/
You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Does this issue reproduce with the latest release?
yes
What operating system and processor architecture are you using (go env)?
go env Output
$ go env
What did you do?
We have a proxy server in Go that proxies over HTTP/2 to a third-party server. We're seeing that the third-party server sometimes runs out of flow control tokens and can't send anymore data back to the client. There's no clear reproduction path, but it seems to happen when the proxy is under heavy load and seeing an increase in canceled requests, due to higher latency.
Looking for potential causes in the http2 package, I noticed that in the case where the transport reads a data frame for a stream that was already canceled, it returns the flow control tokens, but it also increments the flow control counter for the connection:
As far as I can tell, the tokens haven't been subtracted at this point, so we shouldn't be adding to the window? I ran the test case below, with some logging added to the lines linked above, e.g.:
cc.inflow.add(int32(f.Length))
cc.logf("http2: Transport received DATA frame for canceled stream (window: %d)", cc.inflow.available())
cc.inflow.add is increasing the flow control window, and the subsequent lines send the increase to the server in a WINDOW_UPDATE frame. There might be a flow control bug here somewhere, but I don't think these lines are it unless I'm misreading something.
The reason those lines seemed wrong to me is that in the general case, when data is received for an open stream, the window is reduced first (both stream and connection):
if cs.inflow.available() >= int32(f.Length) {
cs.inflow.take(int32(f.Length))
}
It then refunds for padding right away, and later increases the window again when the caller reads the body:
if v := cc.inflow.available(); v < transportDefaultConnFlow/2 {
connAdd = transportDefaultConnFlow - v
cc.inflow.add(connAdd)
}
In the case of a canceled stream, it looks like the window is only ever increased.
Ah, you're right. That does look wrong; we're returning flow control tokens to cs.inflow that were never taken. (I thought we consumed the connection-level flow earlier, but we don't; connection and stream flow are consumed at the same time.) Over time, that'll lead to the transport thinking the server has more tokens than it does, and it'll stop sending window updates.
dmitshur
added
NeedsFix
The path to resolution is known, but the work has not been done.
and removed
NeedsInvestigation
Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
labels
Jan 6, 2023
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
yes
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
We have a proxy server in Go that proxies over HTTP/2 to a third-party server. We're seeing that the third-party server sometimes runs out of flow control tokens and can't send anymore data back to the client. There's no clear reproduction path, but it seems to happen when the proxy is under heavy load and seeing an increase in canceled requests, due to higher latency.
Looking for potential causes in the http2 package, I noticed that in the case where the transport reads a data frame for a stream that was already canceled, it returns the flow control tokens, but it also increments the flow control counter for the connection:
https://github.com/golang/net/blob/a1278a7f7ee0c218caeda793b867e0568bbe1e77/http2/transport.go#L2557-L2559
As far as I can tell, the tokens haven't been subtracted at this point, so we shouldn't be adding to the window? I ran the test case below, with some logging added to the lines linked above, e.g.:
What did you expect to see?
I'd expect flow control to be returned to the server, but the transport's counter shouldn't increase in this case.
What did you see instead?
The flow control counter keeps increasing. I think will lead to flow control desync between client and server.
The text was updated successfully, but these errors were encountered: