blocksync: respect the `maxDiffBetweenCurrentAndReceivedBlockHeight` limit #2465

melekes · 2024-02-28T10:13:30Z

Report

Osmosis team: "My understanding of this value is that this implies we should only be fetching blocks 100 blocks in advance from where we currently are. I added prints and saw that even when this value was set as low as 10, we were still fetching blocks 2000+ blocks out while being starved of blocks 2 away from where we currently were

And I might be wrong about this second part, but looking at where this value is used https://github.com/osmosis-labs/cometbft/blob/490dd752eb8a1a934542f70fdc8960594e06973b/blocksync/pool.go#L264-L268, it seems like because we are fetching blocks very far out, we just error and disregard them, so its completely wasted cycles AFAICT."

Preliminary discussion

@jmalicevic: "it seems that maxDiffBetweenCurrentAndReceivedBlockHeight is only checked when we receive blocks, it should probably also be checked when we request blocks. Otherwise it is not the peers fault we get blocks too far ahead. Perhaps avoid creating requesters too far ahead in makeRequestersRoutine() by adapting when then pool.makeNextRequester is called could help."

@cason: "Yes, but only for blocks for which we don't have a tracked request.
It is still to be understood if we don't have a tracked request because it is too "old" (i.e., we already committed this block) or too "new" (i.e., too far in the future). I still can't see how the latter could occur, as: (i) a correct peer only sends us a block upon receiving our request; (ii) we track all requests we send."

@jmalicevic: "So effectively this variable is never checked unless it is an error (we don’t have a requester for it ) or we removed it at some point. But we are never enforcing not requesting blocks too far ahead with this variable it seems"

…imit Before this change, the number of requesters was unlimited! Now it's constant = 100. Closes #2465

melekes · 2024-02-28T14:04:25Z

Based on my analysis, we respect the limit of max N of requesters (1 requester per block; 600 blocks downloaded in parallel). maxDiffBetweenCurrentAndReceivedBlockHeight is only applied to blocks which we haven't requested from peer (to display a different error).

Lowering the max N of blocks to 100 seems prudent (especially for big blocks). For smaller blocks, it would be great to benchmark this somehow (ie. given the network setup of N (max in + out) peers, how many blocks we should request to saturate the network).

ValarDragon · 2024-02-29T16:12:51Z

I think its ok to go to higher block bounds FWIW. Osmosis right now is getting to 3.6 BPS, and CPU side seems like in our next state break we should be at 5BPS if theres no p2p issues. (And AFAIU Osmosis has one of the most expensive state machines in cosmos rn)

100 blocks fetched in advance is just 20 seconds of syncing then! Would rather have a greater buffer if possible

amin1419 · 2024-02-29T20:39:17Z

fixed my any issue

…cks to {peersCount * 20} (#2467) where 20 is the new max requests per peer. Previously, the number of requesters (and consequently the number of requested blocks) was 600 and did not depend on the number of peers. There's no point in creating more than that since N of requests per peer is capped and we know the number of peers. Closes #2465 Fixes #1283 [api-breaking] --- #### PR checklist - [x] Tests written/updated - [x] Changelog entry added in `.changelog` (we use [unclog](https://github.com/informalsystems/unclog) to manage our changelog) - [ ] Updated relevant documentation (`docs/` or `spec/`) and code comments - [x] Title follows the [Conventional Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec

…cks to {peersCount * 20} (#2467) where 20 is the new max requests per peer. Previously, the number of requesters (and consequently the number of requested blocks) was 600 and did not depend on the number of peers. There's no point in creating more than that since N of requests per peer is capped and we know the number of peers. Closes #2465 Fixes #1283 [api-breaking] --- #### PR checklist - [x] Tests written/updated - [x] Changelog entry added in `.changelog` (we use [unclog](https://github.com/informalsystems/unclog) to manage our changelog) - [ ] Updated relevant documentation (`docs/` or `spec/`) and code comments - [x] Title follows the [Conventional Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec (cherry picked from commit d58acd8) # Conflicts: # internal/blocksync/reactor.go

melekes added bug Something isn't working block-sync needs-triage This issue/PR has not yet been triaged by the team. labels Feb 28, 2024

melekes added a commit that referenced this issue Feb 28, 2024

fix(blocksync): respect maxDiffBetweenCurrentAndReceivedBlockHeight l…

91a648a

…imit Before this change, the number of requesters was unlimited! Now it's constant = 100. Closes #2465

melekes mentioned this issue Feb 28, 2024

feat(blocksync)!: set the max number of (concurrently) downloaded blocks to {peersCount * 20} #2467

Merged

4 tasks

melekes self-assigned this Feb 29, 2024

melekes added this to CometBFT Feb 29, 2024

github-project-automation bot moved this to Todo in CometBFT Feb 29, 2024

melekes moved this from Todo to Ready for Review in CometBFT Feb 29, 2024

melekes closed this as completed in #2467 Mar 11, 2024

github-project-automation bot moved this from Ready for Review to Done in CometBFT Mar 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

blocksync: respect the `maxDiffBetweenCurrentAndReceivedBlockHeight` limit #2465

blocksync: respect the `maxDiffBetweenCurrentAndReceivedBlockHeight` limit #2465

Uh oh!

Uh oh!

Uh oh!

blocksync: respect the maxDiffBetweenCurrentAndReceivedBlockHeight limit #2465

blocksync: respect the maxDiffBetweenCurrentAndReceivedBlockHeight limit #2465

Comments

Report

Preliminary discussion

Uh oh!

Uh oh!

Uh oh!

blocksync: respect the `maxDiffBetweenCurrentAndReceivedBlockHeight` limit #2465

blocksync: respect the `maxDiffBetweenCurrentAndReceivedBlockHeight` limit #2465