8000 Untrusted Validator Squelching - Extends squelching to limit untrusted validator message propagation by Tapanito · Pull Request #5399 · XRPLF/rippled · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Untrusted Validator Squelching - Extends squelching to limit untrusted validator message propagation #5399

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 32 commits into
base: develop
Choose a base branch
from

Conversation

Tapanito
Copy link
Collaborator
@Tapanito Tapanito commented Apr 11, 2025

This feature improves network efficiency by limiting message propagation from untrusted validators.

Squelching currently reduces the volume of duplicate messages from validators but does not address the volume of unique messages from untrusted validators, who may not contribute meaningfully to network progress.

This change introduces a bounded number of slots for untrusted validators, selected based on unique message frequency and quantity. Once selected, their duplicate messages are subject to standard squelching logic, thereby reducing overall message overhead without impacting trusted validator performance.

High Level Overview of Change

Context of Change

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Refactor (non-breaking change that only restructures code)
  • Performance (increase or change in throughput and/or latency)
  • Tests (you added tests for code that already exists, or your new feature included in this PR)
  • Documentation update
  • Chore (no impact to binary, e.g. .gitignore, formatting, dropping support for older tooling)
  • Release

API Impact

  • Public API: New feature (new methods and/or new fields)
  • Public API: Breaking change (in general, breaking changes should only impact the next api_version)
  • libxrpl change (any change that may affect libxrpl or dependents of libxrpl)
  • Peer protocol change (must be backward compatible or bump the peer protocol version)

@Tapanito Tapanito force-pushed the tapanito/feature/enhanced-squelching branch from 901d5ce to bd11905 Compare May 2, 2025 09:03
@Tapanito Tapanito changed the title DRAFT: enhanced squelching algorithm to reduce untrusted validator traffic Extend squelching to limit untrusted validator message propagation May 2, 2025
@Tapanito Tapanito marked this pull request as ready for review May 2, 2025 09:22
@Tapanito Tapanito requested review from vlntb, bthomee and gregtatcam May 2, 2025 09:22
@Tapanito Tapanito force-pushed the tapanito/feature/enhanced-squelching branch from ed26e8a to 57bc3ea Compare May 2, 2025 09:50
Copy link
codecov bot commented May 2, 2025

Codecov Report

Attention: Patch coverage is 68.89849% with 144 lines in your changes missing coverage. Please review.

Project coverage is 79.0%. Comparing base (e18f27f) to head (14ba721).

Files with missing lines Patch % Lines
src/xrpld/overlay/detail/Slot.cpp 81.6% 63 Missing ⚠️
src/xrpld/overlay/detail/OverlayImpl.cpp 2.9% 34 Missing ⚠️
src/xrpld/overlay/Slot.h 48.7% 20 Missing ⚠️
src/xrpld/overlay/detail/PeerImp.cpp 4.8% 20 Missing ⚠️
src/xrpld/overlay/detail/Squelch.cpp 70.0% 6 Missing ⚠️
src/xrpld/overlay/detail/PeerImp.h 0.0% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff            @@
##           develop   #5399     +/-   ##
=========================================
- Coverage     79.1%   79.0%   -0.0%     
=========================================
  Files          816     818      +2     
  Lines        71605   71774    +169     
  Branches      8237    8274     +37     
=========================================
+ Hits         56625   56733    +108     
- Misses       14980   15041     +61     
Files with missing lines Coverage Δ
src/xrpld/core/Config.h 85.7% <ø> (ø)
src/xrpld/core/detail/Config.cpp 75.9% <100.0%> (+0.1%) ⬆️
src/xrpld/overlay/Squelch.h 100.0% <100.0%> (+30.4%) ⬆️
src/xrpld/overlay/detail/OverlayImpl.h 38.4% <ø> (ø)
src/xrpld/overlay/detail/PeerImp.h 13.0% <0.0%> (ø)
src/xrpld/overlay/detail/Squelch.cpp 70.0% <70.0%> (ø)
src/xrpld/overlay/Slot.h 68.8% <48.7%> (-16.0%) ⬇️
src/xrpld/overlay/detail/PeerImp.cpp 3.7% <4.8%> (-<0.1%) ⬇️
src/xrpld/overlay/detail/OverlayImpl.cpp 34.0% <2.9%> (-0.7%) ⬇️
src/xrpld/overlay/detail/Slot.cpp 81.6% <81.6%> (ø)

... and 3 files with indirect coverage changes

Impacted file tree graph

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@Tapanito Tapanito mentioned this pull request May 2, 2025
13 tasks
@Tapanito Tapanito force-pushed the tapanito/feature/enhanced-squelching branch 2 times, most recently from 1d36ebc to c689fbd Compare May 7, 2025 13:16
@Tapanito Tapanito changed the title Extend squelching to limit untrusted validator message propagation Untrusted Validator Squelching - Extends squelching to limit untrusted validator message propagation May 8, 2025
@Tapanito Tapanito force-pushed the tapanito/feature/enhanced-squelching branch 3 times, most recently from 3dd5f5c to 894706f Compare May 13, 2025 08:40
@Tapanito Tapanito force-pushed the tapanito/feature/enhanced-squelching branch 2 times, most recently from b3b8d25 to 5ec71e7 Compare May 21, 2025 14:16
@bthomee bthomee added this to the 2.6.0 (Q3 2025) milestone May 27, 2025
Tapanito added 9 commits May 28, 2025 15:57
This feature improves network efficiency by limiting message propagation from untrusted validators.

Squelching currently reduces the volume of duplicate messages from validators but does not address the volume of unique messages from untrusted validators, who may not contribute meaningfully to network progress.

This change introduces a bounded number of slots for untrusted validators, selected based on message frequency. Once selected, their duplicate messages are subject to standard squelching logic, thereby reducing overall message overhead without impacting trusted validator performance.
@Tapanito Tapanito force-pushed the tapanito/feature/enhanced-squelching branch from 5ec71e7 to a038b70 Compare May 28, 2025 14:37
@Tapanito
Copy link
Collaborator Author
Tapanito commented Jun 23, 2025 via email

@Tapanito Tapanito force-pushed the tapanito/feature/enhanced-squelching branch from deb5377 to 69446cb Compare June 26, 2025 07:57
@Tapanito Tapanito requested a review from vlntb June 26, 2025 14:08
if (peer.count == (MAX_MESSAGE_THRESHOLD + 1))
++reachedThreshold_;

if (now - lastSelected_ > 2 * MAX_UNSQUELCH_EXPIRE_DEFAULT)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is defensive block of code. But if it ever happens then we stuck in this state since lastSelected_ is updated once the peers are selected below. This should never really happen since a slot is deleted if now - lastSelected_ > MAX_UNSQUELCH_EXPIRE_DEFAULT). My question is if lastSelected_ should be set to now if above is true or should we make this an error and delete this slot?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more. F438

I think the logic should be adjusted in the first place. Instead of basing on the time the slot was last selected, we should base it on the time the slot was last updated.
Let's take an example of a slot for a trusted validator whose messages I receive only from a single peer, other peers always timeout. Such a slot will continuously oscillate between being inserted and deleted, as it will never have enough peers to start the selection process.

What do you think?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So basically the time of the most recent message? I think we just should delete the slot in this case since. This scenario can't happen.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it can happen. Imagine we have a slot for untrusted validator that stops sending traffic. We will very quickly learn that it's offline, but it'll take an hour until the slot is deleted. Though, and I already started working on this, for untrusted validators, if they fall under the threshold of minimum peers, they slot will be deleted and the validator squelched.

* @param duration Squelch duration in seconds
*/
virtual void
squelchAll(PublicKey const& validator, std::uint32_t duration) = 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for untrusted validator only, right? Perhaps squelchAllUntrusted()

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe one way to avoid this sort of circular call that you mentioned is for this method to take register callback? Another option is for squelchAll to return the set of squelched peers. But then it's double iteration.

squelchAll(PublicKey const&, std::uint32_t, std::function<void(Peer::id_t)>)

And then

void
OverlayImpl::squelchAll(PublicKey const& validator, uint32_t squelchDuration, std::function<void(Peer::id_t)>&& register)
{
    for_each([&](std::shared_ptr<PeerImp>&& p) {
        p->send(makeSquelchMessage(validator, true, squelchDuration));
        register(p->id());
    });
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is much cleaner, that way OverlayImpl doesn't need to know about the internals of Slot.h! Thanks Greg, the callback is a great idea.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

beast::expire(peersWithMessage_, reduce_relay::IDLED);

if (key.isNonZero())
{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be simplified. The only thing is you can't log if it's new or not. Not sure if this is important.

    auto const res = peersWithMessage_[key].insert(id);
    JLOG(journal_.trace())
        << "addPeerMessage: added " << res.second << " " << to_string(key) 
        << " " << id;
    return res.second;

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be, I purposefully haven't touched the original squelching code. There's a fair bit of refactoring that can be done around it, I'm not sure whether it should be done with this PR, or separetely. What do you think?

TrafficCount::category::validation_untrusted,
Message::messageSize(*m));

overlay_.updateValidatorSlot(key, val->getSignerPublic(), id_);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we doing this for Validation only and not ProposeSet messages?

Copy link
Collaborator Author
@Tapanito Tapanito Jun 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By default, untrusted proposal messages are not relayed, thus they are less likely to be delivered by peers.

Edit:

Since we might not receive all proposals, they do not provide sufficient information to to determine squelching rules.

if (now - it->second.lastMessage > IDLED)
return {};

it->second.peers.insert(peer);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should move these updates to if (...) on line 526. Can just add 1 to constants. There is no need to update if it's going to be erased on line 531.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

peers is a set, I will not know it's true size until the value is insert in line 522, thus just adding one in the if (...) check is not right. While I understand where you're coming from regarding the redundant update in case we delete the entry, I don't think it's such a computationally heavy operation.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right. It could be a duplicate peer.

B83A

Tapanito added 6 commits June 30, 2025 14:26
If the validator was idle for a short period, reset it's progress. However, if the validator was idle for a long time, delete and squelch it.
Similarly, if the validator sent a lot of unique messages, but failed to reach peering constraints, squelch it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
0