-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Untrusted Validator Squelching - Extends squelching to limit untrusted validator message propagation #5399
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
901d5ce
to
bd11905
Compare
ed26e8a
to
57bc3ea
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #5399 +/- ##
=========================================
- Coverage 79.1% 79.0% -0.0%
=========================================
Files 816 818 +2
Lines 71605 71774 +169
Branches 8237 8274 +37
=========================================
+ Hits 56625 56733 +108
- Misses 14980 15041 +61
🚀 New features to boost your workflow:
|
1d36ebc
to
c689fbd
Compare
3dd5f5c
to
894706f
Compare
b3b8d25
to
5ec71e7
Compare
…ate messages from trusted validators
This feature improves network efficiency by limiting message propagation from untrusted validators. Squelching currently reduces the volume of duplicate messages from validators but does not address the volume of unique messages from untrusted validators, who may not contribute meaningfully to network progress. This change introduces a bounded number of slots for untrusted validators, selected based on message frequency. Once selected, their duplicate messages are subject to standard squelching logic, thereby reducing overall message overhead without impacting trusted validator performance.
5ec71e7
to
a038b70
Compare
@bthomee, it would require rewriting a good chunk of existing base
squelching tests, I think that's out of scope. However, I'm happy to
explore this prospect separately.
…On Mon, Jun 23, 2025, 20:13 Bart ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In src/test/overlay/clock.h
<#5399 (comment)>:
> +#define RIPPLE_TEST_OVERLAY_CLOCK_H_INCLUDED
+
+#include <xrpl/basics/random.h>
+
+#include <chrono>
+#include <cstdint>
+#include <ratio>
+
+namespace ripple {
+
+namespace test {
+
+using namespace std::chrono;
+
+/** Manually advanced clock. */
+class ManualClock
@Tapanito <https://github.com/Tapanito> Adding an extra class that
duplicates a lot of existing code/functionality leads to code bloat.
Passing an instance around shouldn't be a huge refactor.
—
Reply to this email directly, view it on GitHub
<#5399 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABMDKU2YWCR3NW4WIEAH2R33FA7VJAVCNFSM6AAAAAB25WMZ7CVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDSNJRGA2DAMZZGE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
deb5377
to
69446cb
Compare
src/xrpld/overlay/detail/Slot.cpp
Outdated
if (peer.count == (MAX_MESSAGE_THRESHOLD + 1)) | ||
++reachedThreshold_; | ||
|
||
if (now - lastSelected_ > 2 * MAX_UNSQUELCH_EXPIRE_DEFAULT) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is defensive block of code. But if it ever happens then we stuck in this state since lastSelected_
is updated once the peers are selected below. This should never really happen since a slot is deleted if now - lastSelected_ > MAX_UNSQUELCH_EXPIRE_DEFAULT)
. My question is if lastSelected_
should be set to now
if above is true or should we make this an error and delete this slot?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more. F438
I think the logic should be adjusted in the first place. Instead of basing on the time the slot was last selected, we should base it on the time the slot was last updated.
Let's take an example of a slot for a trusted validator whose messages I receive only from a single peer, other peers always timeout. Such a slot will continuously oscillate between being inserted and deleted, as it will never have enough peers to start the selection process.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So basically the time of the most recent message? I think we just should delete the slot in this case since. This scenario can't happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it can happen. Imagine we have a slot for untrusted validator that stops sending traffic. We will very quickly learn that it's offline, but it'll take an hour until the slot is deleted. Though, and I already started working on this, for untrusted validators, if they fall under the threshold of minimum peers, they slot will be deleted and the validator squelched.
src/xrpld/overlay/Slot.h
Outdated
* @param duration Squelch duration in seconds | ||
*/ | ||
virtual void | ||
squelchAll(PublicKey const& validator, std::uint32_t duration) = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is for untrusted validator only, right? Perhaps squelchAllUntrusted()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe one way to avoid this sort of circular call that you mentioned is for this method to take register
callback? Another option is for squelchAll
to return the set of squelched peers. But then it's double iteration.
squelchAll(PublicKey const&, std::uint32_t, std::function<void(Peer::id_t)>)
And then
void
OverlayImpl::squelchAll(PublicKey const& validator, uint32_t squelchDuration, std::function<void(Peer::id_t)>&& register)
{
for_each([&](std::shared_ptr<PeerImp>&& p) {
p->send(makeSquelchMessage(validator, true, squelchDuration));
register(p->id());
});
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is much cleaner, that way OverlayImpl doesn't need to know about the internals of Slot.h
! Thanks Greg, the callback is a great idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
beast::expire(peersWithMessage_, reduce_relay::IDLED); | ||
|
||
if (key.isNonZero()) | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be simplified. The only thing is you can't log if it's new or not. Not sure if this is important.
auto const res = peersWithMessage_[key].insert(id);
JLOG(journal_.trace())
<< "addPeerMessage: added " << res.second << " " << to_string(key)
<< " " << id;
return res.second;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can be, I purposefully haven't touched the original squelching code. There's a fair bit of refactoring that can be done around it, I'm not sure whether it should be done with this PR, or separetely. What do you think?
src/xrpld/overlay/detail/PeerImp.cpp
Outdated
TrafficCount::category::validation_untrusted, | ||
Message::messageSize(*m)); | ||
|
||
overlay_.updateValidatorSlot(key, val->getSignerPublic(), id_); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we doing this for Validation only and not ProposeSet messages?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By default, untrusted proposal messages are not relayed, thus they are less likely to be delivered by peers.
Edit:
Since we might not receive all proposals, they do not provide sufficient information to to determine squelching rules.
if (now - it->second.lastMessage > IDLED) | ||
return {}; | ||
|
||
it->second.peers.insert(peer); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should move these updates to if (...)
on line 526. Can just add 1 to constants. There is no need to update if it's going to be erased on line 531.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
peers
is a set, I will not know it's true size until the value is insert in line 522
, thus just adding one in the if (...)
check is not right. While I understand where you're coming from regarding the redundant update in case we delete the entry, I don't think it's such a computationally heavy operation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, right. It could be a duplicate peer.
If the validator was idle for a short period, reset it's progress. However, if the validator was idle for a long time, delete and squelch it. Similarly, if the validator sent a lot of unique messages, but failed to reach peering constraints, squelch it.
This feature improves network efficiency by limiting message propagation from untrusted validators.
Squelching currently reduces the volume of duplicate messages from validators but does not address the volume of unique messages from untrusted validators, who may not contribute meaningfully to network progress.
This change introduces a bounded number of slots for untrusted validators, selected based on unique message frequency and quantity. Once selected, their duplicate messages are subject to standard squelching logic, thereby reducing overall message overhead without impacting trusted validator performance.
High Level Overview of Change
Context of Change
Type of Change
.gitignore
, formatting, dropping support for older tooling)API Impact
libxrpl
change (any change that may affectlibxrpl
or dependents oflibxrpl
)