-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Use Validator List (VL) cache files in more scenarios #5323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
- If any [validator_list_keys] are not available after all [validator_list_sites] have had a chance to be queried, then fall back to loading cache files. Currently, cache files are only used if no sites are defined, or the request to one of them has an error. It does not include cases where not enough sites are defined, or if a site returns an invalid VL (or something else entirely). - Resolves #5320
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## develop #5323 +/- ##
=======================================
Coverage 79.1% 79.1%
=======================================
Files 816 816
Lines 71622 71632 +10
Branches 8237 8236 -1
=======================================
+ Hits 56644 56652 +8
- Misses 14978 14980 +2
🚀 New features to boost your workflow:
|
I updated my
Notice that it includes the keys for both Ripple and XRPLF, but the URL for Ripple is commented out, and the URL for XRPLF is duplicated.
Note that there are two sites listed here: XRPLF, and the local cache.
Note that both lists are available, and that both indicate the same |
…dator-cache * upstream/develop: chore: Move "assert" and "werr" flags from "actions/build" (5325) Log detailed correlated consensus data together (5302)
What is the scenario that is tested here? Ripple's publisher list was cached at a previous point in time. Presently, that publisher list is unavailable (due to being commented out in the config file). Your changes to the code have resorted to using the cache to load the ValidatorList. Is this an accurate description of the PR? In what other situations are you going to be using the cache? |
…dator-cache * upstream/develop: Set version to 2.4.0 Set version to 2.4.0-rc4 chore: Update XRPL Foundation Validator List URL (5326)
It's demonstrating two things:
Yep.
"Currently, cache files are only used if no sites are defined, or the request to one of them has an error." |
…dator-cache * upstream/develop: refactor: Remove unused and add missing includes (5293)
return site.loadedResource->uri == uri; | ||
}); | ||
if (!found) | ||
sites_.emplace_back(uri); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to use a set
or an unordered_set
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly. The main reason I didn't is that there are several places that access individual items in sites_
using the vector index (i.e. operator[]
). See, for example, ValidatorSite::makeRequest
.
It could be rewritten to use a set
-based class and some other lookup method, but as I mentioned in the PR description, I decided it wasn't worth it because:
ValidatorSite::load
checks for duplicate URIs. Since it's a vector, it's not very efficient, but this function is only called at startup and viamissingSites
, so it won't be called often, and not in any critical path.
This saved some effort (e.g. I am lazy), and kept the PR small and easier to reason about.
@@ -210,6 +215,17 @@ ValidatorSite::setTimer( | |||
std::lock_guard<std::mutex> const& site_lock, | |||
std::lock_guard<std::mutex> const& state_lock) | |||
{ | |||
if (!sites_.empty() && // |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be better if we call onError
when we can't verify the VL or when there's something wrong with the json so that we could catch more cases and potentially log something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be better if we call
onError
when we can't verify the VL or when there's something wrong with the json so that we could catch more cases and potentially log something?
onError
is a lambda defined in ValidatorSite::onSiteFetch
, and it is called if there's something wrong with the VL, because ValidatorSite::parseJsonResponse
throws exceptions, which are handled by calling onError
.
The point of the new check here in ValidatorSite::setTimer
in particular is to handle the case where there were no problems with the VL downloads from [validator_list_sites]
, but we still don't have VLs for all the keys defined in [validator_list_keys]
.
That can be possible because [...sites]
and [...keys]
are completely independent.
I could configure the sites list to request from multiple URLs that mirror the same VL. Someone could set up an aggregator that returns multiple VLs from different publishers, and I could use that. I could define a key that doesn't actually have a site to publish their VL, and instead relies on propagation through the P2P network. I simulated that example above where I commented out one of the URLs in the config file.
High Level Overview of Change
[validator_list_keys]
are not available after all[validator_list_sites]
have had a chance to be queried, then fall back to loading cache files. Currently, cache files are only used if no sites are defined, or the request to one of them has an error. It does not include cases where not enough sites are defined, or if a site returns an invalid VL (or something else entirely).Context of Change
Validator list cache files are only used if the request to a site fails. That doesn't cover enough possible cases.
Type of Change
API Impact
None
Before / After
Makes the following changes:
ValidatorSite::onRequestTimeout
), and there is not yet alastRefreshStatus
, it sets one indicating the timeout.ValidatorSite::setTimer
, which determines which request to send next, if all of the sites have alastRefreshStatus
, callsmissingSite
.missingSite
is unchanged. It callsValidatorList::loadLists
, which returns all the cache file names for lists ([validator_list_keys]
) which are unavailable, and for which a cache file exists. Those file names are then passed toValidatorSite::load
, which adds them to thesites_
list. Because those "sites" are new, they will be tried next.ValidatorSite::load
checks for duplicate URIs. Since it's avector
, it's not very efficient, but this function is only called at startup and viamissingSites
, so it won't be called often, and not in any critical path.Test Plan
Reproduce the scenario from #5320. Verify that the UNL becomes available within a few seconds of startup.