8000 Use Validator List (VL) cache files in more scenarios by ximinez · Pull Request #5323 · XRPLF/rippled · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Use Validator List (VL) cache files in more scenarios #5323

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 30 commits into
base: develop
Choose a base branch
from

Conversation

ximinez
Copy link
Collaborator
@ximinez ximinez commented Feb 26, 2025

High Level Overview of Change

  • If any [validator_list_keys] are not available after all [validator_list_sites] have had a chance to be queried, then fall back to loading cache files. Currently, cache files are only used if no sites are defined, or the request to one of them has an error. It does not include cases where not enough sites are defined, or if a site returns an invalid VL (or something else entirely).
  • Resolves VL cache files are only used if an endpoint is unreachable #5320

Context of Change

Validator list cache files are only used if the request to a site fails. That doesn't cover enough possible cases.

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)

API Impact

None

Before / After

Makes the following changes:

  1. If a validator site request times out (ValidatorSite::onRequestTimeout), and there is not yet a lastRefreshStatus, it sets one indicating the timeout.
  2. In ValidatorSite::setTimer, which determines which request to send next, if all of the sites have a lastRefreshStatus, calls missingSite.
    • missingSite is unchanged. It calls ValidatorList::loadLists, which returns all the cache file names for lists ([validator_list_keys]) which are unavailable, and for which a cache file exists. Those file names are then passed to ValidatorSite::load, which adds them to the sites_ list. Because those "sites" are new, they will be tried next.
  3. ValidatorSite::load checks for duplicate URIs. Since it's a vector, it's not very efficient, but this function is only called at startup and via missingSites, so it won't be called often, and not in any critical path.

Test Plan

Reproduce the scenario from #5320. Verify that the UNL becomes available within a few seconds of startup.

- If any [validator_list_keys] are not available after all
  [validator_list_sites] have had a chance to be queried, then fall
  back to loading cache files. Currently, cache files are only used if
  no sites are defined, or the request to one of them has an error. It
  does not include cases where not enough sites are defined, or if a
  site returns an invalid VL (or something else entirely).
- Resolves #5320
Copy link
codecov bot commented Feb 26, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 79.1%. Comparing base (9874d47) to head (9e92413).

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##           develop   #5323   +/-   ##
=======================================
  Coverage     79.1%   79.1%           
=======================================
  Files          816     816           
  Lines        71622   71632   +10     
  Branches      8237    8236    -1     
=======================================
+ Hits         56644   56652    +8     
- Misses       14978   14980    +2     
Files with missing lines Coverage Δ
src/xrpld/app/misc/detail/ValidatorSite.cpp 93.9% <100.0%> (+1.4%) ⬆️

... and 5 files with indirect coverage changes

Impacted file tree graph

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@ximinez
Copy link
Collaborator Author
ximinez commented Feb 26, 2025

I updated my validators.txt to include:

[validator_list_sites]
# https://vl.ripple.com/
https://vl.xrplf.org/
https://vl.xrplf.org/

[validator_list_keys]
# Ripple
ED2677ABFFD1B33AC6FBC3062B71F1E8397C1505E1C42C64D11AD1B28FF73F4734
# XRPLF
ED45D1840EE724BE327ABE9146503D5848EFD5F38B6D5FEDE71E80ACCE5E6E738B

Notice that it includes the keys for both Ripple and XRPLF, but the URL for Ripple is commented out, and the URL for XRPLF is duplicated.

$ rippled -q validator_list_sites
{
   "result" : {
      "status" : "success",
      "validator_sites" : [
         {
            "last_refresh_status" : "same_sequence",
            "last_refresh_time" : "2025-Feb-26 22:40:44.2977547 UTC",
            "next_refresh_time" : "2025-Feb-26 22:45:43.7251469 UTC",
            "refresh_interval_min" : 5,
            "uri" : "https://vl.xrplf.org/"
         },
         {
            "last_refresh_status" : "accepted",
            "last_refresh_time" : "2025-Feb-26 22:25:44.4012575 UTC",
            "next_refresh_time" : "2025-Feb-27 22:25:44.4017257 UTC",
            "refresh_interval_min" : 1440,
            "uri" : "file://[...]/cache.ED2677ABFFD1B33AC6FBC3062B71F1E8397C1505E1C42C64D11AD1B28FF73F4734"
         }
      ]
   }
}

Note that there are two sites listed here: XRPLF, and the local cache.

$ rippled -q validators
{
   "result" : {
      "publisher_lists" : [
         {
            "available" : true,
            "expiration" : "2026-Jan-17 10:09:21.0000000 UTC",
            "list" : [
               [...]
            ],
            "pubkey_publisher" : "ED2677ABFFD1B33AC6FBC3062B71F1E8397C1505E1C42C64D11AD1B28FF73F4734",
            "seq" : 81,
            "uri" : "file://[...]/cache.ED2677ABFFD1B33AC6FBC3062B71F1E8397C1505E1C42C64D11AD1B28FF73F4734",
            "version" : 2
         },
         {
            "available" : true,
            "expiration" : "2026-Jan-18 00:00:00.0000000 UTC",
            "list" : [
               [...]
            ],
            "pubkey_publisher" : "ED45D1840EE724BE327ABE9146503D5848EFD5F38B6D5FEDE71E80ACCE5E6E738B",
            "seq" : 2025011701,
            "uri" : "https://vl.xrplf.org/",
            "version" : 1
         },
[...]

Note that both lists are available, and that both indicate the same uri as shown in validator_list_sites (which does not necessarily need to be the case).

@ximinez ximinez requested review from vlntb and ckeshava February 28, 2025 18:43
…dator-cache

* upstream/develop:
  chore: Move "assert" and "werr" flags from "actions/build" (5325)
  Log detailed correlated consensus data together (5302)
@ckeshava
Copy link
Collaborator
ckeshava commented Mar 7, 2025

Notice that it includes the keys for both Ripple and XRPLF, but the URL for Ripple is commented out, and the URL for XRPLF is duplicated.

What is the scenario that is tested here? Ripple's publisher list was cached at a previous point in time. Presently, that publisher list is unavailable (due to being commented out in the config file). Your changes to the code have resorted to using the cache to load the ValidatorList. Is this an accurate description of the PR?

In what other situations are you going to be using the cache?

…dator-cache

* upstream/develop:
  Set version to 2.4.0
  Set version to 2.4.0-rc4
  chore: Update XRPL Foundation Validator List URL (5326)
@ximinez
Copy link
Collaborator Author
ximinez commented Mar 11, 2025

What is the scenario that is tested here?

It's demonstrating two things:

  1. That the [validator_list_sites] and [validator_list_keys] are independent. This was already true.
  2. That if all the sites included in [validator_list_sites] are queried and any of the [validator_list_keys] still do not have a valid UNL available, the cache file will be loaded. As noted in the PR description, the original behavior will not load the cache in that scenario. "Currently, cache files are only used if no sites are defined, or the request to one of them has an error. It does not include cases where not enough sites are defined, or if a site returns an invalid VL (or something else entirely)."

Ripple's publisher list was cached at a previous point in time. Presently, that publisher list is unavailable (due to being commented out in the config file). Your changes to the code have resorted to using the cache to load the ValidatorList. Is this an accurate description of the PR?

Yep.

In what other situations are you going to be using the cache?

"Currently, cache files are only used if no sites are defined, or the request to one of them has an error."

ximinez added 21 commits March 11, 2025 19:52
…dator-cache

* upstream/develop:
  refactor: Remove unused and add missing includes (5293)
@ximinez ximinez requested a review from a team as a code owner May 14, 2025 14:07
@bthomee bthomee requested review from a1q123456 and removed request for vlntb June 6, 2025 15:12
return site.loadedResource->uri == uri;
});
if (!found)
sites_.emplace_back(uri);
Copy link
Collaborator
@a1q123456 a1q123456 Jun 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to use a set or an unordered_set instead?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly. The main reason I didn't is that there are several places that access individual items in sites_ using the vector index (i.e. operator[]). See, for example, ValidatorSite::makeRequest.

It could be rewritten to use a set-based class and some other lookup method, but as I mentioned in the PR description, I decided it wasn't worth it because:

ValidatorSite::load checks for duplicate URIs. Since it's a vector, it's not very efficient, but this function is only called at startup and via missingSites, so it won't be called often, and not in any critical path.

This saved some effort (e.g. I am lazy), and kept the PR small and easier to reason about.

@@ -210,6 +215,17 @@ ValidatorSite::setTimer(
std::lock_guard<std::mutex> const& site_lock,
std::lock_guard<std::mutex> const& state_lock)
{
if (!sites_.empty() && //
Copy link
Collaborator
@a1q123456 a1q123456 Jun 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better if we call onError when we can't verify the VL or when there's something wrong with the json so that we could catch more cases and potentially log something?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better if we call onError when we can't verify the VL or when there's something wrong with the json so that we could catch more cases and potentially log something?

onError is a lambda defined in ValidatorSite::onSiteFetch, and it is called if there's something wrong with the VL, because ValidatorSite::parseJsonResponse throws exceptions, which are handled by calling onError.

The point of the new check here in ValidatorSite::setTimer in particular is to handle the case where there were no problems with the VL downloads from [validator_list_sites], but we still don't have VLs for all the keys defined in [validator_list_keys].

That can be possible because [...sites] and [...keys] are completely independent.

I could configure the sites list to request from multiple URLs that mirror the same VL. Someone could set up an aggregator that returns multiple VLs from different publishers, and I could use that. I could define a key that doesn't actually have a site to publish their VL, and instead relies on propagation through the P2P network. I simulated that example above where I commented out one of the URLs in the config file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

VL cache files are only used if an endpoint is unreachable
4 participants
0