Use Validator List (VL) cache files in more scenarios #5323

ximinez · 2025-02-26T22:07:05Z

High Level Overview of Change

If any [validator_list_keys] are not available after all [validator_list_sites] have had a chance to be queried, then fall back to loading cache files. Currently, cache files are only used if no sites are defined, or the request to one of them has an error. It does not include cases where not enough sites are defined, or if a site returns an invalid VL (or something else entirely).
Resolves VL cache files are only used if an endpoint is unreachable #5320

Context of Change

Validator list cache files are only used if the request to a site fails. That doesn't cover enough possible cases.

Type of Change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)

API Impact

None

Before / After

Makes the following changes:

If a validator site request times out (ValidatorSite::onRequestTimeout), and there is not yet a lastRefreshStatus, it sets one indicating the timeout.
In ValidatorSite::setTimer, which determines which request to send next, if all of the sites have a lastRefreshStatus, calls missingSite.
- missingSite is unchanged. It calls ValidatorList::loadLists, which returns all the cache file names for lists ([validator_list_keys]) which are unavailable, and for which a cache file exists. Those file names are then passed to ValidatorSite::load, which adds them to the sites_ list. Because those "sites" are new, they will be tried next.
ValidatorSite::load checks for duplicate URIs. Since it's a vector, it's not very efficient, but this function is only called at startup and via missingSites, so it won't be called often, and not in any critical path.

Test Plan

Reproduce the scenario from #5320. Verify that the UNL becomes available within a few seconds of startup.

- If any [validator_list_keys] are not available after all [validator_list_sites] have had a chance to be queried, then fall back to loading cache files. Currently, cache files are only used if no sites are defined, or the request to one of them has an error. It does not include cases where not enough sites are defined, or if a site returns an invalid VL (or something else entirely). - Resolves #5320

codecov · 2025-02-26T22:36:35Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 79.1%. Comparing base (9874d47) to head (9e92413).

Additional details and impacted files

@@           Coverage Diff           @@
##           develop   #5323   +/-   ##
=======================================
  Coverage     79.1%   79.1%           
=======================================
  Files          816     816           
  Lines        71622   71632   +10     
  Branches      8237    8236    -1     
=======================================
+ Hits         56644   56652    +8     
- Misses       14978   14980    +2

Files with missing lines	Coverage Δ
src/xrpld/app/misc/detail/ValidatorSite.cpp	`93.9% <100.0%> (+1.4%)`	⬆️

... and 5 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

ximinez · 2025-02-26T22:51:39Z

I updated my validators.txt to include:

[validator_list_sites]
# https://vl.ripple.com/
https://vl.xrplf.org/
https://vl.xrplf.org/

[validator_list_keys]
# Ripple
ED2677ABFFD1B33AC6FBC3062B71F1E8397C1505E1C42C64D11AD1B28FF73F4734
# XRPLF
ED45D1840EE724BE327ABE9146503D5848EFD5F38B6D5FEDE71E80ACCE5E6E738B

Notice that it includes the keys for both Ripple and XRPLF, but the URL for Ripple is commented out, and the URL for XRPLF is duplicated.

$ rippled -q validator_list_sites
{
   "result" : {
      "status" : "success",
      "validator_sites" : [
         {
            "last_refresh_status" : "same_sequence",
            "last_refresh_time" : "2025-Feb-26 22:40:44.2977547 UTC",
            "next_refresh_time" : "2025-Feb-26 22:45:43.7251469 UTC",
            "refresh_interval_min" : 5,
            "uri" : "https://vl.xrplf.org/"
         },
         {
            "last_refresh_status" : "accepted",
            "last_refresh_time" : "2025-Feb-26 22:25:44.4012575 UTC",
            "next_refresh_time" : "2025-Feb-27 22:25:44.4017257 UTC",
            "refresh_interval_min" : 1440,
            "uri" : "file://[...]/cache.ED2677ABFFD1B33AC6FBC3062B71F1E8397C1505E1C42C64D11AD1B28FF73F4734"
         }
      ]
   }
}

Note that there are two sites listed here: XRPLF, and the local cache.

$ rippled -q validators
{
   "result" : {
      "publisher_lists" : [
         {
            "available" : true,
            "expiration" : "2026-Jan-17 10:09:21.0000000 UTC",
            "list" : [
               [...]
            ],
            "pubkey_publisher" : "ED2677ABFFD1B33AC6FBC3062B71F1E8397C1505E1C42C64D11AD1B28FF73F4734",
            "seq" : 81,
            "uri" : "file://[...]/cache.ED2677ABFFD1B33AC6FBC3062B71F1E8397C1505E1C42C64D11AD1B28FF73F4734",
            "version" : 2
         },
         {
            "available" : true,
            "expiration" : "2026-Jan-18 00:00:00.0000000 UTC",
            "list" : [
               [...]
            ],
            "pubkey_publisher" : "ED45D1840EE724BE327ABE9146503D5848EFD5F38B6D5FEDE71E80ACCE5E6E738B",
            "seq" : 2025011701,
            "uri" : "https://vl.xrplf.org/",
            "version" : 1
         },
[...]

Note that both lists are available, and that both indicate the same uri as shown in validator_list_sites (which does not necessarily need to be the case).

…dator-cache * upstream/develop: chore: Move "assert" and "werr" flags from "actions/build" (5325) Log detailed correlated consensus data together (5302)

ckeshava · 2025-03-07T23:48:24Z

Notice that it includes the keys for both Ripple and XRPLF, but the URL for Ripple is commented out, and the URL for XRPLF is duplicated.

What is the scenario that is tested here? Ripple's publisher list was cached at a previous point in time. Presently, that publisher list is unavailable (due to being commented out in the config file). Your changes to the code have resorted to using the cache to load the ValidatorList. Is this an accurate description of the PR?

In what other situations are you going to be using the cache?

…dator-cache * upstream/develop: Set version to 2.4.0 Set version to 2.4.0-rc4 chore: Update XRPL Foundation Validator List URL (5326)

ximinez · 2025-03-11T22:11:15Z

What is the scenario that is tested here?

It's demonstrating two things:

That the [validator_list_sites] and [validator_list_keys] are independent. This was already true.
That if all the sites included in [validator_list_sites] are queried and any of the [validator_list_keys] still do not have a valid UNL available, the cache file will be loaded. As noted in the PR description, the original behavior will not load the cache in that scenario. "Currently, cache files are only used if no sites are defined, or the request to one of them has an error. It does not include cases where not enough sites are defined, or if a site returns an invalid VL (or something else entirely)."

Ripple's publisher list was cached at a previous point in time. Presently, that publisher list is unavailable (due to being commented out in the config file). Your changes to the code have resorted to using the cache to load the ValidatorList. Is this an accurate description of the PR?

Yep.

In what other situations are you going to be using the cache?

"Currently, cache files are only used if no sites are defined, or the request to one of them has an error."

…dator-cache * upstream/develop: refactor: Remove unused and add missing includes (5293)

a1q123456 · 2025-06-10T10:54:38Z

src/xrpld/app/misc/detail/ValidatorSite.cpp

+                return site.loadedResource->uri == uri;
+            });
+            if (!found)
+                sites_.emplace_back(uri);


Do we want to use a set or an unordered_set instead?

Possibly. The main reason I didn't is that there are several places that access individual items in sites_ using the vector index (i.e. operator[]). See, for example, ValidatorSite::makeRequest.

It could be rewritten to use a set-based class and some other lookup method, but as I mentioned in the PR description, I decided it wasn't worth it because:

ValidatorSite::load checks for duplicate URIs. Since it's a vector, it's not very efficient, but this function is only called at startup and via missingSites, so it won't be called often, and not in any critical path.

This saved some effort (e.g. I am lazy), and kept the PR small and easier to reason about.

a1q123456 · 2025-06-10T11:01:47Z

src/xrpld/app/misc/detail/ValidatorSite.cpp

@@ -210,6 +215,17 @@ ValidatorSite::setTimer(
    std::lock_guard<std::mutex> const& site_lock,
    std::lock_guard<std::mutex> const& state_lock)
 {
+    if (!sites_.empty() &&  //


Would it be better if we call onError when we can't verify the VL or when there's something wrong with the json so that we could catch more cases and potentially log something?

Would it be better if we call onError when we can't verify the VL or when there's something wrong with the json so that we could catch more cases and potentially log something?

onError is a lambda defined in ValidatorSite::onSiteFetch, and it is called if there's something wrong with the VL, because ValidatorSite::parseJsonResponse throws exceptions, which are handled by calling onError.

The point of the new check here in ValidatorSite::setTimer in particular is to handle the case where there were no problems with the VL downloads from [validator_list_sites], but we still don't have VLs for all the keys defined in [validator_list_keys].

That can be possible because [...sites] and [...keys] are completely independent.

I could configure the sites list to request from multiple URLs that mirror the same VL. Someone could set up an aggregator that returns multiple VLs from different publishers, and I could use that. I could define a key that doesn't actually have a site to publish their VL, and instead relies on propagation through the P2P network. I simulated that example above where I commented out one of the URLs in the config file.

ximinez requested review from vlntb and ckeshava February 28, 2025 18:43

Merge remote-tracking branch 'upstream/develop' into ximinez/fix/vali…

d78aacb

…dator-cache * upstream/develop: chore: Move "assert" and "werr" flags from "actions/build" (5325) Log detailed correlated consensus data together (5302)

Merge remote-tracking branch 'upstream/develop' into ximinez/fix/vali…

f156874

…dator-cache * upstream/develop: Set version to 2.4.0 Set version to 2.4.0-rc4 chore: Update XRPL Foundation Validator List URL (5326)

ximinez added 21 commits March 11, 2025 19:52

Merge remote-tracking branch 'upstream/develop' into ximinez/fix/vali…

c7f6cd7

…dator-cache * upstream/develop: refactor: Remove unused and add missing includes (5293)

Merge branch 'develop' into ximinez/fix/validator-cache

c1b1b05

Merge branch 'develop' into ximinez/fix/validator-cache

f6f6620

Merge branch 'develop' into ximinez/fix/validator-cache

d6821b9

Merge branch 'develop' into ximinez/fix/validator-cache

bb37a5b

Merge branch 'develop' into ximinez/fix/validator-cache

93e1c9a

Merge branch 'develop' into ximinez/fix/validator-cache

039a07e

Merge branch 'develop' into ximinez/fix/validator-cache

14cf0df

Merge branch 'develop' into ximinez/fix/validator-cache

0bf4cd7

Merge branch 'develop' into ximinez/fix/validator-cache

965bd77

Merge branch 'develop' into ximinez/fix/validator-cache

0626bff

Merge branch 'develop' into ximinez/fix/validator-cache

8737023

Merge branch 'develop' into ximinez/fix/validator-cache

6f71852

Merge branch 'develop' into ximinez/fix/validator-cache

6bb6706

Merge branch 'develop' into ximinez/fix/validator-cache

da2fd89

Merge branch 'develop' into ximinez/fix/validator-cache

0d77314

Merge branch 'develop' into ximinez/fix/validator-cache

70ce609

Merge branch 'develop' into ximinez/fix/validator-cache

7e7a7b1

Merge branch 'develop' into ximinez/fix/validator-cache

2173aef

Merge branch 'develop' into ximinez/fix/validator-cache

896bc73

Merge branch 'develop' into ximinez/fix/validator-cache

366c4fc

ximinez added 4 commits April 29, 2025 12:23

Merge branch 'develop' into ximinez/fix/validator-cache

ed9c606

Merge branch 'develop' into ximinez/fix/validator-cache

489d234

Merge branch 'develop' into ximinez/fix/validator-cache

103b78b

Merge branch 'develop' into ximinez/fix/validator-cache

c00bc94

ximinez requested a review from a team as a code owner May 14, 2025 14:07

bthomee requested review from a1q123456 and removed request for vlntb June 6, 2025 15:12

a1q123456 reviewed Jun 10, 2025

View reviewed changes

ximinez added 2 commits July 2, 2025 17:59

Merge branch 'develop' into ximinez/fix/validator-cache

9c315ba

Merge branch 'develop' into ximinez/fix/validator-cache

9e92413

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use Validator List (VL) cache files in more scenarios #5323

Use Validator List (VL) cache files in more scenarios #5323

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Use Validator List (VL) cache files in more scenarios #5323

Are you sure you want to change the base?

Use Validator List (VL) cache files in more scenarios #5323

Uh oh!

Conversation

High Level Overview of Change

Context of Change

Type of Change

API Impact

Before / After

Test Plan

Uh oh!

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!