Early logging improvements #30386

ajtowns · 2024-07-03T16:33:00Z

In order to cope gracefully with Log*() calls that are invoked prior to logging being fully configured (indicated by calling StartLogging() we buffer early log messages in m_msgs_before_open. This has a couple of minor issues:

if there are many such log messages the buffer can become arbitrarily large; this can be a problem for users of libkernel that might not wish to worry about logging at all, and as a result never invoke StartLogging()
early log messages are formatted before the formatting options are configured, leading to inconsistent output

Fix those issues by buffering the log info prior to formatting it, and setting a limit on the size of the buffer (dropping the oldest lines, and reporting the number of lines skipped).

Also adds some thread safety annotations, and the ability to invoke LogInstance().DisableLogging() if you want to disable logging entirely, for a minor efficiency improvement.

DrahtBot · 2024-07-03T16:33:03Z

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.

Type	Reviewers
ACK	maflcko, ryanofsky, TheCharlatan

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#30343 (wallet, logging: Replace WalletLogPrintf() with LogInfo() by ryanofsky)
#30342 (kernel, logging: Pass Logger instances to kernel objects by ryanofsky)
#29798 (Logging cleanup by vasild)
#29256 (Improve new LogDebug/Trace/Info/Warning/Error Macros by ryanofsky)
#26697 (logging: use bitset for categories by LarryRuane)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

ajtowns · 2024-07-03T17:25:00Z

Easiest way to test is to add some log lines into init.cpp InitLogging(), eg:

    for (int i = 0; i < 10000; ++i) {
        LogInfo("log spam %d...\n", i);
    }
    LogWarning("awoogah\n");
    LogError("internal bug not detected\nhello, world!\nwas that fun?\n");

Running bitcoind -regtest -logtimemicros should give a decent indication of the differences.

If you want to modify bitcoin-chainstate to see what the logs would look like, remove the DisableLogging call and add:

LogInstance().m_print_to_console = true;
LogInstance().StartLogging()

somewhere.

DrahtBot · 2024-07-03T17:58:37Z

🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the
documentation.

Possibly this is due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.

Leave a comment here, if you need help tracking down a confusing failure.

_{Debug: https://github.com/bitcoin/bitcoin/runs/27006383028}

ryanofsky

Code review ACK 0456828. Left some suggestions, but they are not important

src/bitcoin-chainstate.cpp

src/logging.cpp

ryanofsky

Code review ACK f3632d5. I left some suggestions, but feel free to ignore them. Overall I think this PR makes early logging behavior a lot better by making the log format consistent, and it also makes nice API improvements like avoiding runaway memory usage, and switching to std::string_view.

ryanofsky · 2024-07-18T14:12:18Z

src/logging.h

+         * Mostly intended for libbitcoin-kernel apps that don't want any logging.
+         * Should be used instead of StartLogging().
+         */
+        void DisableLogging() EXCLUSIVE_LOCKS_REQUIRED(!m_cs);


In commit "logging: Add DisableLogging()" (0b1960f)

I think it's a good idea to add this DisableLogging() method for clarity, even though in practice it's equivalent to just calling StartLogging() without setting any logging options.

But I think in the future it would be nice to simplify the log interface and replace StartLogging() DisableLogging() and DisconnectTestLogger() methods with a single SetOptions() method:

struct BufferOptions { size_t max_bytes{1000000}; }; struct OutputOptions { fs::path print_to_file{}; bool print_to_console{false}; bool log_timestamps{true}; bool log_time_micros{false}; bool log_threadnames{false}; bool log_sourcelocations{false}; bool always_print_category_level{false}; }; using Options = std::variant<BufferOptions, OutputOptions>; Options m_options{}; void SetOptions(Options options);

The default behavior, as currently, would be to buffer log messages, but it would be easy to set default output options to disable logging, or set custom output options to enable it.

I'd like to change Enabled() and WillLogCategoryLevel() functions to be lockless so that calls to Log*() don't block when they have nothing to do. I think the above changes would mesh well with that.

ryanofsky · 2024-07-18T14:34:02Z

src/logging.cpp

-            if (m_msgs_before_open.empty()) {
-                m_cur_buffer_memusage = 0;
-                break;
+        if (!starts_new_line && !m_msgs_before_open.empty()) {


In commit "logging: Apply formatting to early log messages" (a67baea)

I think it might make sense to explicitly handle the case where starts_new_line is false and m_msgs_before_open is empty. Right now the code will cut off the first part of the log line but include the rest of the message, which could lead to confusing or misleading output. Better options might be to add a ... [truncated] prefix to the line, or to discard the whole message.

Added a [...] marker, since it was easy.

In order to trigger it you'd need to have 1MB of log message on a single, incomplete line, which seems a bit ridiculous. Long-term, I think it would make more sense to stop intentionally doing incomplete lines (since that introduces a race condition if other threads are also logging), and just add a newline if it's missing... shrug

Long-term, I think it would make more sense to stop intentionally doing incomplete lines (since that introduces a race condition if other threads are also logging), and just add a newline if it's missing... shrug

Fixed in #30485 (but not yet removing the m_started_new_line to avoid a merge conflict here). I'll do it in a follow-up, after this one is merged, or if I get the OK to do it before this one is merged.

In any case, according to the coverage report, all of this is dead code, so shrug . Shouldn't matter much either way.

Long-term, I think it would make more sense to stop intentionally doing incomplete lines (since that introduces a race condition if other threads are also logging), and just add a newline if it's missing... shrug

Fixed in #30485 (but not yet removing the m_started_new_line to avoid a merge conflict here). I'll do it in a follow-up

Removed the dead m_started_new_line in #30929

src/logging.cpp

maflcko

ACK f3632d5 🚲

Show signature

Signature:

untrusted comment: signature from minisign secret key on empty file; verify via: minisign -Vm "${path_to_any_empty_file}" -P RWTRmVTMeKV5noAMqVlsMugDDCyyTSbA3Re5AkUrhvLVln0tSaFWglOw -x "${path_to_this_whole_four_line_signature_blob}"
RUTRmVTMeKV5npGrKx1nqXCw5zeVHdtdYURB/KlyA/LMFgpNCs+SkW9a8N95d+U4AP1RJMi+krxU1A3Yux4bpwZNLvVBKy0wLgM=
trusted comment: ACK f3632d53a5be7efe61be8181ff481d30f7313bd4 🚲
Npt3Z0ZvZ7d1vkjP+I6WoSxmYlnR8J954hzJEEVA5Y1ogrEp3GDcMINyxhP04yvRf/5iWhedn+y9a9jxwtHmAw==

maflcko · 2024-07-18T12:39:55Z

src/logging.cpp

+    {
+        StdLockGuard scoped_lock(m_cs);
+        assert(m_buffering);
+        assert(m_print_callbacks.empty());


q in 0b1960f: Any reason to assert this but not the that the other m_print_to_* fields are false? What is the difference of a developer writing PushBackCallback(...); DisableLogging() to crash the program vs a developer writing m_print_to_console=true;DisableLogging() to achieve the same?

The other things can be safely set to false, so we can be sure they're false without an assertion. But if you've set a callback you've also gotten an iterator that you might intend to pass to DeleteCallback later, which would then crash if we had cleared m_print_callbacks here.

Fair enough, but I am mostly thinking what would be the use case for possibly confusing code like m_print_to_console=true;DisableLogging()? Given that this function also assert(m_buffering), the code can only be called when logging was never active, I'd still say it makes sense to at least Assume(!m_print_to*).

src/logging.h

src/logging.cpp

Log messages created prior to StartLogging() being called go into a buffer. Enforce a limit on the size of this buffer.

The formatting of log messages isn't defined until StartLogging() is called; so can't be correctly applied to early log messages from prior to that call. Instead of saving the output log message, save the inputs to the logging invocation (including time, mocktime and thread name), and format those inputs into a log message when StartLogging() is called.

DrahtBot · 2024-07-19T04:50:00Z

🚧 At least one of the CI tasks failed.
_{Debug: https://github.com/bitcoin/bitcoin/runs/27646993578}

Hints

Make sure to run all tests locally, according to the documentation.

The failure may happen due to a number of reasons, for example:

Possibly due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.
A sanitizer issue, which can only be found by compiling with the sanitizer and running the
affected test.
An intermittent issue.

Leave a comment here, if you need help tracking down a confusing failure.

src/logging.cpp

maflcko · 2024-07-19T05:36:30Z

lgtm, after a CI fix.

re-ACK 1cd8386 🏡

Show signature

Signature:

untrusted comment: signature from minisign secret key on empty file; verify via: minisign -Vm "${path_to_any_empty_file}" -P RWTRmVTMeKV5noAMqVlsMugDDCyyTSbA3Re5AkUrhvLVln0tSaFWglOw -x "${path_to_this_whole_four_line_signature_blob}"
RUTRmVTMeKV5npGrKx1nqXCw5zeVHdtdYURB/KlyA/LMFgpNCs+SkW9a8N95d+U4AP1RJMi+krxU1A3Yux4bpwZNLvVBKy0wLgM=
trusted comment: re-ACK 1cd8386a18db0ced6da0c5f42e952e81fa02867c  🏡
8ysDmnHpYk3ulyRBsxRoXiqBgPnKiwl54SZM1p6EqcYvQ0Pl39Qmk3M0fuBdFyBflutT4xSMMavOyAe9wX4FCQ==

maflcko · 2024-07-19T07:38:05Z

re-ACK b4dd7ab 🕴

Show signature

Signature:

untrusted comment: signature from minisign secret key on empty file; verify via: minisign -Vm "${path_to_any_empty_file}" -P RWTRmVTMeKV5noAMqVlsMugDDCyyTSbA3Re5AkUrhvLVln0tSaFWglOw -x "${path_to_this_whole_four_line_signature_blob}"
RUTRmVTMeKV5npGrKx1nqXCw5zeVHdtdYURB/KlyA/LMFgpNCs+SkW9a8N95d+U4AP1RJMi+krxU1A3Yux4bpwZNLvVBKy0wLgM=
trusted comment: re-ACK b4dd7ab43e8cfc2c171f67588e4e1ec2705393c2 🕴
jM/rPHAaIwZvm8W2wJGcZTGYV/m+3AI+tcbtSXChUzTig/qQoVfzFh/aNMuy08785DW+EVKwt5nlRn3W2mrOAw==

ryanofsky

Code review ACK b4dd7ab

Main change since last review is fixing an escaping bug #30386 (comment), but also includes a few other review suggestions

src/logging.cpp

TheCharlatan

Nice, ACK b4dd7ab

TheCharlatan · 2024-07-24T10:20:06Z

src/logging.cpp

+    m_cur_buffer_memusage = 0;
+    m_buffer_lines_discarded = 0;
+    m_msgs_before_open.clear();
+


Nit: Unneeded newline.

TheCharlatan · 2024-07-24T11:46:47Z

src/logging.cpp

@@ -94,6 +99,11 @@ void BCLog::Logger::DisconnectTestLogger()
    if (m_fileout != nullptr) fclose(m_fileout);
    m_fileout = nullptr;
    m_print_callbacks.clear();
+    m_max_buffer_memusage = DEFAULT_MAX_LOG_BUFFER;


Nit: Why is this being set again? I'm guessing it is not declared as const to future-proof it against introducing an option controlling its size, but it is not clear why disconnecting should clear such a value again.

This is for tests only, where each unit test uses the same global logger, so resetting the global state before each unit test is useful.

maflcko · 2024-07-26T08:25:42Z

Should be good to merge with 3 acks, or is it waiting on more review, or something else?

fa18fc7 log: Remove NOLINT(bitcoin-unterminated-logprintf) (MarcoFalke) Pull request description: `NOLINT(bitcoin-unterminated-logprintf)` is used to document a missing trailing `\n` char in the format string. This has many issues: * It is just documentation, assuming that a trailing `\n` ends up in the formatted string. It is not enforced at compile-time, so it is brittle. * If the newline was truly missing and `NOLINT(bitcoin-unterminated-logprintf)` were used to document a "continued" line, the log stream would be racy/corrupt, because any other thread may inject a log message in the meantime. * If the newline was accidentally missing, nothing is there to correct the mistake. * The intention of all code is to always end a log line with a new line. However, historic code exists to deal with the case where the new line was missing (`m_started_new_line`). This is problematic, because the presumed dead code has to be maintained (#30386 (comment)). Fix almost all issues by removing the `NOLINT(bitcoin-unterminated-logprintf)`, ensuring that a new line is always present. A follow-up will remove the dead logging code. ACKs for top commit: TheCharlatan: ACK fa18fc7 ryanofsky: Code review ACK fa18fc7 Tree-SHA512: bf8a83723cca84e21187658edc19612da79c34f7ef2e1f6e9353e7ba70e4ecc0a878a2ae32290045fb90cba9a44451e35341a36ef2ec1169d13592393aa4a8ca

fa2b7d8 Remove redundant unterminated-logprintf tidy check (MarcoFalke) bbbb2e4 log: Enforce trailing newline, Remove redundant m_started_new_line (MarcoFalke) Pull request description: There are many problems around missing a trailing newline while logging: * All log lines are currently terminated by a trailing newline. This means any runtime code trying to handle a "missing" newline is currently dead code. * Leaving a line unterminated is racy and can cause content corruption by mixing log lines from different sources. * It requires extra code like `m_started_new_line` to keep track of, which is annoying and pointless to maintain, because it is currently dead code, see #30386 (comment). * It requires a standalone `unterminated-logprintf` clang-tidy plugin, which is unmaintained (no one updated it for the new log function names), probably harder to maintain than normal C++ code (because it requires clang AST matcher knowledge), brittle (it can fail to detect issues at any time, if it goes out-of-sync, or be explicitly disabled via `NOLINT`), and annoying for devs (it is slow and intricate to run locally and thus only effectively run on CI or via the CI scripts). Fix all issues by enforcing the trailing newline in logs directly in the code. Then remove all the other stuff. This refactor does not change behavior. ACKs for top commit: stickies-v: re-ACK fa2b7d8 achow101: ACK fa2b7d8 ryanofsky: Code review ACK fa2b7d8. Just comment and test cleanup since last review Tree-SHA512: 10ed420f6c2fdb0f491d6c880be8dd2e8beef628f510adebadf4c3849d9f5e28906519d5cbaeb295f4c7c1b07c4c88a9905b3cfe30fee3a2c91ac9fd24ae6755

logging: Add thread safety annotations

6bbc2dd

ajtowns force-pushed the 202407-early-log branch from fabbde6 to 64725d9 Compare July 3, 2024 16:39

ajtowns mentioned this pull request Jul 3, 2024

RFC: Instanced logs #30338

Closed

ajtowns mentioned this pull request Jul 3, 2024

202406 early logs ajtowns/bitcoin#9

Closed

ajtowns force-pushed the 202407-early-log branch from 64725d9 to 0456828 Compare July 3, 2024 17:58

DrahtBot added the CI failed label Jul 3, 2024

DrahtBot mentioned this pull request Jul 4, 2024

Logging cleanup #29798

Merged

maflcko removed the CI failed label Jul 4, 2024

This was referenced Jul 4, 2024

logging: use bitset for categories #26697

Closed

kernel, logging: Pass Logger instances to kernel objects #30342

Draft

ryanofsky approved these changes Jul 10, 2024

View reviewed changes

src/bitcoin-chainstate.cpp Show resolved Hide resolved

src/logging.cpp Outdated Show resolved Hide resolved

src/logging.cpp Show resolved Hide resolved

src/logging.cpp Outdated Show resolved Hide resolved

logging: Add DisableLogging()

0b1960f

ajtowns force-pushed the 202407-early-log branch from 0456828 to f3632d5 Compare July 15, 2024 04:33

This was referenced Jul 16, 2024

log, refactor: Allow log macros to accept context arguments #29256

Open

wallet, logging: Replace WalletLogPrintf() with LogInfo() #30343

Draft

ryanofsky approved these changes Jul 18, 2024

View reviewed changes

maflcko approved these changes Jul 18, 2024

View reviewed changes

ajtowns added 2 commits July 19, 2024 12:41

logging: Limit early logging buffer

6cf9b34

Log messages created prior to StartLogging() being called go into a buffer. Enforce a limit on the size of this buffer.

ajtowns force-pushed the 202407-early-log branch from f3632d5 to 1cd8386 Compare July 19, 2024 03:35

DrahtBot added the CI failed label Jul 19, 2024

maflcko reviewed Jul 19, 2024

View reviewed changes

src/logging.cpp Outdated Show resolved Hide resolved

maflcko reviewed Jul 19, 2024

View reviewed changes

src/logging.cpp Outdated Show resolved Hide resolved

DrahtBot requested a review from ryanofsky July 19, 2024 05:36

logging: use std::string_view

b4dd7ab

ajtowns force-pushed the 202407-early-log branch from 1cd8386 to b4dd7ab Compare July 19, 2024 05:44

DrahtBot removed the CI failed label Jul 19, 2024

maflcko mentioned this pull request Jul 19, 2024

log: Remove NOLINT(bitcoin-unterminated-logprintf) #30485

Merged

ryanofsky approved these changes Jul 22, 2024

View reviewed changes

src/logging.cpp Show resolved Hide resolved

ryanofsky mentioned this pull request Jul 23, 2024

fix: Make TxidFromString() respect string_view length #30436

Merged

TheCharlatan approved these changes Jul 24, 2024

View reviewed changes

ryanofsky merged commit 30cef53 into bitcoin:master Jul 26, 2024
16 checks passed

maflcko mentioned this pull request Sep 19, 2024

log: Enforce trailing newline #30929

Merged

Early logging improvements #30386

Early logging improvements #30386

Uh oh!

Conversation

Uh oh!

Uh oh!

Code Coverage

Reviews

Conflicts

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh! 8000

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!