feat: TruncateFront & TruncateBack support truncating all entries #31

RangerCD · 2025-02-17T07:21:16Z

Currently, it seems like there is no way to truncate entire log, which causes the duplicate transaction issue (#20).

This PR provides the ability of truncating all entries, by extending the index range 1 step further than the last(TruncateFront) or 1 step before the first(TruncateBack) entry.

For example, if the log has 3 entries[10, 11, 12]:

TruncateFront(13) will truncate all entries, and left firstIndex as 13 and lastIndex as 12.
TruncateBack(9) will truncate all entries, and left firstIndex as 10 and lastIndex as 9.

Why this feature is needed

While I was working on a project that uses this package as WAL implementation, I've met the situation that described in #20. I can't simply mark all jobs as done if the program exits unexpectedly, and next time it starts, it will duplicate the last job. So I need a way to remove this last job from wal once it's done.

With this PR, I can call TruncateFront(index + 1) when I've popped indexth job from wal, and let the program wait for incoming index + 1th job.

What's the difference between #29 and this PR

Thanks to @DStrand1 for providing a solution. However, #29 can't fulfill my requirements, by providing a Clear function.

Complexity of usage

In my use case, I'm running a long-lived goroutine as a consumer(Read), and a gin server as producers(Write).

With Clear from #29, after the consumer finishes all jobs, it will first detect whether wal is empty (by either calling Read(index + 1) with ErrNotFound returned, or checking a seperate counter), and then call Clear to remove all entries. The problem is that, these two steps are not atomic, so I have to add another lock to protect them which is complicated.

With this PR, both TruncateFront and TruncateBack hold internal lock, so they are atomic. And caller don't need to care about how many jobs left in wal, just call Read and TruncateFront in a loop until Read returns ErrNotFound, then wait for incoming jobs.

Corruption-safty

Clear function in #29 isn't corruption-safe, like #29 (comment) pointed out.

This PR modifies TruncateFront and TruncateBack, following the original corruption-safe mechanism.

Index continuity

#29 clears all entries, also resets firstIndex and lastIndex to 1 and 0. Both Write and Read have to reset its own index counter after Clear.

This PR keeps firstIndex and lastIndex continuous, Write and Read can keep using previous index counter.

tidwall · 2025-03-16T14:22:30Z

This looks good on the surface.
We'll need some tests to ensure correctness.

RangerCD · 2025-03-16T14:38:38Z

This looks good on the surface. We'll need some tests to ensure correctness.

I'll add some tests, might also adjust current test cases if needed.

RangerCD · 2025-03-18T02:25:15Z

While I was writing tests, a question raised that needs your help to answer.

With this change, the WAL may be in a state without any entries during use. As a result, the behavior of FirstIndex and LastIndex may need further clarification or modification.

In the following two comments, these two functions are expected to return 0 when the log is empty.

wal/wal.go

Lines 484 to 486 in 56126d5

    
           // FirstIndex returns the index of the first entry in the log. Returns zero 
        
           // when log has no entries. 
        
           func (l *Log) FirstIndex() (index uint64, err error) {

wal/wal.go

Lines 502 to 504 in 56126d5

    
           // LastIndex returns the index of the last entry in the log. Returns zero when 
        
           // log has no entries. 
        
           func (l *Log) LastIndex() (index uint64, err error) {

However, if the WAL is in a continuous read/write state, returning 0 might cause some confusion for users.

Let's temporarily ignore the special handling of empty entries in these two functions and consider the following usage scenario:

There are two Go routines—one continuously appends entries to the WAL, while the other continuously retrieves entries from the WAL. The code is as follows:

func main() {
	notify := make(chan struct{}, 1)
	l, _ := wal.Open("wal", nil)

	go func() {
		for {
			time.Sleep(1 * time.Second)
			idx, _ := l.LastIndex()
			idx++
			l.Write(idx, []byte(strconv.FormatUint(idx, 10)+"hello world"))
			select {
			case notify <- struct{}{}:
			default:
			}
		}
	}()

	go func() {
		for {
			select {
			case <-notify:
			}
			for {
				idx, _ := l.FirstIndex()
				data, err := l.Read(idx)
				if errors.Is(err, wal.ErrNotFound) {
					break
				}
				fmt.Println(string(data))
				l.TruncateFront(idx + 1)
			}
		}
	}()

	select {}
}

Relatively speaking, this is a straightforward implementation of the producer-consumer pattern, where each Go routine only needs to focus on one end of the queue.

In this case, FirstIndex can be understood as the "Next index to read," while LastIndex represents the "Last index that has been written."

Now, returning to the special handling of empty entries in these functions, the above code is actually just one step away from running perfectly:

Since FirstIndex will initially return 0, we need to replace idx, _ := l.FirstIndex() with:

idx, _ := l.FirstIndex()
if idx == 0 {
	idx = 1
}

This isn't particularly complicated, but it may confuse users until they carefully read the source code of FirstIndex to figure out why they can't read the first entry, it's because FirstIndex doesn't only just return the first index to read. And that's what I had experienced a month ago.

Currently, the implementation of these two functions only includes special handling for one specific case: when l.lastIndex == 0, FirstIndex returns 0 instead of the actual l.firstIndex.

For LastIndex, the if branch does not actually change its behavior, because when l.lastIndex == 0, returning 0 and returning l.lastIndex are effectively the same.

Disregarding historical reasons, a feasible implementation would be to return an additional boolean value to indicate whether the WAL is empty.

So the key question that needs to be clarified is:

whether to keep the current behavior and write test cases around it
or introduce a breaking change, either by removing the special handling or by returning an additional value.

feat: TruncateFront & TruncateBack support truncating all entries

6e623cd

imih mentioned this pull request Apr 25, 2025

feat: recover corrupted tail automatically #22

Open

This was referenced Jun 26, 2025

fix(disk-buffer): Remove empty buffer on close influxdata/telegraf#17247

Merged

fix(disk-buffer): Correctly reset the mask after adding to an empty buffer influxdata/telegraf#17240

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: TruncateFront & TruncateBack support truncating all entries #31

feat: TruncateFront & TruncateBack support truncating all entries #31

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

feat: TruncateFront & TruncateBack support truncating all entries #31

Are you sure you want to change the base?

feat: TruncateFront & TruncateBack support truncating all entries #31

Uh oh!

Conversation

Uh oh!

Why this feature is needed

What's the difference between #29 and this PR

Complexity of usage

Corruption-safty

Index continuity

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!