Promql: Prevent extrapolation of NH below zero #16192

gen1321 · 2025-03-08T18:08:03Z

Problem statement

When calculating rates for Native Histograms (NH), we need to prevent extrapolation below zero to ensure consistent results with classic histograms.

What we do now

Zero-point from count
- Find when a linear back-extrapolation of count would hit 0.
- If that moment lies inside the query range, we clamp the lower boundary to it.
*Clamp buckets
- Check if bucket would extrapolate below zero or if we clamped total count - if so reduce the slope.
Adjust count
Bucket tweaks change the total count. We add the difference back to total count

gen1321 · 2025-03-08T18:11:08Z

@beorn7 I have some tests failing, and before I go fix them can you please confirm that this is conceptually right approach. Thanks!

beorn7 · 2025-03-12T19:36:01Z

This is on my review list. Apologies for the delay. I'm still trying to get to this this week.

beorn7 · 2025-03-13T14:50:56Z

promql/functions.go

+	// Histogram total count clamping logic
+	// Using the first sample's count as the master metric, we ensure that
+	// if extrapolation would take the count below zero, we clamp the extrapolation.
+	if isCounter && resultHistogram != nil && resultHistogram.Count > 0 {


We also need to check samples.Histograms[0].H.Count >= 0.

beorn7 · 2025-03-13T14:52:52Z

promql/functions.go

 	extrapolateToInterval += durationToStart

+	// Histogram bucket clamping logic
+	if isCounter && resultHistogram != nil && resultHistogram.Count > 0 && resultHistogram.Sum > 0 {


Suggested change

if isCounter && resultHistogram != nil && resultHistogram.Count > 0 && resultHistogram.Sum > 0 {

if isCounter && resultHistogram != nil && resultHistogram.Count > 0 {

Sum can be negative. That's fine.

beorn7 · 2025-03-13T15:21:50Z

promql/functions.go

+		var totalSumDelta float64
+		for i, bucketRate := range resultHistogram.PositiveBuckets {
+			// Calculate this bucket's proportion of the total rate
+			bucketProportion := bucketRate / resultHistogram.Sum


Why Sum? If at all, it should be Count.
But I don't understand the logic of the "proportion of the rate" in the first place.

Maybe I haven't wrapped my head sufficiently far around it. But broadly, I think we have to do something different depending on whether we have clamped the durationToStart above.

If durationToZero > durationToStart, we only have to manipulate the buckets if they would be extrapolated below zero at durationToStart.

If durationToZero == durationToStart, we have to manipulate all the buckets.

After the manipulation, the manipulated buckets (only) should extrapolate te exactly zero at durationToStart.

I can try to cobble together the equation, but maybe this is already enough to put you on track (or put me on track if I have missed something).

Yes I think you are totally right, I missed that we can actually make histograms schemas match(in hindsight it's obvious), and this is why I actually did this whole thing.

So we actually can match resultHistogram rate with first histogram, and then do all the calculations easily.

The only downside I think is that we will lose some precision in case where when resultHistogram.Schema > firstHistogram.Schema

Please let me know it is a problem or incorrect and I am missing something :)

Not quite sure.

Making all the schemas match is part of the normal rate calculation anyway. First we find the largest common schema, merge buckets in higher-res histograms to match that schema, and only then we start to do the math on all those histograms that now have the same schema.

The extrapolation logic only happens in that second part, so different schemas should not be an issue.

Signed-off-by: Boris Beginin <gen3212@gmail.com>

beorn7 self-requested a review March 11, 2025 10:50

beorn7 requested changes Mar 13, 2025

View reviewed changes

rishabhkumar92 mentioned this pull request Apr 10, 2025

Bug: Native histogram operations like histogram_count emitting negative values grafana/mimir#11148

Open

gen1321 force-pushed the promql/prevent-negative-extrapolation-for-native-histograms branch 4 times, most recently from a02f905 to c73bb76 Compare May 16, 2025 08:20

gen1321 requested a review from beorn7 May 16, 2025 08:30

gen1321 marked this pull request as ready for review May 16, 2025 08:30

gen1321 requested a review from roidelapluie as a code owner May 16, 2025 08:30

Promql: Prevent extrapolation below zero for native histograms

ea7cdef

Signed-off-by: Boris Beginin <gen3212@gmail.com>

gen1321 force-pushed the promql/prevent-negative-extrapolation-for-native-histograms branch from c73bb76 to ea7cdef Compare May 16, 2025 08:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Promql: Prevent extrapolation of NH below zero #16192

Promql: Prevent extrapolation of NH below zero #16192

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

	if isCounter && resultHistogram != nil && resultHistogram.Count > 0 && resultHistogram.Sum > 0 {
	if isCounter && resultHistogram != nil && resultHistogram.Count > 0 {

Promql: Prevent extrapolation of NH below zero #16192

Are you sure you want to change the base?

Promql: Prevent extrapolation of NH below zero #16192

Uh oh!

Conversation

Uh oh!

Problem statement

What we do now

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!