Support Banner/Scroll effects’ fadeawaywidth/height gradient #666

astiob · 2022-11-17T02:08:40Z

Guide to reviewing the commits:

The first commit lays down some foundation and implements an integer-only gradient. If you’re not interested in the integer-only gradient specifically, feel free to skip the gradient arithmetic in your initial review.
The third commit implements a mathematically sound floating-point gradient, the “sanest” variant. This can/will be squashed into the first commit if we settle on a floating-point variant, completely replacing the integer-only code.
The fourth commit is a small but icky patch to get us closer to DirectShow VSFilter if we think that’s useful. We can decide to drop this.

There are several bikesheds to be coloured here:

Should the gradient be implemented as the splitting of ASS_Images with tweaked ASS_Image alphas (as the proposed code does) or by pre-multiplying the backing bitmap(s)?
- VSFilter does the latter. But this is a scrolling event, so (unless it’s scrolling reaaaally slowly, which may be used to emulate a stationary gradient) this can’t be cached and the whole bitmap needs to be redone on each frame. Plus, we’d need either to allocate and fill a separate bitmap for the gradient to use mul_bitmaps on it (VSFilter does this) or introduce a special-purpose multiplier routine for this, which we’d probably like to have an assembler-language implementation for (which someone would have to create).
- Splitting ASS_Images lets us forego any bitmap multiplications and allows bitmap reuse (though reuse can still be limited due to subpixel shifts during the scrolling), but it requires more ASS_Image allocations and breaks those API consumers that use ASS_Images to estimate event bounding boxes.
How stable should the output be on changing display resolution, and how close should it be to VSFilter’s?

I initially offer three variants in this PR. One is a stupid simple implementation, much of the same kind of stupid as in VSFilter and matching it somewhat but not fully, and technically moving by sub-pixel amounts as display resolution is changed. Another is a mathematically ideal variant that to me is the most intuitive thing an author would expect from fadeawayheight/width without knowing how any implementation actually does it. The third is a mathematically exact replica of traditional VSFilter’s gradient position, but I didn’t replicate the bad rounding of values (which in some sense shifts the gradient further), so I’m not entirely sure how this even helps. I can add that if desired, though.

I haven’t had much luck imagining situations where a carefully crafted script might be significantly affected by these differences. But I haven’t tried looking for real examples, either. Theoretically, a script might use these effects to achieve a colour gradient.

For reference, VSFilter’s rounding errors on values:
1. It computes the gradient step as floor(2¹⁴ / gradientWidthOrHeightInDisplayPixels) and repeatedly adds this (starting from fully transparent) for the leading gradient or subtracts this (starting from fully opaque) for the trailing gradient, accumulating/multiplying the error caused by the floor. As a result, the leading gradient never quite smoothly reaches 1, and the trailing gradient never quite smoothly reaches 0; and the higher the resolution, the worse it may get.
2. It computes the final opacity as floor(opacity × gradient / 2¹⁴). And the opacities are in the range 0–64, so even this floor alone is worth up to ¹⁄₆₄ ≈ ⁴⁄₂₅₅. The leading gradient in particular might thus appear to originate another few pixels beyond the designated point.
I’ve noticed the rounding of \clip(,,,) coordinates is different among the various renderers. Do we or do we not want to align this to any particular VSFilter flavour, or perhaps independently improve our own?

TheOneric · 2022-11-17T17:27:53Z

It appears this introduces (or triggers an existing) issue with state persisting beyond its intended lifetime. Here’s a sample:

[Script Info]
ScriptType: v4.00+
ScaledBorderAndShadow: yes
YCbCr Matrix: None
PlayResX: 1920
PlayResY: 1080

[Aegisub Project Garbage]
Video File: ?dummy:25.000000:39000:1920:1080:47:191:225:

[V4+ Styles]
Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding
Style: Default,DejaVu Sans,48,&H00FFFFFF,&H000000FF,&H00000000,&H00000000,0,0,0,0,100,100,0,0,1,0,0,2,10,10,10,1

[Events]
Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
Dialogue: 0,0:00:00.00,0:00:04.99,Default,,0,0,0,Banner;9;0;300,{\4c&HFF0000\1a&HFE\xshad-1250}{\p1}m 0 0 l 600 0 600 300 0 300
Dialogue: 0,0:00:05.00,0:00:08.00,Default,,0,0,0,,AAA{\1c&HFF0000}{\p1}m 0 0 l 600 0 600 300 0 300{\p0}AAAA

The second event doesn't render if the first one was displayed before it. If playback is started after the first event or the first even is commented the second event shows up normally.

astiob · 2022-11-17T18:09:53Z

Thanks for testing! I had only tried single frames, as I don’t (yet) have an easily testable video playback setup. Try with the new code, please.

I’ve added a second condition to apply_scroll_fade. It appears we reset evt_type but not any of the scroll_... properties in init_render_context, which makes sense but I didn’t pay attention to.

TheOneric · 2022-11-17T18:20:13Z

Try with the new code, please.

Both events show up now.

TheOneric

An intial look at the floating-point version and \clip rounding. I didn’t think about the rounding error emulation yet.

If there’s no visible compatibility problems, I think the floating-point variant is good
Similarly with \clip rounding. Since *VSFilters apparently already differ and if we don't expect visible issues from not faithfully emulating one flavour, the approach here seems fine to me
Being mostly stable across different rendering resolutions is imo desireable

I find the code around multiple ASS_Images with borrowed buffers a bit confusing, but tbf that’s due to preëxisting API and lack of documentation so not directly relevant to this PR. I particularly dislike how the private my_draw_bitmap function returns ASS_ImagePriv *s as ASS_Image *s and the result is then cast back to ASS_ImagePriv * in various places; as an internal-only API it could/should imho just directly return ASS_ImagePriv *. I’m also not sure what ASS_ImagePriv->buffer even is.

libass/ass_render.c

TheOneric · 2022-11-19T04:29:48Z

libass/ass_render.c

+                    ASS_Image *nimg = ass_image_borrow_copy(cur);
+                    nimg->dst_x = x + render_priv->settings.left_margin;
+                    cur->w = nimg->dst_x - cur->dst_x;
+                    nimg->w -= cur->w;
+                    nimg->bitmap += cur->w;
+                    cur->next = nimg;
+                    cur = nimg;


I’m not sure it’s actually worthwhile, but in theory you could add a new IMAGE_TYPE_GRADIENT for ASS_Image and use it here, so consumers who want to calculate bounding boxes will know to merge succesive neighbouring gradient image of the same colour sans alpha.

That would erase the distinction between shadows, borders and main images for events affected by gradients, defeating the point of image->type. We could, of course, make a whole parallel set of three new values… But it would still be impossible in some cases to tell where one gradient event ends and another gradient event starts.

By the way, why do the IMAGE_TYPE_... constants not start with ASS_? Sigh.

By the way, why do the IMAGE_TYPE_... constants not start with ASS_? Sigh.

Btw, all macros and enums in ass_types.h but the JUSTIFY ones also lack an ASS_ prefix.

astiob · 2022-11-19T21:38:00Z

I’m also not sure what ASS_ImagePriv->buffer even is.

Currently, an ASS_Image references a source bitmap: either a borrowed cache entry (in which case source is set and reference-counted while buffer is null) or a directly owned bitmap buffer (in which case source is null but buffer is this very buffer). The latter is used for BorderStyle=4 in add_background and for blended vector clips in blend_vector_clip; the former is used for everything else.

This adds a third mode, 8000 where the ASS_Image references a direct bitmap buffer that is borrowed from another ASS_Image that owns it. In this case, both source and buffer are null, and the owning image is in the same image list, so both images will be freed by ass_frame_unref at the same time.

MrSmile

I feel that multiplication would be faster than ton of separate images, but that depends on ASS_Image treatment by consumer. It probably shouldn't matter for regular usage, but we all know about insane full-color drawings that sometimes appear in subs...

I think you should create struct FadeParams and encapsulate alpha calculation inside it:

int event_start = cur->dst_y - render_priv->settings.top_margin;
int event_end = event_start + cur->h;

FadeParams fade = fade_init(state->scroll_y0 * state->screen_scale_y,
                            state->scroll_y1 * state->screen_scale_y,
                            fade_height, cur->color);

int a = fade_alpha(&fade, event_start);
cur->color = (cur->color & ~0xFF) | a;
for (int y = event_start + 1; y < event_end; y++) {
    int prev_a = a;
    a = fade_alpha(&fade, y);
    if (a == prev_a)
        continue;

    ASS_Image *nimg = ass_image_borrow_copy(cur);
    nimg->dst_y = y + render_priv->settings.top_margin;
    cur->h = nimg->dst_y - cur->dst_y;
    nimg->h -= cur->h;
    nimg->bitmap += cur->h * cur->stride;
    cur->next = nimg;
    cur = nimg;

    cur->color = (cur->color & ~0xFF) | a;
}

It would be less copy-paste and all rounding variants would be neatly localized in FadeParams and its functions.

Use lround as opposed to lrint to ensure smoother animations

For all sane use cases lround = lrint, so I don't understand what are you meaning here. Personally I prefer lrint as it can be optimized slightly better than lround.

astiob · 2022-11-20T21:54:50Z

For all sane use cases lround = lrint, so I don't understand what are you meaning here.

I, in turn, don’t understand what you mean by sane use cases. I gave an example of what I mean in the same commit message. Consider a 720p video/script on a 1080p screen: the scaling factor is exactly 1.5. Whole pixels in the source turn into 1.5 display pixels each. lrint rounds successive 1px steps—1.5px steps after scaling—to intervals of 1px, 2px, 2px, 1px, 1px, 2px, 2px, 1px, 1px, 2px, 2px… lround, in contrast, gives 1px, 2px, 1px, 2px, 1px, 2px, 1px, 2px… which is much smoother. In fact, this applies not only to animations (as I wrote in the commit message) where this stepping happens over time, but also to those same insane typeset signs™ where it happens within the span of a single frame, with a separate event for each pixel.

TheOneric

I’m very slightly in favour of emulating VSFilter’s start position (even without emulating rounding errors), but I’m also fine with leaving it out unless we actually get bug reports about it.

The cache-reuse argument for using multiple ASS_Images with a shared cached buffer makes sense to me. Though I can also see how if the events are short-lived, letting API-users blend more, mostly 1px-wide/high images may end up being a bit worse than us premultiplying with alpha bitmap and our mul_bitmaps assembly-variant (or theoretical special-purpose assembly). If this becomes a point of contention, we could create artificial samples for both cases and actually measure the perf diff.

For \clip and Scroll effects, scaled coordinates may not be integers. Round each coordinate to the nearest integer, not towards zero. Use lround as opposed to lrint to ensure smoother animations in case of e. g. a 1px animation step at a 1.5x scaling factor: lround gives successive steps of 2px, 1px, 2, 1, 2, 1, 2, 1, 2... whereas lrint would give 2, 1, 1, 2, 2, 1, 1, 2, 2... (The same reasoning has previously been applied to \kf coordinates in render_and_combine_glyphs.) For what it's worth, for Scroll effects, this exactly matches all VSFilter variants and setups except specific rare cases on MPC-HC's and MPC-BE's internal subtitle renderers where PlayRes differs from display resolution (or video storage resolution if the renderer is built as a standalone DirectShow filter) and MPC ends up off by 1px after double-rounding scaled Scroll boundaries. However, for rectangular \clip, all VSFilters truncate the scaled coordinates, and XySubFilter truncates them twice: at video-storage *and* display resolution. Furthermore, they compute \clip\t(\clip) relative to the truncated scaled starting coordinates, whereas libass currently computes it relative to the truncated PlayRes starting coordinates. Events that have \clip are additionally clipped to the video area, but the old code used floating-point arithmetic and rounded the result also towards zero. For practical values of PlayRes and display size on practical machines with a 53-bit double-precision float mantissa, this arithmetic is exact and no loss of precision occurs, although with a theoretical huge resolution this could lead to the rightmost pixel column or bottommost pixel row of the video area to be falsely clipped off (if PlayRes * display res / PlayRes < display res). But at any rate, this arithmetic isn't necessary at all: just use the known corresponding integer values.

This should work and match VSFilter when rendering to target display resolution equal to layout (video storage) reslution, except that our visual quality is better due to its limited value range (0-64 for both base bitmap and gradient bitmap before they're multiplied together) and more abundant rounding error. When rendering to a different target display resolution, the location of our fade bands as implemented in this commit will not exactly match DirectVobSub's, and the greater the scaling factor, the bigger the difference will be. XySubFilter also exhibits this problem, although our locations don't exactly match XySubFilter's, either, except when PlayRes == LayoutRes. The gradient is implemented as multiple ASS_Images to avoid spending time and memory on a bitmap multiplication on every frame, as the event keeps scrolling and the multiplied bitmap would be uncacheable. However, this does increase the workload of the consumer and may also harm applications that use ASS_Images to estimate event bounding boxes.

Compared to before, this shifts the gradient by half a pixel up/left when rendering at PlayRes, but this new position is exactly reproduced when rendering at any other resolution. The linear gradient now covers exactly the area from the left edge of the video to the right edge of the fadeawaywidthth PlayRes-pixel (exactly fadeawaywidth PlayRes-pixels in total), similarly on the right, and similarly from the top edge of the y0th PlayRes-pixel for exactly fadeawayheight PlayRes-pixels down, and from the bottom edge of y1st PlayRes-pixel for exactly fadeawayheight PlayRes-pixels up. I think this is the most intuitive arrangement. The gradient is sampled at the centre of each display-resolution pixel. This does NOT match any VSFilter variant in any configuration. However, for a fixed video viewed on ever-denser displays, I think MPC-HC/BE ISR's gradient (and for PlayRes == video res, also XySubFilter's) tends to this same result in its limit as the display resolution grows to infinity... or rather, would tend if you could ignore its rounding error.

This still doesn't emulate the rounding error on its output *values*, though.

astiob · 2023-02-09T14:37:21Z

Pushed the non-fadeaway commits to master for now, with an expanded example regarding lrint vs lround.

astiob added enhancement compatibility labels Nov 17, 2022

astiob force-pushed the fadeaway branch from 57f740f to c27622a Compare November 17, 2022 18:07

TheOneric reviewed Nov 19, 2022

View reviewed changes

MrSmile reviewed Nov 20, 2022

View reviewed changes

TheOneric approved these changes Nov 23, 2022

View reviewed changes

astiob added 5 commits February 9, 2023 16:29

ass_layout_res: ensure it is at least 1x1

545bc09

Emulate DirectVobSub's Banner/Scroll effect fadeaway position exactly

b739e45

This still doesn't emulate the rounding error on its output *values*, though.

astiob force-pushed the fadeaway branch from c27622a to b739e45 Compare February 9, 2023 14:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support Banner/Scroll effects’ fadeawaywidth/height gradient #666

Support Banner/Scroll effects’ fadeawaywidth/height gradient #666

Support Banner/Scroll effects’ fadeawaywidth/height gradient #666

Are you sure you want to change the base?

Support Banner/Scroll effects’ fadeawaywidth/height gradient #666

Conversation

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment