When loading the page, it currently undergoes through these stages:
FOUC
Based on previous effort from RCFilters/Watchlist, and VisualEditor; that first flash would be less jarring if the footer was generally below the fold.
This can be done by reserving an estimate of the needed space ahead of time with a placeholder element (rendered server-side) that has its dimensions set in CSS (guarded by .client-js so as to not render it below the fallback message).
The introduction paragraph can probably be rendered server-side as well using $this->msg(…)->escaped(), and might make sense to display regardless of support mode (e.g. render the no-js message below it).
Perceived performance
With the whitespace reserved ahead of time, there is still the issue that there is no indication to the user that something is happening. Perhaps making the placeholder grey (instead of white) could make that experience more pleasing, especially given that the actual UI also uses that color prominently.
It might also make sense to display a temporary text message in that space (e.g. "Loading an image for review"). One could utilise CSS3 transition-delay to make that message appear only if loading takes more than 0.9s (or something like that). This reduces the number of transitions for the fast case, while in the slow case ensuring that loading is acknowledged within a second.