8000 add ensureDecompiled method to decompile all the tables irrespective of lazy attribute by anthrotype · Pull Request #2551 · fonttools/fonttools · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

add ensureDecompiled method to decompile all the tables irrespective of lazy attribute #2551

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Mar 18, 2022

Conversation

anthrotype
Copy link
Member
@anthrotype anthrotype commented Mar 17, 2022

While attempting to tackle glyph reordering in #2060, I needed a way to say "decompile everything in the font". The lazy=False unfortunately does not exactly do what it says to be doing, and that's confusing for people who are not familiar with the fonttools internals (some colleagues did stumble on this). Not only one has to set lazy=False, but one also has to iterate over all the tables and actually decompile them using silly loop like:

for tag in font.keys():
    _ = font[tag]

A quick recap about TTFont.lazy attribute. It has three states: it defaults to None, and can be optionally be set to True or False.
When it is set to True, the font keeps a reference to the original input file (inside its SFTNReader); when it's None (or False), it copies the raw bytes of the input files to in-memory stream (allows overwriting input, or reading from unseekable streams). A TTFont decompiles a given table only when requested (in __getitem__) -- confusingly even when lazy=False.
Then, some tables take into account the lazy attribute to decide whether to load or not all their data upfront. In particular:

  1. the glyf table decompile all the glyphs if lazy is False; it keeps them in compact (raw data) form if lazy is True or None (default), so they get expanded upon getting them.
  2. All tables that are generated through the otData.py machinery are loaded lazily when lazy=True (BaseTable.__getattr__ triggeres decompiling upon getting any attribute); they are fully decompiled if lazy is None (default) or False.
  3. cmap completely ignores lazy attribute, and always loads its subtables lazily.

What this PR proposes to change is:

  1. have TTFont(path, lazy=False) immediately decompile all the tables' data, everything, like it says on the tin, so one doesn't need to then also decompile individual tables; (decided to reverted this change)
  2. add an ensureDecompiled method to TTFont, cmap, glyf and the otTables that allows to un-lazify them even when a font had been loaded with lazy=True or None; they get the same as opening it with lazy=False.
  3. fix the fact that cmap doesn't respect lazy; make it load everything when lazy=False.

previously cmap was completely ignoring lazy attribute, always loading lazily
to 'expand' all the lazy glyphs
can be useful to traverse a tree of otTables
to unlazify a whole tree of otTables, recursively
Make lazy=False actually do what it says, 'load everything eagerly'. It feels weird that one has to, not only say, open with lazy=False, but also have to load each tables individually... Didn't I say don't be lazy?!

Also it can be useful to get to a eager, non-lazy font whether or not it was originally loaded lazily, so I added an ensureDecompiled method that decompiles all the tables and calls ensureDecompiled for those (e.g. cmap, glyf and otData-driven tables like GSUB, GPOS, etc.) that respect the lazy attribute.
@anthrotype
Copy link
Member Author

i'm gonna add tests but first want to gather feedback

@justvanrossum
Copy link
Collaborator

There's a lot of code out there that doesn't do lazy=True, and I'm afraid this is a huge potential performance regression for those cases.

What about the oposite: have a method that explicitly deompiles all?

TTFont was always about lazy loading, the lazy flag is somewhat conflated with "don't buffer the entire stream" and "decompile OTL tables lazily".

@anthrotype
Copy link
Member Author

There's a lot of code out there that doesn't do lazy=True

wait, but lazy=False is opt-in, i'm not changing the default behavior of lazy=None..

@anthrotype
Copy link
Member Author
anthrotype commented Mar 17, 2022

If you grep the code, right now the only difference between lazy=None (default) and lazy=False is in glyf table, which explicitly checks if ttFont.lazy is False. The otTables only do if lazy or if not lazy so there the distinction between None and False doesn't hold. It's either True or it isn't. So currently setting lazy=False has only the effect of loading all the glyphs at once when decompiling the glyf table. That's it. Other than that, it behaves exactly the same as not setting lazy attribute at all.

I would argue that giving lazy=False a new meaning of load everything eagerly upon opening the font can be useful and won't disrupt much existing workflows. If one bothered to set lazy=False to begin with, they definitely are willing to pay the performance hit for whatever reason.

Anyway, if lazy=False as "load everything upfront" turns out to be controvertial, I'm willing to only add the new ensureDecompiled method and keep the current behavior unchanged.

one has to call TTFont.ensureDecompiled to load everything
@anthrotype
Copy link
Member Author

I removed the auto-load-all-tables-if-lazy-is-false. One has to call TTFont.ensureDecompiled to achieve that.

@anthrotype anthrotype changed the title make lazy=False immediately decompile the whole font add ensureDecompiled method to decompile all the tables irrespective of lazy attribute Mar 17, 2022
@justvanrossum
Copy link
Collaborator

(Not a big fan of the "return self for expression convenience" pattern. JS does this all over the place, but eg. Python's list.append(item) does not, as doing so might suggest to the naive reader that a new object is returned.)

@anthrotype
Copy link
Member Author

"return self for expression convenience" pattern

hm ok, I reverted

@justvanrossum
Copy link
Collaborator

I see you've already added the ensureDecompiled() method: I think that is the cleaner solution, so yay.

I find using None as a third flag for anything but b/w compatibility workarounds too unclear. (And: I totally forgot that the lazy flag actually has three states...)

makes it more useful for constructing generic traversals of trees of otTables
@anthrotype
Copy link
Member Author

@justvanrossum the last commit ab8fc32 is technically not required for this PR but it can be handy knowing the name and index of a subtable when traversing a tree of otTables. I can make it into its own PR if you prefer.

@rsheeter
Copy link
Collaborator

I'm enthused about lazy=False actually decompiling (and not being default). I loaded with lazy=False, checked isLoaded on the TTFont's keys, and had a wtf moment when it reported nothing was loaded.

@anthrotype anthrotype merged commit d716977 into main Mar 18, 2022
@anthrotype anthrotype deleted the unlazy branch March 18, 2022 10:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0