Making fontmake faster #1095

behdad · 2017-11-06T19:13:31Z

We want to drastically speed up fontmake: googlefonts/fontmake#367

Other than switching booleanOperations to something vector-based and fast (typemytype/booleanOperations#40), the other big offenders are basically 1. MutatorMath / fontMath, and general defcon / read/write UFO. For example, when James replaces defcon use with tiny objects, he saw about 20% speedup:

googlefonts/ufo2ft@2e65007

Cosimo has also reports that using lxml module instead of xml can significantly speed up UFO reading.

Since https://github.com/unified-font-object/ufoLib is essentially a reference implementation, I like to propose keeping it that way, with validators and converters, but add a trimmed-down version in fontTools.ufoLib that is optimized for speed. I suggest the API for this be strictly a subset of the upstream ufoLib, such that users can use one or the other based on their need.

I also like to add fs module support to it, as we need that for build-system experimentations. Which brings us to the topic of the UFO4 branch: https://github.com/unified-font-object/ufoLib/tree/ufo4 In general, I suggest we avoid these big revolutionary major version bumps that take forever to roll out and cause dependecy hell and instead work on an evolutionary UFO that changes slowly over time in master branch.

Please discuss.

@typesupply @typemytype @anthrotype @justvanrossum @LettError @jenskutilek @madig @brawer

The text was updated successfully, but these errors were encountered:

anthrotype · 2017-11-06T19:17:20Z

Just a small clarification: lxml is faster at writing rather than reading than ElementTree, because the writer is written in C (libxml2)

benkiel · 2017-11-06T19:53:00Z

Would it make more sense that fontmake has a trimmed down implementation (like your tiny objects example), than putting it into fontTools? My strong concern here is that by putting it into fontTools, that the ufoLib implementation becomes sidelined. It's important for interoperability to have design tools use the same, more strictly checked, ufoLib. I assume that this isn't as much of a concern with fontmake (and, if it knows that things are getting strictly checked in design tools, it'll help it make correct assumptions about data), so it seems better put there than fontTools.

behdad · 2017-11-06T19:58:30Z

Would it make more sense that fontmake has a trimmed down implementation (like your tiny objects example), than putting it into fontTools?

Well, then all kinds of scripts would have to import fontmake. We don't want fontmake to be a library, and we know we want to rewrite it at some point. This is NOT something only for fontmake. Any kind of batch processing can use a faster ufoLib / Defcon, which are designed more towards the font-editor usecase.

My strong concern here is that by putting it into fontTools, that the ufoLib implementation becomes sidelined. It's important for interoperability to have design tools use the same, more strictly checked, ufoLib. I assume that this isn't as much of a concern with fontmake (and, if it knows that things are getting strictly checked in design tools, it'll help it make correct assumptions about data), so it seems better put there than fontTools.

That would essentially make this yet another module, and I don't think @anthrotype is rushing to volunteer maintaining that :).

I also think fontTools really should be able to open UFOs directly. See #814 When we do that, we probably want to rewrite ufo2ft, properly, and put it in fontTools as well. Plus going from binaries to UFO. At that point, UFO can become yet another input/output format for ttx, etc.

typemytype · 2017-11-06T21:17:28Z

why keep both? if the only difference between a slimmed down version and ufoLib are converters (between different ufo versions) and validators, then I would propose to make ufoLib more modular so users can add an request whatever they prefer: converting, validation, speed, ...

All the speed improvements you are proposing are also welcome in ufoLib (fe lxml if available) and everyone would then benefit from your time and resources instead of only fontmake...

And this has already been researched see unified-font-object/ufoLib#28 and unified-font-object/ufoLib#27

typesupply · 2017-11-06T23:24:57Z

@behdad Thanks for bringing this up. This is a very important discussion and I'm glad that it has started. Here are my (slightly disjointed) thoughts...

Since https://github.com/unified-font-object/ufoLib is essentially a reference implementation, I like to propose keeping it that way, with validators and converters, but add a trimmed-down version in fontTools.ufoLib that is optimized for speed. I suggest the API for this be strictly a subset of the upstream ufoLib, such that users can use one or the other based on their need.

From my perspective, I'd be very happy if these optimizations went into ufoLib. I'm really wary of dividing the finite resources we have as a type tool developer community on duplicate code bases. We just have so much code to maintain as is that I'd hate to see us split ourselves even more just for some optimizations. If fontTools doesn't want to have a dependency on ufoLib, I think I'd be okay with ufoLib moving under the fontTools umbrella. (This is something that I just though of. I'd need to give it a bit more thought.)

I've been thinking about removing the vast majority of the validation code from ufoLib. I wrote that stuff in parallel with the UFO 3 spec as a check and balance on my thoughts in each. The validator is VERY zealous about balking at even the tiniest thing. That was fine for my theoretical work, but it practice it is having some very unintended consequences. I'm starting to think that it should only do basic sanity checking so that it's not propagating bogus data into callers. That will remove a huge chunk of code and possibly speed things up.

I also like to add fs module support to it, as we need that for build-system experimentations. Which brings us to the topic of the UFO4 branch: https://github.com/unified-font-object/ufoLib/tree/ufo4 In general, I suggest we avoid these big revolutionary major version bumps that take forever to roll out and cause dependecy hell and instead work on an evolutionary UFO that changes slowly over time in master branch.

I agree. As far as UFO spec changes go, yeah, that's been the post UFO 3 plan. UFO 4 isn't a huge change. The only reason that it's called "4" instead of "3.1" is that the formatVersion has been defined as an int since UFO 1. We tried to write UFO 3 so that it can grow much more organically. The hard part is GLIF since it has defined element names. Introducing new elements leads to nasty backwards compatibility issues.

As far as that UFO 4 branch goes... My goal was to get that written and merge it into master quickly. I never meant for it to linger for so long. I felt like it was done, but it got stuck in the approval process and I never had time to keep working towards getting it merged. I was really happy with it and seriously considered merging it into master without approval, but I thought I'd get in trouble for doing that. 😄

Cosimo has also reports that using lxml module instead of xml can significantly speed up UFO reading.

I looked at LXML in the past for reading, but didn't see any dramatic speed differences (as @anthrotype notes). I never looked at it for writing because for my use the vast majority of UFO writes are atomic instead of everything at once. If you want ufoLib to use LXML if available, I'm totally fine with that.

For example, when James replaces defcon use with tiny objects, he saw about 20% speedup:

Yeah. defcon creates a deep tree of objects. I made it more shallow for quicker loading in the last major revision, but as soon as the data is touched the object tree is expanded. If the goal is to create UFOs quickly without much manipulation before save, defcon is probably not the best fit. FWIW, fontMath has a tiny object structure and it's a huge speed improvement. It's even shallower than what you have in James' sketch. Contour objects are just tuples from what I remember. It may be a good thing to look at when working out a set of hyper fast objects.

typesupply · 2017-11-07T01:52:16Z

This brings up an interesting issue: Point Pens. These are deeply linked to ufoLib and vice-versa. As such, the base implementation of AbstractPointPen and the other basic bits and bobs would need to go to FontTools. I'm fine with that as well.

justvanrossum · 2017-11-07T13:14:19Z

I also feel that maintaining two implementations of ufoLib is quite wasteful. +1 on integration with fontTools, +1 on making it more versatile and fast, +1 in factoring out ufo validation, +1 on fs support, etc etc.

Probably much of https://github.com/robofab-developers/fontPens should then also migrate to fonttools.

behdad · 2017-11-07T14:32:53Z

Ok, thank you all. How about:

Let's try to get the fs work Tal did merged: Add ZIP support to UFO. unified-font-object/ufoLib#26
Remove any low-hanging old cruft,
Move it as is to fontTools.ufoLib
Start adding experimental tiny objects and port fontmake to it and rinse and repeat on speeding up.

For pens, I think we can move them to fontTools one by one as we need to. That's what we have been doing anyway.

anthrotype · 2017-11-07T14:36:15Z

ok, and then let's also be brave and break the taboo that says fonttools can't have dependencies :)
I want to require pyfilesystem2.

behdad · 2017-11-07T14:53:18Z

ok, and then let's also be brave and break the taboo that says fonttools can't have dependencies :)
I want to require pyfilesystem2.

As long as it's pip-installable, I don't mind. Also, since we are in Python, lazy-enough dependencies are fine.

We still need to figure-out about circular dependencies for certain functionalities, like a module in fonttools importing glyphsLib.

anthrotype · 2017-11-07T15:02:26Z

it certainly is pip-installable (https://pypi.python.org/pypi/fs)

since we are in Python, lazy-enough dependencies are fine

sure, though we already have a long list of optional and un-declared dependencies in various fonttools modules (see README) and I'm sure it's very confusing for users and downstream maintainers to keep track of all of them, given that our setup.py says we have no dependencies at all. I think some linux distros still think we require numpy...
I think the decision to require fs module in setup.py depends on how "core" this fontTools.ufoLib will be: i.e. if we deem it extra like the pyftinspect script, or if we place it on the same standing as ttLib or varLib.

I just don't want to reinvent the wheel and have to maintain it too.

madig · 2017-11-07T21:47:47Z

Start adding experimental tiny objects and port fontmake to it and rinse and repeat on speeding up.

So a separate implementation within the reference implementation? :)

typesupply · 2017-11-07T22:19:18Z

So a separate implementation within the reference implementation? :)

I wouldn't take the "reference implementation" designation on ufoLib too seriously. When I wrote that bit of documentation (and the strict validation code that is mentioned above) I was just coming off of writing a bunch of very official code for the WOFF specification. I was still speaking RFC 2119 (W3C dialect) and that leaked into the UFO specification. It sounds much more Very Official than it actually is.

typesupply · 2017-11-07T22:21:49Z

Start adding experimental tiny objects and port fontmake to it and rinse and repeat on speeding up

If you need help with this, please let me know. I've written a lot of implementations of these structures over the years and I'd be happy to pass along info about some of the things that I tried and learned.

behdad · 2017-11-08T23:41:17Z

Start adding experimental tiny objects and port fontmake to it and rinse and repeat on speeding up.
So a separate implementation within the reference implementation? :)

Oh those will be defcon replacement objects, not ufoLibs. ufoLibs looks pretty lean, I don't think we'll be rewriting a lot of it, just cleaning up / improving.

behdad · 2017-11-14T22:04:57Z

@adrientetar's thoughts: https://gist.github.com/adrientetar/2c515f2e304bfc57c9d37c7e39b586c7

behdad · 2017-11-15T02:42:47Z

@adrientetar's thoughts: https://gist.github.com/adrientetar/2c515f2e304bfc57c9d37c7e39b586c7

FWIW, I agree with most of the points raised. Nice writeup Adrien!

typemytype · 2017-11-15T08:03:31Z

I dumbly tested the first statement of Ardien:

there is no with ... except ... statement, oke it can be embedded into a `try ... with ... except...
my test shows that is either quicker with os.path.exist as pre test when none of the paths exists
my test shows that is equally fast with os.path.exists as are test when there is a mix of existing and non existing paths
my test shows that is equally fast with os.path.exists as are test when all paths exists
over test: the old implementation is faster
weirdly its 4-5 times faster in python2???

please correct me if Im wrong https://gist.github.com/typemytype/eb69d0a4b6acd92241af9d97e0bb3baa

you can test this side by side in the new DrawBot with py3 support!!!

justvanrossum · 2017-11-15T08:17:30Z

The difference between py2 and py3 becomes less if you open the files in binary mode (avoids unicode conversion). Still, py3 is about twice as slow in the "all exists" case.

justvanrossum · 2017-11-15T08:31:04Z

Re: @adrientetar's thoughts:

it seems most overhead is in defcon, not ufoLib
then why propose to rewrite ufoLib from scratch?
on the one hand complaining about defcon's overhead
on the other hand proposing a complex COW system that will use more memory
what is the main goal of the operation? To make fontmake faster? To cater better to Adrien's taste in object des 8000 ign for his font editor? Those two things should not interfere with each other. I'm ok with fonttools making font building workflows more efficient, I'm not ok with fonttools catering to specific needs of client software. That's why defcon is separate and things like it should be separate [from fonttools].

justvanrossum · 2017-11-15T08:36:31Z

So +1 to:

a super minimal lean object model [in fonttools] so fontmake-like things don't need to depend on defcon

And -1 to:

a new object model designed for font editors [in fonttools] (even if it would be much better than defcon)

adrientetar · 2017-11-15T09:59:46Z

The idea is to have something like TTFont but for UFO objects. The one thing that really needs support is custom classes, such that an app backend can be written of top of the fontTools objects. (And as I say it doesn't have to be dynamic like in defcon, so it's zero-cost.)

on the other hand proposing a complex COW system that will use more memory

I'm not sure what you mean by that. Hash tables and trees have more overhead but they make certain operations faster. Once structural optimizations are exhausted speedups are often a matter of speed/memory tradeoff.

Anyway it's just an idea, I haven't explored whether it makes sense to use here. It's always good to think outside of well-worn paths.

then why propose to rewrite ufoLib from scratch?

I think it can be made a lot more compact, that plus the fact that Cosimo wants to use a different XML lib makes it basically a rewrite. Anyway it's playing semantics, the end results matters not the means (it's up to the one writing it).

my test shows that is equally fast with os.path.exists as are test when all paths exists

That's surprising since pre-check amounts to extra syscalls.

Anyway, note my correctness argument. I think it's more important than speed.

adrientetar · 2017-11-15T10:26:08Z

Added a section about "Single file UFO".

justvanrossum · 2017-11-15T11:16:03Z

The idea is to have something like TTFont but for UFO objects.

That is not a bad idea per se.

The one thing that really needs support is custom classes, such that an app backend can be written of top of the fontTools objects. (And as I say it doesn't have to be dynamic like in defcon, so it's zero-cost.)

This makes no sense to me at all. How can fonttools even pretend to know what font editors may need in terms of data structures?

Anyway. Perhaps it's more useful to try to define a standard minimal object API. "Objects conforming to x, y, and z will interoperate nicely with tools a, b and c".

anthrotype · 2017-11-15T11:20:59Z

Thanks Adrien!

I agree with many of the points you raised, especially the need for a "fast path" between reader and object initialization and avoid unnecessary copies after the initial read.

The goal is to make building or batch processing fonts faster, without unnecessary features that are more meant for interactive editors.

What I would really love to see is a UFO reader/writer that's as fast as if it were written in pure C -- and we can achieve that speed with Cython, without giving up on all the niceties that Python offers. Even simply cythonizing a pure python code as is into an equivalent C code (i.e. one that calls the very same C API that the Python interpreter uses) gives you from 20 to 50 % for free, by just removing the interpreter step. Then as you add static typing to variables the gains can boost to 10 times and more depending on the task at hand.

Lxml is also written in Cython, thus provides a Cython/C interface which could be called from other Cython modules, which makes building parsers/writers built on top of it even faster than it already is from pure Python.

I would like this fast UFO reader/writer to be able to deserialize/serialize to/from primitive built-in data types like dict, lists and tuples. These have the advantage of being compatible with other serialization libraries like json, yaml, pickle, etc. Then from this unstructured data I would like to be able to build classes each representing specific data structures with certain combinations of fields and valid values.

Now.. there is a very cool library which I'd love to use in one of my projects but I'm wary of proposing it because of the usual "dependency scare". It's called attrs, and it allows to declaratively define the attributes of a class and to remove much of the boilerplate involved in defining classes. It's hugely popular, well maintained and used my lots of projects -- so much so that there's now a PEP modelled after it, but that will only be available for future python versions, while attrs work with any python 2 or 3.

I haven't given as much thought as Adrien yet, but you can see that what I have in mind goes more in the direction of a from-scratch rewrite rather than an incremental optimization.

I'm not totally convinced about the idea of immutable data structures, but it may be just me not being much familiar with functional programming techniques, so I'd willing to learn more on that.

anthrotype · 2017-11-15T11:24:10Z

define a standard minimal object API. "Objects conforming to x, y, and z will interoperate nicely with tools a, b and c".

that is basically fontParts

justvanrossum · 2017-11-15T11:25:23Z

that is basically fontParts

Yes, but no. fontParts is a large API that is for convenient scripting in (perhaps) a font editor. I'm rather thinking of something small, with only simple operations and attributes.

justvanrossum · 2017-11-15T11:30:43Z

Perhaps even as small as:

glyph.width
glyph.height
glyph.verticalOriginY
glyph.draw(pen)
glyph.drawPoints(pointPen)
glyph.getPen()
glyph.getPointPen()
glyph.lib
glyph.bounds

anthrotype · 2017-11-15T11:38:33Z

yeah, I meant a subset of fontParts tailored for command-line batch processing tools.

justvanrossum · 2017-11-15T11:42:48Z

Yes, but really a very small, well documented subset. I think that can be extremely useful.

justvanrossum · 2021-04-23T05:59:50Z

Meh. Running it on full fontmake is like 40% slower...

Could it be that some C extension is wreaking havoc? I understand that the use of C extensions can be (very) slow in PyPy. Could it be lxml? Just a wild guess.

Re: fonttools-rs: that's some awesome results already! Looking forward to these new building blocks.

benkiel · 2021-04-23T15:08:50Z

@simoncozens @behdad It would be worth thinking about how fonttools and fonttools-rs would relate. Would fonttools-rs become fonttools? If so, what effort would be spent documenting how to incorporate the rust version into python workflows. Would fonttools-rs become a dependency of fonttools and handle some things, but not all things? If so, what should the name of fonttools-rs be, as having two things called fonttools is super confusing for folks that don't track everything.

I know this is an experiment at this point, and I don't want to slow that down. But, thinking about this now would be helpful. The last thing that one wants is to have two projects doing similar things that then don't get attention because attention is split between the two.

This is basically a reminder that there are lots of folks who use fonttools who are primarily designers. They dabble in code to get things done, but it can be mysterious and confusing at times to them. That isn't a group to be dismissed, but embraced when thinking about shifts. How can a change be made seamless for them? Doing so has the side benefit of making it even easier for those who are in Vi/Emacs all day every day.

behdad · 2021-04-23T15:15:12Z

Could it be that some C extension is wreaking havoc? I understand that the use of C extensions can be (very) slow in PyPy. Could it be lxml? Just a wild guess.

Could it be. Cosimo suggested the same. I read the following article recently so was hoping that it wouldn't be that bad:
https://morepypy.blogspot.com/2018/09/inside-cpyext-why-emulating-cpython-c.html?m=1

simoncozens · 2021-04-23T15:29:48Z

It would be worth thinking about how fonttools and fonttools-rs would relate. Would fonttools-rs become fonttools? If so, what effort would be spent documenting how to incorporate the rust version into python workflows. Would fonttools-rs become a dependency of fonttools and handle some things, but not all things?

OK, here's kinda my vision for fonttools-in-rust:

FontTools-in-Python is really important as a way for designers to manipulate font files in custom ways. As you mention, designers know Python, they want to write simple scripts, and they can. That's great, and I don't want to change that. So fontTools is definitely part of the future.

However, one major use case of fontTools is not custom, designer-generated, code - it's basically running fontmake, which is what this issue is about. What I'd like to do is:

in the short-term, see if any parts of the fontmake process (or any other commonly repeated tasks) can be rewritten in Rust to make them faster. e.g. renaming glyphs to production, IUP gvar, remove overlaps, etc. Once those pipeline utilities are in place, we can get fontmake to call them (as external, command-line utilities, not as a Python/Rust binding) instead of implementing those bit of the process itself.
in the long term, turn the whole of fontmake into Rust.

Of course to do that, we need a Rust library which reads, manipulates and writes OT files, which is basically "fonttools equivalent" - but I'm doing this not because I want to replace fonttools per se, but because I want to replace fontmake.

justvanrossum · 2021-04-23T15:53:02Z

Running time fontmake -m myfont.designspace -o variable on a ~1000 glyph, 15 master designspace gives me the following:

CPython 3.8

real	0m31.321s
user	0m30.656s
sys	0m0.621s

PyPy 3.7

real	0m15.511s
user	0m14.796s
sys	0m0.701s

This is the set of dependencies that I installed manually for pypy (I used pip install --no-deps XXX until fontmake worked):

attrs==20.3.0
cu2qu==1.6.7
fontMath==0.6.0
fontmake==2.4.0
fonttools==4.22.0
fs==2.4.13
pip==21.0.1
pytz==2021.1
setuptools==47.1.0
six==1.15.0
typing_extensions==3.7.4.3
ufo2ft==2.21.0
ufoLib2==0.11.1

behdad · 2021-04-23T16:26:39Z

I reran on NotoSans-MM, this time from designspace.

CPython 3.9.4:

real	0m51.467s
user	0m49.850s
sys	0m1.547s

and PyPy ~3.7:

real	1m7.141s
user	1m4.308s
sys	0m2.501s

I did notice however that the step that says:

INFO:ufo2ft:Pre-processing glyphs

takes ~25s in CPython and ~50s in PyPy. I looked into that step, it looks like that's actually when the UFO glyphs are loaded. So yes, this might be lxml PyPy overhead.

justvanrossum · 2021-04-23T18:05:34Z

FWIW, I'm getting 40% to 50% speedup when running fontmake -g NotoSans-MM.glyphs -o variable with pypy.

(I can't get skia-pathops to work in pypy, probably something messed up in my py2 (!) environment, so I haven't been able to compare static font building with remove overlaps.)

Adding lxml to my env indeed makes the whole thing a lot slower (close to CPython speed, but still a little faster), even when running from .glyphs.

I wonder how much the JIT warmup costs. If it's significant, a longer running fontmake server process could save more.

justvanrossum · 2021-04-23T18:10:48Z

FWIW2: I'm running a big CJK pipeline (of which fontmake is only a part): it takes about 19 minutes with CPython, and a little over 5 minutes with pypy3.

twardoch · 2021-04-25T02:14:16Z

Probably moving processing of multiple fonts into Py would help. In cases like Noto, many fonts are built: many sources are read, and many output formats are built. In the current setup, for each font a new Py process gets started, so with PyPy, the JIT optimization gets lost. If multiple files are processed in the same process, then only the 1st font would be slower and then the JIT-compiled code would run faster on subsequent fonts.

twardoch · 2021-04-25T02:18:12Z

There could be a YAML file that holds the tasks for multiple fonts. So all args for all conversions would be in one YAML, fontmake would be called with the YAML and then all would work in t same run.

anthrotype · 2021-04-26T11:42:18Z

I can't get skia-pathops to work in pypy, probably something messed up in my py2 (!) environment,

we don't distribute wheels for pypy3 (last time I checked there was no portable wheel tag that worked, maybe things have changed since), so it's probably trying to build skia-pathops from source, and skia requires python2 to be on $PATH (it's gn build tool only works with python2.7), maybe you don't have that on your local machine.

justvanrossum · 2021-04-26T11:49:39Z

so it's probably trying to build skia-pathops from source, and skia requires python2

Yes. It was some SSL related thing that I didn't care to debug. For a CI project elswhere I built Linux wheels: https://github.com/justvanrossum/skia-pathops-pypy-wheels/releases/tag/v0.1.0 Perhaps I should try to add macOS to the matrix there. But similar to lxml, it's disproportionally expensive to run pathops under pypy, although it was still a net win in my case.

anthrotype · 2021-04-26T12:04:26Z

still a net win in my case

yeah, skia-pathops in general is much faster than booleanOperations (pyclipper).

Another thing you can try is forcing to install the pure-python version of cu2qu. Right now when you pip install cu2qu, since there is no pypy binary wheel on PyPI, it attemps to build one from the .zip source, bootstrapping cython and building the extension module from source. But you can tell it not to use Cython by exporting the environment variable CU2QU_WITH_CYTHON=0 before (re)installing cu2qu. I noticed that static font builds of NotoSans went down from 37 to 32 seconds after I instaled cu2qu in pure-python mode; the variable font build also got faster by 4.4 seconds with pure cu2qu

anthrotype · 2021-04-26T12:05:34Z

(Maybe I should detect in cu2qu's setup.py if we are building cu2qu on pypy, and make it default to the pure-python version)

anthrotype · 2021-04-26T12:09:43Z

note, if you installed cu2qu normally once, pip may have built and cached a binary wheel, so the next time you pip install cu2qu it will attempt to use that. You can force it to build from source (so that it reads the CU2QU_WITH_CYTHON=0) by passing the pip install option --no-binary=cu2qu:

$ CU2QU_WITH_CYTHON=0 pip install --no-binary=cu2qu cu2qu

justvanrossum · 2021-04-26T12:10:06Z

yeah, skia-pathops in general is much faster than booleanOperations (pyclipper).

No, I was already using pathops, I meant the switch to pypy :)

(Maybe I should detect in cu2qu's setup.py if we are building cu2qu on pypy, and make it default to the pure-python version)

I tried to remove cu2qu's *.so by renaming it but I couldn't measure any difference. I possibly messed up something.

anthrotype · 2021-04-26T12:12:41Z

I tried to remove cu2qu's *.so by renaming it but I couldn't measure any difference.

strange, I can see reproducible 10 to 12% speed gains when switching to pure-python cu2qu

madig · 2021-04-26T14:17:27Z

still a net win in my case

yeah, skia-pathops in general is much faster than booleanOperations (pyclipper).

Last time I checked, there was a huge timesink somewhere in bOps, where it converts curves to something pyclipper can handle I think. It shows up as a huge bar in a profiler. bOps can probably be sped up significantly if that bottleneck is removed.

justvanrossum · 2021-04-27T21:34:26Z

strange, I can see reproducible 10 to 12% speed gains when switching to pure-python cu2qu

I must've been doing something wrong before. I'm seeing good improvements with that now, too. Thanks for the instructions how to configure that!

simoncozens · 2021-05-04T17:40:56Z

I have now got gvar generation working (very crudely) in fonttools-rs.

fontmake -m NotoSans/NotoSans-MM.designspace -o variable  66.60s use
CEB7
r 2.03s system 95% cpu 1:11.82 total
../target/release/fonticulus NotoSans-MM.designspace NotoSans.ttf  4.19s user 1.12s system 390% cpu 1.361 total

Still some bugs to fix, notably applying the avar maps, and it doesn't perform any IUP or shared point optimisations on the gvar table. Can potentially be optimised further by processing the glyphs in parallel, but it looks like UFO loading is the slowest part of the operation.

chrissimpkins · 2021-05-04T21:13:21Z

https://github.com/facebookincubator/cinder

"Instagram's internal performance-oriented production version of CPython 3.8"

behdad · 2021-05-04T21:24:47Z

https://github.com/facebookincubator/cinder

"Instagram's internal performance-oriented production version of CPython 3.8"

Oh VERY interesting!

simoncozens · 2021-05-13T12:56:03Z

Still some bugs to fix, notably applying the avar maps, and it doesn't perform any IUP or shared point optimisations on the gvar table. Can potentially be optimised further by processing the glyphs in parallel, but it looks like UFO loading is the slowest part of the operation.

I fixed a bunch of gvar bugs, applied avar maps, and implemented IUP optimisation. I also changed a .iter into .par_iter, and now the UFO loading is done in parallel.

If anyone's still keeping score, I just built an 81 master, 1049 glyph font in 3.10 seconds.

As far as fontmake is concerned, there are now a bunch of utilities in fonttools-cli which could be used to speed up fontmake/ufo2ft's post processing:

ttf-flatten-components replaces the FlattenComponents filter
ttf-optimize-gvar performs IUP optimisation on an unoptimized gvar table
ttf-rename-glyphs quickly renames glyphs to production names

madig · 2021-05-25T22:09:35Z

I tried to get more UFO loading performance out of Rust again: https://github.com/madig/readwrite-ufo-glif (experimental ufoLib2 PR at fonttools/ufoLib2#147)

It uses a variation of what Simon tried before, loading glyphs or entire layers in norad, serializing the data to Python dictionaries and passing them back to Python land, where ufoLib2 re-instantiates them directly, without going through ufoLib or pens.

It works and speeds up the process, but there remain frictional losses. I described my findings in madig/readwrite-ufo-glif#2.

Cosimo suggested trying to reduce the cross-over performance tax by circumventing PyO3 and serializing the data to JSON or pickle it or do something else where you don't need to go through Python's facilities on the Rust side. I'll see.

yisibl · 2021-11-01T04:32:36Z

be-fonts/better-ergonomics-spec#2

behdad · 2022-03-26T00:10:23Z

I have now got gvar generation working (very crudely) in fonttools-rs.

How does this work?

simoncozens · 2022-03-26T06:42:31Z

Same way as anywhere else: convert curves to quadratic, compute the deltas, stuff them in a gvar table.

behdad · 2022-07-28T23:23:47Z

Closing this sentiissue.

simoncozens mentioned this issue Jun 8, 2022

Add first draft googlefonts/oxidize#23

Merged

behdad closed this as completed Jul 28, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Making fontmake faster #1095

Making fontmake faster #1095

Making fontmake faster #1095

Making fontmake faster #1095

Comments