-
Notifications
You must be signed in to change notification settings - Fork 475
Making fontmake faster #1095
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Just a small clarification: lxml is faster at writing rather than reading than ElementTree, because the writer is written in C (libxml2) |
Would it make more sense that fontmake has a trimmed down implementation (like your tiny objects example), than putting it into fontTools? My strong concern here is that by putting it into fontTools, that the ufoLib implementation becomes sidelined. It's important for interoperability to have design tools use the same, more strictly checked, ufoLib. I assume that this isn't as much of a concern with fontmake (and, if it knows that things are getting strictly checked in design tools, it'll help it make correct assumptions about data), so it seems better put there than fontTools. |
Well, then all kinds of scripts would have to import fontmake. We don't want fontmake to be a library, and we know we want to rewrite it at some point. This is NOT something only for fontmake. Any kind of batch processing can use a faster ufoLib / Defcon, which are designed more towards the font-editor usecase.
That would essentially make this yet another module, and I don't think @anthrotype is rushing to volunteer maintaining that :). I also think fontTools really should be able to open UFOs directly. See #814 When we do that, we probably want to rewrite ufo2ft, properly, and put it in fontTools as well. Plus going from binaries to UFO. At that point, UFO can become yet another input/output format for ttx, etc. |
why keep both? if the only difference between a slimmed down version and ufoLib are converters (between different ufo versions) and validators, then I would propose to make ufoLib more modular so users can add an request whatever they prefer: converting, validation, speed, ... All the speed improvements you are proposing are also welcome in ufoLib (fe lxml if available) and everyone would then benefit from your time and resources instead of only And this has already been researched see unified-font-object/ufoLib#28 and unified-font-object/ufoLib#27 |
@behdad Thanks for bringing this up. This is a very important discussion and I'm glad that it has started. Here are my (slightly disjointed) thoughts...
From my perspective, I'd be very happy if these optimizations went into ufoLib. I'm really wary of dividing the finite resources we have as a type tool developer community on duplicate code bases. We just have so much code to maintain as is that I'd hate to see us split ourselves even more just for some optimizations. If fontTools doesn't want to have a dependency on ufoLib, I think I'd be okay with ufoLib moving under the fontTools umbrella. (This is something that I just though of. I'd need to give it a bit more thought.) I've been thinking about removing the vast majority of the validation code from ufoLib. I wrote that stuff in parallel with the UFO 3 spec as a check and balance on my thoughts in each. The validator is VERY zealous about balking at even the tiniest thing. That was fine for my theoretical work, but it practice it is having some very unintended consequences. I'm starting to think that it should only do basic sanity checking so that it's not propagating bogus data into callers. That will remove a huge chunk of code and possibly speed things up.
I agree. As far as UFO spec changes go, yeah, that's been the post UFO 3 plan. UFO 4 isn't a huge change. The only reason that it's called "4" instead of "3.1" is that the formatVersion has been defined as an int since UFO 1. We tried to write UFO 3 so that it can grow much more organically. The hard part is GLIF since it has defined element names. Introducing new elements leads to nasty backwards compatibility issues. As far as that UFO 4 branch goes... My goal was to get that written and merge it into master quickly. I never meant for it to linger for so long. I felt like it was done, but it got stuck in the approval process and I never had time to keep working towards getting it merged. I was really happy with it and seriously considered merging it into master without approval, but I thought I'd get in trouble for doing that. 😄
I looked at LXML in the past for reading, but didn't see any dramatic speed differences (as @anthrotype notes). I never looked at it for writing because for my use the vast majority of UFO writes are atomic instead of everything at once. If you want ufoLib to use LXML if available, I'm totally fine with that.
Yeah. defcon creates a deep tree of objects. I made it more shallow for quicker loading in the last major revision, but as soon as the data is touched the object tree is expanded. If the goal is to create UFOs quickly without much manipulation before save, defcon is probably not the best fit. FWIW, fontMath has a tiny object structure and it's a huge speed improvement. It's even shallower than what you have in James' sketch. Contour objects are just tuples from what I remember. It may be a good thing to look at when working out a set of hyper fast objects. |
This brings up an interesting issue: Point Pens. These are deeply linked to ufoLib and vice-versa. As such, the base implementation of AbstractPointPen and the other basic bits and bobs would need to go to FontTools. I'm fine with that as well. |
I also feel that maintaining two implementations of ufoLib is quite wasteful. +1 on integration with fontTools, +1 on making it more versatile and fast, +1 in factoring out ufo validation, +1 on Probably much of https://github.com/robofab-developers/fontPens should then also migrate to fonttools. |
Ok, thank you all. How about:
For pens, I think we can move them to fontTools one by one as we need to. That's what we have been doing anyway. |
ok, and then let's also be brave and break the taboo that says fonttools can't have dependencies :) |
As long as it's pip-installable, I don't mind. Also, since we are in Python, lazy-enough dependencies are fine. We still need to figure-out about circular dependencies for certain functionalities, like a module in fonttools importing glyphsLib. |
it certainly is pip-installable (https://pypi.python.org/pypi/fs)
sure, though we already have a long list of optional and un-declared dependencies in various fonttools modules (see README) and I'm sure it's very confusing for users and downstream maintainers to keep track of all of them, given that our setup.py says we have no dependencies at all. I think some linux distros still think we require numpy... I just don't want to reinvent the wheel and have to maintain it too. |
So a separate implementation within the reference implementation? :) |
I wouldn't take the "reference implementation" designation on ufoLib too seriously. When I wrote that bit of documentation (and the strict validation code that is mentioned above) I was just coming off of writing a bunch of very official code for the WOFF specification. I was still speaking RFC 2119 (W3C dialect) and that leaked into the UFO specification. It sounds much more |
If you need help with this, please let me know. I've written a lot of implementations of these structures over the years and I'd be happy to pass along info about some of the things that I tried and learned. |
Oh those will be defcon replacement objects, not ufoLibs. ufoLibs looks pretty lean, I don't think we'll be rewriting a lot of it, just cleaning up / improving. |
FWIW, I agree with most of the points raised. Nice writeup Adrien! |
I dumbly tested the first statement of Ardien:
please correct me if Im wrong https://gist.github.com/typemytype/eb69d0a4b6acd92241af9d97e0bb3baa you can test this side by side in the new DrawBot with py3 support!!! |
The difference between py2 and py3 becomes less if you open the files in binary mode (avoids unicode conversion). Still, py3 is about twice as slow in the "all exists" case. |
Re: @adrientetar's thoughts:
|
So +1 to:
And -1 to:
|
The idea is to have something like TTFont but for UFO objects. The one thing that really needs support is custom classes, such that an app backend can be written of top of the fontTools objects. (And as I say it doesn't have to be dynamic like in defcon, so it's zero-cost.)
I'm not sure what you mean by that. Hash tables and trees have more overhead but they make certain operations faster. Once structural optimizations are exhausted speedups are often a matter of speed/memory tradeoff. Anyway it's just an idea, I haven't explored whether it makes sense to use here. It's always good to think outside of well-worn paths.
I think it can be made a lot more compact, that plus the fact that Cosimo wants to use a different XML lib makes it basically a rewrite. Anyway it's playing semantics, the end results matters not the means (it's up to the one writing it).
That's surprising since pre-check amounts to extra syscalls. Anyway, note my correctness argument. I think it's more important than speed. |
Added a section about "Single file UFO". |
That is not a bad idea per se.
This makes no sense to me at all. How can fonttools even pretend to know what font editors may need in terms of data structures? Anyway. Perhaps it's more useful to try to define a standard minimal object API. "Objects conforming to x, y, and z will interoperate nicely with tools a, b and c". |
Thanks Adrien! I agree with many of the points you raised, especially the need for a "fast path" between reader and object initialization and avoid unnecessary copies after the initial read. The goal is to make building or batch processing fonts faster, without unnecessary features that are more meant for interactive editors. What I would really love to see is a UFO reader/writer that's as fast as if it were written in pure C -- and we can achieve that speed with Cython, without giving up on all the niceties that Python offers. Even simply cythonizing a pure python code as is into an equivalent C code (i.e. one that calls the very same C API that the Python interpreter uses) gives you from 20 to 50 % for free, by just removing the interpreter step. Then as you add static typing to variables the gains can boost to 10 times and more depending on the task at hand. Lxml is also written in Cython, thus provides a Cython/C interface which could be called from other Cython modules, which makes building parsers/writers built on top of it even faster than it already is from pure Python. I would like this fast UFO reader/writer to be able to deserialize/serialize to/from primitive built-in data types like dict, lists and tuples. These have the advantage of being compatible with other serialization libraries like json, yaml, pickle, etc. Then from this unstructured data I would like to be able to build classes each representing specific data structures with certain combinations of fields and valid values. Now.. there is a very cool library which I'd love to use in one of my projects but I'm wary of proposing it because of the usual "dependency scare". It's called attrs, and it allows to declaratively define the attributes of a class and to remove much of the boilerplate involved in defining classes. It's hugely popular, well maintained and used my lots of projects -- so much so that there's now a PEP modelled after it, but that will only be available for future python versions, while attrs work with any python 2 or 3. I haven't given as much thought as Adrien yet, but you can see that what I have in mind goes more in the direction of a from-scratch rewrite rather than an incremental optimization. I'm not totally convinced about the idea of immutable data structures, but it may be just me not being much familiar with functional programming techniques, so I'd willing to learn more on that. |
that is basically fontParts |
Yes, but no. fontParts is a large API that is for convenient scripting in (perhaps) a font editor. I'm rather thinking of something small, with only simple operations and attributes. |
Perhaps even as small as: glyph.width
glyph.height
glyph.verticalOriginY
glyph.draw(pen)
glyph.drawPoints(pointPen)
glyph.getPen()
glyph.getPointPen()
glyph.lib
glyph.bounds |
yeah, I meant a subset of fontParts tailored for command-line batch processing tools. |
Yes, but really a very small, well documented subset. I think that can be extremely useful. |
Could it be that some C extension is wreaking havoc? I understand that the use of C extensions can be (very) slow in PyPy. Could it be lxml? Just a wild guess. Re: fonttools-rs: that's some awesome results already! Looking forward to these new building blocks. |
@simoncozens @behdad It would be worth thinking about how fonttools and fonttools-rs would relate. Would fonttools-rs become fonttools? If so, what effort would be spent documenting how to incorporate the rust version into python workflows. Would fonttools-rs become a dependency of fonttools and handle some things, but not all things? If so, what should the name of fonttools-rs be, as having two things called fonttools is super confusing for folks that don't track everything. I know this is an experiment at this point, and I don't want to slow that down. But, thinking about this now would be helpful. The last thing that one wants is to have two projects doing similar things that then don't get attention because attention is split between the two. This is basically a reminder that there are lots of folks who use fonttools who are primarily designers. They dabble in code to get things done, but it can be mysterious and confusing at times to them. That isn't a group to be dismissed, but embraced when thinking about shifts. How can a change be made seamless for them? Doing so has the side benefit of making it even easier for those who are in Vi/Emacs all day every day. |
Could it be. Cosimo suggested the same. I read the following article recently so was hoping that it wouldn't be that bad: |
OK, here's kinda my vision for fonttools-in-rust: FontTools-in-Python is really important as a way for designers to manipulate font files in custom ways. As you mention, designers know Python, they want to write simple scripts, and they can. That's great, and I don't want to change that. So fontTools is definitely part of the future. However, one major use case of fontTools is not custom, designer-generated, code - it's basically running fontmake, which is what this issue is about. What I'd like to do is:
Of course to do that, we need a Rust library which reads, manipulates and writes OT files, which is basically "fonttools equivalent" - but I'm doing this not because I want to replace fonttools per se, but because I want to replace fontmake. |
Running CPython 3.8
PyPy 3.7
This is the set of dependencies that I installed manually for pypy (I used
|
I reran on NotoSans-MM, this time from designspace. CPython 3.9.4:
and PyPy ~3.7:
I did notice however that the step that says:
takes ~25s in CPython and ~50s in PyPy. I looked into that step, it looks like that's actually when the UFO glyphs are loaded. So yes, this might be lxml PyPy overhead. |
FWIW, I'm getting 40% to 50% speedup when running (I can't get skia-pathops to work in pypy, probably something messed up in my py2 (!) environment, so I haven't been able to compare static font building with remove overlaps.) Adding lxml to my env indeed makes the whole thing a lot slower (close to CPython speed, but still a little faster), even when running from .glyphs. I wonder how much the JIT warmup costs. If it's significant, a longer running fontmake server process could save more. |
FWIW2: I'm running a big CJK pipeline (of which fontmake is only a part): it takes about 19 minutes with CPython, and a little over 5 minutes with pypy3. |
Probably moving processing of multiple fonts into Py would help. In cases like Noto, many fonts are built: many sources are read, and many output formats are built. In the current setup, for each font a new Py process gets started, so with PyPy, the JIT optimization gets lost. If multiple files are processed in the same process, then only the 1st font would be slower and then the JIT-compiled code would run faster on subsequent fonts. |
There could be a YAML file that holds the tasks for multiple fonts. So all args for all conversions would be in one YAML, fontmake would be called with the YAML and then all would work in t same run. |
we don't distribute wheels for pypy3 (last time I checked there was no portable wheel tag that worked, maybe things have changed since), so it's probably trying to build skia-pathops from source, and skia requires python2 to be on $PATH (it's |
Yes. It was some SSL related thing that I didn't care to debug. For a CI project elswhere I built Linux wheels: https://github.com/justvanrossum/skia-pathops-pypy-wheels/releases/tag/v0.1.0 Perhaps I should try to add macOS to the matrix there. But similar to lxml, it's disproportionally expensive to run pathops under pypy, although it was still a net win in my case. |
yeah, skia-pathops in general is much faster than booleanOperations (pyclipper). Another thing you can try is forcing to install the pure-python version of cu2qu. Right now when you |
(Maybe I should detect in cu2qu's setup.py if we are building cu2qu on pypy, and make it default to the pure-python version) |
note, if you installed cu2qu normally once, pip may have built and cached a binary wheel, so the next time you pip install cu2qu it will attempt to use that. You can force it to build from source (so that it reads the CU2QU_WITH_CYTHON=0) by passing the pip install option
|
No, I was already using pathops, I meant the switch to pypy :)
I tried to remove cu2qu's *.so by renaming it but I couldn't measure any difference. I possibly messed up something. |
strange, I can see reproducible 10 to 12% speed gains when switching to pure-python cu2qu |
Last time I checked, there was a huge timesink somewhere in bOps, where it converts curves to something pyclipper can handle I think. It shows up as a huge bar in a profiler. bOps can probably be sped up significantly if that bottleneck is removed. |
I must've been doing something wrong before. I'm seeing good improvements with that now, too. Thanks for the instructions how to configure that! |
I have now got gvar generation working (very crudely) in fonttools-rs.
Still some bugs to fix, notably applying the avar maps, and it doesn't perform any IUP or shared point optimisations on the gvar table. Can potentially be optimised further by processing the glyphs in parallel, but it looks like UFO loading is the slowest part of the operation. |
https://github.com/facebookincubator/cinder "Instagram's internal performance-oriented production version of CPython 3.8" |
Oh VERY interesting! |
I fixed a bunch of gvar bugs, applied avar maps, and implemented IUP optimisation. I also changed a If anyone's still keeping score, I just built an 81 master, 1049 glyph font in 3.10 seconds. As far as fontmake is concerned, there are now a bunch of utilities in fonttools-cli which could be used to speed up fontmake/ufo2ft's post processing:
|
I tried to get more UFO loading performance out of Rust again: https://github.com/madig/readwrite-ufo-glif (experimental ufoLib2 PR at fonttools/ufoLib2#147) It uses a variation of what Simon tried before, loading glyphs or entire layers in norad, serializing the data to Python dictionaries and passing them back to Python land, where ufoLib2 re-instantiates them directly, without going through ufoLib or pens. It works and speeds up the process, but there remain frictional losses. I described my findings in madig/readwrite-ufo-glif#2. Cosimo suggested trying to reduce the cross-over performance tax by circumventing PyO3 and serializing the data to JSON or pickle it or do something else where you don't need to go through Python's facilities on the Rust side. I'll see. |
How does this work? |
Same way as anywhere else: convert curves to quadratic, compute the deltas, stuff them in a gvar table. |
Closing this sentiissue. |
We want to drastically speed up fontmake: googlefonts/fontmake#367
Other than switching booleanOperations to something vector-based and fast (typemytype/booleanOperations#40), the other big offenders are basically 1. MutatorMath / fontMath, and general defcon / read/write UFO. For example, when James replaces defcon use with tiny objects, he saw about 20% speedup:
googlefonts/ufo2ft@2e65007
Cosimo has also reports that using
lxml
module instead ofxml
can significantly speed up UFO reading.Since https://github.com/unified-font-object/ufoLib is essentially a reference implementation, I like to propose keeping it that way, with validators and converters, but add a trimmed-down version in fontTools.ufoLib that is optimized for speed. I suggest the API for this be strictly a subset of the upstream ufoLib, such that users can use one or the other based on their need.
I also like to add
fs
module support to it, as we need that for build-system experimentations. Which brings us to the topic of the UFO4 branch: https://github.com/unified-font-object/ufoLib/tree/ufo4 In general, I suggest we avoid these big revolutionary major version bumps that take forever to roll out and cause dependecy hell and instead work on an evolutionary UFO that changes slowly over time in master branch.Please discuss.
@typesupply @typemytype @anthrotype @justvanrossum @LettError @jenskutilek @madig @brawer
The text was updated successfully, but these errors were encountered: