Emojis š
Hello š!
PDFWriter latest release (4.5.12) includes support for fonts that contain Emojis. Notable examples for Emoji fonts are Windows Segoe UI emoji and Google Noto font. This means that writing text that include emojis will result in lovely colorful emojis, rather than black and white representations.
Emojis are very common in our daily communication and writeups and so it makes sense that theyāll also be present in PDF files. However they do present a technical difficulty in being generally different from the rest of the text - they got color, normally even more than one, and some have fancy gradients. In a way, Emojis are less text and more like images. If you are familiar with SVG file formatā¦then they are way more that than regular text. However, from usage point of view - they are an integral part of text input, and so they are definitely - text. For instance, copying text with emojis from one application to another, you expect the text to include the emojis and show them properly, no exception. Also - you expect to be able to type emojis, not have to upload an image every time you want to use one. Soā¦also text.
If you try to write text with emojis and a relevant font using a previous version of hummus they will come out black and white. This is because thereās a need to include special support for colorful Emojis, which is what you get with this release. The following discusses what it means to have Emojis in fonts, what PDF can do for you in general in this area and how PDFWriter deals with this.
š„³šš¬š
Fonts and Emojis
Before discussing how emojis are represented in fonts, letās talk text in general. Emojis are part of text strings and each has a special Unicode value. For example, the smile Emoji š has the value of U+263A. The vomiting emoji š¤® has the value of U+1F92E. So they are just like regular text and this is what substantializes the ability to do things like copy and paste, save in text files and all the things you can do with text.
When it comes to representing text shape in applications and in print or PDF that is done with Fonts. A āglyphā is the name we use for the shape of a character (or a combination of characters to be exact, but the point is that itās a single drawing). To be able to display emojis we then need to have glyphs for them.
Normally glyph is just about shape. So font formats, like true type (ttf files) or open type (otf files) contain language operations to draw things like curves and lines and take great care to do this in a compact manner that still provides enough display power to allow showing any shape an artist can draw. When used in PDF, fonts are largely embedded with some addition and conversion from their original format (depending on what types of fonts PDF supports vs. what the application that outputs to PDF supports), so the same goes in PDF. This focus on shape is what allows you to apply color to text independently, and that this text would then have the color. Italic and Bold styles of fonts are simply other fonts containing those versions of the same family of glyphs (e.g. times new roman, times new roman italic etc.).
Emojis, as opposed to regular text, donāt only have shape, they also come in specific color - most of the times more than one - and therefore thereās no way to apply color to them. They also donāt tend to come in regular, bold and italic version. A bit more rigid technicallyā¦but can express much more. interesting š¤.
True type/Open type emoji special
To allow defining Emojis there are extensions to True type/Open type format. You can read about them in Wikipedia. Iāll summarize that thereās 3 approaches:
- Allow expressing emojis as combinations of true type/open type glyphs with addition of defining colors, gradient colors and combos. In effect, attempting to implement SVG using true type for drawing the shapes. This form is defined using the true type colr table (everything in true type is tables š¤Ŗ ).
- Expressing emojis as images. This is done with either cbdt or sbix tables.
- Expressing emojis as SVG graphics. This is done with svg tables.
Extra data required to draw specific glyphs as colorful emojis is found in those tables. At this point thereās no one standard format supported in all applications and by all fonts (politics š). Fonts are normally made with a specific method at hand with fallback on default black and white drawing using the default glyf or cff tables.
Hummus and Emoji fonts
My interest in this started with an issue on github. Got a note that Segoe UI Emoji appears black and white. well why wouldnāt it? then found out about colorful emojis in fonts, and in particular about the colr table. Segoe only uses the very simple version 0 of colr, which just defines an emoji as a list of pairs of glyphs and colors. To draw the emoji you iterate the list drawing the glyph in the particular color, one on top of the other, to get the final glyph. I implemented this part. Thanks goes to FreeType for implementing parsing of Colr table so I donāt have to, and could focus on translating this to PDF commands and structures. Thank you Free type š. You can see a usage example in the test file PDFWriterTesting\ColorEmojiColr.cpp. hereās how it looks like in the PDF file:
I then figured I probably want to continue this and implement Colr V1 as well, which is the complete version of colr, allowing to draw not just solid colors and layers of glyphs. Rather you can now do Gradient colors and combine glyphs in many multiple transformations. Thereās special blend modes too. You can read all about it in the specs at Microsoft or Google. The google one has a good link to some sample Colrv1 fonts, including a version of Noto defining itās emojis using Colrv1. Noto is way more interesting than Segoe containing some emojis with awesome gradients. You can see an example of using this font in PDFWriterTesting\ColorEmojiColrV1.cpp. That result in this:
I did stop here and did not continue to SVG and the image based tables. Figured thatās good enough for now.
Implementation details
Emojis are available when you draw text using a content context WriteText command. When encountering a glyph that has references in colr tables the regular drawing of text is suspended and the glyph is drawn individually using either itās Colrv1 definition or Colrv0. This means that if the text contains only regular characters it will continue to be displayed as it is now (probably using a single PDF Tj operator).
The classes LayeredGlyphsDrawingContext and PaintedGlyphsDrawingContext implement Colrv0 and Colrv1 drawing (respectively). The Colrv1 format contains multiple operators and most of them are supported (see below for limitations).
The drawing of gradients is heavily borrowed from a similar implementation in Google Skia library. Specifically the SKGradientShader interface, PDF outputs and code of SkPDFGradientShader. Itās thanks to that that we got them, cause Iām not an expert on gradients, at all. Itās really helpful that they got this lovely fiddle that both shows what comes out and can emit PDF. truly remarkable work. No wonder itās used everywhere haha.
I must say itās almost a bit of a wonder how the definition of gradients in Colr resembles whatās available in Skia shadersā¦but then iām guessing the connection is that both are inspired or attempting to implement SVG.
Also, as stated earlier, I used Freetype to parse the colrv1 and colrv0 table information, and they did a top notch work as wellā¦didnāt have to parse a thing myself.
limitations and caveats
Both through some time limitations (meaning till I stated getting tired) and implementation details thereās some issues with this implementation. They mostly mean there might be some emojis that appear incorrect in relation to how you know the font should look like. hereās a list:
- Implementation is done on WriteText only. Other text operators still only support regular chars and the black and white fallback for emojis (if one exists). So, for instance, Tj does not have emoji support. The reason is mostly that using Tj and other operators entails being able to rely on text state elements (like matrix set in Tm) and that drawing an emoji goes well outside of simple text drawing an uses more complex graphic elements such as cm matrixes and PDF Patterns and transparency.
- I left out implementing blend modes via the Paint Composite operator for now. itās just more work that Iāll do based on general (and mine) interest. would appreciate a sample of a glyph using something thatās not just srcOver (which i did implement).
- For Sweep gradients I only did 0ā¦360, so you canāt choose a different angle range (smaller, that is). Mostly because Skia PDF output seems to have the same limitationā¦and iāll need more time to figure this out, and figured it can wait.
- For Gradients (with or without alpha channels) Iām using Patterns. This is how one would draw a gradient. Nothing exceptional about that. The problem is that Patterns ignore the current matrix set by previous operators, or by elements (say if it is used in a for XObject) and deals directly with the default matrix of the page. While i could take care of providing the matrix including the WriteText position (translation) and whatever matrixes that exist within the emoji glyph drawing, any previous applications of matrix are ignored. This may result in the glyph appearing without color, if there was a shading, and you defined an external matrix. It also means forms including such a text can only be reused under the same transformation matrix. This includes Not being able to show properly when simply placed (without scaling or rotating) in another location. This is a truly annoying limitation of patternsā¦but thatās what it is. If you are using WriteText as is without a previous setting of matrix, it should be ok. If you did setup a matrix prior the problem can be solved by providing this Matrix to WriteText so that the pattern definition uses it in addition to its internal computation of matrixā¦iām not allowing for that for now. If you are interested, let me know and iāll find a way to pass this matrix through.
Oh and one more thing
One side effect of this is that now thereās class that can provide gradient patterns for you to use. ShadingWriter provides interface for creating radial, linear an sweep RGBA shading patterns that you can use independently of text creation. Initialize it with an ObjectsContext and a DocumentContext that you can get from PDFWriter and then it provides functions to create each gradient by being provided with parameters. Didnāt bother much with documentation (As always). If you are interested and canāt understand how to use it, let me know and iāll provide a write up and examples.
Thatās all for now.
XOXO,
Gal.