10000 Is it possible to add support for CJK fonts · Issue #16 · garrettj403/SciencePlots · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Is it possible to add support for CJK fonts #16

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Alraemon opened this issue Jul 16, 2020 · 9 comments · Fixed by #29
Closed

Is it possible to add support for CJK fonts #16

Alraemon opened this issue Jul 16, 2020 · 9 comments · Fixed by #29

Comments

@Alraemon
Copy link

Currently, the legends & labels written in CJK characters can not be properly displayed, can this issue be fixed, or any way to override something to get it displayed ?

@garrettj403
Copy link
Owner

I don't have any experience with CJK fonts, but I think that it's probably something to do with the Latex font rendering.

Does it work properly if you disable Latex?

plt.style.use(['science', 'no-latex'])

@Alraemon
Copy link
Author

I don't have any experience with CJK fonts, but I think that it's probably something to do with the Latex font rendering.

Does it work properly if you disable Latex?

plt.style.use(['science', 'no-latex'])

Unlucky, it seems to be useless. When using LaTeX, I notice it's using pdfTeX, which poorly supports CJK rendering and hard to tweak. However, using no-latex still can not fix the issue directly, but having some tweaks in .mlpstyle could be helpful. I will try fixing it & make a PR later. Hopefully, I will also try fixing LaTeX rendering issue. :D

@garrettj403
Copy link
Owner

Hi @Alraemon

Did you have any luck getting this to work?

@Hsins
Copy link
Contributor
Hsins commented Nov 3, 2020

Hi @garrettj403

There is no font fallback implementation in matplotlib so that it can't properly display the mixed CJK characters and Latin characters in texts even though we disable the LaTeX rendering engine. Moreover, we can only use a certain font support CJK and Latin characters for mixed texts (check this issue in matplotlib for more information).

There would be two possible ways to use CJK fonts for plotting:

  1. Disable LaTeX rendering engine and choose the font which supports both CJK and Latin characters.
  2. Use pgf rendering but switch the TeX engine to XeLaTeX or LuaLaTeX. It's more complex and something infrastructure should be setup.

To use mixed CJK characters and Latins characters with XeLaTeX or LuaLaTeX engine, we need to use the fontspec or babelfont packages (check this question on stackexchange):

There is an example in the Colab jupyter notebook

matplotlib.use('pgf') # stwich backend to pgf
matplotlib.rcParams.update({
    "pgf.texsystem": "xelatex",
    "text.usetex": True,    # use default xelatex
    "pgf.rcfonts": False,   # turn off default matplotlib fonts properties
    "pgf.preamble": [
        r'\usepackage{fontspec, xeCJK}',
        r'\setmainfont{Latin Modern Roman}',# EN fonts Romans
        r'\setCJKmainfont{SimHei}',# set CJK fonts as SimSun
        r'\setCJKsansfont{SimHei}',
        r'\newCJKfontfamily{\Song}{SimSun}',
        ]
})

The script above would export the sample.pdf inside that work folder. Here is the screenshot of the figure inside that file:

image

@garrettj403 garrettj403 linked a pull request Nov 4, 2020 that will close this issue
@garrettj403 garrettj403 reopened this Nov 4, 2020
@garrettj403
Copy link
Owner
garrettj403 commented Nov 5, 2020

Update

@Hsins made a pull request to add support for CJK fonts (see #29). The new cjk-fonts style requires some additional font packages. Instructions on how to install these packages can be found in the FAQ section of the README.

Example 14 provides an example of this style:

The style works well with SC/TC/JP characters, but there are some errors with Korean characters (see discussion in #29).

I reopened this issue in case anyone requires Korean characters.

@Hsins
Copy link
Contributor
Hsins commented Nov 5, 2020

Hi @garrettj403

I have tried again to deal with the Korean characters, it seems that errors occur when the character in texts (no matter users use the Noto Serif CJK TC, Noto Serif CJK JP, Noto Serif CJK KR or Noto Serif CJK SC fonts). I would effort to find another open-source Korean Font to apply.


By the way, I must point out that the given list of fonts below is nonsense in most cases (and sorry for that I push it in previous commits) because of the lack of font fallback mechanism in matplotlib.

font.serif : Noto Serif CJK TC, Noto Serif CJK SC, Noto Serif CJK JP, Noto Serif CJK KR, Times New Roman

The matplotlib just finds the font in that font list and apply the first valid (means it can be found in the given path) one to all characters in the given string. So the style cjk-fonts would only apply the Noto Serif CJK TC fonts for rendering characters if it can be found. And the following fonts would be ignored.

Moreover, there are some differences in those Noto Fonts even though each font supports the charsets in other CJK languages:

Ref: https://ibe.tw/noto-font/

According to the two reasons, users who use Traditional Chinese can just use the font list below (and as same as to the users who use other CJK languages).

font.serif : Noto Serif CJK TC, Times New Roman

Would you mind to separate the cjk-fonts style into cjk-tc-fonts, cjk-sc-fonts, cjk-jp-fonts and cjk-kr-fonts styles?

@garrettj403
Copy link
Owner

Okay, I split cjk-fonts into separate files in commit 4204854.

Note: When I run example 14, I get the following error message:

'NotoSerifCJKtc-Regular.otf' can not be subsetted into a Type 3 font. The entire font will be embedded in the output.

This means that th 8000 e PDF files end up being ~20 MB. Do you have the same issue?

@Hsins
Copy link
Contributor
Hsins commented Nov 6, 2020

I have that same issue too but the problem can't be solved ourselves.

The matplotlib use Type 3 font to produce PDF document by default to reduce the file size. (The disadvantage of Type 3 font is that it can not be edited by a PDF editor such as Adobe Acrobat Pro. The “advantage” is that the produced PDF is small in size.)

But it failed to subsetted the CJK fonts we used. The result would be embedded those fonts into the output file and what the sad news is all the CJK fonts are HUGE. This may as same as the result when we need to use FontType 42 and it make the size of PDF file large.


For someone who wants to reduce the size of PDF file, just follow the solutions here to convert the PDF file to Ghostscript file then convert it back.

$ pdf2ps file.pdf file.ps
$ ps2pdf -dPDFSETTINGS=/prepress file.ps file-optimized.pdf

The option -dPDFSETTINGS defines the quality of the produced PDF. Possible options and explanations are listed below:

  • -dPDFSETTINGS=/screen: screen-view-only quality, 72 dpi images
  • -dPDFSETTINGS=/ebook: low quality, 150 dpi images
  • dPDFSETTINGS=/printer: high quality, 300 dpi images
  • dPDFSETTINGS=/prepress: high quality, color preserving, 300 dpi imgs
  • dPDFSETTINGS=/default: almost identical to /screen

@garrettj403
Copy link
Owner

I created a new release (v1.0.7) which includes limited support for CJK characters. There is still the issue that CJK pdf figures are very large but you can follow the instructions above (from @Hsins) to reduce the size of those figures. I don't think there is anything we can do to fix this.

Note:

  • Korean characters seem to work now (see Fig. 14d)
  • For me, everything works on both macOS and Windows (using Windows Subsystem for Linux)

I'll close this issue for now, but feel free to reopen if there are any more problems or ways to improve CJK support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants
0