Open
Description
String with non-alphanumeric formatted content which has a next-char of an alpha-numeric is tokenized as text
node, instead of into a series of format nodes as expected.
Problem reproduced on CommonMark online demo (to reproduce just paste **@**A
there and compare with **@** A
).
Example:
While all these samples are tokenize as expected:
**@**@ => formatted non-alphanumeric + non-alphanumeric
@**@** => non-alphanumeric + formatted non-alphanumeric
@**A** => formatted non-alphanumeric + non-alphanumeric
**A** @ => formatted alphanumeric + space + non-alphanumeric
This sample will be tokenized into a text
node and will not be parsed: **@**A
(formatted non-alphanumeric + alphanumeric)
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE document SYSTEM "CommonMark.dtd">
<document xmlns="http://commonmark.org/xml/1.0">
<paragraph>
<text>**</text>
<text>@</text>
<text>**</text>
<text>A</text>
</paragraph>
</document>
Add a space between formatted non-alphanumeric and alpha-numeric and compare tokenization for string **A** @
:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE document SYSTEM "CommonMark.dtd">
<document xmlns="http://commonmark.org/xml/1.0">
<paragraph>
<strong>
<text>@</text>
</strong>
<text> A</text>
</paragraph>
</document>
Metadata
Metadata
Assignees
Labels
No labels