This application displays text in multiple languages (English, Ukrainian, Spanish) and color-codes them based on Toki Pona definitions. It provides tooltips to show the Toki Pona words, their semantic primal color components, and English translations.
- Background Color: Never use any backgrounds except #ffffff (white). All UI components must adhere to this. This is crucial for the color-intensive text display.
This project displays a given text, with words colored based on their corresponding Toki Pona definitions. The coloring follows a specific set of rules:
-
Input Text Processing: The input English text is split into words. Punctuation attached to words is removed for the purpose of finding definitions, but the original word (with punctuation) is preserved for display.
-
Toki Pona Definition Search:
- For each English word of length
N
(after splitting from the main text, but before cleaning punctuation for lookup), we aim to find the "most precise" Toki Pona (TP) definition. - The maximum number of TP words allowed in this definition is
ceil(N / 2)
. For example:- A 1-letter English word: max 1 TP word.
- A 2-letter English word: max 1 TP word.
- A 3-letter English word: max 2 TP words.
- A 4-letter English word: max 2 TP words.
- A 5-letter English word: max 3 TP words.
- The search for definitions primarily uses
compounds.txt
. This file contains common TP words and compound phrases with their English translations and frequency scores. An English word is looked up in these translations. - If multiple TP definitions from
compounds.txt
match an English word and satisfy the length criteria, the one with the highest frequency score is chosen as the "most precise." - The official
dictionary.yml
can also be a source for definitions, though the current implementation prioritizescompounds.txt
for phrase matching. - Important: The English definitions (
en
field) indata.ts
are not used for finding the TP definition for coloring;compounds.txt
anddictionary.yml
are the sources.
- For each English word of length
-
Special Particles
pi
ande
:- When a TP definition is found (e.g., "pali pi kama sona"), the particles
pi
(of) ande
(direct object marker) are stripped out from the definition before its length is calculated. - These particles also do not count towards the
ceil(N / 2)
length limit. - For example, "experiment" can be defined as "pali pi kama sona". After stripping
pi
, it becomes "pali kama sona", which has 3 TP words. This definition would be valid for an English word "experiment" (10 letters,ceil(10/2) = 5
, so max 5 TP words).
- When a TP definition is found (e.g., "pali pi kama sona"), the particles
-
Applying Colors to English Word Letters:
- Once the best TP definition (e.g.,
[tp_word1, tp_word2, tp_word3]
) is determined:- The first letter of the (usually two-letter) segment of the original English word is colored using
color1
oftp_word1
(fromdata.ts
). - The second letter of that English word segment is colored using
color2
oftp_word1
. - This pattern continues: the next two letters of the English word (forming the second segment) are colored by
tp_word2
'scolor1
(for the third English letter) andcolor2
(for the fourth English letter), and so on.
- The first letter of the (usually two-letter) segment of the original English word is colored using
- If the TP definition is shorter than the number of 2-letter segments in the English word (i.e.,
definition_length < ceil(N / 2)
), any remaining letters of the English word are colored white. - If an English word has an odd number of letters, the last single letter forms a segment. It is colored with
color1
of the corresponding TP word, or white if the definition is exhausted. - If no suitable TP definition is found for an English word, the entire word is displayed in white.
- Spaces between words are preserved and colored black.
- Once the best TP definition (e.g.,
Let's say the English word is "experiment" (10 letters).
N = 10
. Max TP definition length =ceil(10 / 2) = 5
words.- We search for "experiment" in
compounds.txt
.- We find:
pali pi kama sona: [experiment 47, ...]
- We find:
- The TP phrase is "pali pi kama sona".
- Stripping
pi
: "pali kama sona". - Length of this definition: 3 words (
pali
,kama
,sona
). This is<= 5
, so it's a valid definition. - Let's assume this is the most precise definition found.
- Stripping
- Colors are defined in
data.ts
forpali
,kama
,sona
:pali
: { color1: "yellow", color2: "yellow" }kama
: { color1: "yellow", color2: "red" }sona
: { color1: "gray", color2: "yellow" }
- Applying colors to "experiment":
e
(letter 1) usespali
'scolor1
:yellow
.x
(letter 2) usespali
'scolor2
:yellow
.p
(letter 3) useskama
'scolor1
:yellow
.e
(letter 4) useskama
'scolor2
:red
.r
(letter 5) usessona
'scolor1
:gray
.i
(letter 6) usessona
'scolor2
:yellow
.m
(letter 7): The TP definition (pali
,kama
,sona
) is now exhausted. This letter is colored white.e
(letter 8): Also colored white.n
(letter 9): Also colored white.t
(letter 10): Also colored white.
GitHub Markdown doesn't directly support complex text coloring like applying different colors to individual letters within a word easily using standard Markdown syntax. We can show the intended words and their associated TP color words, and then an approximation of the visual output.
experiment
->
e
(pali: yellow) x
(pali: yellow)
p
(kama: yellow) e
(kama: red)
r
(sona: gray) i
(sona: yellow)
m
(white) e
(white) n
(white) t
(white)
If we were to use simple font colors (this is an approximation of the per-letter coloring):
experiment (Note: This simplified Markdown example just applies some of the distinct colors.)
Or, with a black background (simulated via a table cell for Markdown, if the viewer supports HTML):
experiment (Colors are hex approximations.) |
(The actual app applies a specific color to each character directly via CSS color
style on a black page background.)
- `