Computer Science > Computation and Language

arXiv:2404.14192 (cs)

[Submitted on 22 Apr 2024 (v1), last revised 25 May 2024 (this version, v3)]

Title:Swap distance minimization beyond entropy minimization in word order variation

Authors:Víctor Franco-Sánchez, Arnau Martí-Llobet, Ramon Ferrer-i-Cancho

Abstract:Here we consider the problem of all the possible orders of a linguistic structure formed by $n$ elements, for instance, subject, direct object and verb ($n=3$) or subject, direct object, indirect object and verb ($n=4$). We investigate if the frequency of the $n!$ possible orders is constrained by two principles. First, entropy minimization, a principle that has been suggested to shape natural communication systems at distinct levels of organization. Second, swap distance minimization, namely a preference for word orders that require fewer swaps of adjacent elements to be produced from a source order. Here we present average swap distance, a novel score for research on swap distance minimization, and investigate the theoretical distribution of that score for any $n$: its minimum and maximum values and its expected value in die rolling experiments or when the word order frequencies are shuffled. We investigate whether entropy and average swap distance are significantly small in distinct linguistic structures with $n=3$ or $n=4$ in agreement with the corresponding minimization principles. We find strong evidence of entropy minimization and swap distance minimization with respect to a die rolling experiment. The evidence of these two forces with respect to a Polya urn process is strong for $n=4$ but weaker for $n=3$. We still find evidence of swap distance minimization when word order frequencies are shuffled, indicating that swap distance minimization effects are beyond pressure to minimize word order entropy.

Comments:	Little language and maths errors corrected; substantial changes and corrections in Section 4.2
Subjects:	Computation and Language (cs.CL); Physics and Society (physics.soc-ph)
Cite as:	arXiv:2404.14192 [cs.CL]
	(or arXiv:2404.14192v3 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2404.14192

Submission history

From: Ramon Ferrer-i-Cancho [view email]
[v1] Mon, 22 Apr 2024 14:01:09 UTC (234 KB)
[v2] Sun, 28 Apr 2024 17:38:51 UTC (235 KB)
[v3] Sat, 25 May 2024 14:10:05 UTC (249 KB)

Computer Science > Computation and Language

Title:Swap distance minimization beyond entropy minimization in word order variation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Swap distance minimization beyond entropy minimization in word order variation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators