Data coverage by constituent count n

Pegah Faghiri, Kim Gerdes, Sylvain Kahane (2026). Verifying the Menzerath-Altmann law in the verbal domain in 180 languages. UDW26 @ LREC 2026.

For each constituent count n, the bars show how many languages have at least 100 verbal configurations of that size (the threshold used in the paper, §4). This visualizes why the LMAL and RMAL samples (124 and 103 languages respectively) are smaller than the bilateral MAL sample (131 languages) — they are constrained by the rarer high-n dependents on each side.

Number of languages reaching MIN_COUNT threshold per n

Click any bar to list the languages contributing to it. Each language name links to its sample-sentence page for that bucket (e.g. Abaza at n=5examples/abq/samples/mal_n5.html).