Pegah Faghiri, Kim Gerdes, Sylvain Kahane (2026). Verifying the Menzerath-Altmann law in the verbal domain in 180 languages. UDW26 @ LREC 2026.
This page mirrors the analysis notebook in order: Markdown commentary (including tables) followed by any plots saved by the corresponding code cells.
This section computes the primary MAL measure based on total number of dependents.
MAL_n = Average constituent size when a verb has total n dependents (left + right combined = n).
For n=4, this includes ALL configurations:
- VXXXX (4 right, 0 left) → bilateral_L0_R4_*
- XVXXX (1 left, 3 right) → bilateral_L1_R3_*
- XXVXX (2 left, 2 right) → bilateral_L2_R2_*
- XXXVX (3 left, 1 right) → bilateral_L3_R1_*
- XXXXV (4 left, 0 right) → bilateral_L4_R0_*
Note: Uses bilateral keys in the data. Also computes directional MAL (MAL_right_n, MAL_left_n) for use in Section 3.
The MAL Effect Score quantifies how well a language follows the Menzerath-Altmann Law.
If MAL holds, then as n (number of dependents) increases, the average constituent size (MAL_n) should decrease. A language with high MAL effect score will show: - MAL_1 > MAL_2 > MAL_3 > MAL_4 (decreasing trend)
We fit a linear regression: $\text{MAL}_n = \alpha + \beta \cdot n$
| Metric | Formula | Interpretation |
|---|---|---|
| Slope (β) | From regression | Negative = sizes decrease with n (MAL holds) |
| Normalized Slope | $\beta / \text{MAL}_1$ | Scale-independent version |
| MAL effect score | $-\beta / \text{MAL}_1$ | Higher = stronger MAL effect |
| Spearman ρ | Rank correlation of n vs MAL_n | More negative = stronger monotonic decrease |
| Decrease Ratio | $\text{MAL}1 / \text{MAL}{\max}$ | > 1 means sizes decreased overall |
If a language has MAL_1=3.5, MAL_2=2.8, MAL_3=2.2, MAL_4=1.9: - Slope β ≈ -0.52 (negative → good) - Normalized slope ≈ -0.15 - MAL effect score ≈ 0.15 (positive → follows MAL) - Spearman ρ = -1.0 (perfect monotonic decrease)
Investigate relationships between MAL effect score and word order typology.
(Visualization cells follow - MAL_n curves are part of Section 2.3)
The plot above shows the Menzerath-Altmann Law (MAL) across languages. Each line represents a language, with the x-axis showing the total number of dependents (n) and the y-axis showing the average constituent size (MAL_n).
Key observations:
Strong MAL effect for n=1 to 3: The mean curve (black line) shows a clear downward trend from n=1 to n=3, confirming MAL — as the number of dependents increases, the average size of each constituent decreases.
Upturn at higher n values: The mean curve rises from n=3 to n=6. This is likely a sampling artifact: at higher n values, only languages with sufficient data (≥10 occurrences) contribute to the mean. These tend to be languages with larger corpora or more complex verbal constructions, which may have systematically different properties than the full set of languages.
Survivor bias: As n increases, fewer languages have enough data to be included. The languages that "survive" to n=5 or n=6 are not a random sample — they may represent specific typological profiles or simply better-resourced languages.
Cross-linguistic pattern (n=1 to 3): The consistent downward pattern across diverse language families (indicated by colors) for low n values suggests MAL is a linguistic universal, driven by cognitive processing constraints and communicative efficiency.
Variation around the mean: The spread of individual language curves indicates cross-linguistic variation in the strength of the MAL effect, which may correlate with typological features such as head-directionality (VO vs OV).
Plot the relationship between MAL effect scores and Spearman correlation coefficients across languages.
What this plot shows: This scatter plot compares two different metrics for measuring MAL adherence:
Expected relationship: Since both metrics measure the same underlying phenomenon (constituent size decreasing with valency), we expect a strong negative correlation: languages with more negative Spearman $r$ should have higher MAL effect score (β)s.
Key observations: - A tight linear relationship validates that both metrics capture the same phenomenon consistently - Outliers may indicate languages where the MAL relationship is non-monotonic or non-linear - The regression line slope indicates how the two metrics scale relative to each other
Ceiling/Floor Effects and Discriminative Power:
A notable pattern in this plot is that languages with extreme Spearman values (approaching -1 or +1) show considerable vertical spread in their MAL effect scores. This phenomenon is known as a ceiling effect (or floor effect for -1):
For example, two languages might both have Spearman $r = -1$ (perfect negative monotonic relationship), but one might show a steep decline in constituent size (high MAL effect score) while another shows only a gradual decline (lower MAL effect score). The Spearman correlation conflates these cases; the MAL effect score score differentiates them.
Implication for metric choice:
This suggests that MAL effect score is the more informative metric for cross-linguistic comparison, particularly when: 1. Many languages cluster at extreme Spearman values 2. The research question concerns the strength of the MAL effect, not just its presence 3. Fine-grained distinctions between languages with similar monotonic patterns are needed
The Spearman correlation remains useful as a robustness check (confirming monotonicity) and for cases where the relationship may be non-linear, but the MAL effect score score provides greater discriminative resolution across the full range of MAL behavior.
This section analyzes MAL separately for left-side and right-side dependents.
What these plots show: - Left panel (MAL_right_n): Average constituent size when a head has n right-side dependents (with any number of left dependents). A decreasing curve means constituents shrink as more right dependents are added. - Right panel (MAL_left_n): Average constituent size when a head has n left-side dependents (with any number of right dependents).
Key observations: - Both curves typically show negative slopes (decreasing size with more dependents), confirming MAL operates on both sides of the head - The steepness of each curve reflects how strongly that side exhibits MAL - Languages where right curve is steeper than left → stronger MAL effect on the right (typical for VO languages) - Languages where left curve is steeper than right → stronger MAL effect on the left (typical for OV languages) - The black mean line shows the cross-linguistic average trend
| Measure | Description | Linguistic Interpretation |
|---|---|---|
| Directional Slope Ratio | slope_right / slope_left |
Values >1 indicate stronger MAL on right side; <1 indicates left-side dominance |
| Directional R² Comparison | Compare R² of right vs left linear fits | Which side shows more consistent MAL behavior? |
| Crossover Point | At what n does left MAL = right MAL? | Identifies asymmetry threshold |
| Directional MAL Difference | MAL_right_n - MAL_left_n for each n |
How size differs by direction at each dependent count |
| Interaction Effect | Size when n_left × n_right jointly considered | Does having dependents on both sides amplify/dampen MAL? |
| Positional Decay | How does MAL vary by position within left/right dependents? | First dependent vs. second vs. third on each side |
| Head-Dependent Asymmetry | Compare dependent size reduction left vs right of their own heads | Recursive MAL within the dependency tree |
| Directional Compliance Rate | % of languages with slope < 0 for each direction | Is MAL more universal on one side? |
| Variance by Direction | Std of constituent sizes at each n per direction | Which direction has more stable MAL effect? |
This section analyzes the temporal dynamics of MAL: how constituent size changes step-by-step.
Analyzes which specific transitions follow MAL for each language:
| Measure | Description |
|---|---|
| Step Compliance | For each transition (1→2, 2→3, 3→4): is MAL_{n} > MAL_{n+1}? |
| Compliance Category | Fully compliant / First-step only / Partial / Anti-MAL |
| Compliance Count | Number of decreasing transitions (0 to 3 for n=1..4) |
| Weighted Score | Early transitions weighted more: w₁(1→2) + w₂(2→3) + w₃(3→4) |
Categories: - 🟢 Fully MAL-conformant: All steps decrease (MAL₁ > MAL₂ > MAL₃ > MAL₄) - 🟢 First-step compliant: At least MAL₁ > MAL₂ (most important linguistically) - 🟡 Partial: Some steps decrease, but not monotonic - 🔴 Anti-MAL: Sizes increase with n (MAL₁ < MAL₄)
What are MAL values?
MAL_n represents the average constituent size (in words) when a verb has exactly n total dependents. It's computed as the geometric mean of all constituent sizes across all verb configurations with n dependents.
What is "dependent count" (n)?
The total number of dependents of a verb, regardless of whether they appear on the left or right side. This is the sum of left + right dependents.
How to interpret the heatmap:
| Reading | Interpretation |
|---|---|
| Color intensity | Darker red = larger constituent sizes; lighter yellow = smaller sizes |
| ↓ arrows | Size decreased from previous n → MAL-conformant transition |
| ↑ arrows | Size increased from previous n → anti-MAL transition |
| Row order | Languages sorted by MAL effect score (most compliant at top) |
| Category labels | Right side shows compliance category (Fully, First, Partial, Anti) |
Example reading: If a row shows 2.50 → ↓1.80 → ↓1.45 → ↓1.20, this language is fully MAL-conformant because constituent sizes consistently decrease as the number of dependents increases (2.50 > 1.80 > 1.45 > 1.20).
The heatmap displays languages sorted by MAL effect score score (highest at top). Since we're showing only the top 60 languages, these are predominantly the most MAL-conformant ones—hence most show "Fully" (fully MAL-conformant). Languages with partial or anti-MAL patterns appear further down the list and may not be visible in this truncated view.
What does being at the top mean?
Yakut, at the top of the list, is the language with the strongest MAL effect among those shown. Looking at its values: - MAL_1 ≈ 1.37 → MAL_2 ≈ 1.35 → MAL_3 ≈ 1.15 → MAL_4 ≈ (lower)
This means: - When a Yakut verb has 1 dependent, its average constituent size is ~1.37 words - When it has 4 dependents, constituents shrink to share the limited space - All transitions show ↓ (decreasing), confirming perfect MAL effect score
Why do some languages have larger MAL_1 values?
Languages like Galician (MAL_1 ≈ 5.82) or Catalan (MAL_1 ≈ 4.96) have larger baseline constituent sizes. This doesn't mean they're "less compliant"—they still show consistent decreases. The compliance score is normalized by MAL_1, so languages with different absolute sizes can be fairly compared.
Key insight: Nearly all languages in this sample follow MAL, with constituent sizes consistently shrinking as the number of dependents increases. This supports the universality of the Menzerath-Altmann Law across diverse language families.
Reading the Bar Plot: - Each bar shows the mean MAL effect score for all languages in that family - Error bars represent ±1 standard deviation within each family - Numbers in parentheses indicate sample size (number of languages/treebanks) - Higher (more positive) values = stronger MAL effect (more dependents → smaller constituent size)
Statistical Significance:
Global MAL Effect: The one-sample t-test determines if languages overall exhibit MAL behavior (compliance ≠ 0). A significant result confirms MAL is a genuine cross-linguistic tendency.
Between-Family Differences: The Kruskal-Wallis test (non-parametric ANOVA) tests whether families differ significantly in their MAL effect score. If significant (p < 0.05), family membership matters for MAL strength.
Per-Family Deviations: Individual t-tests identify which families deviate significantly from the global mean: - Families with significantly higher compliance show stronger head-planning effects - Families with significantly lower compliance may have other structural factors counteracting MAL
Caveats: - Sample sizes vary greatly across families (some have 50+ treebanks, others only 3-5) - Small families have less statistical power and larger confidence intervals - Family groupings may obscure within-family diversity - Significance stars: * p < 0.05, ** p < 0.01, *** p < 0.001
Question: Does MAL effect score differ between left-side and right-side dependents? Does this correlate with head-directionality?
We compute: - MAL Asymmetry = MAL_compliance_right - MAL_compliance_left - Positive asymmetry → stronger MAL effect on right-side dependents - Negative asymmetry → stronger MAL effect on left-side dependents
Hypothesis: VO (head-initial) languages may show stronger right-side MAL, while OV (head-final) languages show stronger left-side MAL.
Left Panel: Left vs Right MAL effect score Scatter - Each point represents a language - Above the diagonal: Languages where MAL is stronger on the right side (right dependents shrink more as their count increases) - Below the diagonal: Languages where MAL is stronger on the left side - Languages near the diagonal have symmetric MAL across both sides
Middle Panel: Asymmetry by Language Family - Bars to the right of zero: Families with stronger right-side MAL - Bars to the left of zero: Families with stronger left-side MAL - Error bars show within-family variation
Right Panel: Asymmetry vs VO Score - Tests whether word order predicts which side shows stronger MAL - Hypothesis: VO (head-initial) languages may show stronger right-side MAL because right dependents are more common - Positive correlation would support this hypothesis - Weak/no correlation suggests MAL asymmetry is independent of basic word order
Key Questions Answered: 1. Is MAL universal across both directions, or does one side dominate? 2. Do language families show consistent directional biases? 3. Does head-directionality predict MAL asymmetry?
Question: Beyond binary compliance (yes/no decrease), how steep is the MAL curve?
We compute:
- Early decay rate = (MAL_1 - MAL_2) / MAL_1 — relative drop in first step
- Late decay rate = (MAL_3 - MAL_4) / MAL_3 — relative drop in last step
- Total decay = (MAL_1 - MAL_max) / MAL_1 — overall shrinkage
This reveals whether languages have "front-loaded" decay (big drop early, then flat) or "gradual" decay (consistent shrinkage throughout).
Top-Left: Distribution of Decay Rates by Transition - Histograms show how decay rates are distributed across languages - Positive values = constituent size decreased (MAL-conformant) - Negative values = constituent size increased (anti-MAL) - Compare the distributions: Is early decay (blue) typically larger than late decay (red)?
Top-Right: Early vs Late Decay Scatter - Each point is a language - Below diagonal: "Front-loaded" languages — most size reduction happens early (1→2) - Above diagonal: "Back-loaded" languages — most reduction happens late (3→4) - Near diagonal: "Gradual" languages — consistent decay throughout
Bottom-Left: Decay Rates by Family - Grouped bars show mean decay at each transition for each family - Families with tall blue bars but short red bars have front-loaded MAL - Families with similar bar heights have gradual MAL
Bottom-Right: Decay Pattern Distribution (Pie Chart) - Shows the proportion of languages in each decay category - Front-loaded: Big drop from n=1→2, then flattens - Gradual: Consistent shrinkage at each step - Back-loaded: Small early drop, bigger late drop - Unknown: Insufficient data or mixed patterns
Key Insight: If most languages are "front-loaded," it suggests that the constraint to shrink constituents is strongest when going from 1 to 2 dependents—the first "competition" for space around the verb.
Question: What are the different "shapes" of MAL curves across languages? Can we identify clusters or typological patterns?
We: 1. Normalize MAL values: MAL_n_norm = MAL_n / MAL_1 (so all curves start at 1.0) 2. Visualize trajectories using parallel coordinates plots 3. Cluster languages by trajectory shape to identify "types"
This reveals whether languages have similar MAL "signatures" across families.
Top-Left: All Normalized Trajectories by Family - Each thin line is a language's MAL trajectory, normalized so MAL_1 = 1.0 - Lines going down = constituent sizes shrink as n increases (MAL-conformant) - Lines staying flat = weak or no MAL effect - Lines going up = anti-MAL (rare) - The black line shows the global mean trajectory
Top-Right: Trajectory Clusters - K-means clustering identifies distinct trajectory "shapes" - Each cluster's mean trajectory is shown as a thick line - Clusters might include: - Steep decline: Drops rapidly to ~0.5-0.6 by n=4 - Gradual decline: Steady decrease to ~0.7-0.8 - Flat/Weak MAL: Stays near 1.0 (little shrinkage) - Early steep: Big drop at n=2, then flattens
Bottom-Left: Mean Trajectory by Language Family - Each family's average trajectory - Families with lower endpoints (right side) have stronger MAL - Families with steeper slopes show more dramatic shrinkage - Compare families: Do related languages behave similarly?
Bottom-Right: Cluster Distribution by Family - Shows which trajectory types are common in each family - Families dominated by one color have consistent MAL behavior - Mixed-color families have diverse MAL patterns
Key Questions Answered: 1. Are there typologically distinct MAL "signatures"? 2. Do language families cluster together in trajectory space? 3. What is the typical amount of constituent shrinkage (e.g., 20%, 40%)?
Question: Is MAL a genuine cross-linguistic universal, or could the observed effects arise by chance?
For each language, we test whether the MAL effect is statistically significant using: 1. Permutation test: Shuffle the MAL_n values and check if observed slope is extreme 2. One-sample t-test: Is the slope significantly different from 0? 3. Bootstrap confidence intervals: 95% CI for the slope
A language shows significant MAL if: - The slope is negative AND statistically significant (p < 0.05)
We report: - % of languages with significant MAL effect - % by language family - Effect sizes (Cohen's d)
Results Summary (n=165 languages):
| Category | Count | Percentage |
|---|---|---|
| Negative slope (MAL direction) | 115 | 69.7% |
| Significant slope (p<0.05) | 63 | 38.2% |
| Significant MAL effect | 51 | 30.9% |
| Significant anti-MAL | 12 | 7.3% |
Key Findings:
~70% of languages show MAL direction: The majority of languages exhibit the expected pattern where constituent size decreases as the number of dependents increases.
~31% show statistically significant MAL: About one-third of languages have a MAL effect strong enough to be statistically significant via permutation test. This is far above the 5% expected by chance (binomial test p < 0.001).
Mean effect size = -0.66: This is a medium-to-large effect (Cohen's d), indicating that MAL is not just statistically significant but also substantively meaningful.
Anti-MAL is rare (~7%): Only 12 languages show significant effects in the opposite direction, suggesting MAL violations are uncommon.
Conclusion: MAL is a genuine cross-linguistic tendency. While not universal in the strict sense (not all languages show significant effects), the pattern is far more prevalent than chance would predict, with 70% showing the expected direction and 31% reaching statistical significance.
Pie Chart Interpretation: - Left pie: Shows the split between languages with significant MAL (green, ~31%) vs. not significant (red, ~69%) - Right pie: Breaks down by direction—green for significant MAL, red for significant anti-MAL, gray for non-significant
Generate an HTML report combining all the plots and markdown text from this notebook, and add it to the index.html.
This cell generates the UD_maps.html file containing two interactive maps:
1. Map 1: Languages by family (equal-sized dots, colored by language group)
2. Map 2: Languages by corpus size (dot size proportional to token count)
The info box below each map displays statistics when hovering over a language dot.
Built from 08_menzerath_altmann_analysis.ipynb: 32 markdown cells, 13 plots embedded.
8. Menzerath-Altmann Law (MAL) Analysis
Summary: Comprehensive analysis of Menzerath-Altmann Law across 200+ languages from Universal Dependencies, examining how constituent size decreases as the number of dependents increases.
Theoretical Background
Menzerath-Altmann Law (MAL) states that the larger a linguistic construct, the smaller its constituents tend to be. In verb-centered analysis: - As the number of verbal dependents (n) increases, the average constituent size should decrease - A language with high MAL effect score shows a clear decreasing trend in constituent size as n grows
Notebook Structure
Key Measures Computed
Prerequisites
01_data_preparation_and_validation.ipynbmetadata.pkl04_data_processing.ipynball_langs_position2sizes.pkl,all_langs_position2num.pkl05_comparative_visualization.ipynbvo_vs_hi_scores.csvInputs / Outputs
Inputs: -
data/metadata.pkl-data/all_langs_position2sizes.pkl,data/all_langs_position2num.pkl-data/vo_vs_hi_scores.csv(for VO correlation)Outputs: -
data/lang2MAL_full.pkl- Full MAL data per language -data/mal_compliance_scores.csv- MAL effect scores -data/mal_asymmetry.csv- Directional asymmetry scores -data/mal_decay_rates.csv- Decay pattern analysis -data/mal_trajectories.csv- Trajectory cluster assignments -plots/mal_*.png- VisualizationsRuntime: ~2-3 minutes