A Statistical Explanation of the Distribution of Sortal Classifiers in Languages of the World via Computational Classifiers Article - Avril 2020

One-Soon Her, Marc Tang

One-Soon Her, Marc Tang, « A Statistical Explanation of the Distribution of Sortal Classifiers in Languages of the World via Computational Classifiers  », Journal of Quantitative Linguistics, avril 2020, pp. 93-113. ISSN 0929-6174

Abstract

Previous studies demonstrate that morphosyntactic plural markers and the structure of numeral systems have individually strong predictive power with regard to the usage of sortal classifiers in languages. We use these two factors as explanatory variables to train the computational classifier of random forests and evaluate the accuracy of their predictive power when selecting the existence/absence of sortal classifiers as response variable. Our results show that these two factors result in an excellent discrimination performance of random forests, even when taking into account sortal classifiers as an areal feature. However, the correlation between morphosyntactic plural markers and multiplicative bases is weaker than the correlation between sortal classifiers and plural markers plus multiplicative bases. We are thus able to provide novel insights with regard to probabilistic universals on sortal classifiers, and suggest an innovative cross-disciplinary approach to test the effect of implicational universals with computational methods.

Voir la notice complète sur HAL

Actualités