"Modelling language and text data: Problems, results and challenges"

07.06.2019 12:30 - 13:30

Emmerich Kelih, Professor of Slavonic Studies

The lecture will present and discuss some core problems of the quantitative analysis of language and text data. Statistical methods become more and more popular in many areas of linguistics and text analysis. However, in modelling linguistic data some specificities has to be taken into consideration.

First, analysing and modelling language and text data, heterogeneity, the mostly absent normal distribution, and the problem of representativity demand for special reflections and attention.
In the second part of the talk a brief overview on functional (Menzerath’s law), distributional (Zipf’s law) and developmental laws (Piotrowski law) in linguistics will be given, where mostly power models and/or probabilistic distribution come into play.
In the final part of the talk selected case studies from quantitative linguistics (among others word length studies) will be presented, which should give some representative insights into the power of self-regulation in linguistics and related fields.

Emmerich Kelih

Data Science @ Uni Vienna

SR Geschichte 1

Main Building, 1st floor, Universitätsring 1, 1010 Wien