Cookies Policy
X

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies.

I accept this policy

Find out more here

Open Access Using ancestral state reconstruction methods for onomasiological reconstruction in multilingual word lists

No metrics data to plot.
The attempt to load metrics for this article has failed.
The attempt to plot a graph for these metrics has failed.
The full text of this article is not currently available.

Brill’s MyBook program is exclusively available on BrillOnline Books and Journals. Students and scholars affiliated with an institution that has purchased a Brill E-Book on the BrillOnline platform automatically have access to the MyBook option for the title(s) acquired by the Library. Brill MyBook is a print-on-demand paperback copy which is sold at a favorably uniform low price.

Using ancestral state reconstruction methods for onomasiological reconstruction in multilingual word lists

  • PDF
  • HTML
Add to Favorites
You must be logged in to use this functionality

image of Language Dynamics and Change

Current efforts in computational historical linguistics are predominantly concerned with phylogenetic inference. Methods for ancestral state reconstruction have only been applied sporadically. In contrast to phylogenetic algorithms, automatic reconstruction methods presuppose phylogenetic information in order to explain what has evolved when and where. Here we report a pilot study exploring how well automatic methods for ancestral state reconstruction perform in the task of onomasiological reconstruction in multilingual word lists, where algorithms are used to infer how the words evolved along a given phylogeny, and reconstruct which cognate classes were used to express a given meaning in the ancestral languages. Comparing three different methods, Maximum Parsimony, Minimal Lateral Networks, and Maximum Likelihood on three different test sets (Indo-European, Austronesian, Chinese) using binary and multi-state coding of the data as well as single and sampled phylogenies, we find that Maximum Likelihood largely outperforms the other methods. At the same time, however, the general performance was disappointingly low, ranging between 0.66 (Chinese) and 0.79 (Austronesian) for the F-Scores. A closer linguistic evaluation of the reconstructions proposed by the best method and the reconstructions given in the gold standards revealed that the majority of the cases where the algorithms failed can be attributed to problems of independent semantic shift (homoplasy), to morphological processes in lexical change, and to wrong reconstructions in the independently created test sets that we employed.

Affiliations: 1: Eberhard-Karls Universität Tübingen gerhard.jaeger@uni-tuebingen.de ; 2: Max-Planck-Institute for the Science of Human History, Jena list@shh.mpg.de

10.1163/22105832-00801002
/content/journals/10.1163/22105832-00801002
dcterms_title,pub_keyword,dcterms_description,pub_author
10
5
Loading
Loading data from figshare Loading data from figshare

Current efforts in computational historical linguistics are predominantly concerned with phylogenetic inference. Methods for ancestral state reconstruction have only been applied sporadically. In contrast to phylogenetic algorithms, automatic reconstruction methods presuppose phylogenetic information in order to explain what has evolved when and where. Here we report a pilot study exploring how well automatic methods for ancestral state reconstruction perform in the task of onomasiological reconstruction in multilingual word lists, where algorithms are used to infer how the words evolved along a given phylogeny, and reconstruct which cognate classes were used to express a given meaning in the ancestral languages. Comparing three different methods, Maximum Parsimony, Minimal Lateral Networks, and Maximum Likelihood on three different test sets (Indo-European, Austronesian, Chinese) using binary and multi-state coding of the data as well as single and sampled phylogenies, we find that Maximum Likelihood largely outperforms the other methods. At the same time, however, the general performance was disappointingly low, ranging between 0.66 (Chinese) and 0.79 (Austronesian) for the F-Scores. A closer linguistic evaluation of the reconstructions proposed by the best method and the reconstructions given in the gold standards revealed that the majority of the cases where the algorithms failed can be attributed to problems of independent semantic shift (homoplasy), to morphological processes in lexical change, and to wrong reconstructions in the independently created test sets that we employed.

Loading

Full text loading...

/deliver/journals/22105832/8/1/22105832_008_01_s002_text.html?itemId=/content/journals/10.1163/22105832-00801002&mimeType=html&fmt=ahah
/content/journals/10.1163/22105832-00801002
Loading

Data & Media loading...

http://brill.metastore.ingenta.com/content/journals/10.1163/22105832-00801002
Loading
Loading

Article metrics loading...

/content/journals/10.1163/22105832-00801002
2018-01-01
2018-09-25

Sign-in

Can't access your account?
  • Key

  • Full access
  • Open Access
  • Partial/No accessInformation