ResearchPad - evolutionary-linguistics https://www.researchpad.co Default RSS Feed en-us © 2020 Newgen KnowledgeWorks <![CDATA[Modeling competitive evolution of multiple languages]]> https://www.researchpad.co/article/elastic_article_7854 Increasing evidence demonstrates that in many places language coexistence has become ubiquitous and essential for supporting language and cultural diversity and associated with its financial and economic benefits. The competitive evolution among multiple languages determines the evolution outcome, either coexistence, or decline, or extinction. Here, we extend the Abrams-Strogatz model of language competition to multiple languages and then validate it by analyzing the behavioral transitions of language usage over the recent several decades in Singapore and Hong Kong. In each case, we estimate from data the model parameters that measure each language utility for its speakers and the strength of two biases, the majority preference for their language, and the minority aversion to it. The values of these two biases decide which language is the fastest growing in the competition and what would be the stable state of the system. We also study the system convergence time to stable states and discover the existence of tipping points with multiple attractors. Moreover, the critical slowdown of convergence to the stable fractions of language users appears near and peaks at the tipping points, signaling when the system approaches them. Our analysis furthers our understanding of evolution of various languages and the role of tipping points in behavioral transitions. These insights may help to protect languages from extinction and retain the language and cultural diversity.

]]>
<![CDATA[The natural selection of words: Finding the features of fitness]]> https://www.researchpad.co/article/5c58d625d5eed0c484031768

We introduce a dataset for studying the evolution of words, constructed from WordNet and the Google Books Ngram Corpus. The dataset tracks the evolution of 4,000 synonym sets (synsets), containing 9,000 English words, from 1800 AD to 2000 AD. We present a supervised learning algorithm that is able to predict the future leader of a synset: the word in the synset that will have the highest frequency. The algorithm uses features based on a word’s length, the characters in the word, and the historical frequencies of the word. It can predict change of leadership (including the identity of the new leader) fifty years in the future, with an F-score considerably above random guessing. Analysis of the learned models provides insight into the causes of change in the leader of a synset. The algorithm confirms observations linguists have made, such as the trend to replace the -ise suffix with -ize, the rivalry between the -ity and -ness suffixes, and the struggle between economy (shorter words are easier to remember and to write) and clarity (longer words are more distinctive and less likely to be confused with one another). The results indicate that integration of the Google Books Ngram Corpus with WordNet has significant potential for improving our understanding of how language evolves.

]]>
<![CDATA[Emergence of linguistic conventions in multi-agent reinforcement learning]]> https://www.researchpad.co/article/5c09944fd5eed0c4842ae9e0

Recently, emergence of signaling conventions, among which language is a prime example, draws a considerable interdisciplinary interest ranging from game theory, to robotics to evolutionary linguistics. Such a wide spectrum of research is based on much different assumptions and methodologies, but complexity of the problem precludes formulation of a unifying and commonly accepted explanation. We examine formation of signaling conventions in a framework of a multi-agent reinforcement learning model. When the network of interactions between agents is a complete graph or a sufficiently dense random graph, a global consensus is typically reached with the emerging language being a nearly unique object-word mapping or containing some synonyms and homonyms. On finite-dimensional lattices, the model gets trapped in disordered configurations with a local consensus only. Such a trapping can be avoided by introducing a population renewal, which in the presence of superlinear reinforcement restores an ordinary surface-tension driven coarsening and considerably enhances formation of efficient signaling.

]]>
<![CDATA[Call combinations in birds and the evolution of compositional syntax]]> https://www.researchpad.co/article/5b8acde940307c144d0de057

Syntax is the set of rules for combining words into phrases, providing the basis for the generative power of linguistic expressions. In human language, the principle of compositionality governs how words are combined into a larger unit, the meaning of which depends on both the meanings of the words and the way in which they are combined. This linguistic capability, i.e., compositional syntax, has long been considered a trait unique to human language. Here, we review recent studies on call combinations in a passerine bird, the Japanese tit (Parus minor), that provide the first firm evidence for compositional syntax in a nonhuman animal. While it has been suggested that the findings of these studies fail to provide evidence for compositionality in Japanese tits, this criticism is based on misunderstanding of experimental design, misrepresentation of the importance of word order in human syntax, and necessitating linguistic capabilities beyond those given by the standard definition of compositionality. We argue that research on avian call combinations has provided the first steps in elucidating how compositional expressions could have emerged in animal communication systems.

]]>
<![CDATA[Compositionality in animals and humans]]> https://www.researchpad.co/article/5b8acde740307c144d0de056

A key step in understanding the evolution of human language involves unravelling the origins of language’s syntactic structure. One approach seeks to reduce the core of syntax in humans to a single principle of recursive combination, merge, for which there is no evidence in other species. We argue for an alternative approach. We review evidence that beneath the staggering complexity of human syntax, there is an extensive layer of nonproductive, nonhierarchical syntax that can be fruitfully compared to animal call combinations. This is the essential groundwork that must be explored and integrated before we can elucidate, with sufficient precision, what exactly made it possible for human language to explode its syntactic capacity, transitioning from simple nonproductive combinations to the unrivalled complexity that we now have.

]]>
<![CDATA[A serial founder effect model of phonemic diversity based on phonemic loss in low-density populations]]> https://www.researchpad.co/article/5b28b428463d7e129299939c

It has been observed that the number of phonemes in languages in use today tends to decrease with increasing distance from Africa. A previous formal model has recently reproduced the observed cline, but under two strong assumptions. Here we tackle the question of whether an alternative explanation for the worldwide phonemic cline is possible, by using alternative assumptions. The answer is affirmative. We show this by formalizing a proposal, following Atkinson, that this pattern may be due to a repeated bottleneck effect and phonemic loss. In our simulations, low-density populations lose phonemes during the Out-of-Africa dispersal of modern humans. Our results reproduce the observed global cline for the number of phonemes. In addition, we also detect a cline of phonemic diversity and reproduce it using our simulation model. We suggest how future work could determine whether the previous model or the new one (or even a combination of them) is valid. Simulations also show that the clines can still be present even 300 kyr after the Out-of-Africa dispersal, which is contrary to some previous claims which were not supported by numerical simulations.

]]>
<![CDATA[Evidence of a Vocalic Proto-System in the Baboon (Papio papio) Suggests Pre-Hominin Speech Precursors]]> https://www.researchpad.co/article/5989da15ab0ee8fa60b7af82

Language is a distinguishing characteristic of our species, and the course of its evolution is one of the hardest problems in science. It has long been generally considered that human speech requires a low larynx, and that the high larynx of nonhuman primates should preclude their producing the vowel systems universally found in human language. Examining the vocalizations through acoustic analyses, tongue anatomy, and modeling of acoustic potential, we found that baboons (Papio papio) produce sounds sharing the F1/F2 formant structure of the human [ɨ æ ɑ ɔ u] vowels, and that similarly with humans those vocalic qualities are organized as a system on two acoustic-anatomic axes. This confirms that hominoids can produce contrasting vowel qualities despite a high larynx. It suggests that spoken languages evolved from ancient articulatory skills already present in our last common ancestor with Cercopithecoidea, about 25 MYA.

]]>
<![CDATA[Long-Range Correlations in Sentence Series from A Story of the Stone]]> https://www.researchpad.co/article/5989da20ab0ee8fa60b7eb4d

A sentence is the natural unit of language. Patterns embedded in series of sentences can be used to model the formation and evolution of languages, and to solve practical problems such as evaluating linguistic ability. In this paper, we apply de-trended fluctuation analysis to detect long-range correlations embedded in sentence series from A Story of the Stone, one of the greatest masterpieces of Chinese literature. We identified a weak long-range correlation, with a Hurst exponent of 0.575±0.002 up to a scale of 104. We used the structural stability to confirm the behavior of the long-range correlation, and found that different parts of the series had almost identical Hurst exponents. We found that noisy records can lead to false results and conclusions, even if the noise covers a limited proportion of the total records (e.g., less than 1%). Thus, the structural stability test is an essential procedure for confirming the existence of long-range correlations, which has been widely neglected in previous studies. Furthermore, a combination of de-trended fluctuation analysis and diffusion entropy analysis demonstrated that the sentence series was generated by a fractional Brownian motion.

]]>
<![CDATA[Sequence Memory Constraints Give Rise to Language-Like Structure through Iterated Learning]]> https://www.researchpad.co/article/5989db53ab0ee8fa60bdcc09

Human language is composed of sequences of reusable elements. The origins of the sequential structure of language is a hotly debated topic in evolutionary linguistics. In this paper, we show that sets of sequences with language-like statistical properties can emerge from a process of cultural evolution under pressure from chunk-based memory constraints. We employ a novel experimental task that is non-linguistic and non-communicative in nature, in which participants are trained on and later asked to recall a set of sequences one-by-one. Recalled sequences from one participant become training data for the next participant. In this way, we simulate cultural evolution in the laboratory. Our results show a cumulative increase in structure, and by comparing this structure to data from existing linguistic corpora, we demonstrate a close parallel between the sets of sequences that emerge in our experiment and those seen in natural language.

]]>
<![CDATA[Social adaptation in multi-agent model of linguistic categorization is affected by network information flow]]> https://www.researchpad.co/article/5aafc759463d7e7d7e2e876f

This paper explores how information flow properties of a network affect the formation of categories shared between individuals, who are communicating through that network. Our work is based on the established multi-agent model of the emergence of linguistic categories grounded in external environment. We study how network information propagation efficiency and the direction of information flow affect categorization by performing simulations with idealized network topologies optimizing certain network centrality measures. We measure dynamic social adaptation when either network topology or environment is subject to change during the experiment, and the system has to adapt to new conditions. We find that both decentralized network topology efficient in information propagation and the presence of central authority (information flow from the center to peripheries) are beneficial for the formation of global agreement between agents. Systems with central authority cope well with network topology change, but are less robust in the case of environment change. These findings help to understand which network properties affect processes of social adaptation. They are important to inform the debate on the advantages and disadvantages of centralized systems.

]]>
<![CDATA[Selective Influences of Precision and Power Grips on Speech Categorization]]> https://www.researchpad.co/article/5989d9f1ab0ee8fa60b6e6b1

Recent studies have shown that articulatory gestures are systematically associated with specific manual grip actions. Here we show that executing such actions can influence performance on a speech-categorization task. Participants watched and/or listened to speech stimuli while executing either a power or a precision grip. Grip performance influenced the syllable categorization by increasing the proportion of responses of the syllable congruent with the executed grip (power grip—[ke] and precision grip—[te]). Two follow-up experiments indicated that the effect was based on action-induced bias in selecting the syllable.

]]>