ResearchPad - syntax https://www.researchpad.co Default RSS Feed en-us © 2020 Newgen KnowledgeWorks <![CDATA[Deep neural model with self-training for scientific keyphrase extraction]]> https://www.researchpad.co/article/elastic_article_14707 Scientific information extraction is a crucial step for understanding scientific publications. In this paper, we focus on scientific keyphrase extraction, which aims to identify keyphrases from scientific articles and classify them into predefined categories. We present a neural network based approach for this task, which employs the bidirectional long short-memory (LSTM) to represent the sentences in the article. On top of the bidirectional LSTM layer in our neural model, conditional random field (CRF) is used to predict the label sequence for the whole sentence. Considering the expensive annotated data for supervised learning methods, we introduce self-training method into our neural model to leverage the unlabeled articles. Experimental results on the ScienceIE corpus and ACL keyphrase corpus show that our neural model achieves promising performance without any hand-designed features and external knowledge resources. Furthermore, it efficiently incorporates the unlabeled data and achieve competitive performance compared with previous state-of-the-art systems.

]]>
<![CDATA[Disentangling sequential from hierarchical learning in Artificial Grammar Learning: Evidence from a modified Simon Task]]> https://www.researchpad.co/article/elastic_article_14558 In this paper we probe the interaction between sequential and hierarchical learning by investigating implicit learning in a group of school-aged children. We administered a serial reaction time task, in the form of a modified Simon Task in which the stimuli were organised following the rules of two distinct artificial grammars, specifically Lindenmayer systems: the Fibonacci grammar (Fib) and the Skip grammar (a modification of the former). The choice of grammars is determined by the goal of this study, which is to investigate how sensitivity to structure emerges in the course of exposure to an input whose surface transitional properties (by hypothesis) bootstrap structure. The studies conducted to date have been mainly designed to investigate low-level superficial regularities, learnable in purely statistical terms, whereas hierarchical learning has not been effectively investigated yet. The possibility to directly pinpoint the interplay between sequential and hierarchical learning is instead at the core of our study: we presented children with two grammars, Fib and Skip, which share the same transitional regularities, thus providing identical opportunities for sequential learning, while crucially differing in their hierarchical structure. More particularly, there are specific points in the sequence (k-points), which, despite giving rise to the same transitional regularities in the two grammars, support hierarchical reconstruction in Fib but not in Skip. In our protocol, children were simply asked to perform a traditional Simon Task, and they were completely unaware of the real purposes of the task. Results indicate that sequential learning occurred in both grammars, as shown by the decrease in reaction times throughout the task, while differences were found in the sensitivity to k-points: these, we contend, play a role in hierarchical reconstruction in Fib, whereas they are devoid of structural significance in Skip. More particularly, we found that children were faster in correspondence to k-points in sequences produced by Fib, thus providing an entirely new kind of evidence for the hypothesis that implicit learning involves an early activation of strategies of hierarchical reconstruction, based on a straightforward interplay with the statistically-based computation of transitional regularities on the sequences of symbols.

]]>
<![CDATA[The Language of Innovation]]> https://www.researchpad.co/article/elastic_article_10245 Predicting innovation is a peculiar problem in data science. Following its definition, an innovation is always a never-seen-before event, leaving no room for traditional supervised learning approaches. Here we propose a strategy to address the problem in the context of innovative patents, by defining innovations as never-seen-before associations of technologies and exploiting self-supervised learning techniques. We think of technological codes present in patents as a vocabulary and the whole technological corpus as written in a specific, evolving language. We leverage such structure with techniques borrowed from Natural Language Processing by embedding technologies in a high dimensional euclidean space where relative positions are representative of learned semantics. Proximity in this space is an effective predictor of specific innovation events, that outperforms a wide range of standard link-prediction metrics. The success of patented innovations follows a complex dynamics characterized by different patterns which we analyze in details with specific examples. The methods proposed in this paper provide a completely new way of understanding and forecasting innovation, by tackling it from a revealing perspective and opening interesting scenarios for a number of applications and further analytic approaches.

]]>
<![CDATA[Electrophysiological correlates of concept type shifts]]> https://www.researchpad.co/article/5c8823b9d5eed0c484638ef4

A recent semantic theory of nominal concepts by Löbner [1] posits that–due to their inherent uniqueness and relationality properties–noun concepts can be classified into four concept types (CTs): sortal, individual, relational, functional. For sortal nouns the default determination is indefinite (a stone), for individual nouns it is definite (the sun), for relational and functional nouns it is possessive (his ear, his father). Incongruent determination leads to a concept type shift: his father (functional concept: unique, relational)–a father (sortal concept: non-unique, non-relational). Behavioral studies on CT shifts have demonstrated a CT congruence effect, with congruent determiners triggering faster lexical decision times on the subsequent noun than incongruent ones [2, 3]. The present ERP study investigated electrophysiological correlates of congruent and incongruent determination in German noun phrases, and specifically, whether the CT congruence effect could be indexed by such classic ERP components as N400, LAN or P600. If incongruent determination affects the lexical retrieval or semantic integration of the noun, it should be reflected in the amplitude of the N400 component. If, however, CT congruence is processed by the same neuronal mechanisms that underlie morphosyntactic processing, incongruent determination should trigger LAN or/and P600. These predictions were tested in two ERP studies. In Experiment 1, participants just listened to noun phrases. In Experiment 2, they performed a wellformedness judgment task. The processing of (in)congruent CTs (his sun vs. the sun) was compared to the processing of morphosyntactic and semantic violations in control conditions. Whereas the control conditions elicited classic electrophysiological violation responses (N400, LAN, & P600), CT-incongruences did not. Instead they showed novel concept-type specific response patterns. The absence of the classic ERP components suggests that CT-incongruent determination is not perceived as a violation of the semantic or morphosyntactic structure of the noun phrase.

]]>
<![CDATA[Physical co-presence intensity: Measuring dynamic face-to-face interaction potential in public space using social media check-in records]]> https://www.researchpad.co/article/5c6b26b0d5eed0c484289ea4

Urban public spaces facilitate social interactions between people, reflecting the shifting functionality of spaces. There is no commonly-held consensus on the quantification methods for the dynamic interplay between spatial geometry, urban movement, and face-to-face encounters. Using anonymized social media check-in records from Shanghai, China, this study proposes pipelines for quantifying physical face-to-face encounter potential patterns through public space networks between local and non-local residents sensed by social media over time from space to space, in which social difference, cognitive cost, and time remoteness are integrated as the physical co-presence intensity index. This illustrates the spatiotemporally different ways in which the built environment binds various groups of space users configurationally via urban streets. The variation in face-to-face interaction patterns captures the fine-resolution patterns of urban flows and a new definition of street hierarchy, illustrating how urban public space systems deliver physical meeting opportunities and shape the spatial rhythms of human behavior from the public to the private. The shifting encounter potentials through streets are recognized as reflections of urban centrality structures with social interactions that are spatiotemporally varying, projected in the configurations of urban forms and functions. The results indicate that the occurrence probability of face-to-face encounters is more geometrically scaled than predicted based on the co-location probability of two people using metric distance alone. By adding temporal and social dimensions to urban morphology studies, and the field of space syntax research in particular, we suggest a new approach of analyzing the temporal urban centrality structures of the physical interaction potentials based on trajectory data, which is sensitive to the transformation of the spatial grid. It sheds light on how to adopt urban design as a social instrument to facilitate the dynamically changing social interaction potential in the new data environment, thereby enhancing spatial functionality and the social well-being.

]]>
<![CDATA[Switching between reading tasks leads to phase-transitions in reading times in L1 and L2 readers]]> https://www.researchpad.co/article/5c63394bd5eed0c484ae6445

Reading research uses different tasks to investigate different levels of the reading process, such as word recognition, syntactic parsing, or semantic integration. It seems to be tacitly assumed that the underlying cognitive process that constitute reading are stable across those tasks. However, nothing is known about what happens when readers switch from one reading task to another. The stability assumptions of the reading process suggest that the cognitive system resolves this switching between two tasks quickly. Here, we present an alternative language-game hypothesis (LGH) of reading that begins by treating reading as a softly-assembled process and that assumes, instead of stability, context-sensitive flexibility of the reading process. LGH predicts that switching between two reading tasks leads to longer lasting phase-transition like patterns in the reading process. Using the nonlinear-dynamical tool of recurrence quantification analysis, we test these predictions by examining series of individual word reading times in self-paced reading tasks where native (L1) and second language readers (L2) transition between random word and ordered text reading tasks. We find consistent evidence for phase-transitions in the reading times when readers switch from ordered text to random-word reading, but we find mixed evidence when readers transition from random-word to ordered-text reading. In the latter case, L2 readers show moderately stronger signs for phase-transitions compared to L1 readers, suggesting that familiarity with a language influences whether and how such transitions occur. The results provide evidence for LGH and suggest that the cognitive processes underlying reading are not fully stable across tasks but exhibit soft-assembly in the interaction between task and reader characteristics.

]]>
<![CDATA[Effects of playing position, pitch location, opposition ability and team ability on the technical performance of elite soccer players in different score line states]]> https://www.researchpad.co/article/5c633972d5eed0c484ae67a3

The purpose of this study was to investigate the effects of playing position, pitch location, team ability and opposition ability on technical performance variables (pass, cross, corner, free kick accuracy) of English Premier League Soccer players in difference score line states. A validated automatic tracking system (Venatrack) was used to code player actions in real time for passing accuracy, cross accuracy, corner accuracy and free kick accuracy. In total 376 of the 380 games played during the 2011–12 English premier League season were recorded, resulting in activity profiles of 570 players and over 35’000 rows of data. These data were analysed using multi-level modelling. Multi-level regression revealed a “u” shaped association between passing accuracy and goal difference (GD) with greater accuracy occurring at extremes of GD e.g., when the score was either positive or negative. The same pattern was seen for corner accuracy away from home e.g., corner accuracy was lowest when the score was close with the lowest accuracy at extremes of GD. Although free kicks were not associated with GD, team ability, playing position and pitch location were found to predict accuracy. No temporal variables were found to predict cross accuracy. A number of score line effects were present across the temporal factors which should be considered by coaches and managers when preparing and selecting teams in order to maximise performance. The current study highlighted the need for more sensitive score line definitions in which to consider score line effects.

]]>
<![CDATA[Song variation of the South Eastern Indian Ocean pygmy blue whale population in the Perth Canyon, Western Australia]]> https://www.researchpad.co/article/5c50c450d5eed0c4845e850b

Sea noise collected over 2003 to 2017 from the Perth Canyon, Western Australia was analysed for variation in the South Eastern Indian Ocean pygmy blue whale song structure. The primary song-types were: P3, a three unit phrase (I, II and III) repeated with an inter-song interval (ISI) of 170–194 s; P2, a phrase consisting of only units II & III repeated every 84–96 s; and P1 with a phrase consisting of only unit II repeated every 45–49 s. The different ISI values were approximate multiples of each other within a season. When comparing data from each season, across seasons, the ISI value for each song increased significantly through time (all fits had p << 0.001), at 0.30 s/Year (95%CI 0.217–0.383), 0.8 s/Year (95%CI 0.655–1.025) and 1.73 s/Year (95%CI 1.264–2.196) for the P1, P2 and P3 songs respectively. The proportions of each song-type averaged at 21.5, 24.2 and 56% for P1, P2 and P3 occurrence respectively and these ratios could vary by up to ± 8% (95% CI) amongst years. On some occasions animals changed the P3 ISI to be significantly shorter (120–160 s) or longer (220–280 s). Hybrid song patterns occurred where animals combined multiple phrase types into a repeated song. In recent years whales introduced further complexity by splitting song units. This variability of song-type and proportions implies abundance measure for this whale sub population based on song detection needs to factor in trends in song variability to make data comparable between seasons. Further, such variability in song production by a sub population of pygmy blue whales raises questions as to the stability of the song types that are used to delineate populations. The high level of song variability may be driven by an increasing number of background whale callers creating ‘noise’ and so forcing animals to alter song in order to ‘stand out’ amongst the crowd.

]]>
<![CDATA[Evaluating probabilistic programming languages for simulating quantum correlations]]> https://www.researchpad.co/article/5c390bbad5eed0c48491e06f

This article explores how probabilistic programming can be used to simulate quantum correlations in an EPR experimental setting. Probabilistic programs are based on standard probability which cannot produce quantum correlations. In order to address this limitation, a hypergraph formalism was programmed which both expresses the measurement contexts of the EPR experimental design as well as associated constraints. Four contemporary open source probabilistic programming frameworks were used to simulate an EPR experiment in order to shed light on their relative effectiveness from both qualitative and quantitative dimensions. We found that all four probabilistic languages successfully simulated quantum correlations. Detailed analysis revealed that no language was clearly superior across all dimensions, however, the comparison does highlight aspects that can be considered when using probabilistic programs to simulate experiments in quantum physics.

]]>
<![CDATA[Beyond opinion classification: Extracting facts, opinions and experiences from health forums]]> https://www.researchpad.co/article/5c3fa56ad5eed0c484ca4115

Introduction

Surveys indicate that patients, particularly those suffering from chronic conditions, strongly benefit from the information found in social networks and online forums. One challenge in accessing online health information is to differentiate between factual and more subjective information. In this work, we evaluate the feasibility of exploiting lexical, syntactic, semantic, network-based and emotional properties of texts to automatically classify patient-generated contents into three types: “experiences”, “facts” and “opinions”, using machine learning algorithms. In this context, our goal is to develop automatic methods that will make online health information more easily accessible and useful for patients, professionals and researchers.

Material and methods

We work with a set of 3000 posts to online health forums in breast cancer, morbus crohn and different allergies. Each sentence in a post is manually labeled as “experience”, “fact” or “opinion”. Using this data, we train a support vector machine algorithm to perform classification. The results are evaluated in a 10-fold cross validation procedure.

Results

Overall, we find that it is possible to predict the type of information contained in a forum post with a very high accuracy (over 80 percent) using simple text representations such as word embeddings and bags of words. We also analyze more complex features such as those based on the network properties, the polarity of words and the verbal tense of the sentences and show that, when combined with the previous ones, they can boost the results.

]]>
<![CDATA[A conceptual space for EEG-based brain-computer interfaces]]> https://www.researchpad.co/article/5c37b798d5eed0c4844905bd

Brain-Computer Interfaces (BCIs) have become more and more popular these last years. Researchers use this technology for several types of applications, including attention and workload measures but also for the direct control of objects by the means of BCIs. In this work we present a first, multidimensional feature space for EEG-based BCI applications to help practitioners to characterize, compare and design systems, which use EEG-based BCIs. Our feature space contains 4 axes and 9 sub-axes and consists of 41 options in total as well as their different combinations. We presented the axes of our feature space and we positioned our feature space regarding the existing BCI and HCI taxonomies and we showed how our work integrates the past works, and/or complements them.

]]>
<![CDATA[Call combinations in birds and the evolution of compositional syntax]]> https://www.researchpad.co/article/5b8acde940307c144d0de057

Syntax is the set of rules for combining words into phrases, providing the basis for the generative power of linguistic expressions. In human language, the principle of compositionality governs how words are combined into a larger unit, the meaning of which depends on both the meanings of the words and the way in which they are combined. This linguistic capability, i.e., compositional syntax, has long been considered a trait unique to human language. Here, we review recent studies on call combinations in a passerine bird, the Japanese tit (Parus minor), that provide the first firm evidence for compositional syntax in a nonhuman animal. While it has been suggested that the findings of these studies fail to provide evidence for compositionality in Japanese tits, this criticism is based on misunderstanding of experimental design, misrepresentation of the importance of word order in human syntax, and necessitating linguistic capabilities beyond those given by the standard definition of compositionality. We argue that research on avian call combinations has provided the first steps in elucidating how compositional expressions could have emerged in animal communication systems.

]]>
<![CDATA[Compositionality in animals and humans]]> https://www.researchpad.co/article/5b8acde740307c144d0de056

A key step in understanding the evolution of human language involves unravelling the origins of language’s syntactic structure. One approach seeks to reduce the core of syntax in humans to a single principle of recursive combination, merge, for which there is no evidence in other species. We argue for an alternative approach. We review evidence that beneath the staggering complexity of human syntax, there is an extensive layer of nonproductive, nonhierarchical syntax that can be fruitfully compared to animal call combinations. This is the essential groundwork that must be explored and integrated before we can elucidate, with sufficient precision, what exactly made it possible for human language to explode its syntactic capacity, transitioning from simple nonproductive combinations to the unrivalled complexity that we now have.

]]>
<![CDATA[Deficits in nominal reference identify thought disordered speech in a narrative production task]]> https://www.researchpad.co/article/5b8687d840307c73f6bbfec2

Formal thought disorder (TD) is a neuropathology manifest in formal language dysfunction, but few behavioural linguistic studies exist. These have highlighted problems in the domain of semantics and more specifically of reference. Here we aimed for a more complete and systematic linguistic model of TD, focused on (i) a more in-depth analysis of anomalies of reference as depending on the grammatical construction type in which they occur, and (ii) measures of formal grammatical complexity and errors. Narrative speech obtained from 40 patients with schizophrenia, 20 with TD and 20 without, and from 14 healthy controls matched on pre-morbid IQ, was rated blindly. Results showed that of 10 linguistic variables annotated, 4 showed significant differences between groups, including the two patient groups. These all concerned mis-uses of noun phrases (NPs) for purposes of reference, but showed sensitivity to how NPs were classed: definite and pronominal forms of reference were more affected than indefinite and non-pronominal (lexical) NPs. None of the measures of formal grammatical complexity and errors distinguished groups. We conclude that TD exhibits a specific and differentiated linguistic profile, which can illuminate TD neuro-cognitively and inform future neuroimaging studies, and can have clinical utility as a linguistic biomarker.

]]>
<![CDATA[Background Speech Effects on Sentence Processing during Reading: An Eye Movement Study]]> https://www.researchpad.co/article/5989daa1ab0ee8fa60ba5b06

Effects of background speech on reading were examined by playing aloud different types of background speech, while participants read long, syntactically complex and less complex sentences embedded in text. Readers’ eye movement patterns were used to study online sentence comprehension. Effects of background speech were primarily seen in rereading time. In Experiment 1, foreign-language background speech did not disrupt sentence processing. Experiment 2 demonstrated robust disruption in reading as a result of semantically and syntactically anomalous scrambled background speech preserving normal sentence-like intonation. Scrambled speech that was constructed from the text to-be read did not disrupt reading more than scrambled speech constructed from a different, semantically unrelated text. Experiment 3 showed that scrambled speech exacerbated the syntactic complexity effect more than coherent background speech, which also interfered with reading. Experiment 4 demonstrated that both semantically and syntactically anomalous speech produced no more disruption in reading than semantically anomalous but syntactically correct background speech. The pattern of results is best explained by a semantic account that stresses the importance of similarity in semantic processing, but not similarity in semantic content, between the reading task and background speech.

]]>
<![CDATA[Does Syntactic Alignment Effectively Influence How Speakers Are Perceived by Their Conversation Partner?]]> https://www.researchpad.co/article/5989da27ab0ee8fa60b813a5

The way we talk can influence how we are perceived by others. Whereas previous studies have started to explore the influence of social goals on syntactic alignment, in the current study, we additionally investigated whether syntactic alignment effectively influences conversation partners’ perception of the speaker. To this end, we developed a novel paradigm in which we can measure the effect of social goals on the strength of syntactic alignment for one participant (primed participant), while simultaneously obtaining usable social opinions about them from their conversation partner (the evaluator). In Study 1, participants’ desire to be rated favorably by their partner was manipulated by assigning pairs to a Control (i.e., primed participants did not know they were being evaluated) or Evaluation context (i.e., primed participants knew they were being evaluated). Surprisingly, results showed no significant difference in the strength with which primed participants aligned their syntactic choices with their partners’ choices. In a follow-up study, we used a Directed Evaluation context (i.e., primed participants knew they were being evaluated and were explicitly instructed to make a positive impression). However, again, there was no evidence supporting the hypothesis that participants’ desire to impress their partner influences syntactic alignment. With respect to the influence of syntactic alignment on perceived likeability by the evaluator, a negative relationship was reported in Study 1: the more primed participants aligned their syntactic choices with their partner, the more that partner decreased their likeability rating after the experiment. However, this effect was not replicated in the Directed Evaluation context of Study 2. In other words, our results do not support the conclusion that speakers’ desire to be liked affects how much they align their syntactic choices with their partner, nor is there convincing evidence that there is a reliable relationship between syntactic alignment and perceived likeability.

]]>
<![CDATA[Evidence for simultaneous syntactic processing of multiple words during reading]]> https://www.researchpad.co/article/5989db51ab0ee8fa60bdc304

A hotly debated issue in reading research concerns the extent to which readers process parafoveal words, and how parafoveal information might influence foveal word recognition. We investigated syntactic word processing both in sentence reading and in reading isolated foveal words when these were flanked by parafoveal words. In Experiment 1 we found a syntactic parafoveal preview benefit in sentence reading, meaning that fixation durations on target words were decreased when there was a syntactically congruent preview word at the target location (n) during the fixation on the pre-target (n-1). In Experiment 2 we used a flanker paradigm in which participants had to classify foveal target words as either noun or verb, when those targets were flanked by syntactically congruent or incongruent words (stimulus on-time 170 ms). Lower response times and error rates in the congruent condition suggested that higher-order (syntactic) information can be integrated across foveal and parafoveal words. Although higher-order parafoveal-on-foveal effects have been elusive in sentence reading, results from our flanker paradigm show that the reading system can extract higher-order information from multiple words in a single glance. We propose a model of reading to account for the present findings.

]]>
<![CDATA[Does Speaking Two Dialects in Daily Life Affect Executive Functions? An Event-Related Potential Study]]> https://www.researchpad.co/article/5989db25ab0ee8fa60bd028b

Whether using two languages enhances executive functions is a matter of debate. Here, we take a novel perspective to examine the bilingual advantage hypothesis by comparing bi-dialect with mono-dialect speakers’ performance on a non-linguistic task that requires executive control. Two groups of native Chinese speakers, one speaking only the standard Chinese Mandarin and the other also speaking the Southern-Min dialect, which differs from the standard Chinese Mandarin primarily in phonology, performed a classic Flanker task. Behavioural results showed no difference between the two groups, but event-related potentials recorded simultaneously revealed a number of differences, including an earlier P2 effect in the bi-dialect as compared to the mono-dialect group, suggesting that the two groups engage different underlying neural processes. Despite differences in the early ERP component, no between-group differences in the magnitude of the Flanker effects, which is an index of conflict resolution, were observed in the N2 component. Therefore, these findings suggest that speaking two dialects of one language does not enhance executive functions. Implications of the current findings for the bilingual advantage hypothesis are discussed.

]]>
<![CDATA[PepeSearch: Semantic Data for the Masses]]> https://www.researchpad.co/article/5989dac3ab0ee8fa60bb1880

With the emergence of the Web of Data, there is a need of tools for searching and exploring the growing amount of semantic data. Unfortunately, such tools are scarce and typically require knowledge of SPARQL/RDF. We propose here PepeSearch, a portable tool for searching semantic datasets devised for mainstream users. PepeSearch offers a multi-class search form automatically constructed from a SPARQL endpoint. We have tested PepeSearch with 15 participants searching a Linked Open Data version of the Norwegian Register of Business Enterprises for non-trivial challenges. Retrieval performance was encouragingly high and usability ratings were also very positive, thus suggesting that PepeSearch is effective for searching semantic datasets by mainstream users. We also assessed its portability by configuring PepeSearch to query other SPARQL endpoints.

]]>
<![CDATA[Neural Correlates of Contrast and Humor: Processing Common Features of Verbal Irony]]> https://www.researchpad.co/article/5989da7eab0ee8fa60b997db

Irony is a kind of figurative language used by a speaker to say something that contrasts with the context and, to some extent, lends humor to a situation. However, little is known about the brain regions that specifically support the processing of these two common features of irony. The present study had two main aims: (i) investigate the neural basis of irony processing, by delivering short ironic spoken sentences (and their literal counterparts) to participants undergoing fMRI; and (ii) assess the neural effect of two irony parameters, obtained from normative studies: degree of contrast and humor appreciation. Results revealed activation of the bilateral inferior frontal gyrus (IFG), posterior part of the left superior temporal gyrus, medial frontal cortex, and left caudate during irony processing, suggesting the involvement of both semantic and theory-of-mind networks. Parametric models showed that contrast was specifically associated with the activation of bilateral frontal and subcortical areas, and that these regions were also sensitive to humor, as shown by a conjunction analysis. Activation of the bilateral IFG is consistent with the literature on humor processing, and reflects incongruity detection/resolution processes. Moreover, the activation of subcortical structures can be related to the reward processing of social events.

]]>