The navigation is a substantial issue in the field of robotics. Simultaneous Localization and Mapping (SLAM) is a principle for many autonomous navigation applications, particularly in the Global Navigation Satellite System (GNSS) denied environments. Many SLAM methods made substantial contributions to improve its accuracy, cost, and efficiency. Still, it is a considerable challenge to manage robust SLAM, and there exist several attempts to find better estimation algorithms for it. In this research, we proposed a novel Bayesian filtering based Airborne SLAM structure for the first time in the literature. We also presented the mathematical background of the algorithm, and the SLAM model of an autonomous aerial vehicle. Simulation results emphasize that the new Airborne SLAM performance with the exact flow of particles using for recursive state estimations superior to other approaches emerged before, in terms of accuracy and speed of convergence. Nevertheless, its computational complexity may cause real-time application concerns, particularly in high-dimensional state spaces. However, in Airborne SLAM, it can be preferred in the measurement environments that use low uncertainty sensors because it gives more successful results by eliminating the problem of degeneration seen in the particle filter structure.

]]>Cystic Fibrosis (CF) is an inherited disease caused by mutations in the cystic fibrosis transmembrane conductance regulator (CFTR) ion channel. Mutations in CFTR cause impaired chloride ion transport in the epithelial tissues of patients leading to cardiopulmonary decline and pancreatic insufficiency in the most severely affected patients. CFTR is composed of twelve membrane-spanning domains, two nucleotide-binding domains (NBDs), and a regulatory domain. The most common mutation in CFTR is a deletion of phenylalanine at position 508 (ΔF508) in NBD1. Previous research has primarily concentrated on the structure and dynamics of the NBD1 domain; However numerous pathological mutations have also been found in the lesser-studied NBD2 domain. We have investigated the amino acid co-evolved network of interactions in NBD2, and the changes that occur in that network upon the introduction of CF and CF-related mutations (S1251N(T), S1235R, D1270N, N1303K(T)). Extensive coupling between the α- and β-subdomains were identified with residues in, or near Walker A, Walker B, H-loop and C-loop motifs. Alterations in the predicted residue network varied from moderate for the S1251T perturbation to more severe for N1303T. The S1235R and D1270N networks varied greatly compared to the wildtype, but these CF mutations only affect ion transport preference and do not severely disrupt CFTR function, suggesting dynamic flexibility in the network of interactions in NBD2. Our results also suggest that inappropriate interactions between the β-subdomain and Q-loop could be detrimental. We also identified mutations predicted to stabilize the NBD2 residue network upon introduction of the CF and CF-related mutations, and these predicted mutations are scored as benign by the MUTPRED2 algorithm. Our results suggest the level of disruption of the co-evolution predictions of the amino acid networks in NBD2 does not have a straightforward correlation with the severity of the CF phenotypes observed.

]]>Sulphate attack is one of the most important factors that limit the lifetime of
pure concrete constructions. Harsh environmental conditions have a large impact
on the operational costs of concrete columns or piles dipped into soil. The
results are non-deterministic; therefore, reliability analysis is often used.
The strength characteristics of the substrate around the construction were
modelled as one-dimensional prismatic beams related with random
*p-y* curves. Sulphate deterioration is defined as a set of
random variables jointed with two dimensional mechanical systems at acceptable
levels. Fick’s second law describes the penetration of sulphate ingress into
pure concrete with explicit numerical solutions for boundary conditions and an
increase in the transition factor under the progress of sulphate ingress. This
process was partially solved via analytical methods for sulphate ion transport
and numerically for a random field. This solves the mechanical task and
determines the system reliability. A numerical example is provided to illustrate
the proposed method to prevent unexpected structural failures during column
service life. The proposed methodology can assist designers and can help to make
decisions on existing foundations to ensure the safety of geotechnical
construction.

Linear mixed effect models are powerful tools used to account for population structure in genome-wide association studies (GWASs) and estimate the genetic architecture of complex traits. However, fully-specified models are computationally demanding and common simplifications often lead to reduced power or biased inference. We describe `Grid-LMM` (https://github.com/deruncie/GridLMM), an extendable algorithm for repeatedly fitting complex linear models that account for multiple sources of heterogeneity, such as additive and non-additive genetic variance, spatial heterogeneity, and genotype-environment interactions. `Grid-LMM` can compute approximate (yet highly accurate) frequentist test statistics or Bayesian posterior summaries at a genome-wide scale in a fraction of the time compared to existing general-purpose methods. We apply `Grid-LMM` to two types of quantitative genetic analyses. The first is focused on accounting for spatial variability and non-additive genetic variance while scanning for QTL; and the second aims to identify gene expression traits affected by non-additive genetic variation. In both cases, modeling multiple sources of heterogeneity leads to new discoveries.

Hemorrhagic fever with renal syndrome (HFRS) is a zoonosis caused by hantavirus (belongs to Hantaviridae family). A large amount of HFRS cases occur in China, especially in the Heilongjiang Province, raising great concerns regarding public health. The distribution of these cases across space-time often exhibits highly heterogeneous characteristics. Hence, it is widely recognized that the improved mapping of heterogeneous HFRS distributions and the quantitative assessment of the space-time disease transition patterns can advance considerably the detection, prevention and control of epidemic outbreaks.

A synthesis of space-time mapping and probabilistic logic is proposed to study the distribution of monthly HFRS population-standardized incidences in Heilongjiang province during the period 2005–2013. We introduce a class-dependent Bayesian maximum entropy (cd-BME) mapping method dividing the original dataset into discrete incidence classes that overcome data heterogeneity and skewness effects and can produce space-time HFRS incidence estimates together with their estimation accuracy. A ten-fold cross validation analysis is conducted to evaluate the performance of the proposed cd-BME implementation compared to the standard class-independent BME implementation. Incidence maps generated by cd-BME are used to study the spatiotemporal HFRS spread patterns. Further, the spatiotemporal dependence of HFRS incidences are measured in terms of probability logic indicators that link class-dependent HFRS incidences at different space-time points. These indicators convey useful complementary information regarding intraclass and interclass relationships, such as the change in HFRS transition probabilities between different incidence classes with increasing geographical distance and time separation.

Each HFRS class exhibited a distinct space-time variation structure in terms of its varying covariance parameters (shape, sill and correlation ranges). Given the heterogeneous features of the HFRS dataset, the cd-BME implementation demonstrated an improved ability to capture these features compared to the standard implementation (e.g., mean absolute error: 0.19 *vs*. 0.43 cases/10^{5} capita) demonstrating a point outbreak character at high incidence levels and a non-point spread character at low levels. Intraclass HFRS variations were found to be considerably different than interclass HFRS variations. Certain incidence classes occurred frequently near one class but were rarely found adjacent to other classes. Different classes may share common boundaries or they may be surrounded completely by another class. The HFRS class 0–68.5% was the most dominant in the Heilongjiang province (covering more than 2/3 of the total area). The probabilities that certain incidence classes occur next to other classes were used to estimate the transitions between HFRS classes. Moreover, such probabilities described the dependency pattern of the space-time arrangement of HFRS patches occupied by the incidence classes. The HFRS transition probabilities also suggested the presence of both positive and negative relations among the main classes. The HFRS indicator plots offer complementary visualizations of the varying probabilities of transition between incidence classes, and so they describe the dependency pattern of the space-time arrangement of the HFRS patches occupied by the different classes.

The cd-BME method combined with probabilistic logic indicators offer an accurate and informative quantitative representation of the heterogeneous HFRS incidences in the space-time domain, and the results thus obtained can be interpreted readily. The same methodological combination could also be used in the spatiotemporal modeling and prediction of other epidemics under similar circumstances.

Research demonstrates a negative relationship between alcohol use and affect, but the value of deprecation is unknown and thus cannot be included in estimates of the cost of alcohol to society. This paper aims to examine this relationship and develop econometric techniques to value the loss in affect attributable to alcohol consumption.

Cross-sectional (n = 129,437) and longitudinal (n = 11,352) analyses of alcohol consumers in UK Biobank data were undertaken, with depression and neuroticism as proxies of negative affect. The cross-sectional relationship between household income, negative affect and alcohol consumption were analysed using regression models, controlling for confounding variables, and using within-between random models that are robust to unobserved heterogeneity. The differential in household income required to offset alcohol’s detriment to affect was derived.

A consistent relationship between depression and alcohol consumption (β = 0.001, z = 7.64) and neuroticism and alcohol consumption (β = 0.001, z = 9.24) was observed in cross-sectional analyses, replicated in within-between models (depression β = 0.001, z = 2.32; neuroticism β = 0.001, z = 2.33). Significant associations were found between household income and depression (cross sectional β = -0.157, z = -23.86, within-between β = -0.146, z = -9.51) and household income and neuroticism (cross sectional β = -0.166, z = -32.02, within-between β = -0.158, z = -7.44). The value of reducing alcohol consumption by one gram/day was pooled and estimated to be £209.06 (95% CI £171.84 to £246.27).

Professional sporting organisations invest considerable resources collecting and analysing data in order to better understand the factors that influence performance. Recent advances in non-invasive technologies, such as global positioning systems (GPS), mean that large volumes of data are now readily available to coaches and sport scientists. However analysing such data can be challenging, particularly when sample sizes are small and data sets contain multiple highly correlated variables, as is often the case in a sporting context. Multicollinearity in particular, if not treated appropriately, can be problematic and might lead to erroneous conclusions. In this paper we present a novel ‘leave one variable out’ (LOVO) partial least squares correlation analysis (PLSCA) methodology, designed to overcome the problem of multicollinearity, and show how this can be used to identify the training load (TL) variables that influence most ‘end fitness’ in young rugby league players.

The accumulated TL of sixteen male professional youth rugby league players (17.7 ± 0.9 years) was quantified via GPS, a micro-electrical-mechanical-system (MEMS), and players’ session-rating-of-perceived-exertion (sRPE) over a 6-week pre-season training period. Immediately prior to and following this training period, participants undertook a 30–15 intermittent fitness test (30-15_{IFT}), which was used to determine a players ‘starting fitness’ and ‘end fitness’. In total twelve TL variables were collected, and these along with ‘starting fitness’ as a covariate were regressed against ‘end fitness’. However, considerable multicollinearity in the data (VIF >1000 for nine variables) meant that the multiple linear regression (MLR) process was unstable and so we developed a novel LOVO PLSCA adaptation to quantify the relative importance of the predictor variables and thus minimise multicollinearity issues. As such, the LOVO PLSCA was used as a tool to inform and refine the MLR process.

The LOVO PLSCA identified the distance accumulated at very-high speed (>7 m·s^{-1}) as being the most important TL variable to influence improvement in player fitness, with this variable causing the largest decrease in singular value inertia (5.93). When included in a refined linear regression model, this variable, along with ‘starting fitness’ as a covariate, explained 73% of the variance in v30-15_{IFT} ‘end fitness’ (p<0.001) and eliminated completely any multicollinearity issues.

The LOVO PLSCA technique appears to be a useful tool for evaluating the relative importance of predictor variables in data sets that exhibit considerable multicollinearity. When used as a filtering tool, LOVO PLSCA produced a MLR model that demonstrated a significant relationship between ‘end fitness’ and the predictor variable ‘accumulated distance at very-high speed’ when ‘starting fitness’ was included as a covariate. As such, LOVO PLSCA may be a useful tool for sport scientists and coaches seeking to analyse data sets obtained using GPS and MEMS technologies.

In financial economics, a large number of models are developed based on the daily closing price. When using only the daily closing price to model the time series, we may discard valuable intra-daily information, such as maximum and minimum prices. In this study, we propose an interval time series model, including the daily maximum, minimum, and closing prices, and then apply the proposed model to forecast the entire interval. The likelihood function and the corresponding maximum likelihood estimates (MLEs) are obtained by stochastic differential equation and the Girsanov theorem. To capture the heteroscedasticity of volatility, we consider a stochastic volatility model. The efficiency of the proposed estimators is illustrated by a simulation study. Finally, based on real data for S&P 500 index, the proposed method outperforms several alternatives in terms of the accurate forecast.

]]>Approximation algorithms with linear complexities are required in the treatments of big data, however, present algorithms cannot output the diameter of a set of points with arbitrary accuracy and near-linear complexity. By introducing the partition technique, we introduce a very simple approximation algorithm with arbitrary accuracy *ε* and a complexity of *O*(*N* + *ε*^{−1} log *ε*^{−1}) for the cases that all points are located in an Euclidean plane. The error bounds are proved strictly, and are verified by numerical tests. This complexity is better than existing algorithms, and the present algorithm is also very simple to be implemented in applications.

We present an equivalence between stochastic and deterministic variable approaches to represent ranked data and find the expressions obtained to be suggestive of statistical-mechanical meanings. We first reproduce size-rank distributions *N*(*k*) from real data sets by straightforward considerations based on the assumed knowledge of the background probability distribution *P*(*N*) that generates samples of random variable values similar to real data. The choice of different functional expressions for *P*(*N*): power law, exponential, Gaussian, etc., leads to different classes of distributions *N*(*k*) for which we find examples in nature. Then we show that all of these types of functions can be alternatively obtained from deterministic dynamical systems. These correspond to one-dimensional nonlinear iterated maps near a tangent bifurcation whose trajectories are proved to be precise analogues of the *N*(*k*). We provide explicit expressions for the maps and their trajectories and find they operate under conditions of vanishing or small Lyapunov exponent, therefore at or near a transition to or out of chaos. We give explicit examples ranging from exponential to logarithmic behavior, including Zipf’s law. Adoption of the nonlinear map as the formalism central character is a useful viewpoint, as variation of its few parameters, that modify its tangency property, translate into the different classes for *N*(*k*).

Integration of genome-wide association studies (GWAS) and expression quantitative trait loci (eQTL) studies is needed to improve our understanding of the biological mechanisms underlying GWAS hits, and our ability to identify therapeutic targets. Gene-level association methods such as PrediXcan can prioritize candidate targets. However, limited eQTL sample sizes and absence of relevant developmental and disease context restrict our ability to detect associations. Here we propose an efficient statistical method (MultiXcan) that leverages the substantial sharing of eQTLs across tissues and contexts to improve our ability to identify potential target genes. MultiXcan integrates evidence across multiple panels using multivariate regression, which naturally takes into account the correlation structure. We apply our method to simulated and real traits from the UK Biobank and show that, in realistic settings, we can detect a larger set of significantly associated genes than using each panel separately. To improve applicability, we developed a summary result-based extension called S-MultiXcan, which we show yields highly concordant results with the individual level version when LD is well matched. Our multivariate model-based approach allowed us to use the individual level results as a gold standard to calibrate and develop a robust implementation of the summary-based extension. Results from our analysis as well as software and necessary resources to apply our method are publicly available.

]]>Modern optical imaging experiments not only measure single-cell and single-molecule dynamics with high precision, but they can also perturb the cellular environment in myriad controlled and novel settings. Techniques, such as single-molecule fluorescence in-situ hybridization, microfluidics, and optogenetics, have opened the door to a large number of potential experiments, which begs the question of how to choose the best possible experiment. The Fisher information matrix (FIM) estimates how well potential experiments will constrain model parameters and can be used to design optimal experiments. Here, we introduce the finite state projection (FSP) based FIM, which uses the formalism of the chemical master equation to derive and compute the FIM. The FSP-FIM makes no assumptions about the distribution shapes of single-cell data, and it does not require precise measurements of higher order moments of such distributions. We validate the FSP-FIM against well-known Fisher information results for the simple case of constitutive gene expression. We then use numerical simulations to demonstrate the use of the FSP-FIM to optimize the timing of single-cell experiments with more complex, non-Gaussian fluctuations. We validate optimal simulated experiments determined using the FSP-FIM with Monte-Carlo approaches and contrast these to experiment designs chosen by traditional analyses that assume Gaussian fluctuations or use the central limit theorem. By systematically designing experiments to use all of the measurable fluctuations, our method enables a key step to improve co-design of experiments and quantitative models.

]]>Two classical multivariate statistical problems, testing of multivariate normality and the *k*-sample problem, are explored by a novel analysis on several resolutions simultaneously. The presented methods do not invert any estimated covariance matrix. Thereby, the methods work in the High Dimension Low Sample Size situation, i.e. when *n* ≤ *p*. The output, a significance map, is produced by doing a one-dimensional test for all possible resolution/position pairs. The significance map shows for which resolution/position pairs the null hypothesis is rejected. For the testing of multinormality, the Anderson-Darling test is utilized to detect potential departures from multinormality at different combinations of resolutions and positions. In the *k*-sample case, it is tested whether *k* data sets can be said to originate from the same unspecified discrete or continuous multivariate distribution. This is done by testing the *k* vectors corresponding to the same resolution/position pair of the *k* different data sets through the *k*-sample Anderson-Darling test. Successful demonstrations of the new methodology on artificial and real data sets are presented, and a feature selection scheme is demonstrated.

We study the optimal interventions of a regulator (a central bank or government) on the illiquidity default contagion process in a large, heterogeneous, unsecured interbank lending market. The regulator has only partial information on the interbank connections and aims to minimize the fraction of final defaults with minimal interventions. We derive the analytical results of the asymptotic optimal intervention policy and the asymptotic magnitude of default contagion in terms of the network characteristics. We extend the results of Amini, Cont and Minca’s work to incorporate interventions and adopt the dynamics of Amini, Minca and Sulem’s model to build heterogeneous networks with degree sequences and initial equity levels drawn from arbitrary distributions. Our results generate insights that the optimal intervention policy is “monotonic” in terms of the intervention cost, the closeness to invulnerability and connectivity. The regulator should prioritize interventions on banks that are systematically important or close to invulnerability. Moreover, the regulator should keep intervening on a bank once having intervened on it. Our simulation results show a good agreement with the theoretical results.

]]>Many real-world systems can be studied in terms of pattern recognition tasks, so that proper use (and understanding) of machine learning methods in practical applications becomes essential. While many classification methods have been proposed, there is no consensus on which methods are more suitable for a given dataset. As a consequence, it is important to comprehensively compare methods in many possible scenarios. In this context, we performed a systematic comparison of 9 well-known clustering methods available in the R language assuming normally distributed data. In order to account for the many possible variations of data, we considered artificial datasets with several tunable properties (number of classes, separation between classes, etc). In addition, we also evaluated the sensitivity of the clustering methods with regard to their parameters configuration. The results revealed that, when considering the default configurations of the adopted methods, the spectral approach tended to present particularly good performance. We also found that the default configuration of the adopted implementations was not always accurate. In these cases, a simple approach based on random selection of parameters values proved to be a good alternative to improve the performance. All in all, the reported approach provides subsidies guiding the choice of clustering algorithms.

]]>Online communities, which have become an integral part of the day-to-day life of people and organizations, exhibit much diversity in both size and activity level; some communities grow to a massive scale and thrive, whereas others remain small, and even wither. In spite of the important role of these proliferating communities, there is limited empirical evidence that identifies the dominant factors underlying their dynamics. Using data collected from seven large online platforms, we observe a relationship between online community size and its activity which generally repeats itself across platforms: First, in most platforms, three distinct activity regimes exist—one of low-activity and two of high-activity. Further, we find a sharp activity phase transition at a critical community size that marks the shift between the first and the second regime in six out of the seven online platforms. Essentially, we argue that it is around this critical size that sustainable interactive communities emerge. The third activity regime occurs above a higher characteristic size in which community activity reaches and remains at a constant and higher level. We find that there is variance in the steepness of the slope of the second regime, that leads to the third regime of saturation, but that the third regime is exhibited in six of the seven online platforms. We propose that the sharp activity phase transition and the regime structure stem from the branching property of online interactions.

]]>In dynamic contrast enhanced (DCE) MRI, separation of signal contributions from perfusion and leakage requires robust estimation of parameters in a pharmacokinetic model. We present and quantify the performance of a method to compute tissue hemodynamic parameters from DCE data using established pharmacokinetic models.

We propose a Bayesian scheme to obtain perfusion metrics from DCE MRI data. Initial performance is assessed through digital phantoms of the extended Tofts model (ETM) and the two-compartment exchange model (2CXM), comparing the Bayesian scheme to the standard Levenberg-Marquardt (LM) algorithm. Digital phantoms are also invoked to identify limitations in the pharmacokinetic models related to measurement conditions. Using computed maps of the extra vascular volume (v_{e}) from 19 glioma patients, we analyze differences in the number of un-physiological high-intensity v_{e} values for both ETM and 2CXM, using a one-tailed paired t-test assuming un-equal variance.

The Bayesian parameter estimation scheme demonstrated superior performance over the LM technique in the digital phantom simulations. In addition, we identified limitations in parameter reliability in relation to scan duration for the 2CXM. DCE data for glioma and cervical cancer patients was analyzed with both algorithms and demonstrated improvement in image readability for the Bayesian method. The Bayesian method demonstrated significantly fewer non-physiological high-intensity v_{e} values for the ETM (p<0.0001) and the 2CXM (p<0.0001).

This article explores how probabilistic programming can be used to simulate quantum correlations in an EPR experimental setting. Probabilistic programs are based on standard probability which cannot produce quantum correlations. In order to address this limitation, a hypergraph formalism was programmed which both expresses the measurement contexts of the EPR experimental design as well as associated constraints. Four contemporary open source probabilistic programming frameworks were used to simulate an EPR experiment in order to shed light on their relative effectiveness from both qualitative and quantitative dimensions. We found that all four probabilistic languages successfully simulated quantum correlations. Detailed analysis revealed that no language was clearly superior across all dimensions, however, the comparison does highlight aspects that can be considered when using probabilistic programs to simulate experiments in quantum physics.

]]>Typical underground water storage facilities consist of reinforced concrete tanks and pipes. Although methods of their analysis are well developed, the use of these methods does not always give unambiguous results, as presented in the paper. An example of underground tank is considered in which cylindrical roof collapsed during construction under soil and excavator loads. The causes of failure are investigated with deterministic and stochastic models. In the first step nonlinear finite element analysis including soil-structure interaction was performed to examine overall level of the structural safety, which was found satisfactory thus not explaining the collapse. In the second step an analytical stochastic model was developed and analysed with emphasis to sensitivity. The last analysis explained the collapse as a complex of unfavourable states for considered variables and the failure was recognised as a mixed construction-geotechnical-structural problem. The key role played backfill properties and its depth.

]]>