ResearchPad - covariance https://www.researchpad.co Default RSS Feed en-us © 2020 Newgen KnowledgeWorks <![CDATA[SimSurvey: An R package for comparing the design and analysis of surveys by simulating spatially-correlated populations]]> https://www.researchpad.co/article/elastic_article_8465 Populations often show complex spatial and temporal dynamics, creating challenges in designing and implementing effective surveys. Inappropriate sampling designs can potentially lead to both under-sampling (reducing precision) and over-sampling (through the extensive and potentially expensive sampling of correlated metrics). These issues can be difficult to identify and avoid in sample surveys of fish populations as they tend to be costly and comprised of multiple levels of sampling. Population estimates are therefore affected by each level of sampling as well as the pathway taken to analyze such data. Though simulations are a useful tool for exploring the efficacy of specific sampling strategies and statistical methods, there are a limited number of tools that facilitate the simulation testing of a range of sampling and analytical pathways for multi-stage survey data. Here we introduce the R package SimSurvey, which has been designed to simplify the process of simulating surveys of age-structured and spatially-distributed populations. The package allows the user to simulate age-structured populations that vary in space and time and explore the efficacy of a range of built-in or user-defined sampling protocols to reproduce the population parameters of the known population. SimSurvey also includes a function for estimating the stratified mean and variance of the population from the simulated survey data. We demonstrate the use of this package using a case study and show that it can reveal unexpected sources of bias and be used to explore design-based solutions to such problems. In summary, SimSurvey can serve as a convenient, accessible and flexible platform for simulating a wide range of sampling strategies for fish stocks and other populations that show complex structuring. Various statistical approaches can then be applied to the results to test the efficacy of different analytical approaches.

]]>
<![CDATA[Exact flow of particles using for state estimations in unmanned aerial systems` navigation]]> https://www.researchpad.co/article/Nb8d1b185-24ca-4749-9cc9-bbc7ade34d0a

The navigation is a substantial issue in the field of robotics. Simultaneous Localization and Mapping (SLAM) is a principle for many autonomous navigation applications, particularly in the Global Navigation Satellite System (GNSS) denied environments. Many SLAM methods made substantial contributions to improve its accuracy, cost, and efficiency. Still, it is a considerable challenge to manage robust SLAM, and there exist several attempts to find better estimation algorithms for it. In this research, we proposed a novel Bayesian filtering based Airborne SLAM structure for the first time in the literature. We also presented the mathematical background of the algorithm, and the SLAM model of an autonomous aerial vehicle. Simulation results emphasize that the new Airborne SLAM performance with the exact flow of particles using for recursive state estimations superior to other approaches emerged before, in terms of accuracy and speed of convergence. Nevertheless, its computational complexity may cause real-time application concerns, particularly in high-dimensional state spaces. However, in Airborne SLAM, it can be preferred in the measurement environments that use low uncertainty sensors because it gives more successful results by eliminating the problem of degeneration seen in the particle filter structure.

]]>
<![CDATA[Disease-relevant mutations alter amino acid co-evolution networks in the second nucleotide binding domain of CFTR]]> https://www.researchpad.co/article/N211c75a7-eaac-4644-b655-cac4e239c2e4

Cystic Fibrosis (CF) is an inherited disease caused by mutations in the cystic fibrosis transmembrane conductance regulator (CFTR) ion channel. Mutations in CFTR cause impaired chloride ion transport in the epithelial tissues of patients leading to cardiopulmonary decline and pancreatic insufficiency in the most severely affected patients. CFTR is composed of twelve membrane-spanning domains, two nucleotide-binding domains (NBDs), and a regulatory domain. The most common mutation in CFTR is a deletion of phenylalanine at position 508 (ΔF508) in NBD1. Previous research has primarily concentrated on the structure and dynamics of the NBD1 domain; However numerous pathological mutations have also been found in the lesser-studied NBD2 domain. We have investigated the amino acid co-evolved network of interactions in NBD2, and the changes that occur in that network upon the introduction of CF and CF-related mutations (S1251N(T), S1235R, D1270N, N1303K(T)). Extensive coupling between the α- and β-subdomains were identified with residues in, or near Walker A, Walker B, H-loop and C-loop motifs. Alterations in the predicted residue network varied from moderate for the S1251T perturbation to more severe for N1303T. The S1235R and D1270N networks varied greatly compared to the wildtype, but these CF mutations only affect ion transport preference and do not severely disrupt CFTR function, suggesting dynamic flexibility in the network of interactions in NBD2. Our results also suggest that inappropriate interactions between the β-subdomain and Q-loop could be detrimental. We also identified mutations predicted to stabilize the NBD2 residue network upon introduction of the CF and CF-related mutations, and these predicted mutations are scored as benign by the MUTPRED2 algorithm. Our results suggest the level of disruption of the co-evolution predictions of the amino acid networks in NBD2 does not have a straightforward correlation with the severity of the CF phenotypes observed.

]]>
<![CDATA[Fast and flexible linear mixed models for genome-wide genetics]]> https://www.researchpad.co/article/5c6730aed5eed0c484f37eb1

Linear mixed effect models are powerful tools used to account for population structure in genome-wide association studies (GWASs) and estimate the genetic architecture of complex traits. However, fully-specified models are computationally demanding and common simplifications often lead to reduced power or biased inference. We describe Grid-LMM (https://github.com/deruncie/GridLMM), an extendable algorithm for repeatedly fitting complex linear models that account for multiple sources of heterogeneity, such as additive and non-additive genetic variance, spatial heterogeneity, and genotype-environment interactions. Grid-LMM can compute approximate (yet highly accurate) frequentist test statistics or Bayesian posterior summaries at a genome-wide scale in a fraction of the time compared to existing general-purpose methods. We apply Grid-LMM to two types of quantitative genetic analyses. The first is focused on accounting for spatial variability and non-additive genetic variance while scanning for QTL; and the second aims to identify gene expression traits affected by non-additive genetic variation. In both cases, modeling multiple sources of heterogeneity leads to new discoveries.

]]>
<![CDATA[Probabilistic logic analysis of the highly heterogeneous spatiotemporal HFRS incidence distribution in Heilongjiang province (China) during 2005-2013]]> https://www.researchpad.co/article/5c5ca2d1d5eed0c48441eb6c

Background

Hemorrhagic fever with renal syndrome (HFRS) is a zoonosis caused by hantavirus (belongs to Hantaviridae family). A large amount of HFRS cases occur in China, especially in the Heilongjiang Province, raising great concerns regarding public health. The distribution of these cases across space-time often exhibits highly heterogeneous characteristics. Hence, it is widely recognized that the improved mapping of heterogeneous HFRS distributions and the quantitative assessment of the space-time disease transition patterns can advance considerably the detection, prevention and control of epidemic outbreaks.

Methods

A synthesis of space-time mapping and probabilistic logic is proposed to study the distribution of monthly HFRS population-standardized incidences in Heilongjiang province during the period 2005–2013. We introduce a class-dependent Bayesian maximum entropy (cd-BME) mapping method dividing the original dataset into discrete incidence classes that overcome data heterogeneity and skewness effects and can produce space-time HFRS incidence estimates together with their estimation accuracy. A ten-fold cross validation analysis is conducted to evaluate the performance of the proposed cd-BME implementation compared to the standard class-independent BME implementation. Incidence maps generated by cd-BME are used to study the spatiotemporal HFRS spread patterns. Further, the spatiotemporal dependence of HFRS incidences are measured in terms of probability logic indicators that link class-dependent HFRS incidences at different space-time points. These indicators convey useful complementary information regarding intraclass and interclass relationships, such as the change in HFRS transition probabilities between different incidence classes with increasing geographical distance and time separation.

Results

Each HFRS class exhibited a distinct space-time variation structure in terms of its varying covariance parameters (shape, sill and correlation ranges). Given the heterogeneous features of the HFRS dataset, the cd-BME implementation demonstrated an improved ability to capture these features compared to the standard implementation (e.g., mean absolute error: 0.19 vs. 0.43 cases/105 capita) demonstrating a point outbreak character at high incidence levels and a non-point spread character at low levels. Intraclass HFRS variations were found to be considerably different than interclass HFRS variations. Certain incidence classes occurred frequently near one class but were rarely found adjacent to other classes. Different classes may share common boundaries or they may be surrounded completely by another class. The HFRS class 0–68.5% was the most dominant in the Heilongjiang province (covering more than 2/3 of the total area). The probabilities that certain incidence classes occur next to other classes were used to estimate the transitions between HFRS classes. Moreover, such probabilities described the dependency pattern of the space-time arrangement of HFRS patches occupied by the incidence classes. The HFRS transition probabilities also suggested the presence of both positive and negative relations among the main classes. The HFRS indicator plots offer complementary visualizations of the varying probabilities of transition between incidence classes, and so they describe the dependency pattern of the space-time arrangement of the HFRS patches occupied by the different classes.

Conclusions

The cd-BME method combined with probabilistic logic indicators offer an accurate and informative quantitative representation of the heterogeneous HFRS incidences in the space-time domain, and the results thus obtained can be interpreted readily. The same methodological combination could also be used in the spatiotemporal modeling and prediction of other epidemics under similar circumstances.

]]>
<![CDATA[The costs of negative affect attributable to alcohol consumption in later life: A within-between random longitudinal econometric model using UK Biobank]]> https://www.researchpad.co/article/5c6dca2fd5eed0c48452a89b

Aims

Research demonstrates a negative relationship between alcohol use and affect, but the value of deprecation is unknown and thus cannot be included in estimates of the cost of alcohol to society. This paper aims to examine this relationship and develop econometric techniques to value the loss in affect attributable to alcohol consumption.

Methods

Cross-sectional (n = 129,437) and longitudinal (n = 11,352) analyses of alcohol consumers in UK Biobank data were undertaken, with depression and neuroticism as proxies of negative affect. The cross-sectional relationship between household income, negative affect and alcohol consumption were analysed using regression models, controlling for confounding variables, and using within-between random models that are robust to unobserved heterogeneity. The differential in household income required to offset alcohol’s detriment to affect was derived.

Results

A consistent relationship between depression and alcohol consumption (β = 0.001, z = 7.64) and neuroticism and alcohol consumption (β = 0.001, z = 9.24) was observed in cross-sectional analyses, replicated in within-between models (depression β = 0.001, z = 2.32; neuroticism β = 0.001, z = 2.33). Significant associations were found between household income and depression (cross sectional β = -0.157, z = -23.86, within-between β = -0.146, z = -9.51) and household income and neuroticism (cross sectional β = -0.166, z = -32.02, within-between β = -0.158, z = -7.44). The value of reducing alcohol consumption by one gram/day was pooled and estimated to be £209.06 (95% CI £171.84 to £246.27).

Conclusions

There was a robust relationship between alcohol consumption and negative affect. Econometric methods can value the intangible effects of alcohol use and may, therefore, facilitate the fiscal determination of benefit.

]]>
<![CDATA[Overcoming the problem of multicollinearity in sports performance data: A novel application of partial least squares correlation analysis]]> https://www.researchpad.co/article/5c6f1492d5eed0c48467a325

Objectives

Professional sporting organisations invest considerable resources collecting and analysing data in order to better understand the factors that influence performance. Recent advances in non-invasive technologies, such as global positioning systems (GPS), mean that large volumes of data are now readily available to coaches and sport scientists. However analysing such data can be challenging, particularly when sample sizes are small and data sets contain multiple highly correlated variables, as is often the case in a sporting context. Multicollinearity in particular, if not treated appropriately, can be problematic and might lead to erroneous conclusions. In this paper we present a novel ‘leave one variable out’ (LOVO) partial least squares correlation analysis (PLSCA) methodology, designed to overcome the problem of multicollinearity, and show how this can be used to identify the training load (TL) variables that influence most ‘end fitness’ in young rugby league players.

Methods

The accumulated TL of sixteen male professional youth rugby league players (17.7 ± 0.9 years) was quantified via GPS, a micro-electrical-mechanical-system (MEMS), and players’ session-rating-of-perceived-exertion (sRPE) over a 6-week pre-season training period. Immediately prior to and following this training period, participants undertook a 30–15 intermittent fitness test (30-15IFT), which was used to determine a players ‘starting fitness’ and ‘end fitness’. In total twelve TL variables were collected, and these along with ‘starting fitness’ as a covariate were regressed against ‘end fitness’. However, considerable multicollinearity in the data (VIF >1000 for nine variables) meant that the multiple linear regression (MLR) process was unstable and so we developed a novel LOVO PLSCA adaptation to quantify the relative importance of the predictor variables and thus minimise multicollinearity issues. As such, the LOVO PLSCA was used as a tool to inform and refine the MLR process.

Results

The LOVO PLSCA identified the distance accumulated at very-high speed (>7 m·s-1) as being the most important TL variable to influence improvement in player fitness, with this variable causing the largest decrease in singular value inertia (5.93). When included in a refined linear regression model, this variable, along with ‘starting fitness’ as a covariate, explained 73% of the variance in v30-15IFT ‘end fitness’ (p<0.001) and eliminated completely any multicollinearity issues.

Conclusions

The LOVO PLSCA technique appears to be a useful tool for evaluating the relative importance of predictor variables in data sets that exhibit considerable multicollinearity. When used as a filtering tool, LOVO PLSCA produced a MLR model that demonstrated a significant relationship between ‘end fitness’ and the predictor variable ‘accumulated distance at very-high speed’ when ‘starting fitness’ was included as a covariate. As such, LOVO PLSCA may be a useful tool for sport scientists and coaches seeking to analyse data sets obtained using GPS and MEMS technologies.

]]>
<![CDATA[Modeling financial interval time series]]> https://www.researchpad.co/article/5c6f148ad5eed0c48467a270

In financial economics, a large number of models are developed based on the daily closing price. When using only the daily closing price to model the time series, we may discard valuable intra-daily information, such as maximum and minimum prices. In this study, we propose an interval time series model, including the daily maximum, minimum, and closing prices, and then apply the proposed model to forecast the entire interval. The likelihood function and the corresponding maximum likelihood estimates (MLEs) are obtained by stochastic differential equation and the Girsanov theorem. To capture the heteroscedasticity of volatility, we consider a stochastic volatility model. The efficiency of the proposed estimators is illustrated by a simulation study. Finally, based on real data for S&P 500 index, the proposed method outperforms several alternatives in terms of the accurate forecast.

]]>
<![CDATA[Integrating predicted transcriptome from multiple tissues improves association detection]]> https://www.researchpad.co/article/5c50c43bd5eed0c4845e8359

Integration of genome-wide association studies (GWAS) and expression quantitative trait loci (eQTL) studies is needed to improve our understanding of the biological mechanisms underlying GWAS hits, and our ability to identify therapeutic targets. Gene-level association methods such as PrediXcan can prioritize candidate targets. However, limited eQTL sample sizes and absence of relevant developmental and disease context restrict our ability to detect associations. Here we propose an efficient statistical method (MultiXcan) that leverages the substantial sharing of eQTLs across tissues and contexts to improve our ability to identify potential target genes. MultiXcan integrates evidence across multiple panels using multivariate regression, which naturally takes into account the correlation structure. We apply our method to simulated and real traits from the UK Biobank and show that, in realistic settings, we can detect a larger set of significantly associated genes than using each panel separately. To improve applicability, we developed a summary result-based extension called S-MultiXcan, which we show yields highly concordant results with the individual level version when LD is well matched. Our multivariate model-based approach allowed us to use the individual level results as a gold standard to calibrate and develop a robust implementation of the summary-based extension. Results from our analysis as well as software and necessary resources to apply our method are publicly available.

]]>
<![CDATA[The finite state projection based Fisher information matrix approach to estimate information and optimize single-cell experiments]]> https://www.researchpad.co/article/5c478c61d5eed0c484bd1f74

Modern optical imaging experiments not only measure single-cell and single-molecule dynamics with high precision, but they can also perturb the cellular environment in myriad controlled and novel settings. Techniques, such as single-molecule fluorescence in-situ hybridization, microfluidics, and optogenetics, have opened the door to a large number of potential experiments, which begs the question of how to choose the best possible experiment. The Fisher information matrix (FIM) estimates how well potential experiments will constrain model parameters and can be used to design optimal experiments. Here, we introduce the finite state projection (FSP) based FIM, which uses the formalism of the chemical master equation to derive and compute the FIM. The FSP-FIM makes no assumptions about the distribution shapes of single-cell data, and it does not require precise measurements of higher order moments of such distributions. We validate the FSP-FIM against well-known Fisher information results for the simple case of constitutive gene expression. We then use numerical simulations to demonstrate the use of the FSP-FIM to optimize the timing of single-cell experiments with more complex, non-Gaussian fluctuations. We validate optimal simulated experiments determined using the FSP-FIM with Monte-Carlo approaches and contrast these to experiment designs chosen by traditional analyses that assume Gaussian fluctuations or use the central limit theorem. By systematically designing experiments to use all of the measurable fluctuations, our method enables a key step to improve co-design of experiments and quantitative models.

]]>
<![CDATA[A novel scale-space approach for multinormality testing and the k-sample problem in the high dimension low sample size scenario]]> https://www.researchpad.co/article/5c50c44ed5eed0c4845e84bb

Two classical multivariate statistical problems, testing of multivariate normality and the k-sample problem, are explored by a novel analysis on several resolutions simultaneously. The presented methods do not invert any estimated covariance matrix. Thereby, the methods work in the High Dimension Low Sample Size situation, i.e. when np. The output, a significance map, is produced by doing a one-dimensional test for all possible resolution/position pairs. The significance map shows for which resolution/position pairs the null hypothesis is rejected. For the testing of multinormality, the Anderson-Darling test is utilized to detect potential departures from multinormality at different combinations of resolutions and positions. In the k-sample case, it is tested whether k data sets can be said to originate from the same unspecified discrete or continuous multivariate distribution. This is done by testing the k vectors corresponding to the same resolution/position pair of the k different data sets through the k-sample Anderson-Darling test. Successful demonstrations of the new methodology on artificial and real data sets are presented, and a feature selection scheme is demonstrated.

]]>
<![CDATA[Clustering algorithms: A comparative approach]]> https://www.researchpad.co/article/5c478c94d5eed0c484bd335e

Many real-world systems can be studied in terms of pattern recognition tasks, so that proper use (and understanding) of machine learning methods in practical applications becomes essential. While many classification methods have been proposed, there is no consensus on which methods are more suitable for a given dataset. As a consequence, it is important to comprehensively compare methods in many possible scenarios. In this context, we performed a systematic comparison of 9 well-known clustering methods available in the R language assuming normally distributed data. In order to account for the many possible variations of data, we considered artificial datasets with several tunable properties (number of classes, separation between classes, etc). In addition, we also evaluated the sensitivity of the clustering methods with regard to their parameters configuration. The results revealed that, when considering the default configurations of the adopted methods, the spectral approach tended to present particularly good performance. We also found that the default configuration of the adopted implementations was not always accurate. In these cases, a simple approach based on random selection of parameters values proved to be a good alternative to improve the performance. All in all, the reported approach provides subsidies guiding the choice of clustering algorithms.

]]>
<![CDATA[Robust estimation of hemo-dynamic parameters in traditional DCE-MRI models]]> https://www.researchpad.co/article/5c37b79fd5eed0c4844906c1

Purpose

In dynamic contrast enhanced (DCE) MRI, separation of signal contributions from perfusion and leakage requires robust estimation of parameters in a pharmacokinetic model. We present and quantify the performance of a method to compute tissue hemodynamic parameters from DCE data using established pharmacokinetic models.

Methods

We propose a Bayesian scheme to obtain perfusion metrics from DCE MRI data. Initial performance is assessed through digital phantoms of the extended Tofts model (ETM) and the two-compartment exchange model (2CXM), comparing the Bayesian scheme to the standard Levenberg-Marquardt (LM) algorithm. Digital phantoms are also invoked to identify limitations in the pharmacokinetic models related to measurement conditions. Using computed maps of the extra vascular volume (ve) from 19 glioma patients, we analyze differences in the number of un-physiological high-intensity ve values for both ETM and 2CXM, using a one-tailed paired t-test assuming un-equal variance.

Results

The Bayesian parameter estimation scheme demonstrated superior performance over the LM technique in the digital phantom simulations. In addition, we identified limitations in parameter reliability in relation to scan duration for the 2CXM. DCE data for glioma and cervical cancer patients was analyzed with both algorithms and demonstrated improvement in image readability for the Bayesian method. The Bayesian method demonstrated significantly fewer non-physiological high-intensity ve values for the ETM (p<0.0001) and the 2CXM (p<0.0001).

Conclusion

We have demonstrated substantial improvement of the perceptive quality of pharmacokinetic parameters from advanced compartment models using the Bayesian parameter estimation scheme as compared to the LM technique.

]]>
<![CDATA[Analyzing dwell times with the Generalized Method of Moments]]> https://www.researchpad.co/article/5c3e4f85d5eed0c484d764cc

The Generalized Method of Moments (GMM) is a statistical method for the analysis of samples from random processes. First developed for the analysis of econometric data, the method is here formulated to extract hidden kinetic parameters from measurements of single molecule dwell times. Our method is based on the analysis of cumulants of the measured dwell times. We develop a general form of an objective function whose minimization can return estimates of decay parameters for any number of intermediates directly from the data. We test the performance of our technique using both simulated and experimental data. We also compare the performance of our method to nonlinear least-squares minimization (NL-LSQM), a commonly-used technique for analysis of single molecule dwell times. Our findings indicate that the GMM performs comparably to NL-LSQM over most of the parameter range we explore. It offers some benefits compared with NL-LSQM in that it does not require binning, exhibits slightly lower bias and variance with small sample sizes (N<20), and is somewhat superior in identifying fast decay times with these same low count data sets. Additionally, a comparison with the Classical Method of Moments (CMM) shows that the CMM can fail in many cases, whereas the GMM always returns estimates. Our results show that the GMM can be a useful tool and complements standard approaches to analysis of single molecule dwell times.

]]>
<![CDATA[A semi-empirical model of the energy balance closure in the surface layer]]> https://www.researchpad.co/article/5c1ab819d5eed0c484026acc

It has been hypothesized that the energy balance closure problem of single-tower eddy-covariance measurements is linked to large-scale turbulent transport. In order to shed light on this problem, we investigate the functional dependence of the normalized residual for the potential temperature and humidity conservation equations, i.e. the imbalance ratio for the fluxes of latent and sensible heat. We set up a suite of simulations consisting of cases with different stability and surface Bowen ratio. We employ a nesting approach in the lower part of the atmospheric boundary-layer to achieve higher spatial resolution near the surface. Our simulations reproduce earlier simulation results for the mixed layer and also mimic the saw-blade pattern of real flux measurements. Focusing on homogeneous terrain, we derive a parameterization for the spatially averaged flux imbalance ratios of latent and sensible heat in the surface layer. We also investigate how the remaining imbalance for a given point measurement is related to the local turbulence, by deriving a statistical model based on turbulence characteristics that are related to large-scale turbulence. The average imbalance ratio scales well with friction velocity, especially for sensible heat. For the latent heat flux, our results show that the Bowen ratio also influences the underestimation. Furthermore, in the surface layer the residual has a linear dependence on the absolute height divided by the boundary-layer height. Our parameterization allows us to deduce an expression for the residual in the energy budget for a particular measurement half hour, based on the measurement height and stability.

]]>
<![CDATA[Mixed effects approach to the analysis of the stepped wedge cluster randomised trial—Investigating the confounding effect of time through simulation]]> https://www.researchpad.co/article/5c1c0af1d5eed0c484426f5a

Background

A stepped wedge cluster randomised trial (SWCRT) is a multicentred study which allows an intervention to be rolled out at sites in a random order. Once the intervention is initiated at a site, all participants within that site remain exposed to the intervention for the remainder of the study.

The time since the start of the study (“calendar time”) may affect outcome measures through underlying time trends or periodicity. The time since the intervention was introduced to a site (“exposure time”) may also affect outcomes cumulatively for successful interventions, possibly in addition to a step change when the intervention began.

Methods

Motivated by a SWCRT of self-monitoring for bipolar disorder, we conducted a simulation study to compare model formulations to analyse data from a SWCRT under 36 different scenarios in which time was related to the outcome (improvement in mood score). The aim was to find a model specification that would produce reliable estimates of intervention effects under different scenarios. Nine different formulations of a linear mixed effects model were fitted to these datasets. These models varied in the specification of calendar and exposure times.

Results

Modelling the effects of the intervention was best accomplished by including terms for both calendar time and exposure time. Treating time as categorical (a separate parameter for each measurement time-step) achieved the best coverage probabilities and low bias, but at a cost of wider confidence intervals compared to simpler models for those scenarios which were sufficiently modelled by fewer parameters. Treating time as continuous and including a quadratic time term performed similarly well, with slightly larger variations in coverage probability, but narrower confidence intervals and in some cases lower bias. The impact of misspecifying the covariance structure was comparatively small.

Conclusions

We recommend that unless there is a priori information to indicate the form of the relationship between time and outcomes, data from SWCRTs should be analysed with a linear mixed effects model that includes separate categorical terms for calendar time and exposure time. Prespecified sensitivity analyses should consider the different formulations of these time effects in the model, to assess their impact on estimates of intervention effects.

]]>
<![CDATA[Low rank and sparsity constrained method for identifying overlapping functional brain networks]]> https://www.researchpad.co/article/5c0841fad5eed0c484fcb573

Analysis of functional magnetic resonance imaging (fMRI) data has revealed that brain regions can be grouped into functional brain networks (fBNs) or communities. A community in fMRI analysis signifies a group of brain regions coupled functionally with one another. In neuroimaging, functional connectivity (FC) measure can be utilized to quantify such functionally connected regions for disease diagnosis and hence, signifies the need of devising novel FC estimation methods. In this paper, we propose a novel method of learning FC by constraining its rank and the sum of non-zero coefficients. The underlying idea is that fBNs are sparse and can be embedded in a relatively lower dimension space. In addition, we propose to extract overlapping networks. In many instances, communities are characterized as combinations of disjoint brain regions, although recent studies indicate that brain regions may participate in more than one community. In this paper, large-scale overlapping fBNs are identified on resting state fMRI data by employing non-negative matrix factorization. Our findings support the existence of overlapping brain networks.

]]>
<![CDATA[S-maup: Statistical test to measure the sensitivity to the modifiable areal unit problem]]> https://www.researchpad.co/article/5c06f026d5eed0c484c6d1d3

This work presents a nonparametric statistical test, S-maup, to measure the sensitivity of a spatially intensive variable to the effects of the Modifiable Areal Unit Problem (MAUP). To the best of our knowledge, S-maup is the first statistic of its type and focuses on determining how much the distribution of the variable, at its highest level of spatial disaggregation, will change when it is spatially aggregated. Through a computational experiment, we obtain the basis for the design of the statistical test under the null hypothesis of non-sensitivity to MAUP. We performed an exhaustive simulation study for approaching the empirical distribution of the statistical test, obtaining its critical values, and computing its power and size. The results indicate that, in general, both the statistical size and power improve with increasing sample size. Finally, for illustrative purposes, an empirical application is made using the Mincer equation in South Africa, where starting from 206 municipalities, the S-maup statistic is used to find the maximum level of spatial aggregation that avoids the negative consequences of the MAUP.

]]>
<![CDATA[Impact of a custom-made 3D printed ergonomic grip for direct laryngoscopy on novice intubation performance in a simulated easy and difficult airway scenario—A manikin study]]> https://www.researchpad.co/article/5bfdb389d5eed0c4845ca405

Direct laryngoscopy using a Macintosh laryngoscope is the most widely used approach; however, this skill is not easy for novices and trainees. We evaluated the performance of novices using a laryngoscope with a three-dimensional (3D)-printed ergonomic grip on an airway manikin. Forty second-year medical students were enrolled. Endotracheal intubation was attempted using a conventional Macintosh laryngoscope with or without a 3D-printed ergonomic support grip. Primary outcomes were intubation time and overall success rate. Secondary outcomes were number of unsuccessful attempts, first-attempt success rate, airway Cormack-Lehane (CL) grade, and difficulty score. In the easy airway scenario, intubation time, and the overall success rate were similar between two group. CL grade and ease-of-use scores were significantly better for those using the ergonomic support grip (P < 0.05). In the difficult airway scenario, intubation time (49.7±37.5 vs. 35.5±29.2, P = 0.013), the first-attempt success rate (67.5% vs. 90%, P = 0.029), number of attempts (1.4±0.6 vs. 1.1±0.4, P = 0.006), CL grade (2 [2, 2] vs. 2 [1, 1], P = 0.012), and ease-of-use scores (3.5 [2, 4] vs. 4 [3, 5], P = 0.008) were significantly better for those using the ergonomic support grip. Linear mixed model analysis showed that the ergonomic support grip had a favorable effect on CL grade (P<0.001), ease-of-use scores (P<0.001), intubation time (P = 0.015), and number of intubation attempts (P = 0.029). Our custom 3D-printed ergonomic laryngoscope support grip improved several indicators related to the successful endotracheal intubation in the easy and difficult scenario simulated on an airway manikin. This grip may be useful for intubation training and practice.

]]>
<![CDATA[An efficient outlier removal method for scattered point cloud data]]> https://www.researchpad.co/article/5b6da1b2463d7e4dccc5faec

Outlier removal is a fundamental data processing task to ensure the quality of scanned point cloud data (PCD), which is becoming increasing important in industrial applications and reverse engineering. Acquired scanned PCD is usually noisy, sparse and temporarily incoherent. Thus the processing of scanned data is typically an ill-posed problem. In the paper, we present a simple and effective method based on two geometrical characteristics constraints to trim the noisy points. One of the geometrical characteristics is the local density information and another is the deviation from the local fitting plane. The local density based method provides a preprocessing step, which could remove those sparse outlier and isolated outlier. The non-isolated outlier removal in this paper depends on a local projection method, which placing those points onto objects. There is no doubt that the deviation of any point from the local fitting plane should be a criterion to reduce the noisy points. The experimental results demonstrate the ability to remove the noisy point from various man-made objects consisting of complex outlier.

]]>