Search results
(41 - 49 of 49)
Pages
- Title
- Detecting epistasis via Markov bases
- Date
- 2011, 2011
- Description
-
Rapid research progress in genotyping techniques have allowed large genome-wide association studies. Existing methods often focus on...
Show moreRapid research progress in genotyping techniques have allowed large genome-wide association studies. Existing methods often focus on determining associations between single loci and a specific phenotype. However, a particular phenotype is usually the result of complex relationships between multiple loci and the environment. In this paper, we describe a two-stage method for detecting epistasis by combining the traditionally used single-locus search with a search for multiway interactions. Our method is based on an extended version of Fisher’s exact test. To perform this test, a Markov chain is constructed on the space of multidimensional contingency tables using the elements of a Markov basis as moves. We test our method on simulated data and compare it to a two-stage logistic regression method and to a fully Bayesian method, showing that we are able to detect the interacting loci when other methods fail to do so. Finally, we apply our method to a genome-wide data set consisting of 685 dogs and identify epistasis associated with canine hair length for four pairs of single nucleotide polymorphisms (SNPs).
Show less - Collection
- Journal of Algebraic Statistics
- Title
- Open Problems on Connectivity of Fibers with Positive Margins in Multi-dimensional Contingency Tables
- Date
- 2010, 2010
- Description
-
Diaconis-Sturmfels developed an algorithm for sampling from conditional distributions for a statistical model of discrete exponential families...
Show moreDiaconis-Sturmfels developed an algorithm for sampling from conditional distributions for a statistical model of discrete exponential families, based on the algebraic theory of toric ideals. This algorithm is applied to categorical data analysis through the notion of Markov bases. Initiated with its application to Markov chain Monte Carlo approach for testing statistical fitting of the given model, many researchers have extensively studied the structure of Markov bases for models in computational algebraic statistics. In the Markov chain Monte Carlo approach for testing statistical fitting of the given model, a Markov basis is a set of moves connecting all contingency tables satisfying the given margins. Despite the computational advances, there are applied problems where one may never be able to compute a Markov basis. In general, the number of elements in a minimal Markov basis for a model can be exponentially many. Thus, it is important to compute a reduced number of moves which connect all tables instead of computing a Markov basis. In some cases, such as logistic regression, positive margins are shown to allow a set of Markov connecting moves that are much simpler than the full Markov basis. Such a set is called a Markov subbasis with assumption of positive margins. In this paper we summarize some computations of and open problems on Markov subbases for contingency tables with assumption of positive margins under specific models as well as develop algebraic methods for studying connectivity of Markov moves with margin positivity to develop Markov sampling methods for exact conditional inference in statistical models where the Markov basis is hard to compute.
Show less - Collection
- Journal of Algebraic Statistics
- Title
- Generalized Fréchet Bounds for Cell Entries in Multidimensional Contingency Tables, Special Volume in honor of memory of S.E.Fienberg
- Description
-
We consider the lattice, L, of all subsets of a multidimensional contingency table and establish the properties of monotonicity and...
Show moreWe consider the lattice, L, of all subsets of a multidimensional contingency table and establish the properties of monotonicity and supermodularity for the marginalization function, n(·), on L. We derive from the supermodularity of n(·) some generalized Fr ́echet inequalities comple- menting and extending inequalities of Dobra and Fienberg. Further, we construct new monotonic and supermodular functions from n(·), and we remark on the connection between supermodularity and some correlation inequalities for probability distributions on lattices. We also apply an inequal- ity of Ky Fan to derive a new approach to Fr ́echet inequalities for multidimensional contingency tables.
Show less - Collection
- Journal of Algebraic Statistics
- Title
- Maximal Length Projections in Group Algebras with Applications to Linear Rank Tests of Uniformity
- Description
-
Let G be a finite group, let CG be the complex group algebra of G, and let p ∈ CG. In this paper, we show how to construct submodules S of CG...
Show moreLet G be a finite group, let CG be the complex group algebra of G, and let p ∈ CG. In this paper, we show how to construct submodules S of CG of a fixed dimension with the property that the orthogonal projection of p onto S has maximal length. We then provide an example of how such submodules for the symmetric group Sn can be used to create new linear rank tests of uniformity in statistics for survey data that arises when respondents are asked to give a complete ranking of n items.
Show less - Collection
- Journal of Algebraic Statistics
- Title
- Matrix Completion for the Independence Model
- Description
-
We investigate the problem of completing partial matrices to rank-one matrices in the standard simplex ∆mn−1. The motivation for studying this...
Show moreWe investigate the problem of completing partial matrices to rank-one matrices in the standard simplex ∆mn−1. The motivation for studying this problem comes from statistics: A lack of eligible completion can provide a falsification test for partial observations to come from the independence model. For each pattern of specified entries, we give equations and inequalities which are satisfied if and only if an eligible completion exists. We also describe the set of valid completions, and we optimize over this set.
Show less - Collection
- Journal of Algebraic Statistics
- Title
- One example of general unidentifiable tensors
- Description
-
Abstract. Theidentifiabilityofparametersinaprobabilisticmodelisacrucialnotioninstatistical inference. We prove that a general tensor of rank 8...
Show moreAbstract. Theidentifiabilityofparametersinaprobabilisticmodelisacrucialnotioninstatistical inference. We prove that a general tensor of rank 8 in C3 ⊗ C6 ⊗ C6 has at least 6 decompositions as sum of simple tensors, so it is not 8-identifiable. This is the highest known example of balanced tensors of dimension 3, which are not k-identifiable, when k is smaller than the generic rank.
Show less - Collection
- Journal of Algebraic Statistics
- Title
- L-cumulants, L-cumulant embeddings and algebraic statistics, AS2012 Special Volume, part 1: This issue includes a second series of papers from talks, posters and collaborations resulting from and inspired by the Algebraic Statistics in the Alleghenies Conference at Penn State, which took place in July 2012.
- Description
-
Focusing on the discrete probabilistic setting we generalize the combinatorial definition of cumulants to L-cumulants. This generalization...
Show moreFocusing on the discrete probabilistic setting we generalize the combinatorial definition of cumulants to L-cumulants. This generalization keeps all the desired properties of the classical cumulants like semi-invariance and vanishing for independent blocks of random variables. These properties make L-cumulants useful for the algebraic analysis of statistical models. We illustrate this for general Markov models and hidden Markov processes in the case when the hidden process is binary. The main motivation of this work is to understand cumulant-like coordinates in alge- braic statistics and to give a more insightful explanation why tree cumulants give such an elegant description of binary hidden tree models. Moreover, we argue that L-cumulants can be used in the analysis of certain classical algebraic varieties.
Show less - Collection
- Journal of Algebraic Statistics
- Title
- Hilbert Polynomial of the Kimura 3-Parameter Model, AS2012 Special Volume, part 1: This issue includes a second series of papers from talks, posters and collaborations resulting from and inspired by the Algebraic Statistics in the Alleghenies Conference at Penn State, which took place in July 2012.
- Description
-
In [2] Buczyn ́ska and Wi ́sniewski showed that the Hilbert polynomial of the algebraic variety associated to the Jukes-Cantor binary model on...
Show moreIn [2] Buczyn ́ska and Wi ́sniewski showed that the Hilbert polynomial of the algebraic variety associated to the Jukes-Cantor binary model on a trivalent tree depends only on the number of leaves of the tree and not on its shape. We ask if this can be generalized to other group-based models. The Jukes-Cantor binary model has Z2 as the underlying group. We consider the Kimura 3-parameter model with Z2 × Z2 as the underlying group. We show that the generalization of the statement about the Hilbert polynomials to the Kimura 3-parameter model is not possible as the Hilbert polynomial depends on the shape of a trivalent tree.
Show less - Collection
- Journal of Algebraic Statistics
- Title
- An Iterative Method Converging to a Positive Solution of Certain Systems of Polynomial Equations
- Date
- 2011, 2011
- Description
-
We present a numerical algorithm for finding real non-negative solutions to a certain class of polynomial...
Show moreWe present a numerical algorithm for finding real non-negative solutions to a certain class of polynomial equations. Our methods are based on the expectation maximization and iterative proportional fitting algorithms, which are used in statistics to find maximum likelihood parameters for certain classes of statistical models. Since our algorithm works by iteratively improving an approximate solution, we find approximate solutions in the cases when there are no exact solutions, such as overconstrained systems.
Show less - Collection
- Journal of Algebraic Statistics
