Search results
(21 - 40 of 49)
Pages
- Title
- An Iterative Method Converging to a Positive Solution of Certain Systems of Polynomial Equations
- Date
- 2011, 2011
- Description
-
We present a numerical algorithm for finding real non-negative solutions to a certain class of polynomial equations. Our methods are based on...
Show moreWe present a numerical algorithm for finding real non-negative solutions to a certain class of polynomial equations. Our methods are based on the expectation maximization and iterative proportional fitting algorithms, which are used in statistics to find maximum likelihood parameters for certain classes of statistical models. Since our algorithm works by iteratively improving an approximate solution, we find approximate solutions in the cases when there are no exact solutions, such as overconstrained systems.
Show less - Collection
- Journal of Algebraic Statistics
- Title
- Properties of semi-elementary imsets as sums of elementary imsets
- Date
- 2011, 2011
- Description
-
We study properties of semi-elementary imsets and elementary imsets introduced by Studeny [10]. The rules of the semi-graphoid axiom ...
Show moreWe study properties of semi-elementary imsets and elementary imsets introduced by Studeny [10]. The rules of the semi-graphoid axiom (decomposition, weak union and contraction) for conditional independence statements can be translated into a simple identity among three semi-elementary imsets. By recursively applying the identity, any semi-elementary imset can be written as a sum of elementary imsets, which we call a representation of the semi-elementary imset. A semi-elementary imset has many representations. We study properties of the set of possible representations of a semi-elementary imset and prove that all representations are connected by relations among four elementary imsets.
Show less - Collection
- Journal of Algebraic Statistics
- Title
- Hilbert Polynomial of the Kimura 3-Parameter Model, AS2012 Special Volume, part 1: This issue includes a second series of papers from talks, posters and collaborations resulting from and inspired by the Algebraic Statistics in the Alleghenies Conference at Penn State, which took place in July 2012.
- Description
-
In [2] Buczyn ́ska and Wi ́sniewski showed that the Hilbert polynomial of the algebraic variety associated to the Jukes-Cantor binary model on...
Show moreIn [2] Buczyn ́ska and Wi ́sniewski showed that the Hilbert polynomial of the algebraic variety associated to the Jukes-Cantor binary model on a trivalent tree depends only on the number of leaves of the tree and not on its shape. We ask if this can be generalized to other group-based models. The Jukes-Cantor binary model has Z2 as the underlying group. We consider the Kimura 3-parameter model with Z2 × Z2 as the underlying group. We show that the generalization of the statement about the Hilbert polynomials to the Kimura 3-parameter model is not possible as the Hilbert polynomial depends on the shape of a trivalent tree.
Show less - Collection
- Journal of Algebraic Statistics
- Title
- Connectivity for 3 x 3 x K contingency tables
- Description
-
We consider an exact sequential conditional test for three-way conditional test of no interaction. At each time τ, the test uses as the...
Show moreWe consider an exact sequential conditional test for three-way conditional test of no interaction. At each time τ, the test uses as the conditional inference frame the set F(Hτ) of all tables with the same three two-way marginal tables as the obtained table Hτ . For 3 × 3 × K tables, we propose a method to construct F(Hτ) from F(Hτ−1). This enables us to perform efficiently the sequential exact conditional test. The subset Sτ of F (Hτ ) consisting of s + Hτ − Hτ −1 for s ∈ F(Hτ−1) contains Hτ , where the operations + and − are defined elementwise. Our argument is based on the minimal Markov basis for 3 × 3 × K contingency tables and we give a minimal subset M of some Markov basis which has the property that F (Hτ ) = {s − m | s ∈ Sτ , m ∈ M}.
Show less - Collection
- Journal of Algebraic Statistics
- Title
- Detecting epistasis via Markov bases
- Date
- 2011, 2011
- Description
-
Rapid research progress in genotyping techniques have allowed large genome-wide association studies. Existing methods often focus on...
Show moreRapid research progress in genotyping techniques have allowed large genome-wide association studies. Existing methods often focus on determining associations between single loci and a specific phenotype. However, a particular phenotype is usually the result of complex relationships between multiple loci and the environment. In this paper, we describe a two-stage method for detecting epistasis by combining the traditionally used single-locus search with a search for multiway interactions. Our method is based on an extended version of Fisher’s exact test. To perform this test, a Markov chain is constructed on the space of multidimensional contingency tables using the elements of a Markov basis as moves. We test our method on simulated data and compare it to a two-stage logistic regression method and to a fully Bayesian method, showing that we are able to detect the interacting loci when other methods fail to do so. Finally, we apply our method to a genome-wide data set consisting of 685 dogs and identify epistasis associated with canine hair length for four pairs of single nucleotide polymorphisms (SNPs).
Show less - Collection
- Journal of Algebraic Statistics
- Title
- Open Problems on Connectivity of Fibers with Positive Margins in Multi-dimensional Contingency Tables
- Date
- 2010, 2010
- Description
-
Diaconis-Sturmfels developed an algorithm for sampling from conditional distributions for a statistical model of discrete exponential families...
Show moreDiaconis-Sturmfels developed an algorithm for sampling from conditional distributions for a statistical model of discrete exponential families, based on the algebraic theory of toric ideals. This algorithm is applied to categorical data analysis through the notion of Markov bases. Initiated with its application to Markov chain Monte Carlo approach for testing statistical fitting of the given model, many researchers have extensively studied the structure of Markov bases for models in computational algebraic statistics. In the Markov chain Monte Carlo approach for testing statistical fitting of the given model, a Markov basis is a set of moves connecting all contingency tables satisfying the given margins. Despite the computational advances, there are applied problems where one may never be able to compute a Markov basis. In general, the number of elements in a minimal Markov basis for a model can be exponentially many. Thus, it is important to compute a reduced number of moves which connect all tables instead of computing a Markov basis. In some cases, such as logistic regression, positive margins are shown to allow a set of Markov connecting moves that are much simpler than the full Markov basis. Such a set is called a Markov subbasis with assumption of positive margins. In this paper we summarize some computations of and open problems on Markov subbases for contingency tables with assumption of positive margins under specific models as well as develop algebraic methods for studying connectivity of Markov moves with margin positivity to develop Markov sampling methods for exact conditional inference in statistical models where the Markov basis is hard to compute.
Show less - Collection
- Journal of Algebraic Statistics
- Title
- Generalized Fréchet Bounds for Cell Entries in Multidimensional Contingency Tables, Special Volume in honor of memory of S.E.Fienberg
- Description
-
We consider the lattice, L, of all subsets of a multidimensional contingency table and establish the properties of monotonicity and...
Show moreWe consider the lattice, L, of all subsets of a multidimensional contingency table and establish the properties of monotonicity and supermodularity for the marginalization function, n(·), on L. We derive from the supermodularity of n(·) some generalized Fr ́echet inequalities comple- menting and extending inequalities of Dobra and Fienberg. Further, we construct new monotonic and supermodular functions from n(·), and we remark on the connection between supermodularity and some correlation inequalities for probability distributions on lattices. We also apply an inequal- ity of Ky Fan to derive a new approach to Fr ́echet inequalities for multidimensional contingency tables.
Show less - Collection
- Journal of Algebraic Statistics
- Title
- Maximal Length Projections in Group Algebras with Applications to Linear Rank Tests of Uniformity
- Description
-
Let G be a finite group, let CG be the complex group algebra of G, and let p ∈ CG. In this paper, we show how to construct submodules S of CG...
Show moreLet G be a finite group, let CG be the complex group algebra of G, and let p ∈ CG. In this paper, we show how to construct submodules S of CG of a fixed dimension with the property that the orthogonal projection of p onto S has maximal length. We then provide an example of how such submodules for the symmetric group Sn can be used to create new linear rank tests of uniformity in statistics for survey data that arises when respondents are asked to give a complete ranking of n items.
Show less - Collection
- Journal of Algebraic Statistics
- Title
- Matrix Completion for the Independence Model
- Description
-
We investigate the problem of completing partial matrices to rank-one matrices in the standard simplex ∆mn−1. The motivation for studying this...
Show moreWe investigate the problem of completing partial matrices to rank-one matrices in the standard simplex ∆mn−1. The motivation for studying this problem comes from statistics: A lack of eligible completion can provide a falsification test for partial observations to come from the independence model. For each pattern of specified entries, we give equations and inequalities which are satisfied if and only if an eligible completion exists. We also describe the set of valid completions, and we optimize over this set.
Show less - Collection
- Journal of Algebraic Statistics
- Title
- One example of general unidentifiable tensors
- Description
-
Abstract. Theidentifiabilityofparametersinaprobabilisticmodelisacrucialnotioninstatistical inference. We prove that a general tensor of rank 8...
Show moreAbstract. Theidentifiabilityofparametersinaprobabilisticmodelisacrucialnotioninstatistical inference. We prove that a general tensor of rank 8 in C3 ⊗ C6 ⊗ C6 has at least 6 decompositions as sum of simple tensors, so it is not 8-identifiable. This is the highest known example of balanced tensors of dimension 3, which are not k-identifiable, when k is smaller than the generic rank.
Show less - Collection
- Journal of Algebraic Statistics
- Title
- L-cumulants, L-cumulant embeddings and algebraic statistics, AS2012 Special Volume, part 1: This issue includes a second series of papers from talks, posters and collaborations resulting from and inspired by the Algebraic Statistics in the Alleghenies Conference at Penn State, which took place in July 2012.
- Description
-
Focusing on the discrete probabilistic setting we generalize the combinatorial definition of cumulants to L-cumulants. This generalization...
Show moreFocusing on the discrete probabilistic setting we generalize the combinatorial definition of cumulants to L-cumulants. This generalization keeps all the desired properties of the classical cumulants like semi-invariance and vanishing for independent blocks of random variables. These properties make L-cumulants useful for the algebraic analysis of statistical models. We illustrate this for general Markov models and hidden Markov processes in the case when the hidden process is binary. The main motivation of this work is to understand cumulant-like coordinates in alge- braic statistics and to give a more insightful explanation why tree cumulants give such an elegant description of binary hidden tree models. Moreover, we argue that L-cumulants can be used in the analysis of certain classical algebraic varieties.
Show less - Collection
- Journal of Algebraic Statistics
- Title
- Estimation for Dyadic-Dependent Exponential Random Graph Models
- Date
- 2014, 2014-04-30
- Description
-
Graphs are the primary mathematical representation for networks, with nodes or vertices corresponding to units (e.g., individuals) and edges...
Show moreGraphs are the primary mathematical representation for networks, with nodes or vertices corresponding to units (e.g., individuals) and edges corresponding to relationships. Exponential Random Graph Models (ERGMs) are widely used for describing network data because of their simple structure as an exponential function of a sum of parameters multiplied by their corresponding sufficient statistics. As with other exponential family settings the key computational difficulty is determining the normalizing constant for the likelihood function, a quantity that depends only on the data. In ERGMs for network data, the normalizing constant in the model often makes the parameter estimation intractable for large graphs, when the model involves dependence among dyads in the graph. One way to deal with this problem is to approximate the likelihood function by something tractable, e.g., by using the method of pseudo-likelihood estimation suggested in the early literature. In this paper, we describe the family of ERGMs and explain the increasing complexity that arises from imposing different edge dependence and homogeneous parameter assumptions. We then compare maximum likelihood (ML) and maximum pseudo-likelihood (MPL) estimation schemes with respect to existence and related degeneracy properties for ERGMs involving dependencies among dyads.
Show less - Collection
- Journal of Algebraic Statistics
- Title
- Generic Identification of Binary-Valued Hidden Markov Processes
- Date
- 2014, 2014-04-30
- Description
-
The generic identification problem is to decide whether a stochastic process (X_t) is a hidden Markov process and if yes to infer its...
Show moreThe generic identification problem is to decide whether a stochastic process (X_t) is a hidden Markov process and if yes to infer its parameters for all but a subset of parametrizations that form a lower-dimensional subvariety in parameter space. Partial answers so far available depend on extra assumptions on the processes, which are usually centered around stationarity. Here we present a general solution for binary-valued hidden Markov processes. Our approach is rooted in algebraic statistics hence it is geometric in nature. We find that the algebraic varieties associated with the probability distributions of binary-valued hidden Markov processes are zero sets of determinantal equations which draws a connection to well-studied objects from algebra. As a consequence, our solution allows for algorithmic implementation based on elementary (linear) algebraic routines.
Show less - Collection
- Journal of Algebraic Statistics
- Title
- Binary hidden Markov models and varieties, AS2012 Special Volume, part 2: This issue includes a second series of papers from talks, posters and collaborations resulting from and inspired by the Algebraic Statistics in the Alleghenies Conference at Penn State, which took place in July 2012.
- Date
- 2013, 2013
- Description
-
This paper closely examines HMMs in which all the hidden random variables are...
Show moreThis paper closely examines HMMs in which all the hidden random variables are binary. Its main contributions are (1) a birational parametrization for every such HMM, with an explicit inverse for recovering the hidden parameters in terms of observables, (2) a semialgebraic model membership test for every such HMM, and (3) minimal dening equations for the 4-node fully binary model, comprising 21 quadrics and 29 cubics, which were computed using Grobner bases in the cumulant coordinates of Sturmfels and Zwiernik. The new model parameters in (1) are rationally identiable in the sense of Sullivant, Garcia-Puente, and Spielvogel, and each model's Zariski closure is therefore a rational projective variety of dimension 5. Grobner basis computations for the model and its graph are found to be considerably faster using these parameters. In the case of two hidden states, item (2) supersedes a previous algorithm of Schonhuth which is only generically dened, and the dening equations (3) yield new invariants for HMMs of all lengths 4. Such invariants have been used successfully in model selection problems in phylogenetics, and one can hope for similar applications in the case of HMMs.
Show less - Collection
- Journal of Algebraic Statistics
- Title
- Learning Coefficient in Bayesian Estimation of Restricted Boltzmann Machine, AS2012 Special Volume, part 2: This issue includes a second series of papers from talks, posters and collaborations resulting from and inspired by the Algebraic Statistics in the Alleghenies Conference at Penn State, which took place in July 2012.
- Date
- 2013, 2013
- Description
-
We consider the real log canonical threshold for the learning model in Bayesian estimation. This threshold corresponds to a learning...
Show moreWe consider the real log canonical threshold for the learning model in Bayesian estimation. This threshold corresponds to a learning coefficient of generalization error in Bayesian estimation, which serves to measure learning efficiency in hierarchical learning models [30, 31, 33]. In this paper, we clarify the ideal which gives the log canonical threshold of the restricted Boltzmann machine and consider the learning coefficients of this model.
Show less - Collection
- Journal of Algebraic Statistics
- Title
- The precision space of interpolatory cubature formulæ
- Date
- 2015, 2015-06-11
- Description
-
Methods from Commutative Algebra and Numerical Analysis are combined to address a problem common to many disciplines: the estimation of the...
Show moreMethods from Commutative Algebra and Numerical Analysis are combined to address a problem common to many disciplines: the estimation of the expected value of a polynomial of a random vector using a linear combination of a finite number of its values. In this work we remark on the error estimation in cubature formulæ for polynomial functions and introduce the notion of a precision space for a cubature rule.
Show less - Collection
- Journal of Algebraic Statistics
- Title
- The degeneration of the Grassmannian into a toric variety and the calculation of the eigenspaces of a torus action
- Date
- 2015, 2015-06-11
- Description
-
Using the method of degenerating a Grassmannian into a toric variety, we calculate formulas for the dimensions of the eigenspaces of the...
Show moreUsing the method of degenerating a Grassmannian into a toric variety, we calculate formulas for the dimensions of the eigenspaces of the action of an n-dimensional torus on a Grassmannian of planes in an n-dimensional space.
Show less - Collection
- Journal of Algebraic Statistics
- Title
- Varieties with maximum likelihood degree one
- Date
- 2014, 2014-04-30
- Description
-
We show that algebraic varieties with maximum likelihood degree one are exactly the images of reduced A-discriminantal varieties under...
Show moreWe show that algebraic varieties with maximum likelihood degree one are exactly the images of reduced A-discriminantal varieties under monomial maps with finite fibers. The maximum likelihood estimator corresponding to such a variety is Kapranov’s Horn uniformization. This extends Kapranov’s characterization of A-discriminantal hypersurfaces to varieties of arbitrary codimension.
Show less - Collection
- Journal of Algebraic Statistics
- Title
- Maximum Likelihood for Matrices with Rank Constraints
- Date
- 2014, 2014-04-30
- Description
-
Maximum likelihood estimation is a fundamental optimization problem in statistics. We study this problem on manifolds of matrices with bounded...
Show moreMaximum likelihood estimation is a fundamental optimization problem in statistics. We study this problem on manifolds of matrices with bounded rank. These represent mixtures of distributions of two independent discrete random variables. We determine the maximum likelihood degree for a range of determinantal varieties, and we apply numerical algebraic geometry to compute all critical points of their likelihood functions. This led to the discovery of maximum likelihood duality between matrices of complementary ranks, a result proved subsequently by Draisma and Rodriguez.
Show less - Collection
- Journal of Algebraic Statistics
- Title
- Uncovering Proximity of Chromosome Territories using Classical Algebraic Statistics
- Date
- 2015, 2015-11-09
- Description
-
Exchange type chromosome aberrations (ETCAs) are rearrangements of the genome that occur when chromosomes break and the resulting fragments...
Show moreExchange type chromosome aberrations (ETCAs) are rearrangements of the genome that occur when chromosomes break and the resulting fragments rejoin with fragments from other chromosomes or from other regions within the same chromosome. ETCAs are commonly observed in cancer cells and in cells exposed to radiation. The frequency of these chromosome rearrangements is correlated with their spatial proximity, therefore it can be used to infer the three dimensional organization of the genome. Extracting statistical significance of spatial proximity from cancer and radiation data has remained somewhat elusive because of the sparsity of the data. We here propose a new approach to study the three dimensional organization of the genome using algebraic statistics. We test our method on a published data set of irradiated human blood lymphocyte cells. We provide a rigorous method for testing the overall organization of the genome, and in agreement with previous results we find a random relative positioning of chromosomes with the exception of the chromosome pairs {1,22} and {13,14} that have a significantly larger number of ETCAs than the rest of the chromosome pairs suggesting their spatial proximity. We conclude that algebraic methods can successfully be used to analyze genetic data and have potential applications to larger and more complex data sets.
Show less - Collection
- Journal of Algebraic Statistics