Search results
(1 - 2 of 2)
- Title
- Advances in Machine Learning: Theory and Applications in Time Series Prediction
- Creator
- London, Justin J.
- Date
- 2021
- Description
-
A new time series modeling framework for forecasting, prediction and regime switching for recurrent neural networks (RNNs) using machine...
Show moreA new time series modeling framework for forecasting, prediction and regime switching for recurrent neural networks (RNNs) using machine learning is introduced. In this framework, we replace the perceptron with an econometric modeling unit. This cell/unit is a functionally dedicated to processing the prediction component from the econometric model. These supervised learning methods overcome the parameter estimation and convergence problems of traditional econometric autoregression (AR) models that use MLE and expectation-maximization (EM) methods which are computationally expensive, assume linearity, Gaussian distributed errors, and suffer from the curse of dimensionality. Consequently, due to these estimation problems and lower number of lags that can be estimated, AR models are limited in their ability to capture long memory or dependencies. On the other hand, plain RNNs suffer from the vanishing and gradient problem that also limits their ability to have long-memory. We introduce a new class of RNN models, the $\alpha$-RNN and dynamic $\alpha_{t}$-RNNs that does not suffer from these problems by utilizing an exponential smoothing parameter. We also introduce MS-RNNs, MS-LSTMs, and MS-GRUs., novel models that overcome the limitations of MS-ARs but enable regime (Markov) switching and detection of structural breaks in the data. These models have long memory, can handle non-linear dynamics, do not require data stationarity or assume error distributions. Thus, they make no assumptions about the data generating process and have the ability to better capture temporal dependencies leading to better forecasting and prediction accuracy over traditional econometric models and plain RNNs. Yet, the partial autocorrelation function and econometric tools, such as the the ADF, Ljung-Box, and AIC test statistics, can be used to determine optimal sequence lag lengths to input into these RNN models and to diagnose serial correlation. The new framework has capacity to characterize the non-linear partial autocorrelation of time series and directly capture dynamic effects such as trends and seasonality. The optimal sequence lag order can greatly influence prediction performance on test data. This structure provides more interpretability to ML models since traditional econometric models are embedded into RNNs. The ability to embed econometric models into RNNs will allow firms to improve prediction accuracy compared to traditional econometric or traditional ML models by creating a hybrid utilizing a well understood traditional econometric model and a ML. In theory the traditional econometric model should focus on the portion of the estimation error that is best managed by a traditional model and the ML should focus the non-linear portion of the model. This combined structure is a step towards explainable AI and lays the framework for econometric AI.
Show less
- Title
- Fast Automatic Bayesian Cubature Using Matching Kernels and Designs
- Creator
- Rathinavel, Jagadeeswaran
- Date
- 2019
- Description
-
Automatic cubatures approximate multidimensional integrals to user-specified error tolerances. In many real-world integration problems, the...
Show moreAutomatic cubatures approximate multidimensional integrals to user-specified error tolerances. In many real-world integration problems, the analytical solution is either unavailable or difficult to compute. To overcome this, one can use numerical algorithms that approximately estimate the value of the integral. For high dimensional integrals, quasi-Monte Carlo (QMC) methods are very popular. QMC methods are equal-weight quadrature rules where the quadrature points are chosen deterministically, unlike Monte Carlo (MC) methods where the points are chosen randomly.The families of integration lattice nodes and digital nets are the most popular quadrature points used. These methods consider the integrand to be a deterministic function. An alternative approach, called Bayesian cubature, postulates the integrand to be an instance of a Gaussian stochastic process. For high dimensional problems, it is difficult to adaptively change the sampling pattern. But one can automatically determine the sample size, $n$, given a fixed and reasonable sampling pattern. We take this approach using a Bayesian perspective. We assume a Gaussian process parameterized by a constant mean and a covariance function defined by a scale parameter and a function specifying how the integrand values at two different points in the domain are related. These parameters are estimated from integrand values or are given non-informative priors. This leads to a credible interval for the integral. The sample size, $n$, is chosen to make the credible interval for the Bayesian posterior error no greater than the desired error tolerance. However, the process just outlined typically requires vector-matrix operations with a computational cost of $O(n^3)$. Our innovation is to pair low discrepancy nodes with matching kernels, which lowers the computational cost to $O(n \log n)$. We begin the thesis by introducing the Bayesian approach to calculate the posterior cubature error and define our automatic Bayesian cubature. Although much of this material is known, it is used to develop the necessary foundations. Some of the major contributions of this thesis include the following: 1) The fast Bayesian transform is introduced. This generalizes the techniques that speedup Bayesian cubature when the kernel matches low discrepancy nodes. 2) The fast Bayesian transform approach is demonstrated using two methods: a) rank-1 lattice sequences and shift-invariant kernels, and b) Sobol' sequences and Walsh kernels. These two methods are implemented as fast automatic Bayesian cubature algorithms in the Guaranteed Automatic Integration Library (GAIL). 3) We develop additional numerical implementation techniques: a) rewriting the covariance kernel to avoid cancellation error, b) gradient descent for hyperparameter search, and c) non-integer kernel order selection.The thesis concludes by applying our fast automatic Bayesian cubature algorithms to three sample integration problems. We show that our algorithms are faster than the basic Bayesian cubature and that they provide answers within the error tolerance in most cases. The Bayesian cubatures that we develop are guaranteed for integrands belonging to a cone of functions that reside in the middle of the sample space. The concept of a cone of functions is also explained briefly.
Show less