Non-parametric probabilistic machine learning methodologies suited to real-world data

Roy, Gargi

Please use this identifier to cite or link to this item: http://bura.brunel.ac.uk/handle/2438/31173

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	Chakrabarty, D	-
dc.contributor.advisor	Lim, J W	-
dc.contributor.author	Roy, Gargi	-
dc.date.accessioned	2025-05-06T17:23:11Z	-
dc.date.available	2025-05-06T17:23:11Z	-
dc.date.issued	2025	-
dc.identifier.uri	http://bura.brunel.ac.uk/handle/2438/31173	-
dc.description	This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University London	en_US
dc.description.abstract	Recent years have seen a massive increase in the use of supervised learning for prediction tasks in various real-world applications. In supervised learning, the relationship between two variables - an input and an output - is sought. In spite of several existing learning models, the learning of this sought relation poses several challenges given real-world data due to different characteristics of this data, such as noise of the observations; non-stationarity in the data; inhomogeneities present in the correlations between different output pairs that are realised at inputs located differently in the space of the input variable. Moreover, real-world data can be high-dimensional, where both the input and the output can be tensor-valued in general. These challenges induce the following desirables in the prediction that is performed after such learning is undertaken: fast prediction that is accurate, uncertainty-included and reliable, as well as low in computational complexity, for easy implementation. Additionally, the prediction exercise - foreshadowed by the learning - needs to be scalable to high-dimensions. Although, some of the existing learning models address a subset of the challenges given non-stationary data, (towards reliable, uncertainty-included predictions), they typically require learning of a large number of hyperparameters, making these learning techniques computationally intensive. Also, they do not come with easy-to-implement algorithms, which makes these models infeasible for applications using medium sized real-world data. Furthermore, some of the existing models are designed for dedicated applications, using domain-specific model assumptions, generalisability of which outside the domains can be questioned. This thesis addresses challenges of real-world data, and presents applications of a generic, completely non-parametric learning model that is reliable, accurate, parsimonious, and works given non-stationary data that can be high-dimensional in general. Equipped with an easy-toimplement algorithm, such a learning technique overcomes the limitations of existing models. More precisely, this thesis attempts demonstration of reliable learning of a function (that represents the relation between a pair of random variables), by modelling this function as a sample function of a Gaussian Process. Such learning will be followed by fast prediction of the output that is realised at test inputs, where said prediction offers closed-form mean and variance of this output. In fact, in this approach, the predictions that follow the learning of the inter-variable relation, follows from the identification of the posterior predictive distribution of outputs realised at test inputs. The illustration of this approach has been performed for applications in various domains such as finance, energy consumption and astrophysics, were the data is inhomogeneously correlated and have diverse dimensions. From finance sector, real-world time-series data has been considered where both the input and output is scalar-variate. In the real-world energy consumption data, the input is vector-variate and the output is a scalar. The astrophysics application uses an astronomical simulation data where the input is a vector and the output is a matrix, yielding the sought function to be high-dimensional. Inference is undertaken throughout my doctoral work using Markov chain Monte Carlo (MCMC) sampling techniques. This thesis also highlights the sensitivity of predictions achieved with Deep Neural Networks (DNN), to the architecture of the DNNs. The chapter-wise distribution is as follows. The first chapter introduces the topic. The second chapter discusses various MCMC techniques and illustrations of these inference techniques to perform parametric learning with a small real-world data. The third chapter introduces the background of Gaussian Process (GP) based learning, application of GP-based supervised learning for efficient learning of uncertainty with an under-constraint MCMC for prediction. A probabilistic, non-parametric, non-stationary, parsimonious learning strategy is presented in the fourth chapter along with results on applying the model, and on comparison against existing models. This application is relevant to the case of both input and output is scalar-variate and the data is inhomogeneously-correlated. The fifth chapter includes the application of the learning strategy with a multivariate (with vector input and scalar output), inhomogeneously-correlated real-world data. This chapter also discusses some ideas about inhomogeneities in the correlation structure of the training data and the DNN exposition in both univariate and multivariate setups. An application of the learning of a high-dimensional function is discussed in the sixth chapter. In this application, the input is a vector and the output is a matrix. Prediction of the output at a test input vector is then presented. Finally the thesis has been concluded in chapter seven. The first appendix includes the application of the presented non-parametric learning strategy towards forecasting, along which a new strategy for designing of priors for performing forecasting with real-world inhomogeneously-correlated data. The second appendix includes some of the results of inference preformed with various MCMC techniques included in chapter two.	en_US
dc.description.sponsorship	EPSRC & the Prachi Dwivedi award	en_US
dc.publisher	Brunel University London	en_US
dc.relation.uri	http://bura.brunel.ac.uk/handle/2438/31173/1/FulltextThesis.pdf	-
dc.subject	Computational Statistics	en_US
dc.subject	Inhomogeneous Data	en_US
dc.subject	Non-stationary	en_US
dc.subject	Bayesian	en_US
dc.subject	Markov Chain Monte Carlo	en_US
dc.title	Non-parametric probabilistic machine learning methodologies suited to real-world data	en_US
dc.type	Thesis	en_US
Appears in Collections:	Dept of Mathematics Theses Mathematical Sciences

Files in This Item:

File	Description	Size	Format
FulltextThesis.pdf		24.03 MB	Adobe PDF	View/Open

Show simple item record