BDSI Interactive Seminar Series

Models, Features and Dimension Reduction for Biological Data

Dr Terry Neeman (BDSI) 

Exploring high dimensional data in biology recalls the beautiful statistical ideas in multivariate techniques from the 20th century. Principal components analysis (PCA) has long been the workhorse for multivariate data visualisation. PCA is a straightforward algorithm linear projection of data onto a lower dimensional subspace that optimally preserves information (variation) in the data. In a 1999 paper, Tipping and Bishop suggested re-imagining PCA in a model-based framework, a statistical model being, after all, a way of representing the relevant and important signals in the data. Model-based PCA opens up more possibilities in dimension reduction when working with “difficult” data: missing values, mixtures of Gaussians, and non-normal data. In this talk, we’ll explore PCA from a dual perspective (sample space and feature space), and introduce the main results from model-based or probabilistic PCA.

Reference: Probabilistic Principal Component Analysis, Journal of the Royal Statistical Society. Series B (Statistical Methodology), Vol. 61, No. 3 (1999), pp. 611-622

MuseOmics: Unlocking the Vaults holding Historical Genetic and Gene Expression Data

Dr Erin Hahn (CERC Postdoctoral Fellow, Australian National Wildlife Collection, CSIRO)

Date & time

3–4am 16 March 2020


Eucalyptus Seminar Room, Level 2 RN Robertson Bldg, ANU


Terry Neeman, ANU
Erin Hahn, CSIRO


 Jo Bayley

