Visiting Professor of Statistics Department of Mathematics Imperial College London g.montana [at] imperial. ac. uk |
Professor of Biostatistics and Bioinformatics |

## Research

- Computational statistics & machine learning
- High-dimensional and big data mining
- Applications in biomedicine - bioinformatics and medical imaging

## Recent events

- Workshop on Genomic Data Integration - 1 March 2013- Imperial College, London
- Industrial statistics event, Big Data theme - 25 March 2013- Newton Institute, Cambdridge
- Workshop on Big Data Mining - 14-15 May 2013- Imperial College, London
- Computational Methods Workshop for Massive/Complex Data - 19-20 June 2014- Imperial College, London

## Research group

- Michelle Krishnan (KCL)
- Alexandre de Brebisson (ICL)
- Petros-Pavlos Ypsilantis (KCL)
- Zhana Kuncheva (ICL)
- Dimosthenis Tsagkrasoulis (ICL)
- Ricardo Monti (ICL)
- Zi Wang (ICL)
- Ryan Ruan ICL)

- Ai Chung (KCL)
- Chris Minas (IC)
- Rene Gausoin (IC)

- Peter Nash - Research Associate
- Anand Pandit - PhD Student
- Chris Minas - PhD Student
- Matt Silver - PhD Student
- Orlando Dohering - MPhil Student
- Yue Wang - Visiting PhD Student (NUS)
- Maria Vounou - PhD Student, EPSRC and GSK Clinical Imaging Center
- Maurice Berk - PhD Student, Wellcome Trust
- Brian McWilliams - PhD Student, EPSRC
- Alberto Cozzini - PhD Student, AHL / Man Group
- Theo Tsagaris - PhD Student, Bluecrest Capital
- Mansour Sharabiani - Research Associate, NIHR
- Becky Inkster - Research Associate, VIP award
- Eva Jasounova - Academic Visitor, Leonardo da Vinci award
- Francesco Parrella - Academic Visitor, Leonardo da Vinci award

## Preprints and selected publications

- Janousova E. et al (2014). Mapping of cognitive processes on subcortical volumes, cortical thickness and area patterns shows no significant associations. Preprint
- Wang Z. and
**Montana G.**(2014) The graph-guided group lasso for genome-wide association studies. In "Regularization, Optimization, Kernels, and Support Vector Machines", Johan A.K. Suykens et al (Editors). In press - Ruan D., Young A., and
**Montana G.**(2014). Differential analysis of biological networks. Preprint - Wang Z., Curry E., and
**Montana G.**(2014). Network-guided regression for detecting associations between DNA methylation and gene expression. Preprint - Pio Monti R., Hellyer P., Sharp D., Leech R., Anagnostopoulos C.,
**Montana G.**(2014) Estimating dynamic brain connectivity networks from functional MRI time series. Preprint - Minas, C. and
**Montana, G.**(2014) Hypothesis testing in distance-based regression. Preprint. - Gaudoin R.,
**Montana G.**, Jones S., Aylin P. and Bottle A. (2014) Classifier calibration using splined empirical probabilities in clinical risk prediction. Health Care Management Science, to appear. - Cozzini A, Jasra A.,
**Montana G.**and Persing A. (2014) A Bayesian mixture of lasso regressions with t-errors. [arXiv] Computational Statistics and Data Analysis. In press - Minas C. and
**Montana G.**(2014) Distance-based analysis of variance: approximate inference [arXiv] Statistical Analysis and Data Mining. In press - McWilliams B. and
**Montana G.**(2014) Subspace clustering of high-dimensional data: a predictive approach. [arXiv] Data Mining and Knowledge Discovery. Volume 28, Issue 3, pp 736-772 - de Marvao A., Dawes T., Shi W., Minas C., Keenan N., Diamond T., Durighel G.,
**Montana G.**, Rueckert D., Cook S. and O'Regan D. (2014) Automated cardiac phenotyping using 3D high spatial resolution MR imaging. Journal of Cardiovascular MR, 16:16 - Kiskinis E., Chatzeli L., Curry E., Kaforou M., Frontini A., Cinti S.,
**Montana G.**, Parker M. and Christian M. (2014) RIP140 represses the BRITE adipocyte program including a futile cycle of TAG breakdown and synthesis. Molecular Endocrinology, Vol 28, Issue 3. - Rosell M., Kaforou M., Frontini A., Okolo A., Nikolopolou E., Millership S., Fenech ME, MacIntyre D, Turner JO, Blackburn E., Gullick W., Cinti S.,
**Montana G.**, Parker MG, Christian M. (2014) Brown and white adipose tissues. Intrinsic differences in gene expression and response to cold exposure. Am J Physiol Endocrinol Metab. - Sim, A., Tsagkrasoulis, D. and
**Montana, G.**(2013) Random forests on distance matrices for imaging genetics studies. Statistical Applications in Genetics and Molecular Biology. Volume 12, Issue 6, Pages 757-786 - Silver M., Chen P., Ruoying L., Cheng CY, Wong TY, Tai E., Teo YY, and
**Montana G.**(2013) Pathways-driven sparse regression identifies pathways and genes associated with high-density lipoprotein cholesterol in two Asian cohorts. [arXiv] PloS Genetics - Herberg J., Kaforou M., Gormley S., Sumner E.D., Patel S., Jones KDJ, Paulus S., Fink C., Martinon-Torres F.,
**Montana G.**, Wright VJ, Levin M. (2013) Transcriptomic profiling in childhood H1N1/09 influenza reveals reduced expression of protein synthesis genes. The Journal of Infectious Disease 15;208(10):1664-8. - Minas C., Curry E., and
**Montana G.**(2013) A distance-based test of association between paired heterogeneous genomic data. [arXiv] Bioinformatics. - Pandit AS, Robinson E., Aljabar P., Ball G., Gousias IS, Wang Z., Hajnal JV, Rueckert D., Counsell SJ,
**Montana G.**, Edwards AD (2013) Whole-brain mapping of structural connectivity in infants reveals altered connection strength associated with growth and preterm birth. Cerebral Cortex. - Wang Y., Goh W, Wong L. and
**Montana G.**(2013) Random forests on Hadoop for genome-wide studies of multivariate neuroimaging phenotypes. BMC Bioinformatics. - Cozzini A, Jasra A. and
**Montana G**. (2013) Model-based clustering with gene ranking using penalised mixtures of heavy-tailed distributions. Journal of Bioinformatics and Computational Biology. [arXiv] - Gendrel AV, Apedaile A, Coker H, Termanis A, Zvetkova I, Godwin J, Tang YA, Huntley D,
**Montana G.**, Taylor S, Giannoulatou E, Heard E, Stancheva I, Brockdorff N (2012) Smchd1-dependent and -independent pathways determine developmental dynamics of CpG island methylation on the inactive X chromosome. Developmental Cell, to appear. - Silver M., Janousova E., Hue X., Thompson P. and
**Montana G.**(2012) Identification of gene pathways implicated in Alzheimer's disease using longitudinal imaging phenotypes with sparse regression. Neuroimage, 63(3), Pages 1681-1694 [arXiv] - McWilliams B. and
**Montana G.**(2012) Multi-view predictive partitioning in high dimensions. Statistical Analysis and Data Mining,*5(4): 304-321*[arXiv] - Silver M. and
**Montana G.**(2012) Fast identification of biological pathways associated with a quantitative trait using group lasso with overlaps. Statistical Applications in Genetics and Molecular Biology, vol. 11, issue 1, article 7 [arXiv] - Janousova E., Vounou M, Wolz R., Gray K. R., Rueckert D. and
**Montana G.**(2012) Biomarker discovery for sparse classification of brain images in Alzheimer's disease. Annals of the British Machine Vision Association (2), 1-11 - Berk M. and
**Montana G.**(2012) A skew-t-normal multi-level reduced-rank functional PCA model with applications to replicated `omics time series data sets. In Proceedings of the IDA Symposium 2012 [arXiv] - Inkster B, Strijbis E, Vounou M, Bendtfeld K, Radue EW, Matthews PM, Barkhof F, Polman CH,
**Montana G***, Geurts JJG*. (2012) Histone deacetylase gene variants predict brain volume changes in multiple sclerosis. Neurobiology of Aging. - Strijbis E, Inkster B, Vounou M, Kappos L, Radue EW, Matthews PM, Uitdehaag B, Barkhof G, Polman CH,
**Montana G***, Geurts JJG* (2012) Glutamate gene polymorphisms predict brain volume changes in multiple sclerosis. Multiple Sclerosis Journal. - Vounou M, Janousova E., Wolz R., Stein J. Thompson P., Rueckert D. and
**Montana G.**(2011) Sparse reduced-rank regression detects genetic associations with voxel-wise longitudinal phenotypes in Alzheimer's disease. NeuroImage, 60(1):700-716 - McWilliams B. and
**Montana G.**(2011) Predictive Subspace Clustering. In Procedings of the Tenth IEEE International Conference on Machine Learning and Applications, Vol. 1, pp.247-252. - Minas C, Waddell S. and
**Montana G.**(2011) Distance-based differential analysis of gene curves. Bioinformatics, 27 (22): 3135-3141. - Pathan N., Burmester M., Adamovic T., Berk M., Ng K., Betts H., Macrae M., Waddell S., Paul-Clark M., Levin M.,
**Montana G.**, Mitchell J. (2011) Intestinal injury and endotoxemia in children undergoing surgery for congenital heart disease. American Journal of Respiratory and Critical Care Medicine, Vol 184, Pages:1261-1269 - Silver M. and
**Montana G.**(2011) Pathway selection for GWAS using the group lasso with overlaps. In IEEE International Proceedings of Chemical, Biological & Environmental Engineering, Singapore. - Janousova E., Vounou M., Wolz R. Ruecket D., and
**Montana G.**(2011) Fast brain-wide search of highly discriminative regions in medical images: an application to Alzheimer's disease. In Proceedings of MIUA (Medical Image Understanding and Analysis), London, UK. - Berk M., Ebbels T, and
**Montana G.**(2011) A statistical framework for metabolic profiling using longitudinal data. Bioinformatics, 27(14), pp. 1979-1985. - Berk M., Hemingway C., Levin M. and
**Montana G.**(2011). Longitudinal analysis of gene expression profiles using functional mixed-effects models. In 'Studies in Theoretical and Applied Statistics' pp 57-67. Springer. [arXiv] - Triantafyllopoulos, K. and
**Montana, G.**(2011) Dynamic modeling of mean-reverting spreads for statistical arbitrage. Computational Management Science. Vol 8, Issue 1, pp. 23-49 [arXiv] - Pathan N, Burmester M, Adamovic T, Berk M,
**Montana G**, Levin M, Mitchell J (2010) Gut barrier dysfunction and activation of endoxin signal pathways in children undergoing for congenital heart disease. In proceedings of the 40th Critical Care Congress. Lippincot Williams & Wilkins. - Spanu et al. (2010) Genome expansion and gene loss in powdery mildew fungi reveal functional tradeoffs in extreme parasitism. Science 10, Dec 2010: Vol. 330 no. 6010 pp. 1543-1546
- McWilliams B. and
**Montana G.**(2010) A PRESS statistic for two-block partial least squares regression. In Proceedings of the 10th Conference on Computational Intelligence UK, Colchester [arXiv] - Vounou M. Nichols T., and
**Montana G.**(2010) Detecting genetic associations with high-dimensional neuroimaging phenotypes: a sparse reduced-rank regression approach. NeuroImage, 5;53(3), pp. 1147-59. - Silver M.,
**Montana G.**, and Nichols T. (2010). False positives in neuroimaging genetics using voxel based morphometry data. NeuroImage, 15;54(2), pp. 992-1000 - Tang Y. A., Huntley, D.,
**Montana G.**, Cerase A., Nesteroa, T. B. and Brockdorff N. (2010) Efficiency of Xist-mediated silencing on autosomes is linked to chromosomal domain organisation. Epigenetics and Chromatin. 7;3(1):10. **Montana G**., Berk M. and Ebbels T. (2010) Modelling short time series in metabolomics: a functional data analysis approach. In 'Software Tools and Algorithms for Biological Systems', Advances in Experimental Medicine and Biology, 2011, Volume 696, Part 4, 307-315, Springer.- McWilliams B. and
**Montana G.**(2010) Sparse partial least squares for on-line variable selection in multivariate data streams. Statistical Analysis and Data Mining. 3: 170-193. [arXiv] - McWilliams B. and
**Montana G.**(2009) Dynamic asset allocation for bivariate enhanced index tracking using sparse partial least squares. International Workshop on Advances in Machine Learning for Computational Finance, 20-21 July, London. [Video] - Berk M. and
**Montana G.**(2009). Functional modelling of microarray time series with covariate curves. Statistica,*2-3, pp. 153-177*[arXiv] **Montana G.**, Triantafyllopoulos K. and Tsagaris T. (2009) Flexible least squares for temporal data mining and statistical arbitrage. Expert Systems with Applications 36(2), pp. 2819-2830. [arXiv]**Montana G**. and Parrella F. (2009) Data mining for algorithmic asset management. In 'Data Mining for Business Applications',Springer US.**Montana G**. and Parrella F. (2008) Learning to trade with incremental support vector regression experts. Lecture Notes in Artificial Intelligence Vol. 5271, pp. 591-598. Springer-Verlag**Montana G.**, Triantafyllopoulos K. and Tsagaris, T. (2008) Data stream mining for market-neutral algorithmic trading. In Proceedings of ACM Symposium on Applied Computing, pp. 966-970.- Triantafyllopoulos K. and
**Montana G.**(2007) Fast estimation of multivariate stochastic volatility. [arXiv] **Montana G.**and Hoggart C. (2007) Statistical software for gene mapping by admixture linkage disequilibrium, Briefings in Bioinformatics 8, pp. 393-395- Adams N.M., Hand D.J.,
**Montana G.**and Weston D. (2006). Fraud Detection in consumer credit. Expert Update, 9(1), pp. 21-27. (Special Issue on the 2nd UK KDD Workshop) **Montana G.**(2006) Statistical methods in genetics. Briefings in Bioinformatics 7(3), pp. 297-308**Montana G**. (2005) HapSim: A simulation tool for generating haplotype data with pre-specified allele frequencies and LD patterns. Bioinformatics 21(23), pp. 4309-4311- Triantafyllopoulos K. and
**Montana G.**(2004) Forecasting the London metal exchange with a dynamic model. In Proceedings of the 16th Symposium in Computational Statistics, pp. 1885-1892 -
**Montana G.**and Pritchard J. K. (2004) Statistical tests for admixture mapping with case-control and case-only data. American Journal of Human Genetics 75, pp. 771-789 - Kendall W.S. and
**Montana G.**(2002) Small sets and Markov transition kernels. Stochastic Processes and Their Applications 99(2), pp. 177-19

## Code

- NsRRR: R code for Network-guided sparse Reduced-Rank Regression
- SINGLE: R package implementing the Smooth Incremental Graphical Lasso Estimation algorithm
- GRV: R code for the generalised RV test of association between distance matrices (with data)
- HiPLAR: R packages for High Performance (GPU and multi-core) Linear Algebra in R
- PaRFR: Java implementation of parallel random forest regression for hadoop
- PsRRR: Python code for pathways-sparse reduced-rank regression (with data)
- PSC: Matlab code for the PSC (predictive subspace clustering) algorithm (with data)
- ISPLS: Matlab code the ISPL (incremental sparse partial least squares) algorithm (with data)
- MVPP: Matlab code for the MVPP (multi-view predictive partitioning) algorithm
- PTM: R code for the PTM (penalised finite mixtures of t distributions) model
- DBF: R code for the DBF (distance-based F) test statistic and artificial data simulation
- SME: R code for the SME (smoothing splines mixed effects) model for functional data
- MALDsoft: C code for admixture mapping using hidden Markov models
- HapSim: R package for realistic haplotype data simulation
- Online SVR: C++ code for on-line support vector regression
- DLM: C++ code for fitting dynamic linear models
- I maintain the CRAN Task View on Statistical Genetics

## Recent teaching (2012-2013)

- Machine Learning - MSc in Statistics
- Statistical Learning - MSc in Bioinformatics and Theoretical Systems Biology
- Statistical Pattern Recognition - London Taught Course Centre (PhD students)

## Previous positions

- Research Biostatistician - Statistical Genetics and Biomarkers Group, Bristol-Myers Squibb Company. Pharmaceutical Research Institute. Princeton, USA
- Research Associate - Department of Human Genetics. University of Chicago. Chicago, USA
- PhD in Statistics - Department of Statistics. University of Warwick. Coventry, UK

## Other activities

- Guest editor, Computational Statistics & Data Analysis, special issue on Advances in Data Mining and Robust Statistics, 2013-14
- Member of the Program Committe
- IDA (Intelligent Data Analysis) 2011-2014
- ICPRAM (International Conference on Pattern Recognition Applications) 2012-2014
- MASAMB (Mathematical and Statistical Aspects of Molecular Biology) 2013

- Chair, CompBio 2011
- Chartered Statistician (since 2006) and fellow of the Royal Statistical Society
- Committee Member of the Business & Industry Section, Royal Statistical Society (2010-)
- Vice Chair of the Statistical Computing Section, Royal Statistical Society (2010-)
- Member of the Computing and Research Committees, IC Dept of Mathematics, 2010-2013
- Visiting Senior Research Fellow, NUS School of Computing, 2011