Sourav Das
- sourav.das@jcu.edu.au
- Adjunct Senior Lecturer
Projects
0
Publications
0
Awards
0
Biography
I am a Senior Lecturer in statistics and data science with varied and interdisciplinary interests in applied and methodological statistics. My research expertise are in methodology and applications of statistical analysis of time series and spatial processes. I investigate natural events that evolve over space and time. Of particular interest are landslides, tropical cyclones, river flow, and diseases. The statistical problems are wide rannging and complex.
Students are our core research strength. Mr. Yihong Mei recently won a national summer research scholarship (https://amsi.org.au/) to investigate differences in characteristic features of polar ice reduction. I am looking for students to compete for PhD scholarships internally and nationally. If you are interested please see below.
Current Interests
My immediate interests fall into the following categories-
Statistical Methodology
- Modelling of spatial and temporal covariance structures.
- Computationally and statistically efficient models for natural hazards.
- Interface of maximum likelihood and interpretable machine learning algorithms for streaming data.
Applications
- Algorithms for landslide monitoring.
- Methods for detecting structural changes in neurological time series.
- Statistical variation in river flow and tropical cyclones.
- Spatial variation in tropical ecological process.
- Spatial epidemiology for infectious diseases and mental health.
Projects
Application and methods
- Monitoring and characterization of evolving statistical distribution of natural events. Landslides lead to heavy losses in life and property with alarming regularity. Developing nations, with relatively poor infrastructure and high residential density are particularly vulnerable. Statistically sound early warning systems can significantly help to mitigate risk to life and livelihood. Using spatial and time series data from ground based and low earth orbiting satellites we are investigating the construction of mechanistic and automatic algorithms for monitoring such events to offer early warning systems.
- Signature of EEG signals in neurological Is there 'natural' pathological clustering of brain channels in the event of neurological disorders such as epilepsy? We use statistical methods of time series and machine learning approaches to investigate and develop related tools.
- Uncertainty in periodic environmental events Will the next ENSO return in 3 years, 5 years or would it be 8 years. Changing climate regime have rendered previous assumptions on periodic climatic events uncertain. We are investigating methods for estimating uncertainty of periodic natural events.
- Language and odds of mental health disorders Handwritten or digital texts in the form of clinical prescriptions contain a wealth of information. Using modern language processing tools we are exploring ways to build text and sentiment based evidence for mental health to help facilitate public policy in Far North Queensland.
Methodology
The following problems are often motivated bythe above applications.
- Non-stationarity in time series - Methods of collecting, processing and aggregating data have been undergoing a radical transformation in most scientific disciplines due to rapid advances in sensors and related ioT devices. Increasing precision, accuracy and cost-effectiveness of these devices have led to the collection and storage of burgeoning volumes of online spatial and temporal signals, in standardized format. These have opened up immensely possibilities for developing statistical methods for data science. Second order stationarity have been the mainstay in estimation and monitoring algorithms for time series and spatial processes
- Imputing missing values - Time series data on earth and environmental sciences disciplines are often sampled irregularly leading to missing observations. This poses challenge to statistical analysis of time series as conventional statistical methodology and follow on theories of time series have relied on regularly sampled time series data. But imputing missing values in multiple time series with serial correlation and seasonality is a non-trivial problem. This work is motivated by environmental factors as sea surface temperature, polar ice melting, cyclones, and climate indices.
- AIC, BIC and Informatic criteria in a massive data world - The information content in Information theoretic criteria and other measures, AIC and BIC were originally proposed for nested model selection to determine the order of a time series, and were gradually expanded to a range of real observations generated from the exponential family distributions. Their appeal lie in their close relationship with entropy and the mutual information criteria.In a big data world, however, such criteria are being used indiscriminately in a wide range of supervised learning algorithms. Through simulated and real world data from – terrestrial and marine ecology and earth sciences - this project would investigate –
- The implications of applying AIC and BIC under mis-specified modelling.
- We study how they align with the idea of maximum Fisher information – the variance of score function.
- We consider and compare them against other Bayesian and frequentist alternatives.
- As a secondary outcome we would look into the broader questions of the significance of data/generative model assumptions on conventional machine learning classifiers.
Contact
If you wish to pursue a PhD with me or want to have a discussion on some of my current research interests please contact me at sourav.das@jcu.edu.au. I am particularly keen to hear from students with background in one or more of the following- statistical analysis of time series, applied stochastic process, multivariate statistical methods, mixed effects modelling or spatial statistics.