Toggle Main Menu Toggle Search

Open Access padlockePrints

Can Machine Learning advance or hinder statistical modelling of dementia risk in the Population.

Lookup NU author(s): Dr Connor Richardson, Sarah Wharton, Professor Fiona MatthewsORCiD


Full text for this publication is not currently held within this repository. Alternative links are provided below where available.


Background:Machine learning and Artificial intelligence is a rapidly growing area of health research in predicting individual and population disease risk. Predicting dementia in the population using neuropathology data is particularly challenged due to the lack of brain donation recourses. Machine learning is growing in popularity due to an increased focus on collecting health data and its powerful ability to compute complex statistics using smaller datasets while handling missing data.There is debate in the research community on the reliability of machine learning for predicting dementia. The thinking behind many machine learning techniques emphasizes use of very large amounts of data, whereas more classical statistical techniques emphasize a prediction of risk based on more specific and targeted data often resulting in smaller sample sizes.If these techniques complement or contradict each other’s ability to accurately predict risk of dementia in the population is yet unknown. Understanding this is paramount for the future of dementia research considering the ever-growing integration between Neuropathology, data and statistical science.Method: We applied machine learning techniques for feature ranking and classification as an unbiased comparison of neuropathological features and assessment of their diagnostic performance using a cohort (n=507) from the Cognitive Function and Ageing Studies (CFAS). This could then be compared to more traditional statistical modelling using logistic regression.Result:Decision Tree Classification gave clear and easy to interpret roots for dementia risk, visualising a hierarchy from pathologies that is less clear than logistic regression and odds ratios. It produces an individualised look at dementia risk based on specific combinations of pathology that is unclear in the statistical regression model however, does not deal with missing data well.The random forest model is strong in predicting cases of dementia however it misclassifies healthy cases as dementia cases 52% of the time.TDP 43 contributes the most to the model and Lewy bodies contributes the least.Conclusion:Machine learning has the potential to analyse neuropathology of dementia in sophisticated ways, however almost half of dementia cases remained misclassified. This points towards a more complex model of dementia pathology that requires further neuropathological assessments and collection of high quality cohort data to analyse.

Publication metadata

Author(s): Richardson CD, Wharton S, Matthews FE

Publication type: Conference Proceedings (inc. Abstract)

Publication status: Published

Conference Name: Alzheimer's Association International Conference (AAIC)

Year of Conference: 2022

Acceptance date: 16/03/2022

Publisher: Alzheimer's Association