Diagnosing Disease with Shopping Data
Retailers loyalty card data is a currently underused and under-explored dataset in health research despite containing large-scale, and longitudinal behavioural information on the populations diet, product use and self-medication.
GDPR now gives individuals the right to access a copy of the personal data commercial companies hold on them. As more studies evidence how what we consume is affecting our health, the opportunity should not be missed to link data-sets on our purchasing habits to health outcomes.
Personal commercial transactional data is the information stored when an exchange occurs between an individual and a business, including customer shopping data. This research will connect store sales and loyalty card data (customer shopping information held by a retailer), to data on respiratory disease, and to information from women with ovarian cancer. Connecting these datasets will be used to investigate whether shopping data can be used to get women with ovarian cancer diagnosed earlier, and/or if it can help in informing public health decisions in a pandemic.
The aim of this project is to create recommendations for using shopping data in medical research and asks the question:
How can personal transactional data be collected and analysed for the purposes of health research in a way that is acceptable to society, and works for infectious and chronic disease.
The project is connected to a wider project by partners ALSPAC at Bristol University and the Alan Turing Institute: Donating personal transactional data for research: investigating the public acceptability of using commercial transactional data in public health research.
A collection of studies will be done to iteratively create machine learning models whose predictions could help in the earlier diagnosis of ovarian cancer and/or the understanding of ILI (Influenza Like Illnesses) outbreaks.
The methodology to be used is mixed methods collecting and analysing both qualitative data, and quantitative data for integrated interpretation. The studies will be used to inform the models schema creation, feature engineering, to understand, and validate its outputs and any interpretations made from these. The iterative design will allow for adjustments to the model for successful implementation in a clinical setting.
Watch this space.
Psychology of personal data donation 2019
Advances in digital technology have led to large amounts of personal data being recorded and retained by industry, constituting an invaluable asset to private organizations. The implementation of the General Data Protection Regulation in the EU … enables the general public to access data collected about them by organisations, opening up the possibility of this data being used for research that benefits the public themselves; for example, to uncover lifestyle causes of poor health outcomes… [more]
Public attitudes towards sharing loyalty card data for academic health research: a qualitative study 2022
A growing number of studies show the potential of loyalty card data for use in health research. However, research into public perceptions of using this data is limited. This study aimed to investigate public attitudes towards donating loyalty card data for academic health research, and the safeguards the public would want to see implemented … [more]
Using Shopping Data to Improve the Diagnosis of Ovarian Cancer: Computational Analysis of a Web-Based Survey 2023
Background: Shopping data can be analyzed using machine learning techniques to study population health. It is unknown if the use of such methods can successfully investigate prediagnosis purchases linked to self-medication of symptoms of ovarian cancer.
Objective: The aims of this study were to gain new domain knowledge from women’s experiences, understand how women’s shopping behavior relates to their pathway to the diagnosis of ovarian cancer, and inform research on computational analysis of shopping data for population health… [more]
Qualitative Investigation of the Novel Use of Shopping Loyalty Card Data in Medical Decision Making
This paper describes early results of a small qualitative study investigating the potential impact of shopping loyalty card data (SLCD) in the diagnostic pathway for ovarian cancer. There is early evidence that pharmaceutical products such as pain relief and medications for irritable bowel syndrome and bloating are bought by women to manage the early symptoms of ovarian cancer… [more]
Value of Commercial Product Sales Data in Healthcare Prediction
Technical report and code for above project conducted with the NHS can be viewed at
NHSX AU Blog
Applying a novel variable importance technique, MCR (model class reliance), to machine learning models in order to assess the Value of Commercial Product Sales Data in Healthcare Prediction
Cancer Therapy Advisor News Story
Could Shopping Data Be Used to Predict Cancer and Diagnose It Earlier?
Carina Storrs, PhD