Diagnosing Disease with Shopping Data
Retailers loyalty card data is a currently underused and under-explored dataset in health research despite containing large-scale, and longitudinal behavioural information on the populations diet, product use and self-medication.
GDPR now gives individuals the right to access a copy of the personal data commercial companies hold on them. As more studies evidence how what we consume is affecting our health, the opportunity should not be missed to link data-sets on our purchasing habits to health outcomes.
Personal commercial transactional data is the information stored when an exchange occurs between an individual and a business, including customer shopping data. Â This research will connect loyalty card data (customer shopping information held by a retailer), to covid-19 incidents and to information from women with ovarian cancer. Connecting these datasets will be used to investigate whether shopping data can be used to get women with ovarian cancer diagnosed earlier, and/or if it can help in informing public health decisions in a pandemic.
The aim of this project is to create a framework for using shopping data in medical research and asks the question:
How can personal transactional data be collected and analysed for the purposes of health research in a way that is acceptable to society, and works for infectious and chronic disease.
The project is connected to a wider project by partners ALSPACÂ atÂ Bristol University and the Alan Turing Institute:Â â€œdonating personal transactional data for research: investigating the public acceptability of using commercial transactional data in public health researchâ€.
A collection of studies will be done to iteratively create machine learning models whose predictions could help in the earlier diagnosis of ovarian cancer and/or the understanding of ILI (Influenza Like Illnesses) outbreaks.
The methodology to be used is mixed methods collecting and analysing both qualitative data, and quantitative data for integrated interpretation.Â Â The studies will be used to inform the models schema creation, feature engineering, to understand, and validate its outputs and any interpretations made from these.Â Â The iterative design will allow for adjustments to the model for successful implementation in a clinical setting.
Watch this space.
Psychology of personal data donation 2019
Advances in digital technology have led to large amounts of personal data being recorded and retained by industry, constituting an invaluable asset to private organizations. The implementation of the General Data Protection Regulation in the EU … enables the general public to access data collected about them by organisations, opening up the possibility of this data being used for research that benefits the public themselves; for example, to uncover lifestyle causes of poor health outcomes….Â [more]
Value of Commercial Product Sales Data in Healthcare Prediction
Technical report and code for above project conducted with the NHS can be viewed at
NHSX AU Blog
Applying a novel variable importance technique, MCR (model class reliance), to machine learning models in order to assess theÂ Value of Commercial Product Sales Data in Healthcare Prediction