Nottingham Trent University is running an online event to gather user requirements (particularly focused on marketisation and stakeholder engagement) for Phase-IV-AI, an EU funded project developing a “Health Data Hub”, which provides access to the project’s data synthetisation services (DaaS) and multi-party computation services (MaaS).
The project will advance the current state-of-the-art data synthesis methods towards a more generalised approach of synthetic data generation and develop ML orchestration tools for a machine learning workflow using medical data. It will also develop metrics for testing and validation, as well as protocols that enable synthetic data generation without access to real-world data (through multi-party computation).
The project aims to provide:
- Data as a Service (DaaS)
- Core data generation technologies for data availability and reusability.
- State of the art de-identification methods and anonymisation tools to support generative models for synthetic data generation with anonymisation, and data augmentation methods for providing on-demand data access.
- Improved methods and technical pipelines for privacy-preserving data synthesis including different data formats such as Electronic Health Records and medical images.
- Anonymous data on demand or from a repository.
- Model as a Service (MaaS)
- Privacy preserving machine learning orchestration tools for a machine learning workflow using medical data residing in hospitals and other healthcare institutions.
- Easy to use and configurable data services to enable AI developers’ access to larger pools of decentralised de-identified data through multi-party computing.
- Health Data Hub (HDH)
- To foster innovators and end-users’ needs with a sustainable service for the DaaS and the MaaS.
- Establish a Data Market – facilitating data sharing and monetisation including incentives-based provision of data to the services.
- Integrate the data market and the data service ecosystem as a X-European health data hub in the European Health Data Space.
- Results validated with real-world use-cases, focusing on lung cancer, prostate cancer and ischemic stroke.
This is also an opportunity for collaboration and data sharing across Europe, with distinct data sets providing additional perspectives for disease prediction.
The Ask
The area to focus on is using large (primary care) datasets for symptomatic detection, and to predict current or future disease risk. The project has existing models, and are developing new algorithms for comparison, but need to understand what users from different disciplines and backgrounds need from the system for it to be worthwhile.
In the first round of requirements development, they have worked with clinicians and academics, in the upcoming round would like to focus on the requirements of AI companies and developers. Your input is vital in shaping a system which will be relevant to your needs and will help to provide access to an ever-growing repository of health data.