Improving the quality, speed and transparency of curating data to the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) using the Carrot tool
Authors
Cox, S., Masood E., Panagi, V., Macdonald, C., Milligan, G., Horban, S., Santos, R., Hall, C., Lea, D., Tarr, S., Mumtaz, S., Akashili, E., Rae, A., Cole, C., Sheikh, A., Jefferson, E., Quinlan, P. R.
Abstract
The use of data standards is low across the healthcare system and therefore to undertake international research it is usually required to convert data to a common data model. One such model is the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM). It has gained significant traction across researchers and those who have developed data platforms.
The Observational Healthcare Data Sciences and Informatics (OHDSI) partnership manage OMOP and provide many open-source tooling to assist those with data to convert their data to the OMOP CDM. The challenge, however, is in the skills, knowledge, know-how and capacity within teams to convert their data to OMOP.
The European Health Data Evidence Network (EHDEN) provided funds to allow data owners to bring in external resource to do the required conversions and therefore creating a once in time conversion of data.
The Carrot software is a new set of open-source tools designed to help address these challenges while not requiring data access by external resources. Data protection rules are increasing and privacy by design is a core principle under the European and UK legislations related to data protection.
Our aims for the Carrot software were to
- Have a standardised mechanism for managing the data curation process
- Capturing the rules used to convert the data
- Creating a platform that can re-use rules across projects to drive standardisation of process, improve the speed, and without compromising on quality.
Most importantly, the privacy by design approach was to deliver this approach without requiring those creating the rules to have access to any of the data.
Carrot has been delivered and has been used on a project called CO-CONNECT to assist in the process of allowing datasets to be discovered via a federated platform. It has been used to create over forty five thousand rules and over 5 million of patient records have been converted.
This has been achieved while maintaining our principles of ensuring this can be achieved with no access to the underlying data by the team creating the rules. It has also facilitated the re-use of existing rules, with the majority of rules being re-used rather than manually curated.
Carrot has demonstrated how it can be utilised alongside existing OHDSI tools with a focus on the mapping stage. In the CO-CONNECT project it successfully managed to re-use rules across datasets. The approach is valid and brought the benefits expected with future work continuing to optimise the generation of rules.
Digital Research contribution
The team provided software development expertise to produce the first version of carrot. The user interface (UI) was implemented using React.js to give it a modern feel and provide dynamic feature rendering. PostgreSQL was used to develop the database that would store and reuse OMOP mapping terms. Additionally, an API was created using Django + DRF in Python to access the terms. Furthermore, the team developed an Azure function which can ingest user-provided mapping rules and add them to the database for reuse.
The services were all hosted in the cloud using Azure.
Links
ResearchGate (link to follow)