Carrot (CO-CONNECT)

About

Carrot is an open-source toolset designed to support the transformation of healthcare data into the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM). The project addresses a common challenge in health research: clinical and administrative data are often stored in many different formats and coding systems, making it difficult to combine datasets or perform large-scale analyses across organisations. By providing tools that standardise how data are mapped and transformed, Carrot helps researchers prepare datasets for interoperable, reproducible research.

Carrot consists of two primary components: Carrot Mapper and Carrot Transform. Carrot Mapper is a web-based application that enables users to define how fields in a source dataset should be mapped to the OMOP standard. It uses metadata from data-profiling tools such as WhiteRabbit to help users understand the structure of their data and create mapping rules. These rules describe how specific variables, values, and relationships in the original dataset correspond to the standardised OMOP schema. The system supports automated mapping suggestions, manual configuration, and the reuse of previously defined mappings, helping to streamline the data curation process and encourage consistency across projects.

Once mapping rules have been defined, Carrot Transform executes the transformation itself. This component is a Python-based command-line tool that runs within the secure environment of a data partner, applying the mapping rules generated in Carrot Mapper to convert source data into OMOP-compliant tables. Running the transformation locally ensures that sensitive health data remain within the organisation’s infrastructure, supporting privacy-by-design principles and compliance with data protection regulations.

A key feature of the Carrot approach is the separation of mapping and transformation processes. Mapping rules can be created and shared without requiring direct access to the underlying data, allowing domain experts to collaborate on standardisation while maintaining strict data governance controls. Over time, these rules can be reused across datasets and institutions, improving efficiency and promoting consistent data interpretation.

Carrot has been used in several large-scale health data initiatives, including federated research platforms that enable discovery and analysis across multiple datasets. By enabling standardised transformation workflows and reusable mapping rules, the project aims to accelerate the preparation of research-ready datasets while maintaining transparency, reproducibility, and strong privacy protections.

Links

Carrot main page: https://carrot.ac.uk/

Carrot Mapper: https://carrot.ac.uk/mapper

Carrot Transform: https://carrot.ac.uk/transform

Leave a Reply

Your email address will not be published. Required fields are marked *