FCP’s Harmonization Copilot allows you to quickly produce syntactic and semantic mappings (including custom vocabulary mappings) between a subset of the data in your Electronic Health Records (EHR) or Data Warehouse (DW) to OMOP or a custom vocabulary. This allows you to transform your data’s locally-defined names for procedures, conditions, and other items into supported clinical data standard names. Producing semantic mappings is particularly useful if you are collaborating with other research institutions or industry partners or if you need to package clinical data into an interoperable compliant standard for data sharing. Syntactic mapping, which is also known as schema mapping, is the mapping of table/column names to the target column name (e.g., OMOP).
For the current version, FCP supports syntactic mapping to OMOP, and semantic mapping to OMOP and to custom vocabularies.
Audience
This feature is designed for those who are responsible for mapping data or creating customized vocabularies for research or clinical data. This includes:
- Clinical Informatics Specialists
- Clinical SMEs
- Medical Doctors
- Data Engineers
Others might also participate in this process.
Important Terms and Concepts
There are several important concepts to understand before you create a data mapping using Harmonization Copilot.
- Custom Vocabulary: A non-standard language used to describe clinical data fields. It can be useful if a standard vocabulary, such as OMOP, is not appropriate.
- Data Profile: All the source columns and their unique values in a dataset.
- EHR/DW: Electronic Health Record/Data Warehouse. Stores and serves EHR/DW data.
- FHIR: Fast Healthcare Interoperability Resources. A standard that defines how healthcare information should be exchanged using a common format and language.
- LLM (Large Language Model): A type of machine learning model that can understand and generate natural text. LLMs are used in the Harmonization Copilot to make recommendations for data mappings.
- OMOP: Observational Medical Outcomes Partnership. A standard common data model that standardizes the content and structure of observational data.
- Semantic Mapping (SeM): Also known as vocabulary mapping, this is the mapping of values with the columns of interest.
- Syntactic Mapping: Also known as schematic mapping, this is the mapping of table/column names to the target column name (e.g., OMOP).
- SNOMED: Systematized Nomenclature of Medicine. An organized collection of clinical/medical terms that provide codes, synonyms, definitions, and codes for clinical reporting and documentation.
Workflow
After the target data to convert has been imported into one or more datasets in FCP, the workflow for the Harmonization Copilot begins. (Typically, the data engineer prepares the EHR/DW data so it can be imported into FCP as datasets.)
Semantic Mapping
Here is the workflow for semantic mappings:
- If a custom vocabulary is needed, the user creates it in FCP. If not, the user goes to the next step.
- The user selects the fields from the input dataset(s) to be mapped to the target standard vocabulary, such as OMOP Procedures Domain.
- The data harmonization copilot processes the input dataset(s) and creates a list of recommended SeMs.
- The user reviews and corrects the SeMs.
After this is complete, the data engineer can take the semantic mapping objects and export them from FCP, so they can be integrated into their production ETL.
Syntactic Mapping
For syntactic mappings, the workflow is:
- Create a syntactic mapping and indicate the source and target dataset.
- Review each mapping.
- Apply transformations as needed.
- Create and run a code object.