This article explains how to Data Harmonization code object to apply a syntactic mapping transformation to input datasets.
Background
The Harmonization Copilot automatically turns Syntactic Mappings and any associated Semantic Mappings into reusable ETLs that can be run on your Rhino Client (true to the principles of Federated Computing) in order to transform datasets from the source data format to the target data format.
Syntactic Mappings can be associated with Data Harmonization code objects. These are like other code objects in FCP and can be executed in a federated manner on a Rhino Client. The input datasets to these code objects have the syntactic mapping applied to them and this transforms source fields to target fields and applies any configured transformations. The outputs of this process are also stored as datasets on the same Rhino Client, and can be further processed using the same tools as exist for any dataset on FCP, e.g. using Dataset Analytics to look at distributions and data completeness, or exporting the dataset to an external location within the site's network.
Prerequisites
- You must have Manage Data Mappings permission.
- You must have the Run Code on This Site permission for the site on which the harmonization is to be run.
- The datasets that you want to harmonize must have been imported into FCP.
- The syntactic mapping you want to edit must exist.
- Any semantic mappings referenced by the syntactic mapping must exist.
Running a Data Harmonization Code Object
- On Projects page, click the project that contains your data mappings and datasets.
- Select the Code option from the navigation menu on the left-hand side of the page.
- A code object is automatically generated for each syntactic mapping created. The name of the code object is the same as the syntactic mapping. The type is Data Harmonization.
- Select the Run button to the right of the code object to open the Run Settings page.
- Select the Workgroup on which you want to run the code object.
- Select the Input datasets. There should be one for each input data schema. An initial selection is already made for you, but you can change it by selecting the dataset dropdown menu and selecting a different dataset. If there are no datasets that match a given source data schema, the text "NO OPTIONS" will appear. In this case, please close this dialog, import a matching dataset (from the Datasets menu option on the left-hand side navigation menu), then try again.
- Select the Semantic Mappings - one for each vocabulary or OMOP domain referenced by the syntactic mapping in different transformations. An initial selection will already be made for you, but you can change it by clicking on the semantic mapping dropdown menu and selecting a different semantic mapping. If there are no semantic mappings that match a given target vocabulary or OMOP domain, the text "NO OPTIONS" will appear. In this case, please close this dialog, create a matching semantic mapping (from the Data Mappings -> Semantic Mappings page), then try again.
- Click Run to start the code run.
- Click Code Runs from the menu on the left to see the code run entry. You will see the status of the code run here. Select the status of the code run to see a detailed log from the code run.
- Once the code has finished running, you will see the status change to Completed if the run completed successfully or Error if there was any error. If the run completed successfully, you will see the number of output datasets under the Output Datasets field. If there was an Error, click on the status to see the detailed logs with a description of the error encountered.