The Harmonization Copilot automatically turns Syntactic Mappings and any associated Semantic Mappings into reusable ETLs that can be run on your Rhino Client (true to the principles of Federated Computing) in order to transform datasets from the source data format to the target data format.
Syntactic Mappings can be associated with Code Objects of type Data Harmonization. These are like other Code Objects in FCP and can be executed in a federated manner on a Rhino Client. The input datasets to this Code Objects have the syntactic mapping applied to them, this transforms source fields to target fields and applies any configured transformations. The outputs of this process are also stored as datasets on the same Rhino Client, and can be further processed using the same tools as exist for any dataset on FCP, e.g. using Dataset Analytics to look at distributions and data completeness, or exporting the dataset to an external location within the site's network.
In order to run a Data Harmonization Code Object to apply a syntactic mapping transformation to input datasets, complete these steps.
Prerequisites
- You must have Manage Data Mappings permission.
- You must have the Run Code on This Site permission for the site on which the harmonization is to be run.
- The datasets that you want to harmonize must have been imported into FCP.
- The syntactic mapping you want to edit must exist.
- Any semantic mappings referenced by the syntactic mapping must exist.
Running a Data Harmonization Code Object
1. From the Projects page, click the project in which you have set up your data mappings and datasets
2. Click the Code option from the navigation menu on the lefthand side of the screen.
3. You will see that a Code Object has been automatically generated for you for each Syntactic Mapping that you have created. The name of the Code Object will be identical to the name of the Syntactic Mapping, and it will be of type Data Harmonization.
4. Click the Run button to the right of the Code Object. The Run Settings page appears.
5. Select the Workgroup on which you want to run the Code Object
6. Select the Input datasets - one for each input data schema. An initial selection will already be made for you, but you can change it by clicking on the dataset dropdown menu and selecting a different dataset. If there are no datasets that match a given source data schema the text "NO OPTIONS" will appear. In this case, please close this dialog, import a matching dataset (from the Datasets menu option on the lefthand side navigation menu) and then try again.
7. Select the Semantic Mappings - one for each vocabulary or OMOP domain referenced by the syntactic mapping in different transformations. An initial selection will already be made for you, but you can change it by clicking on the semantic mapping dropdown menu and selecting a different semantic mapping. If there are no semantic mappings that match a given target vocabulary or OMOP domain, the text "NO OPTIONS" will appear. In this case, please close this dialog, create a matching semantic mapping (from the Data Mappings -> Semantic Mappings page) and then try again.
8. Click Run.
9. The code begins to run. Click Code Runs from the menu on the left to see the code run entry. You will see the status of the code run here.
10. Click the status of the code run to see a detailed log from the code run.
11. Once the code has finished running, you will see the status change to Completed if the run completed successfully or Error if there was any error. If the run completed successfully, you will see the number of output datasets under the "Output Datasets" field. If there was an Error, click on the status to see the detailed logs with a description of the error that was encountered.