This article explains how to import and export datasets on the Rhino Health Platform to and from the following cloud storage platforms:
- Amazon Web Services (AWS) S3
- Google Cloud Platform's (GCP) Cloud Storage (CS)
- Server Message Block (SMB) network file sharing protocol
Prerequisite
Before you can complete these instructions, you will need to mount the bucket or directory that contains the data you want to access. To do this, follow the steps in Mounting Storage to Your Rhino Client.
Import Datasets
Create a new dataset and point to your cloud storage data using the following paths:
For AWS S3:
/rhino_data/external/s3/my_cloud_storage_folder/YOUR_DATA_PATH_UNDER_BUCKET
For GCP CS:
/rhino_data/external/gcs/my_cloud_storage_folder/YOUR_DATA_PATH_UNDER_BUCKET
For SMB:
/rhino_data/external/smb/my_cloud_storage_folder/YOUR_DATA_PATH_NETWORK_SHARE
The integration is available at the workgroup level. Each workgroup can set up their own buckets or network share. Those buckets or network shares are not accessible to other workgroups.
Example: Importing a file from AWS S3
Suppose you want to import file that is located in a S3 bucket. Here is how you would do this.
- Select the sprocket from the bottom left corner of the Rhino FCP main page, then select Client Mounted Storage from the Settings menu.
- In the Client Mounted Storage page, find the Client Mount Path that has the file. Click the blue rectangle next to it to copy it.
- In the Datasets page, on the Overview tab, select the Import New Dataset button located in the top right corner of the page.
- In the Import New Dataset page, enter the following information:
- Name: Name of the dataset.
- Description: Description of the dataset.
- Workgroup: The name of the workgroup that the dataset belongs to.
- Data Schema: Indicates the name of the schema for the dataset. If you do not want to use an existing schema, you can choose to auto-generate a schema from the data. Otherwise, choose a schema from the selections offered.
- DICOM Data Location (if needed) - Indicate the import method (DICOM Server or File System)
- DICOM Path (if needed): Indicate path where the DICOM data is stored.
- File Data Location: Indicate the path of the file data. This is where you would put what you copied in step 2 of these instructions. Make sure to add the name of the file. For example: /rhino_data/external/s3/my_cloud_storage_folder/some_folder/some_subfolder/dataset.csv.
- Sensitive Data - Indicates whether the dataset contains sensitive data. Note that when you import a new dataset, you select Auto-generate schema from data in the Data Schema field, and you select the Sensitive Data field, you will need to review the schema by clicking the Review Schema button. If you import a new dataset, but select a schema (and do not autogenerate it), you can select the Sensitive Data field and then select the Import New Dataset button. Rhino FCP behaves this way because it is assumed that you've already indicated which fields are sensitive in the schema you chose.
- Select Import New Dataset.
Export Datasets
To export an existing dataset, follow the steps described in Exporting a Dataset. The Rhino integration with your network storage should be configured as `Is read only` = `False` to allow your Rhino Client to save the exported files in your network storage. (If you are not sure if `Is read only` = `False` in your configuration, please contact the Rhino support team.)
Datasets will be exported to the file storage path set in the integration. For the AWS import example above, datasets would exported to your AWS S3 bucket named `my_bucket`.