This article explains the data processing workflow on the Rhino FCP, from preparing code inputs to using outputs. Users interact with the Rhino Orchestrator, which sends commands to the Rhino Client to import/export datasets or run code, so that data always remains within your network.
Workflow
Step 1: Prepare Your Code Inputs
Make Data Accessible to the Rhino Client
Users with the right security credentials can make data available to the Rhino Client. Data can be made accessible to the Rhino Client in multiple ways and is automatically referenced in internal storage. See the following articles for more information:
- Client Mounted Storage: How to access data in AWS S3, GCP CS or the SMB network files
- How can I import data in my local environment onto my Rhino FCP client using SFTP?
- Using SQL to Extract Metrics and Import Data From a Database
Upload External Files Needed for Your Code to Run
Upload files that DO NOT contain sensitive information (e.g., foundation model weights) to your workgroup's cloud storage (S3 bucket on Rhino Orchestrator). Reference them in your code using /external_data/the_folder_in_s3_you_want_to_upload_to/filename. See How to use external files at run time? for more information.
Create Datasets
Datasets are created from data accessible to the Rhino Client by importing it. They can be created using the Rhino FCP UI or SDK, and can include the Tabular data, DICOM data, or generic files data. See Creating a New Dataset or Dataset Version for more information.
Step 2: Define and Run Code Objects
Prepare Your Code
Write code to compute on input datasets, reading from /input/... and writing to /output/... within the container. Data from the Rhino Client Internal Storage is mounted to containers. The graphic below shows in blue represents how data is stored within the Rhino Client and in green how you can access that storage with your code with your code:
Inputs for dataset number "i" should be referenced as follows:
- Tabular Data will be mounted at /input/{i}/dataset.csv
- File Data will be mounted at /input/{i}/file_data/
- DICOM Data will be mounted at /input/{i}/dicom_data/
Similarly, outputs for the dataset number "i":
- Tabular Data to be written to /output/{i}/dataset.csv
- File Data to be written to /output/{i}/file_data/
- DICOM Data to be written to /output/{i}/dicom_data/
Code can read parameters from /input/run_params.json
External files can be referenced from /external_data/...
Create and Run Your Code on FCP
Your prepared code runs on FCP with containers. FCP packages containers into Code Objects, and runs code on the Rhino Client(s) with Code Object Runs.
Create a Code Object
On FCP, you can create Code Objects using the prepared code or a container including such code. Code Objects are the computational building blocks on FCP for data transformation, federated training, or third-party software. In addition to package your code so it can run on the Rhino Client(s), Code Objects allow you to define the type and number of inputs and outputs to and from your code. FCP supports multiple types of Code Objects to accommodate your code requirements (from Python Auto-Containers, to support other languages with Generalized Compute, to Federated Learning with NV Flare). See What is a Code Object? for more information.
Run Your Code Object
Code Object Runs allows you run you prepared code on the Rhino Client. They allow you to select the input datasets you want to run your code on, and any parameters and external files. When running your Code Object, the following happens:
- For Python Auto-Containers, a container is created on FCP with your prepared code.
- Your container is packaged so that data is accessible, including parameters and external files.
- The Rhino FCP sends the container to the Rhino Client.
- The Rhino Client runs the code on the input data and generates the output data.
- Output datasets are created from the output data (that written to /output).
- FCP either auto-generates output data schemas, or validates the output data with predefined output data schema(s).
See What is a Code Run? for more information.
3. Use Your Code Outputs
Once your code outputs are stored as datasets, they can be further used on FCP. You can:
- Export Datasets to the Rhino Client:
- They will be available in /rhino_data/. See Exporting a Dataset, or if you have network storage, see Importing to and Exporting Datasets from Your Network Storage
- Access exported data via SFTP or cloud/external storage.
- Keep working on your data by:
- Performing Analytics on Federated Datasets.
- Or running additional code.