This article explains how to use external files in your code run. Your workgroup has a unique cloud storage (an S3 bucket on Rhino Orchestrator) to upload non-sensitive data that might be needed during code run, e.g., open-source large LLM model weights.
Important Sensitive Data Instructions
Cloud storage (specified in Workgroup code and model artifacts storage section in the instructions that follow) is owned by Rhino. Please DO NOT upload sensitive information such as proprietary datasets. For sensitive data, please use the instructions in Creating a New Dataset or Dataset Version instead.
Overview
For your code to be able to access files while running in your agent we need do 4 steps:
- Set up a bucket to upload your files
- Upload files to your designated bucket
- Reference your files in your code
- Use your files in a specific run
Find the bucket to upload your files
The bucket is created as part of the onboarding process into the FCP by client request, you can find it in your settings page under Containers & Artifacts. Here is how to find it.
- In your project, select the settings button (it looks like a sprocket at the bottom left side of the page.
-
In the Settings menu, which is near the top of the screen on the left side of the window, select Containers & Artifacts.
- Your storage bucket and bucket prefix appear in the Workgroup code and model artifacts storage part of the page.
Upload files to your designated bucket
AWS offers many alternative ways to upload files to an S3 bucket. We've also provided a script in our user_resource repository you can use: upload-file-to-s3.sh.
You need to define your S3 credentials and then you can call the script like this:
./upload-file-to-s3.sh the_folder_to_upload storage_bucket bucket_prefix your_path_in_bucketFor the example above, if you wanted to upload files in a local folder (called "local_folder" in the below example) and store them in a folder in FCP called "my_files", the command would be:
./upload-file-to-s3.sh ./local_folder external-files-rhino-health-dev my_filesReference your files in your code
When creating your code object, you can reference your external files by referencing them in the following path:
/external_data/the_folder_in_s3_you_want_to_upload_to/filenameFor the example above, assuming you uploaded `model_params.txt` file under `my_files`, the file would be accessible during runtime in
/external_data/my_files/model_params.txtAn example of Python code to read your file data into a `text` variable would look like as follows:
from pathlib import Path
text = Path('/external_data/my_files/model_params.txt').read_text()Use your files in a specific run
Once your code points to your external files, and the code object is created, you can run the code and specify the files you want to use in your code run. In the UI, you can simply select them from a dropdown
Alternatively, you can also use the SDK. See this notebook to use external files via SDK for more details.