Calculating Federated Analytics (Part 1) – Rhino Federated Computing

The Rhino SDK Federated Analytics Guide is a comprehensive tutorial that is designed to help you configure and retrieve various federated metrics using the Rhino SDK.

About this Tutorial

Part 1 of this tutorial addresses the following:

Retrieving a Metric
Basic Metrics (Sum, Count, Mean, Standard Deviation)
Two-by-Two Tables
Odds and Risk Metrics
Prevelance Incidence
Statistical Tests: Chi-Square, T-Test, One Way ANOVA, Wilcoxon

Part 2 of this tutorial addresses the following:

Correlation Coefficients
Kaplan-Meier
Cox-Proportional Hazard
Regressions

Note: Part 2 of this tutorial is here.

Prerequisites

Before you start implementing metrics, make sure you can log in to the Rhino FCP via the SDK using your Rhino account:

import rhino_health as rh 
import getpass 

# Provide your Rhino username 
my_username = "your_username@example.com" 

# Log in to Rhino 
session = rh.login(username=my_username, password=getpass.getpass())

Retrieving a Metric

The RHP allows you to quickly and securely retrieve metrics in a federated way across multiple sites. The metric retrieval will always be done in two steps:

1. Configuring the metric parameters using the metric object.

2. Making an API call to the Endpoint (EP).

The Metric Object

The metric configuration object is a crucial component, serving as a blueprint for metric retrieval. It allows you to specify the metric variables, grouping preferences, and data filters.

For example, let's take a look at the Mean metric, for the "Height" column:

from rhino_health.lib.metrics import Count, Mean, StandardDeviation

# Replace 'dataset_id_1', 'dataset_id_2' with actual Dataset IDs
dataset_uids = ["dataset_id_1", "dataset_id_2"]

# Create the Mean config
mean_config = Mean(variable="Hight", group_by={
    "groupings": ["Gender"]
},
data_filters=[
    {
         "filter_column": "Weight", 
         "filter_type": ">", 
         "filter_value": 50
    },
    {
         "filter_column": "Weight", 
         "filter_type": "<", 
         "filter_value": 80
}])

# Make API call for Mean calculation
response_mean = session.project.aggregate_dataset_metric(dataset_uids, mean_config)

We can see that the Mean object is initialized to contain the variable of interest, that is the label in the Dataset that this metric will be calculated on.

- Group By: The group_by parameter allows you to organize metrics based on specific categorical variables, providing segmentation. In this example, the mean is grouped by the Gender column in the data.

- Data filters: The data_filters parameter enables you to refine your analysis by setting conditions and filtering the output by certain criteria. In the example above, the mean is filtered to be calculated only on a specific range within the Weight column.

After the Metric object is set, one can use the session.project.aggregate_dataset_metric endpoint to retrieve the metric from the chosen Datasets.

The Metric Response Object

When retrieving a metric, all results are returned in a MetricResponse object (or its derivatives). The MetricResponse object is a structured Python object that includes the specific outcome values in the 'output' attribute, such as statistical measures, and details about the metric configuration in metric_configuration_dict, specifying the type of metric, applied filters, and relevant variables.

For example, printing the results object for the ChiSquare metric will result in the following:

MetricResponse(output={
    'chi_square': {
        'statistic': 2.2,
        'p_value': 0.809, 
        'dof': 2
    }
}, 
metric_configuration_dict={
    'metric': 'chi_square', 
    'arguments': '{
        "data_filters": [], 
        "count_variable_name": "variable", 
        "variable": "id", 
        "variable_1": "Zb", 
        "variable_2": "E"
     }'
})

The metric results will always be under the output attribute, under the metric name key (in this case chi_square). The metric response values are then stored under the value name (e.g. p_value in the example above). The initial metric configuration used to generate this output can be found under the metric_configuration_dict attribute.

Basic Metrics: Sum, Count, Mean, Standard Deviation

To calculate basic numeric metrics such as count, mean, sum, and standard deviation, use the following syntax to create the metric configuration and retrieve the results:

from rhino_health.lib.metrics import Count, Mean, StandardDeviation, RocAuc

# Replace 'dataset_id_1', 'dataset_id_2' with actual Dataset IDs
dataset_uids = ["dataset_id_1", "dataset_id_1"]

# Calculate Mean
mean_config = Mean(variable="Height") 
response_mean = session.project.aggregate_dataset_metric(dataset_uids, mean_config)

# Calculate Standard Deviation
stddev_config = StandardDeviation(variable="Height") # Replace with actual variable name
response_stddev = session.project.aggregate_dataset_metric(dataset_uids, stddev_config)

# Calculate Count
count_config = Count(variable="id") 
response_count = session.project.aggregate_dataset_metric(dataset_uids, count_config)

Custom aggregation

If you prefer to use your own aggregation method instead of the built-in aggregation provided by our system, you can define a custom approach to combine metric results from multiple sites. In this case, the metric will first be computed independently at each site (e.g., mean, sum, etc.), and you can then determine how to aggregate these individual results.

To implement custom aggregation, create a callable object that defines your aggregation logic and pass it to the metric object.

The method signature should be:

method(metric_name: str, metric_results: List[Dict[str, Any]], **kwargs)

where the metric_results are each of the site's results for the metric, and the method should return a dict with the structure of: {metric_name: <aggregated_value>}.

For example, if you wish to override our builtin weighted average method and use a non weighted average method instead, you should create a method as shown:

def non_weighted_average(metric: str, metric_results: List[MetricResultDataType], **kwargs) -> MetricResultDataType:
  # Extract the metric values from all results
  metric_values = [metric_result.get(metric, 0) for metric_result in metric_results]
  # Compute the non-weighted average
  return {metric: np.mean(metric_values)}

And pass it to the metric objects like this:

mean_config = Mean(variable="Height", custom_aggregation_method=non_weighted_average) 
response_mean = session.project.aggregate_dataset_metric(dataset_uids, mean_config)

Two By Two table

The TwoByTwoTable metric facilitates the creation of a two-by-two contingency table, enabling you to analyze the relationships between variables. Here is an example for generating a two by two table metric for the columns exposed and detected in the data:

from rhino_health.lib.metrics.epidemiology.two_by_two_table_based_metrics import TwoByTwoTable 

# Calculate TBTT 
tbtt_config = TwoByTwoTable(variable="detected", detected_column_name="detected", exposed_column_name="exposed") 
table_result = session.project.aggregate_dataset_metric(dataset_uids, tbtt_config)

The table results are also stored in a response object, that can be parsed into a pandas data frame in order to view the results as a table:

import pandas as pd

# Display the result as a DataFrame 
pd.DataFrame(table_result.as_table())

Odds Metric

The Odds metric calculates the odds of an event occurring rather than not occurring, and can be generated like so:

from rhino_health.lib.metrics import Odds

# Calculate Odds
odds_config = Odds(variable="SeriesUID", column_name="Pneumonia")
odds_results = session.project.aggregate_dataset_metric(dataset_uids, odds_config)

Odds Ratio Metric

The OddsRatio metric is used to calculate the odds ratio between two binary variables and can be generated as follows:

from rhino_health.lib.metrics.epidemiology.two_by_two_table_based_metrics import OddsRatio

odds_ratio_config = OddsRatio(
    variable="id",
    exposed_column_name="Smoking",
    detected_column_name="Pneumonia",
)

odds_ratio_results = session.project.aggregate_dataset_metric(dataset_uids, odds_ratio_config)

Risk Metric

The Risk metric calculates the ratio between the true outcome and the total population with respect to detected and exposed columns.

from rhino_health.lib.metrics.base_metric import Risk, DataFilter

risk_config = Risk(
    variable="id",
    exposed_column_name="Smoking",
    detected_column_name="Pneumonia",
)

risk_results = session.project.aggregate_dataset_metric(dataset_uids, risk_config)

Prevalence and Incidence

The Prevalence metric calculates the proportion of individuals who have or develop a specific condition over a specified time range, whereas the Incidence describes the occurrence of new cases over a period of time. In this example, the prevalence and incidence of pneumonia are calculated within the given specific time range. Note that the column representing the time data should contain time in UTC format.

from rhino_health.lib.metrics import Prevalence, Incidence

prevalence_config = Prevalence(
    variable="id",
    time_column_name="Time Pneumonia",
    detected_column_name="Pneumonia",
    start_time="2023-02-02T07:07:48Z",
    end_time="2023-06-10T11:24:43Z",
)

prevalence_results = session.project.aggregate_dataset_metric(dataset_uids, prevalence_config)

incidence_config = Incidence(
    variable="id",
    time_column_name="Time Pneumonia",
    detected_column_name="Pneumonia",
    start_time="2023-02-02T07:07:48Z",
    end_time="2023-06-10T11:24:43Z",
)

incidence_results = session.project.aggregate_dataset_metric(dataset_uids, incidence_config)

Statistical Tests: Chi-Square, T-Test, One Way ANOVA, Wilcoxon

Chi-Square Test

The Chi-Square test is employed to assess the independence between two categorical variables. In this example, we examine the association between the occurrence of pneumonia and gender across different Datasets. The result includes the Chi-Square statistic, p-value, and degrees of freedom.

from rhino_health.lib.metrics.statistics_tests import ChiSquare 

chi_square_config = ChiSquare(variable="id", variable_1="Pneumonia", variable_2="Gender") 

result = project.aggregate_dataset_metric(dataset_uids, chi_square_config)

T-Test

The T-test is utilized to determine if there is a significant difference between the means of the two groups. The implemented method is the Welch test, which does not assume equality of variance. The result includes the T-Test statistic, p-value, and degrees of freedom.

from rhino_health.lib.metrics.statistics_tests import TTest

t_test_config = TTest(numeric_variable="Height", categorical_variable="Gender")

t_test_result = project.aggregate_dataset_metric(dataset_uids, t_test_config)

One-Way ANOVA

The One-Way ANOVA (Analysis of Variance) is applied to assess whether there are any statistically significant differences between the means of three or more independent groups. In this example, we examine the relationship between inflammation level and height. The result will contain the following calculated values: ANOVA statistic value, p-value, DFC, DFE, DFT, MSC, MSE, SSC, SSE, and SST.

from rhino_health.lib.metrics.statistics_tests import OneWayANOVA 

anova_config = OneWayANOVA(variable="id", numeric_variable="Height", categorical_variable="Inflammation Level") 

anova_result = project.aggregate_dataset_metric(dataset_uids, anova_config)

Wilcoxon signed-rank test

The singed-rank test is a non-parametric rank test for statistical hypothesis testing used either to test the location of a population based on a sample of data, or to compare the locations of two populations using two matched samples.

Our implementation always runs the calculation on a single column of values. To compare two columns, one must first add a column with the differences between the two columns (see example below). Additionally, one must add a column with the absolute values of the column to be used for the calculation (also shown in the example below).

Regarding handling of zeros and ties: For zeros, we use the “signed-rank zero procedure” where the sign for zero values is set to zero. For ties, we use the common “average rank” method where equal values are assigned identical “ranks”, which are equal to the average of the ranks before and after them.

The result of the calculation is the value according to the following formula:

The following example shows calculating the Wilcoxon signed-rank test to examine the relationship between weight and maximum weight in a dataset:

from rhino_health.lib.metrics.statistics_tests import Wilcoxon 

# Add columns with differences and aboslute values thereof.
datasets = [session.dataset.get_dataset(uid) for uid in dataset_uids]
datasets_with_diffs_columns = []
for dataset in datasets:
    output_datasets, _run_result = dataset.run_code(
        "df['WeightDiff'] = df.MaxWeight - df.Weight\n" +
        "df['WeightDiffAbs'] = df.WeightDiff.abs()"
    )
    datasets_with_diffs_columns.append(output_datasets[0])

# Calculate the test statistic.
new_dataset_uids = [dataset.uid for dataset in datasets_with_diffs_columns]
wilcoxon_config = Wilcoxon(variable="WeightDiff", abs_values_variable="WeightDiffAbs")
wilcoxon_result = project.aggregate_dataset_metric(new_dataset_uids, wilcoxon_config)

Next Steps

This tutorial is continued in Calculating Federated Analytics (Part 2). It includes the following types of examples:

Correlation Coefficients
Kaplan-Meier
Cox-Proportional Hazard
Regressions

Related to