metrics package¶
Submodules¶
metrics.visualization_utils module¶
Helper to visualize metrics.
-
metrics.visualization_utils.
get_longest_flops_path
(sys_metrics)[source]¶ Prints the largest amount of flops required to complete training.
- To calculate this metric, we:
- For each round, pick the client that required the largest amount
- of local training.
- Sum the FLOPS from the clients picked in step 1 across rounds.
TODO: This metric would make more sense with seconds instead of FLOPS.
Parameters: sys_metrics – pd.DataFrame as written by writer.py.
-
metrics.visualization_utils.
load_data
(stat_metrics_file='stat_metrics.csv', sys_metrics_file='sys_metrics.csv')[source]¶ Loads the data from the given stat_metric and sys_metric files.
-
metrics.visualization_utils.
plot_accuracy_vs_round_number
(stat_metrics, weighted=False, plot_stds=False, figsize=(10, 8), title_fontsize=16, **kwargs)[source]¶ Plots the clients’ average test accuracy vs. the round number.
Parameters: - stat_metrics – pd.DataFrame as written by writer.py.
- weighted – Whether the average across clients should be weighted by number of test samples.
- plot_stds – Whether to plot error bars corresponding to the std between users.
- figsize – Size of the plot as specified by plt.figure().
- title_fontsize – Font size for the plot’s title.
- kwargs – Arguments to be passed to _set_plot_properties.
-
metrics.visualization_utils.
plot_accuracy_vs_round_number_per_client
(stat_metrics, sys_metrics, max_num_clients, figsize=(15, 12), title_fontsize=16, max_name_len=10, **kwargs)[source]¶ Plots the clients’ test accuracy vs. the round number.
Parameters: - stat_metrics – pd.DataFrame as written by writer.py.
- sys_metrics – pd.DataFrame as written by writer.py. Allows us to know which client actually performed training in each round. If None, then no indication is given of when was each client trained.
- max_num_clients – Maximum number of clients to plot.
- figsize – Size of the plot as specified by plt.figure().
- title_fontsize – Font size for the plot’s title.
- max_name_len – Maximum length for a client’s id.
- kwargs – Arguments to be passed to _set_plot_properties.
-
metrics.visualization_utils.
plot_bytes_written_and_read
(sys_metrics, rolling_window=10, figsize=(10, 8), title_fontsize=16, **kwargs)[source]¶ Plots the cumulative sum of the bytes written and read by the server.
Parameters: - sys_metrics – pd.DataFrame as written by writer.py.
- rolling_window – Number of previous rounds to consider in the cumulative sum.
- figsize – Size of the plot as specified by plt.figure().
- title_fontsize – Font size for the plot’s title.
- kwargs – Arguments to be passed to _set_plot_properties.
-
metrics.visualization_utils.
plot_client_computations_vs_round_number
(sys_metrics, aggregate_window=20, max_num_clients=20, figsize=(25, 15), title_fontsize=16, max_name_len=10, range_rounds=None)[source]¶ Plots the clients’ local computations against round number.
Parameters: - sys_metrics – pd.DataFrame as written by writer.py.
- aggregate_window – Number of rounds that are aggregated. e.g. If set to 20, then rounds 0-19, 20-39, etc. will be added together.
- max_num_clients – Maximum number of clients to plot.
- figsize – Size of the plot as specified by plt.figure().
- title_fontsize – Font size for the plot’s title.
- max_name_len – Maximum length for a client’s id.
- range_rounds – Tuple representing the range of rounds to be plotted. The rounds are subsampled before aggregation. If None, all rounds are considered.
metrics.writer module¶
Writes the given metrics in a csv.
-
metrics.writer.
get_metrics_names
(metrics)[source]¶ Gets the names of the metrics.
Parameters: metrics – Dict keyed by client id. Each element is a dict of metrics for that client in the specified round. The dicts for all clients are expected to have the same set of keys.
-
metrics.writer.
print_dataframe
(df, path, mode='w')[source]¶ Writes the given dataframe in path as a csv
-
metrics.writer.
print_metrics
(round_number, client_ids, metrics, hierarchies, num_samples, path)[source]¶ Prints or appends the given metrics in a csv.
- The resulting dataframe is of the form:
- client_id, round_number, hierarchy, num_samples, metric1, metric2 twebbstack, 0, , 18, 0.5, 0.89
Parameters: - round_number – Number of the round the metrics correspond to. If 0, then the file in path is overwritten. If not 0, we append to that file.
- client_ids – Ids of the clients. Not all ids must be in the following dicts.
- metrics – Dict keyed by client id. Each element is a dict of metrics for that client in the specified round. The dicts for all clients are expected to have the same set of keys.
- hierarchies – Dict keyed by client id. Each element is a list of hierarchies to which the client belongs.
- num_samples – Dict keyed by client id. Each element is the number of test samples for the client.