metrics package

Submodules

metrics.visualization_utils module

Helper to visualize metrics.

metrics.visualization_utils.get_longest_flops_path(sys_metrics)[source]

Prints the largest amount of flops required to complete training.

To calculate this metric, we:
  1. For each round, pick the client that required the largest amount
    of local training.
  2. Sum the FLOPS from the clients picked in step 1 across rounds.

TODO: This metric would make more sense with seconds instead of FLOPS.

Parameters:sys_metrics – pd.DataFrame as written by writer.py.
metrics.visualization_utils.load_data(stat_metrics_file='stat_metrics.csv', sys_metrics_file='sys_metrics.csv')[source]

Loads the data from the given stat_metric and sys_metric files.

metrics.visualization_utils.plot_accuracy_vs_round_number(stat_metrics, weighted=False, plot_stds=False, figsize=(10, 8), title_fontsize=16, **kwargs)[source]

Plots the clients’ average test accuracy vs. the round number.

Parameters:
  • stat_metrics – pd.DataFrame as written by writer.py.
  • weighted – Whether the average across clients should be weighted by number of test samples.
  • plot_stds – Whether to plot error bars corresponding to the std between users.
  • figsize – Size of the plot as specified by plt.figure().
  • title_fontsize – Font size for the plot’s title.
  • kwargs – Arguments to be passed to _set_plot_properties.
metrics.visualization_utils.plot_accuracy_vs_round_number_per_client(stat_metrics, sys_metrics, max_num_clients, figsize=(15, 12), title_fontsize=16, max_name_len=10, **kwargs)[source]

Plots the clients’ test accuracy vs. the round number.

Parameters:
  • stat_metrics – pd.DataFrame as written by writer.py.
  • sys_metrics – pd.DataFrame as written by writer.py. Allows us to know which client actually performed training in each round. If None, then no indication is given of when was each client trained.
  • max_num_clients – Maximum number of clients to plot.
  • figsize – Size of the plot as specified by plt.figure().
  • title_fontsize – Font size for the plot’s title.
  • max_name_len – Maximum length for a client’s id.
  • kwargs – Arguments to be passed to _set_plot_properties.
metrics.visualization_utils.plot_bytes_written_and_read(sys_metrics, rolling_window=10, figsize=(10, 8), title_fontsize=16, **kwargs)[source]

Plots the cumulative sum of the bytes written and read by the server.

Parameters:
  • sys_metrics – pd.DataFrame as written by writer.py.
  • rolling_window – Number of previous rounds to consider in the cumulative sum.
  • figsize – Size of the plot as specified by plt.figure().
  • title_fontsize – Font size for the plot’s title.
  • kwargs – Arguments to be passed to _set_plot_properties.
metrics.visualization_utils.plot_client_computations_vs_round_number(sys_metrics, aggregate_window=20, max_num_clients=20, figsize=(25, 15), title_fontsize=16, max_name_len=10, range_rounds=None)[source]

Plots the clients’ local computations against round number.

Parameters:
  • sys_metrics – pd.DataFrame as written by writer.py.
  • aggregate_window – Number of rounds that are aggregated. e.g. If set to 20, then rounds 0-19, 20-39, etc. will be added together.
  • max_num_clients – Maximum number of clients to plot.
  • figsize – Size of the plot as specified by plt.figure().
  • title_fontsize – Font size for the plot’s title.
  • max_name_len – Maximum length for a client’s id.
  • range_rounds – Tuple representing the range of rounds to be plotted. The rounds are subsampled before aggregation. If None, all rounds are considered.

metrics.writer module

Writes the given metrics in a csv.

metrics.writer.get_metrics_names(metrics)[source]

Gets the names of the metrics.

Parameters:metrics – Dict keyed by client id. Each element is a dict of metrics for that client in the specified round. The dicts for all clients are expected to have the same set of keys.
metrics.writer.print_dataframe(df, path, mode='w')[source]

Writes the given dataframe in path as a csv

metrics.writer.print_metrics(round_number, client_ids, metrics, hierarchies, num_samples, path)[source]

Prints or appends the given metrics in a csv.

The resulting dataframe is of the form:
client_id, round_number, hierarchy, num_samples, metric1, metric2 twebbstack, 0, , 18, 0.5, 0.89
Parameters:
  • round_number – Number of the round the metrics correspond to. If 0, then the file in path is overwritten. If not 0, we append to that file.
  • client_ids – Ids of the clients. Not all ids must be in the following dicts.
  • metrics – Dict keyed by client id. Each element is a dict of metrics for that client in the specified round. The dicts for all clients are expected to have the same set of keys.
  • hierarchies – Dict keyed by client id. Each element is a list of hierarchies to which the client belongs.
  • num_samples – Dict keyed by client id. Each element is the number of test samples for the client.

Module contents