utils package¶
Submodules¶
utils.constants module¶
utils.language_utils module¶
Utils for language models.
-
utils.language_utils.
bag_of_words
(line, vocab)[source]¶ Returns bag of words representation of given phrase using given vocab.
Parameters: - line – string representing phrase to be parsed
- vocab – dictionary with words as keys and indices as values
Returns: integer list
-
utils.language_utils.
line_to_indices
(line, indd, max_words=25)[source]¶ Converts given phrase into list of word indices
if the phrase has more than max_words words, returns a list containing indices of the first max_words words if the phrase has less than max_words words, repeatedly appends integer representing unknown index to returned list until the list’s length is max_words
Parameters: - line – string representing phrase/sequence of words
- indd – dictionary with string words as keys and int indices as values
- max_words – maximum number of word indices in returned list
Returns: list of word indices, one index for each word in phrase
Return type: indl
-
utils.language_utils.
split_line
(line)[source]¶ Split given line/phrase into list of words
Parameters: line – string representing phrase to be split Returns: list of strings, with each string representing a word
utils.model_utils module¶
-
utils.model_utils.
batch_data
(data, batch_size)[source]¶ data is a dict := {‘x’: [list], ‘y’: [list]} returns x, y, which are both lists of size-batch_size lists
-
utils.model_utils.
read_data
(train_data_dir, test_data_dir)[source]¶ parses data in given train and test data directories
assumes: - the data in the input directories are .json files with
keys ‘users’ and ‘user_data’- the set of train set users is the same as the set of test set users
Returns: list of client ids groups: list of group ids; empty list if none found train_data: dictionary of train data test_data: dictionary of test data Return type: clients
utils.tf_utils module¶
-
utils.tf_utils.
graph_size
(graph)[source]¶ Returns the size of the given graph in bytes
The size of the graph is calculated by summing up the sizes of each trainable variable. The sizes of variables are calculated by multiplying the number of bytes in their dtype with their number of elements, captured in their shape attribute
Parameters: graph – TF graph Returns: integer representing size of graph (in bytes)