utils package¶

Submodules¶

utils.constants module¶

utils.language_utils module¶

Utils for language models.

utils.language_utils.bag_of_words(line, vocab)[source]¶

Returns bag of words representation of given phrase using given vocab.

Parameters:	line – string representing phrase to be parsed vocab – dictionary with words as keys and indices as values
Returns:	integer list

utils.language_utils.get_word_emb_arr(path)[source]¶

utils.language_utils.letter_to_vec(letter)[source]¶: Returns one-hot representation of given letter.

utils.language_utils.line_to_indices(line, indd, max_words=25)[source]¶

Converts given phrase into list of word indices

if the phrase has more than max_words words, returns a list containing indices of the first max_words words if the phrase has less than max_words words, repeatedly appends integer representing unknown index to returned list until the list’s length is max_words

Parameters:	line – string representing phrase/sequence of words indd – dictionary with string words as keys and int indices as values max_words – maximum number of word indices in returned list
Returns:	list of word indices, one index for each word in phrase
Return type:	indl

utils.language_utils.split_line(line)[source]¶

Split given line/phrase into list of words

Parameters:	line – string representing phrase to be split
Returns:	list of strings, with each string representing a word

utils.language_utils.val_to_vec(size, val)[source]¶

Converts target into one-hot.

Parameters:	size – Size of vector. val – Integer in range [0, size].
Returns:	one-hot vector with a 1 in the val element.
Return type:	vec

utils.language_utils.word_to_indices(word)[source]¶

returns a list of character indices

Parameters:	word – string
Returns:	int list with length len(word)
Return type:	indices

utils.model_utils module¶

utils.model_utils.batch_data(data, batch_size)[source]¶: data is a dict := {‘x’: [list], ‘y’: [list]} returns x, y, which are both lists of size-batch_size lists

utils.model_utils.read_data(train_data_dir, test_data_dir)[source]¶

parses data in given train and test data directories

assumes: - the data in the input directories are .json files with

keys ‘users’ and ‘user_data’

the set of train set users is the same as the set of test set users

Returns:	list of client ids groups: list of group ids; empty list if none found train_data: dictionary of train data test_data: dictionary of test data
Return type:	clients

utils.tf_utils module¶

utils.tf_utils.graph_size(graph)[source]¶

Returns the size of the given graph in bytes

The size of the graph is calculated by summing up the sizes of each trainable variable. The sizes of variables are calculated by multiplying the number of bytes in their dtype with their number of elements, captured in their shape attribute

Parameters:	graph – TF graph
Returns:	integer representing size of graph (in bytes)