tf.lookup.TextFileInitializer
View source on GitHub |
Table initializers from a text file.
tf.lookup.TextFileInitializer( filename, key_dtype, key_index, value_dtype, value_index, vocab_size=None, delimiter='\t', name=None )
This initializer assigns one entry in the table for each line in the file.
The key and value type of the table to initialize is given by key_dtype
and value_dtype
.
The key and value content to get from each line is specified by the key_index
and value_index
.
-
TextFileIndex.LINE_NUMBER
means use the line number starting from zero, expects data type int64. -
TextFileIndex.WHOLE_LINE
means use the whole line content, expects data type string. - A value
>=0
means use the index (starting at zero) of the split line based ondelimiter
.
For example if we have a file with the following content:
import tempfile f = tempfile.NamedTemporaryFile(delete=False) content='\n'.join(["emerson 10", "lake 20", "palmer 30",]) f.file.write(content.encode('utf-8')) f.file.close()
The following snippet initializes a table with the first column as keys and second column as values:
emerson -> 10
lake -> 20
palmer -> 30
init= tf.lookup.TextFileInitializer( filename=f.name, key_dtype=tf.string, key_index=0, value_dtype=tf.int64, value_index=1, delimiter=" ") table = tf.lookup.StaticHashTable(init, default_value=-1) table.lookup(tf.constant(['palmer','lake','tarkus'])).numpy()
Similarly to initialize the whole line as keys and the line number as values.
emerson 10 -> 0
lake 20 -> 1
palmer 30 -> 2
init = tf.lookup.TextFileInitializer( filename=f.name, key_dtype=tf.string, key_index=tf.lookup.TextFileIndex.WHOLE_LINE, value_dtype=tf.int64, value_index=tf.lookup.TextFileIndex.LINE_NUMBER, delimiter=" ") table = tf.lookup.StaticHashTable(init, -1) table.lookup(tf.constant('palmer 30')).numpy() 2
Args | |
---|---|
filename | The filename of the text file to be used for initialization. The path must be accessible from wherever the graph is initialized (eg. trainer or eval workers). The filename may be a scalar Tensor . |
key_dtype | The key data type. |
key_index | the index that represents information of a line to get the table 'key' values from. |
value_dtype | The value data type. |
value_index | the index that represents information of a line to get the table 'value' values from.' |
vocab_size | The number of elements in the file, if known. |
delimiter | The delimiter to separate fields in a line. |
name | A name for the operation (optional). |
Raises | |
---|---|
ValueError | when the filename is empty, or when the table key and value data types do not match the expected data types. |
Attributes | |
---|---|
key_dtype | The expected table key dtype. |
value_dtype | The expected table value dtype. |
Methods
initialize
initialize( table )
Initializes the table from a text file.
Args | |
---|---|
table | The table to be initialized. |
Returns | |
---|---|
The operation that initializes the table. |
Raises | |
---|---|
TypeError | when the keys and values data types do not match the table key and value data types. |
© 2020 The TensorFlow Authors. All rights reserved.
Licensed under the Creative Commons Attribution License 3.0.
Code samples licensed under the Apache 2.0 License.
https://www.tensorflow.org/versions/r2.4/api_docs/python/tf/lookup/TextFileInitializer