tf.contrib.lookup.index_to_string_table_from_file
Returns a lookup table that maps a Tensor
of indices into strings.
tf.contrib.lookup.index_to_string_table_from_file( vocabulary_file, vocab_size=None, default_value='UNK', name=None, key_column_index=TextFileIndex.LINE_NUMBER, value_column_index=TextFileIndex.WHOLE_LINE, delimiter='\t' )
This operation constructs a lookup table to map int64 indices into string values. The table is initialized from a vocabulary file specified in vocabulary_file
, where the whole line is the value and the zero-based line number is the index.
Any input which does not have a corresponding index in the vocabulary file (an out-of-vocabulary entry) is assigned the default_value
The underlying table must be initialized by calling session.run(tf.compat.v1.tables_initializer())
or session.run(table.init())
once.
To specify multi-column vocabulary files, use key_column_index and value_column_index and delimiter.
- TextFileIndex.LINE_NUMBER means use the line number starting from zero, expects data type int64.
- TextFileIndex.WHOLE_LINE means use the whole line content, expects data type string.
- A value >=0 means use the index (starting at zero) of the split line based on
delimiter
.
Sample Usages:
If we have a vocabulary file "test.txt" with the following content:
emerson lake palmer
indices = tf.constant([1, 5], tf.int64) table = tf.lookup.index_to_string_table_from_file( vocabulary_file="test.txt", default_value="UNKNOWN") values = table.lookup(indices) ... tf.compat.v1.tables_initializer().run() values.eval() ==> ["lake", "UNKNOWN"]
Args | |
---|---|
vocabulary_file | The vocabulary filename, may be a constant scalar Tensor . |
vocab_size | Number of the elements in the vocabulary, if known. |
default_value | The value to use for out-of-vocabulary indices. |
name | A name for this op (optional). |
key_column_index | The column index from the text file to get the key values from. The default is to use the line number, starting from zero. |
value_column_index | The column index from the text file to get the value values from. The default is to use the whole line content. |
delimiter | The delimiter to separate fields in a line. |
Returns | |
---|---|
The lookup table to map a string values associated to a given index int64 Tensors . |
Raises | |
---|---|
ValueError | when vocabulary_file is empty. |
ValueError | when vocab_size is invalid. |
© 2020 The TensorFlow Authors. All rights reserved.
Licensed under the Creative Commons Attribution License 3.0.
Code samples licensed under the Apache 2.0 License.
https://www.tensorflow.org/versions/r1.15/api_docs/python/tf/contrib/lookup/index_to_string_table_from_file