tf.contrib.lookup.IdTableWithHashBuckets
String to Id table wrapper that assigns out-of-vocabulary keys to buckets.
Inherits From: LookupInterface
tf.contrib.lookup.IdTableWithHashBuckets( table, num_oov_buckets, hasher_spec=tf.contrib.lookup.FastHashSpec, name=None, key_dtype=None )
For example, if an instance of IdTableWithHashBuckets
is initialized with a string-to-id table that maps:
emerson -> 0
lake -> 1
palmer -> 2
The IdTableWithHashBuckets
object will performs the following mapping:
emerson -> 0
lake -> 1
palmer -> 2
-
<other term> -> bucket_id
, where bucket_id will be between3
and3 + num_oov_buckets - 1
, calculated by:hash(<term>) % num_oov_buckets + vocab_size
If input_tensor is ["emerson", "lake", "palmer", "king", "crimson"]
, the lookup result is [0, 1, 2, 4, 7]
.
If table
is None, only out-of-vocabulary buckets are used.
Example usage:
num_oov_buckets = 3 input_tensor = tf.constant(["emerson", "lake", "palmer", "king", "crimnson"]) table = tf.IdTableWithHashBuckets( tf.StaticHashTable(tf.TextFileIdTableInitializer(filename), default_value), num_oov_buckets) out = table.lookup(input_tensor). table.init.run() print(out.eval())
The hash function used for generating out-of-vocabulary buckets ID is handled by hasher_spec
.
Args | |
---|---|
table | Table that maps tf.string or tf.int64 keys to tf.int64 ids. |
num_oov_buckets | Number of buckets to use for out-of-vocabulary keys. |
hasher_spec | A HasherSpec to specify the hash function to use for assignation of out-of-vocabulary buckets (optional). |
name | A name for the operation (optional). |
key_dtype | Data type of keys passed to lookup . Defaults to table.key_dtype if table is specified, otherwise tf.string . Must be string or integer, and must be castable to table.key_dtype . |
Raises | |
---|---|
ValueError | when table in None and num_oov_buckets is not positive. |
TypeError | when hasher_spec is invalid. |
Attributes | |
---|---|
init | DEPRECATED FUNCTION |
initializer | |
key_dtype | The table key dtype. |
name | The name of the table. |
resource_handle | Returns the resource handle associated with this Resource. |
value_dtype | The table value dtype. |
Methods
lookup
lookup( keys, name=None )
Looks up keys
in the table, outputs the corresponding values.
It assigns out-of-vocabulary keys to buckets based in their hashes.
Args | |
---|---|
keys | Keys to look up. May be either a SparseTensor or dense Tensor . |
name | Optional name for the op. |
Returns | |
---|---|
A SparseTensor if keys are sparse, otherwise a dense Tensor . |
Raises | |
---|---|
TypeError | when keys doesn't match the table key data type. |
size
size( name=None )
Compute the number of elements in this table.
© 2020 The TensorFlow Authors. All rights reserved.
Licensed under the Creative Commons Attribution License 3.0.
Code samples licensed under the Apache 2.0 License.
https://www.tensorflow.org/versions/r1.15/api_docs/python/tf/contrib/lookup/IdTableWithHashBuckets