tf.sparse.bincount
Count the number of times an integer value appears in a tensor.
tf.sparse.bincount( values, weights=None, axis=0, minlength=None, maxlength=None, binary_output=False, name=None )
This op takes an N-dimensional Tensor
, RaggedTensor
, or SparseTensor
, and returns an N-dimensional int64 SparseTensor where element [i0...i[axis], j]
contains the number of times the value j
appears in slice [i0...i[axis], :]
of the input tensor. Currently, only N=0 and N=-1 are supported.
Args | |
---|---|
values | A Tensor, RaggedTensor, or SparseTensor whose values should be counted. These tensors must have a rank of 2 if axis=-1 . |
weights | If non-None, must be the same shape as arr. For each value in value , the bin will be incremented by the corresponding weight instead of 1. |
axis | The axis to slice over. Axes at and below axis will be flattened before bin counting. Currently, only 0 , and -1 are supported. If None, all axes will be flattened (identical to passing 0 ). |
minlength | If given, ensures the output has length at least minlength , padding with zeros at the end if necessary. |
maxlength | If given, skips values in values that are equal or greater than maxlength , ensuring that the output has length at most maxlength . |
binary_output | If True, this op will output 1 instead of the number of times a token appears (equivalent to one_hot + reduce_any instead of one_hot + reduce_add). Defaults to False. |
name | A name for this op. |
Returns | |
---|---|
A SparseTensor with output.shape = values.shape[:axis] + [N] , where N is
|
Examples:
Bin-counting every item in individual batches
This example takes an input (which could be a Tensor, RaggedTensor, or SparseTensor) and returns a SparseTensor where the value of (i,j) is the number of times value j appears in batch i.
data = np.array([[10, 20, 30, 20], [11, 101, 11, 10001]], dtype=np.int64) output = tf.sparse.bincount(data, axis=-1) print(output) SparseTensor(indices=tf.Tensor( [[ 0 10] [ 0 20] [ 0 30] [ 1 11] [ 1 101] [ 1 10001]], shape=(6, 2), dtype=int64), values=tf.Tensor([1 2 1 2 1 1], shape=(6,), dtype=int64), dense_shape=tf.Tensor([ 2 10002], shape=(2,), dtype=int64))
Bin-counting with defined output shape
This example takes an input (which could be a Tensor, RaggedTensor, or SparseTensor) and returns a SparseTensor where the value of (i,j) is the number of times value j appears in batch i. However, all values of j above 'maxlength' are ignored. The dense_shape of the output sparse tensor is set to 'minlength'. Note that, while the input is identical to the example above, the value '10001' in batch item 2 is dropped, and the dense shape is [2, 500] instead of [2,10002] or [2, 102].
minlength = maxlength = 500 data = np.array([[10, 20, 30, 20], [11, 101, 11, 10001]], dtype=np.int64) output = tf.sparse.bincount( data, axis=-1, minlength=minlength, maxlength=maxlength) print(output) SparseTensor(indices=tf.Tensor( [[ 0 10] [ 0 20] [ 0 30] [ 1 11] [ 1 101]], shape=(5, 2), dtype=int64), values=tf.Tensor([1 2 1 2 1], shape=(5,), dtype=int64), dense_shape=tf.Tensor([ 2 500], shape=(2,), dtype=int64))
Binary bin-counting
This example takes an input (which could be a Tensor, RaggedTensor, or SparseTensor) and returns a SparseTensor where (i,j) is 1 if the value j appears in batch i at least once and is 0 otherwise. Note that, even though some values (like 20 in batch 1 and 11 in batch 2) appear more than once, the 'values' tensor is all 1s.
data = np.array([[10, 20, 30, 20], [11, 101, 11, 10001]], dtype=np.int64) output = tf.sparse.bincount(data, binary_output=True, axis=-1) print(output) SparseTensor(indices=tf.Tensor( [[ 0 10] [ 0 20] [ 0 30] [ 1 11] [ 1 101] [ 1 10001]], shape=(6, 2), dtype=int64), values=tf.Tensor([1 1 1 1 1 1], shape=(6,), dtype=int64), dense_shape=tf.Tensor([ 2 10002], shape=(2,), dtype=int64))
Weighted bin-counting
This example takes two inputs - a values tensor and a weights tensor. These tensors must be identically shaped, and have the same row splits or indices in the case of RaggedTensors or SparseTensors. When performing a weighted count, the op will output a SparseTensor where the value of (i, j) is the sum of the values in the weight tensor's batch i in the locations where the values tensor has the value j. In this case, the output dtype is the same as the dtype of the weights tensor.
data = np.array([[10, 20, 30, 20], [11, 101, 11, 10001]], dtype=np.int64) weights = [[2, 0.25, 15, 0.5], [2, 17, 3, 0.9]] output = tf.sparse.bincount(data, weights=weights, axis=-1) print(output) SparseTensor(indices=tf.Tensor( [[ 0 10] [ 0 20] [ 0 30] [ 1 11] [ 1 101] [ 1 10001]], shape=(6, 2), dtype=int64), values=tf.Tensor([2. 0.75 15. 5. 17. 0.9], shape=(6,), dtype=float32), dense_shape=tf.Tensor([ 2 10002], shape=(2,), dtype=int64))
© 2020 The TensorFlow Authors. All rights reserved.
Licensed under the Creative Commons Attribution License 3.0.
Code samples licensed under the Apache 2.0 License.
https://www.tensorflow.org/versions/r2.4/api_docs/python/tf/sparse/bincount