tf.distribute.HierarchicalCopyAllReduce

Hierarchical copy all-reduce implementation of CrossDeviceOps.

Inherits From: CrossDeviceOps

View aliases

Compat aliases for migration

tf.compat.v1.distribute.HierarchicalCopyAllReduce

tf.distribute.HierarchicalCopyAllReduce(
    num_packs=1
)

It reduces to one GPU along edges in some hierarchy and broadcasts back to each GPU along the same path. For the batch API, tensors will be repacked or aggregated for more efficient cross-device transportation.

This is a reduction created for Nvidia DGX-1 which assumes GPUs connects like that on DGX-1 machine. If you have different GPU inter-connections, it is likely that it would be slower than tf.distribute.ReductionToOneDevice.

For reduces that are not all-reduce, it falls back to tf.distribute.ReductionToOneDevice.

Here is how you can use HierarchicalCopyAllReduce in tf.distribute.MirroredStrategy:

strategy = tf.distribute.MirroredStrategy(
  cross_device_ops=tf.distribute.HierarchicalCopyAllReduce())

Args
`num_packs`	a non-negative integer. The number of packs to split values into. If zero, no packing will be done.

Raises
ValueError if `num_packs` is negative.

Methods

`batch_reduce`

View source

batch_reduce(
    reduce_op, value_destination_pairs, options=None
)

Reduce values to destinations in batches.

See tf.distribute.StrategyExtended.batch_reduce_to. This can only be called in the cross-replica context.

Args
`reduce_op`	a `tf.distribute.ReduceOp` specifying how values should be combined.
`value_destination_pairs`	a sequence of (value, destinations) pairs. See `tf.distribute.CrossDeviceOps.reduce` for descriptions.
`options`	a `tf.distribute.experimental.CommunicationOptions`. See `tf.distribute.experimental.CommunicationOptions` for details.

Returns
A list of `tf.Tensor` or `tf.distribute.DistributedValues`, one per pair in `value_destination_pairs`.

Raises
`ValueError`	if `value_destination_pairs` is not an iterable of tuples of `tf.distribute.DistributedValues` and destinations.

`broadcast`

View source

broadcast(
    tensor, destinations
)

Broadcast tensor to destinations.

This can only be called in the cross-replica context.

Args
`tensor`	a `tf.Tensor` like object. The value to broadcast.
`destinations`	a `tf.distribute.DistributedValues`, a `tf.Variable`, a `tf.Tensor` alike object, or a device string. It specifies the devices to broadcast to. Note that if it's a `tf.Variable`, the value is broadcasted to the devices of that variable, this method doesn't update the variable.

Returns
A `tf.Tensor` or `tf.distribute.DistributedValues`.

`reduce`

View source

reduce(
    reduce_op, per_replica_value, destinations, options=None
)

Reduce per_replica_value to destinations.

See tf.distribute.StrategyExtended.reduce_to. This can only be called in the cross-replica context.

Args
`reduce_op`	a `tf.distribute.ReduceOp` specifying how values should be combined.
`per_replica_value`	a `tf.distribute.DistributedValues`, or a `tf.Tensor` like object.
`destinations`	a `tf.distribute.DistributedValues`, a `tf.Variable`, a `tf.Tensor` alike object, or a device string. It specifies the devices to reduce to. To perform an all-reduce, pass the same to `value` and `destinations`. Note that if it's a `tf.Variable`, the value is reduced to the devices of that variable, and this method doesn't update the variable.
`options`	a `tf.distribute.experimental.CommunicationOptions`. See `tf.distribute.experimental.CommunicationOptions` for details.

Returns
A `tf.Tensor` or `tf.distribute.DistributedValues`.

Raises
`ValueError`	if per_replica_value can't be converted to a `tf.distribute.DistributedValues` or if destinations is not a string, `tf.Variable` or `tf.distribute.DistributedValues`.

© 2020 The TensorFlow Authors. All rights reserved.
Licensed under the Creative Commons Attribution License 3.0.
Code samples licensed under the Apache 2.0 License.
https://www.tensorflow.org/versions/r2.4/api_docs/python/tf/distribute/HierarchicalCopyAllReduce