tf.distribute.OneDeviceStrategy
View source on GitHub |
A distribution strategy for running on a single device.
Inherits From: Strategy
tf.distribute.OneDeviceStrategy( device )
Using this strategy will place any variables created in its scope on the specified device. Input distributed through this strategy will be prefetched to the specified device. Moreover, any functions called via strategy.run
will also be placed on the specified device as well.
Typical usage of this strategy could be testing your code with the tf.distribute.Strategy API before switching to other strategies which actually distribute to multiple devices/machines.
For example:
strategy = tf.distribute.OneDeviceStrategy(device="/gpu:0") with strategy.scope(): v = tf.Variable(1.0) print(v.device) # /job:localhost/replica:0/task:0/device:GPU:0 def step_fn(x): return x * 2 result = 0 for i in range(10): result += strategy.run(step_fn, args=(i,)) print(result) # 90
Args | |
---|---|
device | Device string identifier for the device on which the variables should be placed. See class docs for more details on how the device is used. Examples: "/cpu:0", "/gpu:0", "/device:CPU:0", "/device:GPU:0" |
Attributes | |
---|---|
cluster_resolver | Returns the cluster resolver associated with this strategy. In general, when using a multi-worker Strategies that intend to have an associated Single-worker strategies usually do not have a The os.environ['TF_CONFIG'] = json.dumps({ 'cluster': { 'worker': ["localhost:12345", "localhost:23456"], 'ps': ["localhost:34567"] }, 'task': {'type': 'worker', 'index': 0} }) # This implicitly uses TF_CONFIG for the cluster and current task info. strategy = tf.distribute.experimental.MultiWorkerMirroredStrategy() ... if strategy.cluster_resolver.task_type == 'worker': # Perform something that's only applicable on workers. Since we set this # as a worker above, this block will run on this particular instance. elif strategy.cluster_resolver.task_type == 'ps': # Perform something that's only applicable on parameter servers. Since we # set this as a worker above, this block will not run on this particular # instance. For more information, please see |
extended | tf.distribute.StrategyExtended with additional methods. |
num_replicas_in_sync | Returns number of replicas over which gradients are aggregated. |
Methods
distribute_datasets_from_function
distribute_datasets_from_function( dataset_fn, options=None )
Distributes tf.data.Dataset
instances created by calls to dataset_fn
.
dataset_fn
will be called once for each worker in the strategy. In this case, we only have one worker and one device so dataset_fn
is called once.
The dataset_fn
should take an tf.distribute.InputContext
instance where information about batching and input replication can be accessed:
def dataset_fn(input_context): batch_size = input_context.get_per_replica_batch_size(global_batch_size) d = tf.data.Dataset.from_tensors([[1.]]).repeat().batch(batch_size) return d.shard( input_context.num_input_pipelines, input_context.input_pipeline_id) inputs = strategy.distribute_datasets_from_function(dataset_fn) for batch in inputs: replica_results = strategy.run(replica_fn, args=(batch,))
Args | |
---|---|
dataset_fn | A function taking a tf.distribute.InputContext instance and returning a tf.data.Dataset . |
options | tf.distribute.InputOptions used to control options on how this dataset is distributed. |
Returns | |
---|---|
A "distributed Dataset ", which the caller can iterate over like regular datasets. |
experimental_distribute_dataset
experimental_distribute_dataset( dataset, options=None )
Distributes a tf.data.Dataset instance provided via dataset.
In this case, there is only one device, so this is only a thin wrapper around the input dataset. It will, however, prefetch the input data to the specified device. The returned distributed dataset can be iterated over similar to how regular datasets can.
Note: Currently, the user cannot add any more transformations to a distributed dataset.
Example:
strategy = tf.distribute.OneDeviceStrategy() dataset = tf.data.Dataset.range(10).batch(2) dist_dataset = strategy.experimental_distribute_dataset(dataset) for x in dist_dataset: print(x) # [0, 1], [2, 3],...
Args: dataset: tf.data.Dataset
to be prefetched to device. options: tf.distribute.InputOptions
used to control options on how this dataset is distributed. Returns: A "distributed Dataset
" that the caller can iterate over.
experimental_distribute_values_from_function
experimental_distribute_values_from_function( value_fn )
Generates tf.distribute.DistributedValues
from value_fn
.
This function is to generate tf.distribute.DistributedValues
to pass into run
, reduce
, or other methods that take distributed values when not using datasets.
Args | |
---|---|
value_fn | The function to run to generate values. It is called for each replica with tf.distribute.ValueContext as the sole argument. It must return a Tensor or a type that can be converted to a Tensor. |
Returns | |
---|---|
A tf.distribute.DistributedValues containing a value for each replica. |
Example usage:
- Return constant value per replica:
strategy = tf.distribute.MirroredStrategy(["GPU:0", "GPU:1"]) def value_fn(ctx): return tf.constant(1.) distributed_values = ( strategy.experimental_distribute_values_from_function( value_fn)) local_result = strategy.experimental_local_results(distributed_values) local_result (<tf.Tensor: shape=(), dtype=float32, numpy=1.0>, <tf.Tensor: shape=(), dtype=float32, numpy=1.0>)
- Distribute values in array based on replica_id:
strategy = tf.distribute.MirroredStrategy(["GPU:0", "GPU:1"]) array_value = np.array([3., 2., 1.]) def value_fn(ctx): return array_value[ctx.replica_id_in_sync_group] distributed_values = ( strategy.experimental_distribute_values_from_function( value_fn)) local_result = strategy.experimental_local_results(distributed_values) local_result (3.0, 2.0)
- Specify values using num_replicas_in_sync:
strategy = tf.distribute.MirroredStrategy(["GPU:0", "GPU:1"]) def value_fn(ctx): return ctx.num_replicas_in_sync distributed_values = ( strategy.experimental_distribute_values_from_function( value_fn)) local_result = strategy.experimental_local_results(distributed_values) local_result (2, 2)
- Place values on devices and distribute:
strategy = tf.distribute.TPUStrategy() worker_devices = strategy.extended.worker_devices multiple_values = [] for i in range(strategy.num_replicas_in_sync): with tf.device(worker_devices[i]): multiple_values.append(tf.constant(1.0)) def value_fn(ctx): return multiple_values[ctx.replica_id_in_sync_group] distributed_values = strategy. experimental_distribute_values_from_function( value_fn)
experimental_local_results
experimental_local_results( value )
Returns the list of all local per-replica values contained in value
.
In OneDeviceStrategy
, the value
is always expected to be a single value, so the result is just the value in a tuple.
Args | |
---|---|
value | A value returned by experimental_run() , run() , extended.call_for_each_replica() , or a variable created in scope . |
Returns | |
---|---|
A tuple of values contained in value . If value represents a single value, this returns (value,). |
gather
gather( value, axis )
Gather value
across replicas along axis
to the current device.
Given a tf.distribute.DistributedValues
or tf.Tensor
-like object value
, this API gathers and concatenates value
across replicas along the axis
-th dimension. The result is copied to the "current" device
- which would typically be the CPU of the worker on which the program is running. For
tf.distribute.TPUStrategy
, it is the first TPU host. For multi-clientMultiWorkerMirroredStrategy
, this is CPU of each worker.
This API can only be called in the cross-replica context. For a counterpart in the replica context, see tf.distribute.ReplicaContext.all_gather
.
Note: For all strategies excepttf.distribute.TPUStrategy
, the inputvalue
on different replicas must have the same rank, and their shapes must be the same in all dimensions except theaxis
-th dimension. In other words, their shapes cannot be different in a dimensiond
whered
does not equal to theaxis
argument. For example, given atf.distribute.DistributedValues
with component tensors of shape(1, 2, 3)
and(1, 3, 3)
on two replicas, you can callgather(..., axis=1, ...)
on it, but notgather(..., axis=0, ...)
orgather(..., axis=2, ...)
. However, fortf.distribute.TPUStrategy.gather
, all tensors must have exactly the same rank and same shape.
Note: Given atf.distribute.DistributedValues
value
, its component tensors must have a non-zero rank. Otherwise, consider usingtf.expand_dims
before gathering them.
strategy = tf.distribute.MirroredStrategy(["GPU:0", "GPU:1"]) # A DistributedValues with component tensor of shape (2, 1) on each replica distributed_values = strategy.experimental_distribute_values_from_function(lambda _: tf.identity(tf.constant([[1], [2]]))) @tf.function def run(): return strategy.gather(distributed_values, axis=0) run() <tf.Tensor: shape=(4, 1), dtype=int32, numpy= array([[1], [2], [1], [2]], dtype=int32)>
Consider the following example for more combinations:
strategy = tf.distribute.MirroredStrategy(["GPU:0", "GPU:1", "GPU:2", "GPU:3"]) single_tensor = tf.reshape(tf.range(6), shape=(1,2,3)) distributed_values = strategy.experimental_distribute_values_from_function(lambda _: tf.identity(single_tensor)) @tf.function def run(axis): return strategy.gather(distributed_values, axis=axis) axis=0 run(axis) <tf.Tensor: shape=(4, 2, 3), dtype=int32, numpy= array([[[0, 1, 2], [3, 4, 5]], [[0, 1, 2], [3, 4, 5]], [[0, 1, 2], [3, 4, 5]], [[0, 1, 2], [3, 4, 5]]], dtype=int32)> axis=1 run(axis) <tf.Tensor: shape=(1, 8, 3), dtype=int32, numpy= array([[[0, 1, 2], [3, 4, 5], [0, 1, 2], [3, 4, 5], [0, 1, 2], [3, 4, 5], [0, 1, 2], [3, 4, 5]]], dtype=int32)> axis=2 run(axis) <tf.Tensor: shape=(1, 2, 12), dtype=int32, numpy= array([[[0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2], [3, 4, 5, 3, 4, 5, 3, 4, 5, 3, 4, 5]]], dtype=int32)>
Args | |
---|---|
value | a tf.distribute.DistributedValues instance, e.g. returned by Strategy.run , to be combined into a single tensor. It can also be a regular tensor when used with tf.distribute.OneDeviceStrategy or the default strategy. The tensors that constitute the DistributedValues can only be dense tensors with non-zero rank, NOT a tf.IndexedSlices . |
axis | 0-D int32 Tensor. Dimension along which to gather. Must be in the range [0, rank(value)). |
Returns | |
---|---|
A Tensor that's the concatenation of value across replicas along axis dimension. |
reduce
reduce( reduce_op, value, axis )
Reduce value
across replicas.
In OneDeviceStrategy
, there is only one replica, so if axis=None, value is simply returned. If axis is specified as something other than None, such as axis=0, value is reduced along that axis and returned.
Example:
t = tf.range(10) result = strategy.reduce(tf.distribute.ReduceOp.SUM, t, axis=None).numpy() # result: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] result = strategy.reduce(tf.distribute.ReduceOp.SUM, t, axis=0).numpy() # result: 45
Args | |
---|---|
reduce_op | A tf.distribute.ReduceOp value specifying how values should be combined. |
value | A "per replica" value, e.g. returned by run to be combined into a single tensor. |
axis | Specifies the dimension to reduce along within each replica's tensor. Should typically be set to the batch dimension, or None to only reduce across replicas (e.g. if the tensor has no batch dimension). |
Returns | |
---|---|
A Tensor . |
run
run( fn, args=(), kwargs=None, options=None )
Run fn
on each replica, with the given arguments.
In OneDeviceStrategy
, fn
is simply called within a device scope for the given device, with the provided arguments.
Args | |
---|---|
fn | The function to run. The output must be a tf.nest of Tensor s. |
args | (Optional) Positional arguments to fn . |
kwargs | (Optional) Keyword arguments to fn . |
options | (Optional) An instance of tf.distribute.RunOptions specifying the options to run fn . |
Returns | |
---|---|
Return value from running fn . |
scope
scope()
Returns a context manager selecting this Strategy as current.
Inside a with strategy.scope():
code block, this thread will use a variable creator set by strategy
, and will enter its "cross-replica context".
In OneDeviceStrategy
, all variables created inside strategy.scope()
will be on device
specified at strategy construction time. See example in the docs for this class.
Returns | |
---|---|
A context manager to use for creating variables with this strategy. |
© 2020 The TensorFlow Authors. All rights reserved.
Licensed under the Creative Commons Attribution License 3.0.
Code samples licensed under the Apache 2.0 License.
https://www.tensorflow.org/versions/r2.4/api_docs/python/tf/distribute/OneDeviceStrategy