tf.data.experimental.TFRecordWriter
View source on GitHub |
Writes a dataset to a TFRecord file.
tf.data.experimental.TFRecordWriter( filename, compression_type=None )
The elements of the dataset must be scalar strings. To serialize dataset elements as strings, you can use the tf.io.serialize_tensor
function.
dataset = tf.data.Dataset.range(3) dataset = dataset.map(tf.io.serialize_tensor) writer = tf.data.experimental.TFRecordWriter("/path/to/file.tfrecord") writer.write(dataset)
To read back the elements, use TFRecordDataset
.
dataset = tf.data.TFRecordDataset("/path/to/file.tfrecord") dataset = dataset.map(lambda x: tf.io.parse_tensor(x, tf.int64))
To shard a dataset
across multiple TFRecord files:
dataset = ... # dataset to be written def reduce_func(key, dataset): filename = tf.strings.join([PATH_PREFIX, tf.strings.as_string(key)]) writer = tf.data.experimental.TFRecordWriter(filename) writer.write(dataset.map(lambda _, x: x)) return tf.data.Dataset.from_tensors(filename) dataset = dataset.enumerate() dataset = dataset.apply(tf.data.experimental.group_by_window( lambda i, _: i % NUM_SHARDS, reduce_func, tf.int64.max ))
Args | |
---|---|
filename | a string path indicating where to write the TFRecord data. |
compression_type | (Optional.) a string indicating what type of compression to use when writing the file. See tf.io.TFRecordCompressionType for what types of compression are available. Defaults to None . |
Methods
write
write( dataset )
Writes a dataset to a TFRecord file.
An operation that writes the content of the specified dataset to the file specified in the constructor.
If the file exists, it will be overwritten.
Args | |
---|---|
dataset | a tf.data.Dataset whose elements are to be written to a file |
Returns | |
---|---|
In graph mode, this returns an operation which when executed performs the write. In eager mode, the write is performed by the method itself and there is no return value. |
Raises TypeError: if dataset
is not a tf.data.Dataset
. TypeError: if the elements produced by the dataset are not scalar strings.
© 2020 The TensorFlow Authors. All rights reserved.
Licensed under the Creative Commons Attribution License 3.0.
Code samples licensed under the Apache 2.0 License.
https://www.tensorflow.org/versions/r2.4/api_docs/python/tf/data/experimental/TFRecordWriter