sparse_feature_a = sparse_column_with_hash_bucket(...)
sparse_feature_b = sparse_column_with_hash_bucket(...)
sparse_feature_a_emb = embedding_column(sparse_id_column=sparse_feature_a,
...)
sparse_feature_b_emb = embedding_column(sparse_id_column=sparse_feature_b,
...)
estimator = DNNClassifier(
feature_columns=[sparse_feature_a_emb, sparse_feature_b_emb],
hidden_units=[1024, 512, 256])
# Or estimator using the ProximalAdagradOptimizer optimizer with
# regularization.
estimator = DNNClassifier(
feature_columns=[sparse_feature_a_emb, sparse_feature_b_emb],
hidden_units=[1024, 512, 256],
optimizer=tf.compat.v1.train.ProximalAdagradOptimizer(
learning_rate=0.1,
l1_regularization_strength=0.001
))
# Input builders
def input_fn_train: # returns x, y (where y represents label's class index).
pass
estimator.fit(input_fn=input_fn_train)
def input_fn_eval: # returns x, y (where y represents label's class index).
pass
estimator.evaluate(input_fn=input_fn_eval)
def input_fn_predict: # returns x, None
pass
# predict_classes returns class indices.
estimator.predict_classes(input_fn=input_fn_predict)
If the user specifies label_keys in constructor, labels must be strings from the label_keys vocabulary. Example:
label_keys = ['label0', 'label1', 'label2']
estimator = DNNClassifier(
feature_columns=[sparse_feature_a_emb, sparse_feature_b_emb],
hidden_units=[1024, 512, 256],
label_keys=label_keys)
def input_fn_train: # returns x, y (where y is one of label_keys).
pass
estimator.fit(input_fn=input_fn_train)
def input_fn_eval: # returns x, y (where y is one of label_keys).
pass
estimator.evaluate(input_fn=input_fn_eval)
def input_fn_predict: # returns x, None
# predict_classes returns one of label_keys.
estimator.predict_classes(input_fn=input_fn_predict)
Input of fit and evaluate should have following features, otherwise there will be a KeyError:
if weight_column_name is not None, a feature with key=weight_column_name whose value is a Tensor.
for each column in feature_columns:
if column is a SparseColumn, a feature with key=column.name whose value is a SparseTensor.
if column is a WeightedSparseColumn, two features: the first with key the id column name, the second with key the weight column name. Both features' value must be a SparseTensor.
if column is a RealValuedColumn, a feature with key=column.name whose value is a Tensor.
Args
hidden_units
List of hidden units per layer. All layers are fully connected. Ex. [64, 32] means first layer has 64 nodes and second one has 32.
feature_columns
An iterable containing all the feature columns used by the model. All items in the set should be instances of classes derived from FeatureColumn.
model_dir
Directory to save model parameters, graph and etc. This can also be used to load checkpoints from the directory into a estimator to continue training a previously saved model.
n_classes
number of label classes. Default is binary classification. It must be greater than 1. Note: Class labels are integers representing the class index (i.e. values from 0 to n_classes-1). For arbitrary label values (e.g. string labels), convert to class indices first.
weight_column_name
A string defining feature column name representing weights. It is used to down weight or boost examples during training. It will be multiplied by the loss of the example.
optimizer
An instance of tf.Optimizer used to train the model. If None, will use an Adagrad optimizer.
activation_fn
Activation function applied to each layer. If None, will use tf.nn.relu. Note that a string containing the unqualified name of the op may also be provided, e.g., "relu", "tanh", or "sigmoid".
dropout
When not None, the probability we will drop out a given coordinate.
gradient_clip_norm
A float > 0. If provided, gradients are clipped to their global norm with this clipping ratio. See tf.clip_by_global_norm for more details.
enable_centered_bias
A bool. If True, estimator will learn a centered bias variable for each class. Rest of the model structure learns the residual after centered bias.
config
RunConfig object to configure the runtime settings.
feature_engineering_fn
Feature engineering function. Takes features and labels which are the output of input_fn and returns features and labels which will be fed into the model.
embedding_lr_multipliers
Optional. A dictionary from EmbeddingColumn to a float multiplier. Multiplier will be used to multiply with learning rate for the embedding variables.
input_layer_min_slice_size
Optional. The min slice size of input layer partitions. If not provided, will use the default of 64M.
label_keys
Optional list of strings with size [n_classes] defining the label vocabulary. Only supported for n_classes > 2.
Raises
ValueError
If n_classes < 2.
Attributes
config
model_dir
Returns a path in which the eval process will look for checkpoints.
model_fn
Returns the model_fn which is bound to self.params.
Exports inference graph as a SavedModel into given dir.
Args
export_dir_base
A string containing a directory to write the exported graph and checkpoints.
serving_input_fn
A function that takes no argument and returns an InputFnOps.
default_output_alternative_key
the name of the head to serve when none is specified. Not needed for single-headed models.
assets_extra
A dict specifying how to populate the assets.extra directory within the exported SavedModel. Each key should give the destination path (including the filename) relative to the assets.extra directory. The corresponding value gives the full path of the source file to be copied. For example, the simple case of copying a single file without renaming it is specified as {'my_asset_file.txt': '/path/to/my_asset_file.txt'}.
as_text
whether to write the SavedModel proto in text format.
checkpoint_path
The checkpoint path to export. If None (the default), the most recent checkpoint found within the model directory is chosen.
graph_rewrite_specs
an iterable of GraphRewriteSpec. Each element will produce a separate MetaGraphDef within the exported SavedModel, tagged and rewritten as specified. Defaults to a single entry using the default serving tag ("serve") and no rewriting.
strip_default_attrs
Boolean. If True, default-valued attributes will be removed from the NodeDefs. For a detailed guide, see Stripping Default-Valued Attributes.
Incremental fit on a batch of samples. (deprecated arguments)
This method is expected to be called several times consecutively on different or the same chunks of the dataset. This either can implement iterative training or out-of-core/online training.
This is especially useful when the whole dataset is too big to fit in memory at the same time. Or when model is taking long time to converge, and you want to split up training into subparts.
Args
x
Matrix of shape [n_samples, n_features...]. Can be iterator that returns arrays of features. The training input samples for fitting the model. If set, input_fn must be None.
y
Vector or matrix [n_samples] or [n_samples, n_outputs]. Can be iterator that returns array of labels. The training label values (class labels in classification, real numbers in regression). If set, input_fn must be None.
input_fn
Input function. If set, x, y, and batch_size must be None.
steps
Number of steps for which to train model. If None, train forever.
batch_size
minibatch size to use on the input, defaults to first dimension of x. Must be None if input_fn is provided.
monitors
List of BaseMonitor subclass instances. Used for callbacks inside the training loop.
Returns
self, for chaining.
Raises
ValueError
If at least one of x and y is provided, and input_fn is provided.
Returns predictions for given features. (deprecated argument values) (deprecated argument values)
By default, returns predicted classes. But this default will be dropped soon. Users should either pass outputs, or call predict_classes method.
Args
x
features.
input_fn
Input function. If set, x must be None.
batch_size
Override default batch size.
outputs
list of str, name of the output to predict. If None, returns classes.
as_iterable
If True, return an iterable which keeps yielding predictions for each example until inputs are exhausted. Note: The inputs must terminate if you want the iterable to terminate (e.g. be sure to pass num_epochs=1 if you are using something like read_batch_features).
Returns
Numpy array of predicted classes with shape batch_size. Each predicted class is represented by its class index (i.e. integer from 0 to n_classes-1). If outputs is set, returns a dict of predictions.
Returns predicted classes for given features. (deprecated argument values)
Args
x
features.
input_fn
Input function. If set, x must be None.
batch_size
Override default batch size.
as_iterable
If True, return an iterable which keeps yielding predictions for each example until inputs are exhausted. Note: The inputs must terminate if you want the iterable to terminate (e.g. be sure to pass num_epochs=1 if you are using something like read_batch_features).
Returns
Numpy array of predicted classes with shape batch_size. Each predicted class is represented by its class index (i.e. integer from 0 to n_classes-1).
Returns predicted probabilities for given features. (deprecated argument values)
Args
x
features.
input_fn
Input function. If set, x and y must be None.
batch_size
Override default batch size.
as_iterable
If True, return an iterable which keeps yielding predictions for each example until inputs are exhausted. Note: The inputs must terminate if you want the iterable to terminate (e.g. be sure to pass num_epochs=1 if you are using something like read_batch_features).
The method works on simple estimators as well as on nested objects (such as pipelines). The former have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.