sklearn.dummy.DummyClassifier
-
class sklearn.dummy.DummyClassifier(*, strategy='prior', random_state=None, constant=None)[source] -
DummyClassifier is a classifier that makes predictions using simple rules.
This classifier is useful as a simple baseline to compare with other (real) classifiers. Do not use it for real problems.
Read more in the User Guide.
New in version 0.13.
- Parameters
-
-
strategy{“stratified”, “most_frequent”, “prior”, “uniform”, “constant”}, default=”prior” -
Strategy to use to generate predictions.
- “stratified”: generates predictions by respecting the training set’s class distribution.
- “most_frequent”: always predicts the most frequent label in the training set.
- “prior”: always predicts the class that maximizes the class prior (like “most_frequent”) and
predict_probareturns the class prior. - “uniform”: generates predictions uniformly at random.
-
“constant”: always predicts a constant label that is provided by the user. This is useful for metrics that evaluate a non-majority class
Changed in version 0.24: The default value of
strategyhas changed to “prior” in version 0.24.
-
random_stateint, RandomState instance or None, default=None -
Controls the randomness to generate the predictions when
strategy='stratified'orstrategy='uniform'. Pass an int for reproducible output across multiple function calls. See Glossary. -
constantint or str or array-like of shape (n_outputs,) -
The explicit constant as predicted by the “constant” strategy. This parameter is useful only for the “constant” strategy.
-
- Attributes
-
-
classes_ndarray of shape (n_classes,) or list of such arrays -
Class labels for each output.
-
n_classes_int or list of int -
Number of label for each output.
-
class_prior_ndarray of shape (n_classes,) or list of such arrays -
Probability of each class for each output.
-
n_outputs_int -
Number of outputs.
-
sparse_output_bool -
True if the array returned from predict is to be in sparse CSC format. Is automatically set to True if the input y is passed in sparse format.
-
Examples
>>> import numpy as np >>> from sklearn.dummy import DummyClassifier >>> X = np.array([-1, 1, 1, 1]) >>> y = np.array([0, 1, 1, 1]) >>> dummy_clf = DummyClassifier(strategy="most_frequent") >>> dummy_clf.fit(X, y) DummyClassifier(strategy='most_frequent') >>> dummy_clf.predict(X) array([1, 1, 1, 1]) >>> dummy_clf.score(X, y) 0.75
Methods
fit(X, y[, sample_weight])Fit the random classifier.
get_params([deep])Get parameters for this estimator.
predict(X)Perform classification on test vectors X.
Return log probability estimates for the test vectors X.
Return probability estimates for the test vectors X.
score(X, y[, sample_weight])Returns the mean accuracy on the given test data and labels.
set_params(**params)Set the parameters of this estimator.
-
fit(X, y, sample_weight=None)[source] -
Fit the random classifier.
- Parameters
-
-
Xarray-like of shape (n_samples, n_features) -
Training data.
-
yarray-like of shape (n_samples,) or (n_samples, n_outputs) -
Target values.
-
sample_weightarray-like of shape (n_samples,), default=None -
Sample weights.
-
- Returns
-
-
selfobject
-
-
get_params(deep=True)[source] -
Get parameters for this estimator.
- Parameters
-
-
deepbool, default=True -
If True, will return the parameters for this estimator and contained subobjects that are estimators.
-
- Returns
-
-
paramsdict -
Parameter names mapped to their values.
-
-
predict(X)[source] -
Perform classification on test vectors X.
- Parameters
-
-
Xarray-like of shape (n_samples, n_features) -
Test data.
-
- Returns
-
-
yarray-like of shape (n_samples,) or (n_samples, n_outputs) -
Predicted target values for X.
-
-
predict_log_proba(X)[source] -
Return log probability estimates for the test vectors X.
- Parameters
-
-
X{array-like, object with finite length or shape} -
Training data, requires length = n_samples
-
- Returns
-
-
Pndarray of shape (n_samples, n_classes) or list of such arrays -
Returns the log probability of the sample for each class in the model, where classes are ordered arithmetically for each output.
-
-
predict_proba(X)[source] -
Return probability estimates for the test vectors X.
- Parameters
-
-
Xarray-like of shape (n_samples, n_features) -
Test data.
-
- Returns
-
-
Pndarray of shape (n_samples, n_classes) or list of such arrays -
Returns the probability of the sample for each class in the model, where classes are ordered arithmetically, for each output.
-
-
score(X, y, sample_weight=None)[source] -
Returns the mean accuracy on the given test data and labels.
In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.
- Parameters
-
-
XNone or array-like of shape (n_samples, n_features) -
Test samples. Passing None as test samples gives the same result as passing real test samples, since DummyClassifier operates independently of the sampled observations.
-
yarray-like of shape (n_samples,) or (n_samples, n_outputs) -
True labels for X.
-
sample_weightarray-like of shape (n_samples,), default=None -
Sample weights.
-
- Returns
-
-
scorefloat -
Mean accuracy of self.predict(X) wrt. y.
-
-
set_params(**params)[source] -
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters
-
-
**paramsdict -
Estimator parameters.
-
- Returns
-
-
selfestimator instance -
Estimator instance.
-
© 2007–2020 The scikit-learn developers
Licensed under the 3-clause BSD License.
https://scikit-learn.org/0.24/modules/generated/sklearn.dummy.DummyClassifier.html