sklearn.impute.MissingIndicator
-
class sklearn.impute.MissingIndicator(*, missing_values=nan, features='missing-only', sparse='auto', error_on_new=True)
[source] -
Binary indicators for missing values.
Note that this component typically should not be used in a vanilla
Pipeline
consisting of transformers and a classifier, but rather could be added using aFeatureUnion
orColumnTransformer
.Read more in the User Guide.
New in version 0.20.
- Parameters
-
-
missing_valuesint, float, string, np.nan or None, default=np.nan
-
The placeholder for the missing values. All occurrences of
missing_values
will be imputed. For pandas’ dataframes with nullable integer dtypes with missing values,missing_values
should be set tonp.nan
, sincepd.NA
will be converted tonp.nan
. -
features{‘missing-only’, ‘all’}, default=’missing-only’
-
Whether the imputer mask should represent all or a subset of features.
- If ‘missing-only’ (default), the imputer mask will only represent features containing missing values during fit time.
- If ‘all’, the imputer mask will represent all features.
-
sparsebool or ‘auto’, default=’auto’
-
Whether the imputer mask format should be sparse or dense.
- If ‘auto’ (default), the imputer mask will be of same type as input.
- If True, the imputer mask will be a sparse matrix.
- If False, the imputer mask will be a numpy array.
-
error_on_newbool, default=True
-
If True, transform will raise an error when there are features with missing values in transform that have no missing values in fit. This is applicable only when
features='missing-only'
.
-
- Attributes
-
-
features_ndarray, shape (n_missing_features,) or (n_features,)
-
The features indices which will be returned when calling
transform
. They are computed duringfit
. Forfeatures='all'
, it is torange(n_features)
.
-
Examples
>>> import numpy as np >>> from sklearn.impute import MissingIndicator >>> X1 = np.array([[np.nan, 1, 3], ... [4, 0, np.nan], ... [8, 1, 0]]) >>> X2 = np.array([[5, 1, np.nan], ... [np.nan, 2, 3], ... [2, 4, 0]]) >>> indicator = MissingIndicator() >>> indicator.fit(X1) MissingIndicator() >>> X2_tr = indicator.transform(X2) >>> X2_tr array([[False, True], [ True, False], [False, False]])
Methods
fit
(X[, y])Fit the transformer on X.
fit_transform
(X[, y])Generate missing values indicator for X.
get_params
([deep])Get parameters for this estimator.
set_params
(**params)Set the parameters of this estimator.
transform
(X)Generate missing values indicator for X.
-
fit(X, y=None)
[source] -
Fit the transformer on X.
- Parameters
-
-
X{array-like, sparse matrix}, shape (n_samples, n_features)
-
Input data, where
n_samples
is the number of samples andn_features
is the number of features.
-
- Returns
-
-
selfobject
-
Returns self.
-
-
fit_transform(X, y=None)
[source] -
Generate missing values indicator for X.
- Parameters
-
-
X{array-like, sparse matrix}, shape (n_samples, n_features)
-
The input data to complete.
-
- Returns
-
-
Xt{ndarray or sparse matrix}, shape (n_samples, n_features) or (n_samples, n_features_with_missing)
-
The missing indicator for input data. The data type of
Xt
will be boolean.
-
-
get_params(deep=True)
[source] -
Get parameters for this estimator.
- Parameters
-
-
deepbool, default=True
-
If True, will return the parameters for this estimator and contained subobjects that are estimators.
-
- Returns
-
-
paramsdict
-
Parameter names mapped to their values.
-
-
set_params(**params)
[source] -
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters
-
-
**paramsdict
-
Estimator parameters.
-
- Returns
-
-
selfestimator instance
-
Estimator instance.
-
-
transform(X)
[source] -
Generate missing values indicator for X.
- Parameters
-
-
X{array-like, sparse matrix}, shape (n_samples, n_features)
-
The input data to complete.
-
- Returns
-
-
Xt{ndarray or sparse matrix}, shape (n_samples, n_features) or (n_samples, n_features_with_missing)
-
The missing indicator for input data. The data type of
Xt
will be boolean.
-
© 2007–2020 The scikit-learn developers
Licensed under the 3-clause BSD License.
https://scikit-learn.org/0.24/modules/generated/sklearn.impute.MissingIndicator.html