sklearn.impute.MissingIndicator
-
class sklearn.impute.MissingIndicator(*, missing_values=nan, features='missing-only', sparse='auto', error_on_new=True)[source] -
Binary indicators for missing values.
Note that this component typically should not be used in a vanilla
Pipelineconsisting of transformers and a classifier, but rather could be added using aFeatureUnionorColumnTransformer.Read more in the User Guide.
New in version 0.20.
- Parameters
-
-
missing_valuesint, float, string, np.nan or None, default=np.nan -
The placeholder for the missing values. All occurrences of
missing_valueswill be imputed. For pandas’ dataframes with nullable integer dtypes with missing values,missing_valuesshould be set tonp.nan, sincepd.NAwill be converted tonp.nan. -
features{‘missing-only’, ‘all’}, default=’missing-only’ -
Whether the imputer mask should represent all or a subset of features.
- If ‘missing-only’ (default), the imputer mask will only represent features containing missing values during fit time.
- If ‘all’, the imputer mask will represent all features.
-
sparsebool or ‘auto’, default=’auto’ -
Whether the imputer mask format should be sparse or dense.
- If ‘auto’ (default), the imputer mask will be of same type as input.
- If True, the imputer mask will be a sparse matrix.
- If False, the imputer mask will be a numpy array.
-
error_on_newbool, default=True -
If True, transform will raise an error when there are features with missing values in transform that have no missing values in fit. This is applicable only when
features='missing-only'.
-
- Attributes
-
-
features_ndarray, shape (n_missing_features,) or (n_features,) -
The features indices which will be returned when calling
transform. They are computed duringfit. Forfeatures='all', it is torange(n_features).
-
Examples
>>> import numpy as np >>> from sklearn.impute import MissingIndicator >>> X1 = np.array([[np.nan, 1, 3], ... [4, 0, np.nan], ... [8, 1, 0]]) >>> X2 = np.array([[5, 1, np.nan], ... [np.nan, 2, 3], ... [2, 4, 0]]) >>> indicator = MissingIndicator() >>> indicator.fit(X1) MissingIndicator() >>> X2_tr = indicator.transform(X2) >>> X2_tr array([[False, True], [ True, False], [False, False]])Methods
fit(X[, y])Fit the transformer on X.
fit_transform(X[, y])Generate missing values indicator for X.
get_params([deep])Get parameters for this estimator.
set_params(**params)Set the parameters of this estimator.
transform(X)Generate missing values indicator for X.
-
fit(X, y=None)[source] -
Fit the transformer on X.
- Parameters
-
-
X{array-like, sparse matrix}, shape (n_samples, n_features) -
Input data, where
n_samplesis the number of samples andn_featuresis the number of features.
-
- Returns
-
-
selfobject -
Returns self.
-
-
fit_transform(X, y=None)[source] -
Generate missing values indicator for X.
- Parameters
-
-
X{array-like, sparse matrix}, shape (n_samples, n_features) -
The input data to complete.
-
- Returns
-
-
Xt{ndarray or sparse matrix}, shape (n_samples, n_features) or (n_samples, n_features_with_missing) -
The missing indicator for input data. The data type of
Xtwill be boolean.
-
-
get_params(deep=True)[source] -
Get parameters for this estimator.
- Parameters
-
-
deepbool, default=True -
If True, will return the parameters for this estimator and contained subobjects that are estimators.
-
- Returns
-
-
paramsdict -
Parameter names mapped to their values.
-
-
set_params(**params)[source] -
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters
-
-
**paramsdict -
Estimator parameters.
-
- Returns
-
-
selfestimator instance -
Estimator instance.
-
-
transform(X)[source] -
Generate missing values indicator for X.
- Parameters
-
-
X{array-like, sparse matrix}, shape (n_samples, n_features) -
The input data to complete.
-
- Returns
-
-
Xt{ndarray or sparse matrix}, shape (n_samples, n_features) or (n_samples, n_features_with_missing) -
The missing indicator for input data. The data type of
Xtwill be boolean.
-
© 2007–2020 The scikit-learn developers
Licensed under the 3-clause BSD License.
https://scikit-learn.org/0.24/modules/generated/sklearn.impute.MissingIndicator.html