sklearn.preprocessing.robust_scale

sklearn.preprocessing.robust_scale(X, *, axis=0, with_centering=True, with_scaling=True, quantile_range=25.0, 75.0, copy=True, unit_variance=False) [source]

Standardize a dataset along any axis

Center to the median and component wise scale according to the interquartile range.

Notes

This implementation will refuse to center scipy.sparse matrices since it would make them non-sparse and would potentially crash the program with memory exhaustion problems.

Instead the caller is expected to either set explicitly with_centering=False (in that case, only variance scaling will be performed on the features of the CSR matrix) or to call X.toarray() if he/she expects the materialized dense array to fit in memory.

To avoid memory copy the caller should pass a CSR matrix.

For a comparison of the different scalers, transformers, and normalizers, see examples/preprocessing/plot_all_scaling.py.

Warning

Risk of data leak

Do not use robust_scale unless you know what you are doing. A common mistake is to apply it to the entire data before splitting into training and test sets. This will bias the model evaluation because information would have leaked from the test set to the training set. In general, we recommend using RobustScaler within a Pipeline in order to prevent most risks of data leaking: pipe = make_pipeline(RobustScaler(), LogisticRegression()).

© 2007–2020 The scikit-learn developers
Licensed under the 3-clause BSD License.
https://scikit-learn.org/0.24/modules/generated/sklearn.preprocessing.robust_scale.html