sklearn.datasets.fetch_kddcup99
-
sklearn.datasets.fetch_kddcup99(*, subset=None, data_home=None, shuffle=False, random_state=None, percent10=True, download_if_missing=True, return_X_y=False, as_frame=False)
[source] -
Load the kddcup99 dataset (classification).
Download it if necessary.
Classes
23
Samples total
4898431
Dimensionality
41
Features
discrete (int) or continuous (float)
Read more in the User Guide.
New in version 0.18.
- Parameters
-
-
subset{‘SA’, ‘SF’, ‘http’, ‘smtp’}, default=None
-
To return the corresponding classical subsets of kddcup 99. If None, return the entire kddcup 99 dataset.
-
data_homestr, default=None
-
Specify another download and cache folder for the datasets. By default all scikit-learn data is stored in ‘~/scikit_learn_data’ subfolders. .. versionadded:: 0.19
-
shufflebool, default=False
-
Whether to shuffle dataset.
-
random_stateint, RandomState instance or None, default=None
-
Determines random number generation for dataset shuffling and for selection of abnormal samples if
subset='SA'
. Pass an int for reproducible output across multiple function calls. See Glossary. -
percent10bool, default=True
-
Whether to load only 10 percent of the data.
-
download_if_missingbool, default=True
-
If False, raise a IOError if the data is not locally available instead of trying to download the data from the source site.
-
return_X_ybool, default=False
-
If True, returns
(data, target)
instead of a Bunch object. See below for more information about thedata
andtarget
object.New in version 0.20.
-
as_framebool, default=False
-
If
True
, returns a pandas Dataframe for thedata
andtarget
objects in theBunch
returned object;Bunch
return object will also have aframe
member.New in version 0.24.
-
- Returns
-
-
dataBunch
-
Dictionary-like object, with the following attributes.
-
data{ndarray, dataframe} of shape (494021, 41)
-
The data matrix to learn. If
as_frame=True
,data
will be a pandas DataFrame. -
target{ndarray, series} of shape (494021,)
-
The regression target for each sample. If
as_frame=True
,target
will be a pandas Series. -
framedataframe of shape (494021, 42)
-
Only present when
as_frame=True
. Containsdata
andtarget
. -
DESCRstr
-
The full description of the dataset.
-
feature_nameslist
-
The names of the dataset columns
- target_names: list
-
The names of the target columns
-
-
(data, target)tuple if return_X_y is True
-
New in version 0.20.
-
© 2007–2020 The scikit-learn developers
Licensed under the 3-clause BSD License.
https://scikit-learn.org/0.24/modules/generated/sklearn.datasets.fetch_kddcup99.html