sklearn.linear_model.TweedieRegressor

class sklearn.linear_model.TweedieRegressor(*, power=0.0, alpha=1.0, fit_intercept=True, link='auto', max_iter=100, tol=0.0001, warm_start=False, verbose=0) [source]

Generalized Linear Model with a Tweedie distribution.

This estimator can be used to model different GLMs depending on the power parameter, which determines the underlying distribution.

Read more in the User Guide.

New in version 0.23.

Parameters

powerfloat, default=0

The power determines the underlying target distribution according to the following table:

Power	Distribution
0	Normal
1	Poisson
(1,2)	Compound Poisson Gamma
2	Gamma
3	Inverse Gaussian

For 0 < power < 1, no distribution exists.

alphafloat, default=1

Constant that multiplies the penalty term and thus determines the regularization strength. alpha = 0 is equivalent to unpenalized GLMs. In this case, the design matrix X must have full column rank (no collinearities).

link{‘auto’, ‘identity’, ‘log’}, default=’auto’

The link function of the GLM, i.e. mapping from linear predictor X @ coeff + intercept to prediction y_pred. Option ‘auto’ sets the link depending on the chosen family as follows:

‘identity’ for Normal distribution
‘log’ for Poisson, Gamma and Inverse Gaussian distributions

fit_interceptbool, default=True

Specifies if a constant (a.k.a. bias or intercept) should be added to the linear predictor (X @ coef + intercept).

max_iterint, default=100

The maximal number of iterations for the solver.

tolfloat, default=1e-4

Stopping criterion. For the lbfgs solver, the iteration will stop when max{|g_j|, j = 1, ..., d} <= tol where g_j is the j-th component of the gradient (derivative) of the objective function.

warm_startbool, default=False

If set to True, reuse the solution of the previous call to fit as initialization for coef_ and intercept_ .

verboseint, default=0

For the lbfgs solver set verbose to any positive number for verbosity.

Attributes

coef_array of shape (n_features,): Estimated coefficients for the linear predictor (X @ coef_ + intercept_) in the GLM.
intercept_float: Intercept (a.k.a. bias) added to linear predictor.
n_iter_int: Actual number of iterations used in the solver.

Examples

>>> from sklearn import linear_model
>>> clf = linear_model.TweedieRegressor()
>>> X = [[1, 2], [2, 3], [3, 4], [4, 3]]
>>> y = [2, 3.5, 5, 5.5]
>>> clf.fit(X, y)
TweedieRegressor()
>>> clf.score(X, y)
0.839...
>>> clf.coef_
array([0.599..., 0.299...])
>>> clf.intercept_
1.600...
>>> clf.predict([[1, 1], [3, 4]])
array([2.500..., 4.599...])

Methods

`fit`(X, y[, sample_weight])	Fit a Generalized Linear Model.
`get_params`([deep])	Get parameters for this estimator.
`predict`(X)	Predict using GLM with feature matrix X.
`score`(X, y[, sample_weight])	Compute D^2, the percentage of deviance explained.
`set_params`(**params)	Set the parameters of this estimator.

fit(X, y, sample_weight=None) [source]

Fit a Generalized Linear Model.

Parameters

X{array-like, sparse matrix} of shape (n_samples, n_features): Training data.
yarray-like of shape (n_samples,): Target values.
sample_weightarray-like of shape (n_samples,), default=None: Sample weights.

Returns

selfreturns an instance of self.

get_params(deep=True) [source]

Get parameters for this estimator.

Parameters

deepbool, default=True: If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns

paramsdict: Parameter names mapped to their values.

predict(X) [source]

Predict using GLM with feature matrix X.

Parameters

X{array-like, sparse matrix} of shape (n_samples, n_features): Samples.

Returns

y_predarray of shape (n_samples,): Returns predicted values.

score(X, y, sample_weight=None) [source]

Compute D^2, the percentage of deviance explained.

D^2 is a generalization of the coefficient of determination R^2. R^2 uses squared error and D^2 deviance. Note that those two are equal for family='normal'.

D^2 is defined as \(D^2 = 1-\frac{D(y_{true},y_{pred})}{D_{null}}\), \(D_{null}\) is the null deviance, i.e. the deviance of a model with intercept alone, which corresponds to \(y_{pred} = \bar{y}\). The mean \(\bar{y}\) is averaged by sample_weight. Best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse).

Parameters

X{array-like, sparse matrix} of shape (n_samples, n_features): Test samples.
yarray-like of shape (n_samples,): True values of target.
sample_weightarray-like of shape (n_samples,), default=None: Sample weights.

Returns

scorefloat: D^2 of self.predict(X) w.r.t. y.

set_params(**params) [source]

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters

**paramsdict: Estimator parameters.

Returns

selfestimator instance: Estimator instance.

Examples using `sklearn.linear_model.TweedieRegressor`

Release Highlights for scikit-learn 0.23

Tweedie regression on insurance claims

© 2007–2020 The scikit-learn developers
Licensed under the 3-clause BSD License.
https://scikit-learn.org/0.24/modules/generated/sklearn.linear_model.TweedieRegressor.html

sklearn.linear_model.TweedieRegressor

Examples

Methods

Examples using sklearn.linear_model.TweedieRegressor

Examples using `sklearn.linear_model.TweedieRegressor`