pandas.api.extensions.ExtensionArray

class pandas.api.extensions.ExtensionArray [source]

Abstract base class for custom 1-D array types.

pandas will recognize instances of this class as proper arrays with a custom type and will not attempt to coerce them to objects. They may be stored directly inside a DataFrame or Series.

New in version 0.23.0.

Notes

The interface includes the following abstract methods that must be implemented by subclasses:

  • _from_sequence
  • _from_factorized
  • __getitem__
  • __len__
  • dtype
  • nbytes
  • isna
  • take
  • copy
  • _concat_same_type

An additional method is available to satisfy pandas’ internal, private block API.

  • _formatting_values

Some methods require casting the ExtensionArray to an ndarray of Python objects with self.astype(object), which may be expensive. When performance is a concern, we highly recommend overriding the following methods:

  • fillna
  • unique
  • factorize / _values_for_factorize
  • argsort / _values_for_argsort

This class does not inherit from ‘abc.ABCMeta’ for performance reasons. Methods and properties required by the interface raise pandas.errors.AbstractMethodError and no register method is provided for registering virtual subclasses.

ExtensionArrays are limited to 1 dimension.

They may be backed by none, one, or many NumPy arrays. For example, pandas.Categorical is an extension array backed by two arrays, one for codes and one for categories. An array of IPv6 address may be backed by a NumPy structured array with two fields, one for the lower 64 bits and one for the upper 64 bits. Or they may be backed by some other storage type, like Python lists. Pandas makes no assumptions on how the data are stored, just that it can be converted to a NumPy array. The ExtensionArray interface does not impose any rules on how this data is stored. However, currently, the backing data cannot be stored in attributes called .values or ._values to ensure full compatibility with pandas internals. But other names as .data, ._data, ._items, … can be freely used.

Attributes

dtype An instance of ‘ExtensionDtype’.
nbytes The number of bytes needed to store this object in memory.
ndim Extension Arrays are only allowed to be 1-dimensional.
shape Return a tuple of the array dimensions.

Methods

argsort([ascending, kind]) Return the indices that would sort this array.
astype(dtype[, copy]) Cast to a NumPy array with ‘dtype’.
copy([deep]) Return a copy of the array.
factorize([na_sentinel]) Encode the extension array as an enumerated type.
fillna([value, method, limit]) Fill NA/NaN values using the specified method.
isna() Boolean NumPy array indicating if each value is missing.
take(indices[, allow_fill, fill_value]) Take elements from an array.
unique() Compute the ExtensionArray of unique values.

© 2008–2012, AQR Capital Management, LLC, Lambda Foundry, Inc. and PyData Development Team
Licensed under the 3-clause BSD License.
https://pandas.pydata.org/pandas-docs/version/0.23.4/generated/pandas.api.extensions.ExtensionArray.html