numpy.random.Generator.hypergeometric
method
-
Generator.hypergeometric(ngood, nbad, nsample, size=None)
-
Draw samples from a Hypergeometric distribution.
Samples are drawn from a hypergeometric distribution with specified parameters,
ngood
(ways to make a good selection),nbad
(ways to make a bad selection), andnsample
(number of items sampled, which is less than or equal to the sumngood + nbad
).- Parameters
-
-
ngoodint or array_like of ints
-
Number of ways to make a good selection. Must be nonnegative and less than 10**9.
-
nbadint or array_like of ints
-
Number of ways to make a bad selection. Must be nonnegative and less than 10**9.
-
nsampleint or array_like of ints
-
Number of items sampled. Must be nonnegative and less than
ngood + nbad
. -
sizeint or tuple of ints, optional
-
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned ifngood
,nbad
, andnsample
are all scalars. Otherwise,np.broadcast(ngood, nbad, nsample).size
samples are drawn.
-
- Returns
-
-
outndarray or scalar
-
Drawn samples from the parameterized hypergeometric distribution. Each sample is the number of good items within a randomly selected subset of size
nsample
taken from a set ofngood
good items andnbad
bad items.
-
See also
-
multivariate_hypergeometric
-
Draw samples from the multivariate hypergeometric distribution.
-
scipy.stats.hypergeom
-
probability density function, distribution or cumulative density function, etc.
Notes
The probability density for the Hypergeometric distribution is
where and
for P(x) the probability of
x
good results in the drawn sample, g =ngood
, b =nbad
, and n =nsample
.Consider an urn with black and white marbles in it,
ngood
of them are black andnbad
are white. If you drawnsample
balls without replacement, then the hypergeometric distribution describes the distribution of black balls in the drawn sample.Note that this distribution is very similar to the binomial distribution, except that in this case, samples are drawn without replacement, whereas in the Binomial case samples are drawn with replacement (or the sample space is infinite). As the sample space becomes large, this distribution approaches the binomial.
The arguments
ngood
andnbad
each must be less than10**9
. For extremely large arguments, the algorithm that is used to compute the samples [4] breaks down because of loss of precision in floating point calculations. For such large values, ifnsample
is not also large, the distribution can be approximated with the binomial distribution,binomial(n=nsample, p=ngood/(ngood + nbad))
.References
-
1
-
Lentner, Marvin, “Elementary Applied Statistics”, Bogden and Quigley, 1972.
-
2
-
Weisstein, Eric W. “Hypergeometric Distribution.” From MathWorld–A Wolfram Web Resource. http://mathworld.wolfram.com/HypergeometricDistribution.html
-
3
-
Wikipedia, “Hypergeometric distribution”, https://en.wikipedia.org/wiki/Hypergeometric_distribution
-
4
-
Stadlober, Ernst, “The ratio of uniforms approach for generating discrete random variates”, Journal of Computational and Applied Mathematics, 31, pp. 181-189 (1990).
Examples
Draw samples from the distribution:
>>> rng = np.random.default_rng() >>> ngood, nbad, nsamp = 100, 2, 10 # number of good, number of bad, and number of samples >>> s = rng.hypergeometric(ngood, nbad, nsamp, 1000) >>> from matplotlib.pyplot import hist >>> hist(s) # note that it is very unlikely to grab both bad items
Suppose you have an urn with 15 white and 15 black marbles. If you pull 15 marbles at random, how likely is it that 12 or more of them are one color?
>>> s = rng.hypergeometric(15, 15, 15, 100000) >>> sum(s>=12)/100000. + sum(s<=3)/100000. # answer = 0.003 ... pretty unlikely!
© 2005–2020 NumPy Developers
Licensed under the 3-clause BSD License.
https://numpy.org/doc/1.19/reference/random/generated/numpy.random.Generator.hypergeometric.html