Each Answer to this Q is separated by one/two green lines.

Can anyone help me out in fitting a gamma distribution in python? Well, I’ve got some data : X and Y coordinates, and I want to find the gamma parameters that fit this distribution… In the Scipy doc, it turns out that a fit method actually exists but I don’t know how to use it :s.. First, in which format the argument “data” must be, and how can I provide the second argument (the parameters) since that’s what I’m looking for?

Generate some gamma data:

```
import scipy.stats as stats
alpha = 5
loc = 100.5
beta = 22
data = stats.gamma.rvs(alpha, loc=loc, scale=beta, size=10000)
print(data)
# [ 202.36035683 297.23906376 249.53831795 ..., 271.85204096 180.75026301
# 364.60240242]
```

Here we fit the data to the gamma distribution:

```
fit_alpha, fit_loc, fit_beta=stats.gamma.fit(data)
print(fit_alpha, fit_loc, fit_beta)
# (5.0833692504230008, 100.08697963283467, 21.739518937816108)
print(alpha, loc, beta)
# (5, 100.5, 22)
```

I was unsatisfied with the ss.gamma.rvs-function as it can generate negative numbers, something the gamma-distribution is supposed not to have. So I fitted the sample through expected value = mean(data) and variance = var(data) (see wikipedia for details) and wrote a function that can yield random samples of a gamma distribution without scipy (which I found hard to install properly, on a sidenote):

```
import random
import numpy
data = [6176, 11046, 670, 6146, 7945, 6864, 767, 7623, 7212, 9040, 3213, 6302, 10044, 10195, 9386, 7230, 4602, 6282, 8619, 7903, 6318, 13294, 6990, 5515, 9157]
# Fit gamma distribution through mean and average
mean_of_distribution = numpy.mean(data)
variance_of_distribution = numpy.var(data)
def gamma_random_sample(mean, variance, size):
"""Yields a list of random numbers following a gamma distribution defined by mean and variance"""
g_alpha = mean*mean/variance
g_beta = mean/variance
for i in range(size):
yield random.gammavariate(g_alpha,1/g_beta)
# force integer values to get integer sample
grs = [int(i) for i in gamma_random_sample(mean_of_distribution,variance_of_distribution,len(data))]
print("Original data: ", sorted(data))
print("Random sample: ", sorted(grs))
# Original data: [670, 767, 3213, 4602, 5515, 6146, 6176, 6282, 6302, 6318, 6864, 6990, 7212, 7230, 7623, 7903, 7945, 8619, 9040, 9157, 9386, 10044, 10195, 11046, 13294]
# Random sample: [1646, 2237, 3178, 3227, 3649, 4049, 4171, 5071, 5118, 5139, 5456, 6139, 6468, 6726, 6944, 7050, 7135, 7588, 7597, 7971, 10269, 10563, 12283, 12339, 13066]
```

If you want a long example including a discussion about estimating or fixing the support of the distribution, then you can find it in https://github.com/scipy/scipy/issues/1359 and the linked mailing list message.

Preliminary support to fix parameters, such as location, during fit has been added to the trunk version of scipy.

OpenTURNS has a simple way to do this with the `GammaFactory`

class.

First, let’s generate a sample:

```
import openturns as ot
gammaDistribution = ot.Gamma()
sample = gammaDistribution.getSample(100)
```

Then fit a Gamma to it:

```
distribution = ot.GammaFactory().build(sample)
```

Then we can draw the PDF of the Gamma:

```
import openturns.viewer as otv
otv.View(distribution.drawPDF())
```

which produces:

More details on this topic at: http://openturns.github.io/openturns/latest/user_manual/_generated/openturns.GammaFactory.html

1): the “data” variable could be in the format of a python list or tuple, or a numpy.ndarray, which could be obtained by using:

```
data=numpy.array(data)
```

where the 2nd data in the above line should be a list or a tuple, containing your data.

2: the “parameter” variable is a first guess you could optionally provide to the fitting function as a starting point for the fitting process, so it could be omitted.

3: a note on @mondano’s answer. The usage of moments (mean and variances) to work out the gamma parameters are reasonably good for large shape parameters (alpha>10), but could yield poor results for small values of alpha (See *Statistical methods in the atmospheric scineces* by Wilks, and THOM, H. C. S., 1958: A note on the gamma distribution. Mon. Wea. Rev., 86, 117–122.

Using Maximum Likelihood Estimators, as that implemented in the scipy module, is regarded a better choice in such cases.