The random module (http://docs.python.org/2/library/random.html) has several fixed functions to randomly sample from. For example random.gauss will sample random point from a normal distribution with a given mean and sigma values.

I’m looking for a way to extract a number N of random samples between a given interval using my own distribution as fast as possible in python. This is what I mean:

def my_dist(x):
    # Some distribution, assume c1,c2,c3 and c4 are known.
    f = c1*exp(-((x-c2)**c3)/c4)
    return f

# Draw N random samples from my distribution between given limits a,b.
N = 1000
N_rand_samples = ran_func_sample(my_dist, a, b, N)

where ran_func_sample is what I’m after and a, b are the limits from which to draw the samples. Is there anything of that sort in python?

You need to use Inverse transform sampling method to get random values distributed according to a law you want. Using this method you can just apply inverted function
to random numbers having standard uniform distribution in the interval [0,1].

After you find the inverted function, you get 1000 numbers distributed according to the needed distribution this obvious way:

[inverted_function(random.random()) for x in range(1000)]

More on Inverse Transform Sampling:

Also, there is a good question on StackOverflow related to the topic:

This code implements the sampling of n-d discrete probability distributions. By setting a flag on the object, it can also be made to be used as a piecewise constant probability distribution, which can then be used to approximate arbitrary pdf’s. Well, arbitrary pdfs with compact support; if you efficiently want to sample extremely long tails, a non-uniform description of the pdf would be required. But this is still efficient even for things like airy-point-spread functions (which I created it for, initially). The internal sorting of values is absolutely critical there to get accuracy; the many small values in the tails should contribute substantially, but they will get drowned out in fp accuracy without sorting.

class Distribution(object):
    """
    draws samples from a one dimensional probability distribution,
    by means of inversion of a discrete inverstion of a cumulative density function

    the pdf can be sorted first to prevent numerical error in the cumulative sum
    this is set as default; for big density functions with high contrast,
    it is absolutely necessary, and for small density functions,
    the overhead is minimal

    a call to this distibution object returns indices into density array
    """
    def __init__(self, pdf, sort = True, interpolation = True, transform = lambda x: x):
        self.shape          = pdf.shape
        self.pdf            = pdf.ravel()
        self.sort           = sort
        self.interpolation  = interpolation
        self.transform      = transform

        #a pdf can not be negative
        assert(np.all(pdf>=0))

        #sort the pdf by magnitude
        if self.sort:
            self.sortindex = np.argsort(self.pdf, axis=None)
            self.pdf = self.pdf[self.sortindex]
        #construct the cumulative distribution function
        self.cdf = np.cumsum(self.pdf)
    @property
    def ndim(self):
        return len(self.shape)
    @property
    def sum(self):
        """cached sum of all pdf values; the pdf need not sum to one, and is imlpicitly normalized"""
        return self.cdf[-1]
    def __call__(self, N):
        """draw """
        #pick numbers which are uniformly random over the cumulative distribution function
        choice = np.random.uniform(high = self.sum, size = N)
        #find the indices corresponding to this point on the CDF
        index = np.searchsorted(self.cdf, choice)
        #if necessary, map the indices back to their original ordering
        if self.sort:
            index = self.sortindex[index]
        #map back to multi-dimensional indexing
        index = np.unravel_index(index, self.shape)
        index = np.vstack(index)
        #is this a discrete or piecewise continuous distribution?
        if self.interpolation:
            index = index + np.random.uniform(size=index.shape)
        return self.transform(index)


if __name__=='__main__':
    shape = 3,3
    pdf = np.ones(shape)
    pdf[1]=0
    dist = Distribution(pdf, transform=lambda i:i-1.5)
    print dist(10)
    import matplotlib.pyplot as pp
    pp.scatter(*dist(1000))
    pp.show()

And as a more real-world relevant example:

x = np.linspace(-100, 100, 512)
p = np.exp(-x**2)
pdf = p[:,None]*p[None,:]     #2d gaussian
dist = Distribution(pdf, transform=lambda i:i-256)
print dist(1000000).mean(axis=1)    #should be in the 1/sqrt(1e6) range
import matplotlib.pyplot as pp
pp.scatter(*dist(1000))
pp.show()

I was in a similar situation but I wanted to sample from a multivariate distribution, so, I implemented a rudimentary version of Metropolis-Hastings (which is an MCMC method).

def metropolis_hastings(target_density, size=500000):
    burnin_size = 10000
    size += burnin_size
    x0 = np.array([[0, 0]])
    xt = x0
    samples = []
    for i in range(size):
        xt_candidate = np.array([np.random.multivariate_normal(xt[0], np.eye(2))])
        accept_prob = (target_density(xt_candidate))/(target_density(xt))
        if np.random.uniform(0, 1) < accept_prob:
            xt = xt_candidate
        samples.append(xt)
    samples = np.array(samples[burnin_size:])
    samples = np.reshape(samples, [samples.shape[0], 2])
    return samples

This function requires a function target_density which takes in a data-point and computes its probability.

For details check-out this detailed answer of mine.

import numpy as np
import scipy.interpolate as interpolate

def inverse_transform_sampling(data, n_bins, n_samples):
    hist, bin_edges = np.histogram(data, bins=n_bins, density=True)
    cum_values = np.zeros(bin_edges.shape)
    cum_values[1:] = np.cumsum(hist*np.diff(bin_edges))
    inv_cdf = interpolate.interp1d(cum_values, bin_edges)
    r = np.random.rand(n_samples)
    return inv_cdf(r)

So if we give our data sample that has a specific distribution, the inverse_transform_sampling function will return a dataset with exactly the same distribution. Here the advantage is that we can get our own sample size by specifying it in the n_samples variable.

Here is a rather nice way of performing inverse transform sampling with a decorator.

import numpy as np
from scipy.interpolate import interp1d

def inverse_sample_decorator(dist):
    
    def wrapper(pnts, x_min=-100, x_max=100, n=1e5, **kwargs):
        
        x = np.linspace(x_min, x_max, int(n))
        cumulative = np.cumsum(dist(x, **kwargs))
        cumulative -= cumulative.min()
        f = interp1d(cumulative/cumulative.max(), x)
        return f(np.random.random(pnts))
    
    return wrapper

Using this decorator on a Gaussian distribution, for example:

@inverse_sample_decorator
def gauss(x, amp=1.0, mean=0.0, std=0.2):
    return amp*np.exp(-(x-mean)**2/std**2/2.0)

You can then generate sample points from the distribution by calling the function. The keyword arguments x_min and x_max are the limits of the original distribution and can be passed as arguments to gauss along with the other key word arguments that parameterise the distribution.

samples = gauss(5000, mean=20, std=0.8, x_min=19, x_max=21)

Alternatively, this can be done as a function that takes the distribution as an argument (as in your original question),

def inverse_sample_function(dist, pnts, x_min=-100, x_max=100, n=1e5, 
                            **kwargs):
        
    x = np.linspace(x_min, x_max, int(n))
    cumulative = np.cumsum(dist(x, **kwargs))
    cumulative -= cumulative.min()
    f = interp1d(cumulative/cumulative.max(), x)
        
    return f(np.random.random(pnts))