[Solved] Seaborn: Setting a distplot bin range?

So I have this data set showing the GDP of countries in billions (so 1 trillion gdp = 1000).

import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline

df = pd.read_csv('2014_World_GDP')
df.sort('GDP (BILLIONS)',ascending=False, inplace=True)
sorted = df['GDP (BILLIONS)']

fig, ax = plt.subplots(figsize=(12, 8))
sns.distplot(sorted,bins=8,kde=False,ax=ax)

The above code give me the following figure:
image

What I want to do whoever is set the bins range so they look more like [250,500,750,1000,2000,5000,10000,20000].

Is there a way to do that in seaborn?

Solution #1:

You could use logarithmic bins, which would work well with data that is distributed as yours is. Here is an example:

import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

df = pd.DataFrame()
df['GDP (BILLIONS)'] = 2000*1./(np.random.random(250))
df.sort_values(by='GDP (BILLIONS)',ascending=False, inplace=True)

fig, ax = plt.subplots(1,2,figsize=(8, 3))

sns.distplot(df['GDP (BILLIONS)'].values,bins=8,kde=False,ax=ax[0])
ax[0].set_title('Linear Bins')

LogMin, LogMax = np.log10(df['GDP (BILLIONS)'].min()),np.log10(df['GDP (BILLIONS)'].max())
newBins = np.logspace(LogMin, LogMax,8)
sns.distplot(df['GDP (BILLIONS)'].values,bins=newBins,kde=False,ax=ax[1])
ax[1].set_xscale('log')
ax[1].set_title('Log Bins')

fig.show()

enter image description here

Respondent: Robbie
Solution #2:

You could just put your bin range as a sequence, in your case that would be:

sns.distplot(df['GDP (BILLIONS)'].values,
             bins=[250,500,750,1000,2000,5000,10000,20000],
             kde=False,ax=ax[0])

However, doing this alone won’t change the x-axis scale, you would need the set scale lines in Robbie’s answer to do that.

Respondent: Kc3
The answers/resolutions are collected from stackoverflow, are licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0 .

Leave a Reply

Your email address will not be published.