# Standard deviation of a list

Each Answer to this Q is separated by one/two green lines.

I want to find mean and standard deviation of 1st, 2nd,… digits of several (Z) lists. For example, I have

``````A_rank=[0.8,0.4,1.2,3.7,2.6,5.8]
B_rank=[0.1,2.8,3.7,2.6,5,3.4]
C_Rank=[1.2,3.4,0.5,0.1,2.5,6.1]
# etc (up to Z_rank )...
``````

Now I want to take the mean and std of `*_Rank`, the mean and std of `*_Rank`, etc.
(ie: mean and std of the 1st digit from all the (A..Z)_rank lists;
the mean and std of the 2nd digit from all the (A..Z)_rank lists;
the mean and std of the 3rd digit…; etc).

Since Python 3.4 / PEP450 there is a `statistics module` in the standard library, which has a method `stdev` for calculating the standard deviation of iterables like yours:

``````>>> A_rank = [0.8, 0.4, 1.2, 3.7, 2.6, 5.8]
>>> import statistics
>>> statistics.stdev(A_rank)
2.0634114147853952
``````

I would put `A_Rank` et al into a 2D NumPy array, and then use `numpy.mean()` and `numpy.std()` to compute the means and the standard deviations:

``````In : import numpy

In : arr = numpy.array([A_rank, B_rank, C_rank])

In : numpy.mean(arr, axis=0)
Out:
array([ 0.7       ,  2.2       ,  1.8       ,  2.13333333,  3.36666667,
5.1       ])

In : numpy.std(arr, axis=0)
Out:
array([ 0.45460606,  1.29614814,  1.37355985,  1.50628314,  1.15566239,
1.2083046 ])
``````

Here’s some pure-Python code you can use to calculate the mean and standard deviation.

All code below is based on the `statistics` module in Python 3.4+.

``````def mean(data):
"""Return the sample arithmetic mean of data."""
n = len(data)
if n < 1:
raise ValueError('mean requires at least one data point')
return sum(data)/n # in Python 2 use sum(data)/float(n)

def _ss(data):
"""Return sum of square deviations of sequence data."""
c = mean(data)
ss = sum((x-c)**2 for x in data)
return ss

def stddev(data, ddof=0):
"""Calculates the population standard deviation
by default; specify ddof=1 to compute the sample
standard deviation."""
n = len(data)
if n < 2:
raise ValueError('variance requires at least two data points')
ss = _ss(data)
pvar = ss/(n-ddof)
return pvar**0.5
``````

Note: for improved accuracy when summing floats, the `statistics` module uses a custom function `_sum` rather than the built-in `sum` which I’ve used in its place.

Now we have for example:

``````>>> mean([1, 2, 3])
2.0
>>> stddev([1, 2, 3]) # population standard deviation
0.816496580927726
>>> stddev([1, 2, 3], ddof=1) # sample standard deviation
0.1
``````

In Python 2.7.1, you may calculate standard deviation using `numpy.std()` for:

• Population std: Just use `numpy.std()` with no additional arguments besides to your data list.
• Sample std: You need to pass ddof (i.e. Delta Degrees of Freedom) set to 1, as in the following example:

numpy.std(< your-list >, ddof=1)

The divisor used in calculations is N – ddof, where N represents the number of elements. By default ddof is zero.

It calculates sample std rather than population std.

In python 2.7 you can use NumPy’s `numpy.std()` gives the population standard deviation.

In Python 3.4 `statistics.stdev()` returns the sample standard deviation. The `pstdv()` function is the same as `numpy.std()`.

Using python, here are few methods:

``````import statistics as st

n = int(input())
data = list(map(int, input().split()))
``````

# Approach1 – using a function

``````stdev = st.pstdev(data)
``````

# Approach2: calculate variance and take square root of it

``````variance = st.pvariance(data)
devia = math.sqrt(variance)
``````

# Approach3: using basic math

``````mean = sum(data)/n
variance = sum([((x - mean) ** 2) for x in X]) / n
stddev = variance ** 0.5

print("{0:0.1f}".format(stddev))
``````

# Note:

• `variance` calculates variance of sample population
• `pvariance` calculates variance of entire population
• similar differences between `stdev` and `pstdev`

pure python code:

``````from math import sqrt

def stddev(lst):
mean = float(sum(lst)) / len(lst)
return sqrt(float(reduce(lambda x, y: x + y, map(lambda x: (x - mean) ** 2, lst))) / len(lst))
``````

The other answers cover how to do std dev in python sufficiently, but no one explains how to do the bizarre traversal you’ve described.

I’m going to assume A-Z is the entire population. If not see Ome‘s answer on how to inference from a sample.

So to get the standard deviation/mean of the first digit of every list you would need something like this:

``````#standard deviation
numpy.std([A_rank, B_rank, C_rank, ..., Z_rank])

#mean
numpy.mean([A_rank, B_rank, C_rank, ..., Z_rank])
``````

To shorten the code and generalize this to any nth digit use the following function I generated for you:

``````def getAllNthRanks(n):
return [A_rank[n], B_rank[n], C_rank[n], D_rank[n], E_rank[n], F_rank[n], G_rank[n], H_rank[n], I_rank[n], J_rank[n], K_rank[n], L_rank[n], M_rank[n], N_rank[n], O_rank[n], P_rank[n], Q_rank[n], R_rank[n], S_rank[n], T_rank[n], U_rank[n], V_rank[n], W_rank[n], X_rank[n], Y_rank[n], Z_rank[n]]
``````

Now you can simply get the stdd and mean of all the nth places from A-Z like this:

``````#standard deviation
numpy.std(getAllNthRanks(n))

#mean
numpy.mean(getAllNthRanks(n))
`````` The answers/resolutions are collected from stackoverflow, are licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0 .