# [Solved] Replace values in NumPy array based on dictionary and avoid overlap between new values and keys

I want to replace values in a 2D numpy array based on following dictionary in python:

``````code    region
334     0
4       22
8       31
12      16
16      17
24      27
28      18
32      21
36       1
``````

I want to find cells in `numpy` 2D array which match `code` and replace by corresponding value in `region` column. The issue is that this will result in replacing `code = 12` by `region = 16` and in the next line, all cells with value of 16 (including the ones which just got assigned a value of 16) will be replaced by a value of 17. How do I prevent that?

## Solution #1:

Here’s a vectorized one based on `np.searchsorted` to trace back the locations for each of those keys in the array and then replacing and please excuse the almost sexist function name here (couldn’t help it though) –

``````def replace_with_dict(ar, dic):
# Extract out keys and values
k = np.array(list(dic.keys()))
v = np.array(list(dic.values()))

# Get argsort indices
sidx = k.argsort()

# Drop the magic bomb with searchsorted to get the corresponding
# places for a in keys (using sorter since a is not necessarily sorted).
# Then trace it back to original order with indexing into sidx
# Finally index into values for desired output.
return v[sidx[np.searchsorted(k,ar,sorter=sidx)]]
``````

Sample run –

``````In : dic ={334:0, 4:22, 8:31, 12:16, 16:17, 24:27, 28:18, 32:21, 36:1}
...:
...: np.random.seed(0)
...: a = np.random.choice(dic.keys(), 20)
...:

In : a
Out:
array([ 28,  16,  32,  32, 334,  32,  28,   4,   8, 334,  12,  36,  36,
24,  12, 334, 334,  36,  24,  28])

In : replace_with_dict(a, dic)
Out:
array([18, 17, 21, 21,  0, 21, 18, 22, 31,  0, 16,  1,  1, 27, 16,  0,  0,
1, 27, 18])
``````

Improvement

A faster one for big arrays would be sort the values and keys arrays and then use `searchsorted` without `sorter`, like so –

``````def replace_with_dict2(ar, dic):
# Extract out keys and values
k = np.array(list(dic.keys()))
v = np.array(list(dic.values()))

# Get argsort indices
sidx = k.argsort()

ks = k[sidx]
vs = v[sidx]
return vs[np.searchsorted(ks,ar)]
``````

Runtime test –

``````In : dic ={334:0, 4:22, 8:31, 12:16, 16:17, 24:27, 28:18, 32:21, 36:1}
...:
...: np.random.seed(0)
...: a = np.random.choice(dic.keys(), 20000)

In : out1 = replace_with_dict(a, dic)
...: out2 = replace_with_dict2(a, dic)
...: print np.allclose(out1, out2)
True

In : %timeit replace_with_dict(a, dic)
1000 loops, best of 3: 453 µs per loop

In : %timeit replace_with_dict2(a, dic)
1000 loops, best of 3: 341 µs per loop
``````

Generic case when all array elements are not in dictionary

If all elements in the input array are not guaranteed to be in the dictionary, we need a bit more work as listed below –

``````def replace_with_dict2_generic(ar, dic, assume_all_present=True):
# Extract out keys and values
k = np.array(list(dic.keys()))
v = np.array(list(dic.values()))

# Get argsort indices
sidx = k.argsort()

ks = k[sidx]
vs = v[sidx]
idx = np.searchsorted(ks,ar)

if assume_all_present==0:
idx[idx==len(vs)] = 0
else:
return vs[idx]
``````

Sample run –

``````In : dic ={334:0, 4:22, 8:31, 12:16, 16:17, 24:27, 28:18, 32:21, 36:1}
...:
...: np.random.seed(0)
...: a = np.random.choice(dic.keys(), (20))
...: a[-1] = 400

In : a
Out:
array([ 28,  16,  32,  32, 334,  32,  28,   4,   8, 334,  12,  36,  36,
24,  12, 334, 334,  36,  24, 400])

In : replace_with_dict2_generic(a, dic, assume_all_present=False)
Out:
array([ 18,  17,  21,  21,   0,  21,  18,  22,  31,   0,  16,   1,   1,
27,  16,   0,   0,   1,  27, 400])
``````

## Solution #2:

The way I’d do this is in two passes: first, get the indexes corresponding to the values you want to replace, and then replace the values.

``````arr = np.array([1,2,3,1,2,3])
code = np.array([1,2])
region = np.array([2,3])
index_list = []
for val in code:
index_list.append(np.where(arr == val))
for indexes, replace_val in zip(index_list, region):
arr[indexes] = replace_val
``````

The answers/resolutions are collected from stackoverflow, are licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0 .