# [Solved] Replace values in NumPy array based on dictionary and avoid overlap between new values and keys

I want to replace values in a 2D numpy array based on following dictionary in python:

```
code region
334 0
4 22
8 31
12 16
16 17
24 27
28 18
32 21
36 1
```

I want to find cells in `numpy`

2D array which match `code`

and replace by corresponding value in `region`

column. The issue is that this will result in replacing `code = 12`

by `region = 16`

and in the next line, all cells with value of 16 (including the ones which just got assigned a value of 16) will be replaced by a value of 17. How do I prevent that?

##
Solution #1:

Here’s a vectorized one based on `np.searchsorted`

to trace back the locations for each of those keys in the array and then replacing and please excuse the almost *sexist* function name here (couldn’t help it though) –

```
def replace_with_dict(ar, dic):
# Extract out keys and values
k = np.array(list(dic.keys()))
v = np.array(list(dic.values()))
# Get argsort indices
sidx = k.argsort()
# Drop the magic bomb with searchsorted to get the corresponding
# places for a in keys (using sorter since a is not necessarily sorted).
# Then trace it back to original order with indexing into sidx
# Finally index into values for desired output.
return v[sidx[np.searchsorted(k,ar,sorter=sidx)]]
```

Sample run –

```
In [82]: dic ={334:0, 4:22, 8:31, 12:16, 16:17, 24:27, 28:18, 32:21, 36:1}
...:
...: np.random.seed(0)
...: a = np.random.choice(dic.keys(), 20)
...:
In [83]: a
Out[83]:
array([ 28, 16, 32, 32, 334, 32, 28, 4, 8, 334, 12, 36, 36,
24, 12, 334, 334, 36, 24, 28])
In [84]: replace_with_dict(a, dic)
Out[84]:
array([18, 17, 21, 21, 0, 21, 18, 22, 31, 0, 16, 1, 1, 27, 16, 0, 0,
1, 27, 18])
```

**Improvement**

A faster one for big arrays would be sort the values and keys arrays and then use `searchsorted`

without `sorter`

, like so –

```
def replace_with_dict2(ar, dic):
# Extract out keys and values
k = np.array(list(dic.keys()))
v = np.array(list(dic.values()))
# Get argsort indices
sidx = k.argsort()
ks = k[sidx]
vs = v[sidx]
return vs[np.searchsorted(ks,ar)]
```

Runtime test –

```
In [91]: dic ={334:0, 4:22, 8:31, 12:16, 16:17, 24:27, 28:18, 32:21, 36:1}
...:
...: np.random.seed(0)
...: a = np.random.choice(dic.keys(), 20000)
In [92]: out1 = replace_with_dict(a, dic)
...: out2 = replace_with_dict2(a, dic)
...: print np.allclose(out1, out2)
True
In [93]: %timeit replace_with_dict(a, dic)
1000 loops, best of 3: 453 µs per loop
In [95]: %timeit replace_with_dict2(a, dic)
1000 loops, best of 3: 341 µs per loop
```

**Generic case when all array elements are not in dictionary**

If all elements in the input array are not guaranteed to be in the dictionary, we need a bit more work as listed below –

```
def replace_with_dict2_generic(ar, dic, assume_all_present=True):
# Extract out keys and values
k = np.array(list(dic.keys()))
v = np.array(list(dic.values()))
# Get argsort indices
sidx = k.argsort()
ks = k[sidx]
vs = v[sidx]
idx = np.searchsorted(ks,ar)
if assume_all_present==0:
idx[idx==len(vs)] = 0
mask = ks[idx] == ar
return np.where(mask, vs[idx], ar)
else:
return vs[idx]
```

Sample run –

```
In [163]: dic ={334:0, 4:22, 8:31, 12:16, 16:17, 24:27, 28:18, 32:21, 36:1}
...:
...: np.random.seed(0)
...: a = np.random.choice(dic.keys(), (20))
...: a[-1] = 400
In [165]: a
Out[165]:
array([ 28, 16, 32, 32, 334, 32, 28, 4, 8, 334, 12, 36, 36,
24, 12, 334, 334, 36, 24, 400])
In [166]: replace_with_dict2_generic(a, dic, assume_all_present=False)
Out[166]:
array([ 18, 17, 21, 21, 0, 21, 18, 22, 31, 0, 16, 1, 1,
27, 16, 0, 0, 1, 27, 400])
```

##
Solution #2:

The way I’d do this is in two passes: first, get the indexes corresponding to the values you want to replace, and then replace the values.

```
arr = np.array([1,2,3,1,2,3])
code = np.array([1,2])
region = np.array([2,3])
index_list = []
for val in code:
index_list.append(np.where(arr == val)[0])
for indexes, replace_val in zip(index_list, region):
arr[indexes] = replace_val
```