Each Answer to this Q is separated by one/two green lines.

So in R when I have a data frame consisting of say 4 columns, call it `df`

and I want to compute the ratio by sum product of a group, I can it in such a way:

```
// generate data
df = data.frame(a=c(1,1,0,1,0),b=c(1,0,0,1,0),c=c(10,5,1,5,10),d=c(3,1,2,1,2));
| a b c d |
| 1 1 10 3 |
| 1 0 5 1 |
| 0 0 1 2 |
| 1 1 5 1 |
| 0 0 10 2 |
// compute sum product ratio
df = df%>% group_by(a,b) %>%
mutate(
ratio=c/sum(c*d)
);
| a b c d ratio |
| 1 1 10 3 0.286 |
| 1 1 5 1 0.143 |
| 1 0 5 1 1 |
| 0 0 1 2 0.045 |
| 0 0 10 2 0.454 |
```

But in python I need to resort to loops.

I know there should be a more elegant way than raw loops in python, anyone got any ideas?

According to this thread on pandas github we can use the `transform()`

method to replicate the combination of `dplyr::groupby()`

and `dplyr::mutate()`

. For this example, it would look as follows:

```
df = pd.DataFrame(
dict(
a=(1 , 1, 0, 1, 0 ),
b=(1 , 0, 0, 1, 0 ),
c=(10, 5, 1, 5, 10),
d=(3 , 1, 2, 1, 2 ),
)
).assign(
prod_c_d = lambda x: x['c'] * x['d'],
ratio = lambda x: x['c'] / (x.groupby(['a','b']).transform('sum')['prod_c_d'])
)
```

This example uses pandas method chaining. For more information on how to use method chaining to replicate `dplyr`

workflows see this blogpost.

The method using `apply()`

and `groupby()`

does not work for me because it does not seem to be adaptable. For example, it does not work if we delete `g.c/`

from the lambda expression.

```
df['ratio'] = df.groupby(['a','b'], group_keys=False)\
.apply(lambda g: (g.c * g.d).sum() )
```

It’s pretty easy to translate your R code into python with `datar`

:

```
>>> from datar.all import f, c, tibble, sum, group_by, mutate
[2021-06-24 13:32:29][datar][WARNING] Builtin name "sum" has been overriden by datar.
>>>
>>> df = tibble(a=c(1,1,0,1,0),b=c(1,0,0,1,0),c=c(10,5,1,5,10),d=c(3,1,2,1,2))
>>> df
a b c d
<int64> <int64> <int64> <int64>
0 1 1 10 3
1 1 0 5 1
2 0 0 1 2
3 1 1 5 1
4 0 0 10 2
>>> df >> group_by(f.a, f.b) >> mutate(ratio=f.c/sum(f.c*f.d))
a b c d ratio
<int64> <int64> <int64> <int64> <float64>
0 1 1 10 3 0.285714
1 1 0 5 1 1.000000
2 0 0 1 2 0.045455
3 1 1 5 1 0.142857
4 0 0 10 2 0.454545
[Groups: a, b (n=3)]
```

Disclaimer: I am the author of the `datar`

package.