当前位置: 代码迷 >> python >> 按列分组后如何获取平均值?
  详细解决方案

按列分组后如何获取平均值?

热度:30   发布时间:2023-06-27 21:45:33.0

我有一个名为df的DataFrame,我想获得不同gender组的不同apps的平均使用time

import pandas as pd 
df=pd.DataFrame({'user':[2,3,4,4,5,5],'gender':[0,0,1,1,1,1],
'app':['k','k','k','k','s','s'],'time':[6,10,10,6,3,1]})

Input:

  app  gender  time  user
0   k    0     6     2
1   k    0    10     3
2   k    1    10     4
3   k    1     6     4
4   s    1     3     5
5   s    1     1     5

对于app kgender 0组使用app k app k的总时间为16 (10 + 6 ) ,因此平均使用时间0_k8.0

gender 1组使用app k的总时间为16 (10 + 6 + 0 + 0) ,因此平均使用时间1_k4.0

Expected:

dict = {'0_k': 8.0, '0_s': 0, '1_k': 4.0, '1_s': 1.0}

IIUC我认为您需要:

df['new_col'] = df.gender.astype(str)+'_'+df.app
df['Average'] = df.groupby(['gender','app'])['time'].transform('sum')/\
                df.groupby(['gender'])['time'].transform('count')

print(df)
   user  gender app  time new_col  Average
0     2       0   k     6     0_k      8.0
1     3       0   k    10     0_k      8.0
2     4       1   k    10     1_k      4.0
3     4       1   k     6     1_k      4.0
4     5       1   s     3     1_s      1.0
5     5       1   s     1     1_s      1.0

d = dict(df[['new_col','Average']].values)

print(d)
{'0_k': 8.0, '1_k': 4.0, '1_s': 1.0}
(df.groupby(["app", "gender"]).sum()/df.groupby(["gender"]).count()).time


app  gender
k    0         8.0
     1         4.0
s    1         1.0

要将其转换为字典:

dict = (df.groupby(["app", "gender"]).sum()/df.groupby(["gender"]).count()).time.to_dict()

{('k', 0): 8.0, ('s', 1): 2.0, ('k', 1): 8.0}
  相关解决方案