python – Pandas分层索引和计算
发布时间:2020-12-20 11:04:34 所属栏目:Python 来源:网络整理
导读:鉴于: df = pd.DataFrame({"panum": ["PA1","PA1","PA2","PA2"],"which": ["A","A","B","B"],"score": [88,80,90,92,95,99]})df.set_index(['panum','which'],inplace=True)df scorepanum which PA1 A 88 A 80 A 90PA2 B 92 B 95 B 99 是否有可能写出一些会
鉴于:
df = pd.DataFrame({"panum": ["PA1","PA1","PA2","PA2"],"which": ["A","A","B","B"],"score": [88,80,90,92,95,99]}) df.set_index(['panum','which'],inplace=True) df score panum which PA1 A 88 A 80 A 90 PA2 B 92 B 95 B 99 是否有可能写出一些会在’哪个’中创建一个新的索引条目,这个参数最大但是对于这个级别,所以它会创建两个新行,PA1,Max和PA2,Max? 更新 我已经纠正了索引.上面的例子不是我的意思. panmum factor score PA1 init 90 resub 94 final 93 PA2 init 60 resub 90 final 88 我在这个更好的场景中的问题是:“我想创建一个名为mean的新”panum“,它将有三行,(mean,init),resub),final)”. 伪代码就像df [‘mean’] =(df [‘pa1’] df [‘pa2’])/ 2 我知道这是一个不同的问题! 解决方法
您可以创建最大值的新DataFrame,添加第二级最大值,
append 到原始值和最后
sort_index :
m = df.max(level=0).assign(max='max').set_index('max',append=True) print (m) score panum max PA1 max 90 PA2 max 99 df = df.append(m).sort_index() print (df) score panum which PA1 A 88 A 80 A 90 max 90 PA2 B 92 B 95 B 99 max 99 编辑答案:解决方案的平均值由第二级和swaplevel更改为正确对齐到最终的DataFrame: df = pd.DataFrame({"panum": ["PA1","factor": ["init","resub","final"] * 2,"score": [90,94,93,60,88]}) df.set_index(['panum','factor'],inplace=True) print (df) score panum factor PA1 init 90 resub 94 final 93 PA2 init 60 resub 90 final 88 m = (df.mean(level=1) .assign(factor='mean') .set_index('factor',append=True) .swaplevel(0,1)) print (m) score factor factor mean init 75.0 resub 92.0 final 90.5 df = df.append(m) print (df) score panum factor PA1 init 90.0 resub 94.0 final 93.0 PA2 init 60.0 resub 90.0 final 88.0 mean init 75.0 resub 92.0 final 90.5 (编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |