加入收藏 | 设为首页 | 会员中心 | 我要投稿 李大同 (https://www.lidatong.com.cn/)- 科技、建站、经验、云计算、5G、大数据,站长网!
当前位置: 首页 > 编程开发 > Python > 正文

根据名称pandas python对某些列进行乘法和求和

发布时间:2020-12-20 11:04:45 所属栏目:Python 来源:网络整理
导读:我有一个小样本数据集: import pandas as pdd = { 'measure1_x': [10,12,20,30,21],'measure2_x':[11,10,3,3],'measure3_x':[10,1,1],'measure1_y': [1,2,'measure2_y':[1,'measure3_y':[1,1]}df = pd.DataFrame(d)df = df.reindex_axis([ 'measure1_x','me

import pandas as pd
d = {
  'measure1_x': [10,12,20,30,21],'measure2_x':[11,10,3,3],'measure3_x':[10,1,1],'measure1_y': [1,2,'measure2_y':[1,'measure3_y':[1,1]
df = pd.DataFrame(d)
df = df.reindex_axis([


measure1_x  measure2_x  measure3_x  measure1_y  measure2_y  measure3_y
          10          11          10           1           1           1
          12          12           0           2           1           0
          20          10          12           2           1           2
          30           3           1           3           3           1
          21           3           1           1           3           1


total = measure1_x * measure1_y measure2_x * measure2_y measure3_x * measure3_y


measure1_x  measure2_x  measure3_x  measure1_y  measure2_y  measure3_y   total

 10          11          10           1           1           1           31 
 12          12           0           2           1           0           36 
 20          10          12           2           1           2           74
 30           3           1           3           3           1          100
 21           3           1           1           3           1           31


#first identify the column names that has '_x' and '_y',then identify if 
#the column names are the same after removing '_x' and '_y',if the pair has 
#the same name then multiply them,do that for all pairs and sum the results 
#up to get the total number

for colname in df.columns:
if "_x".lower() in colname.lower() or "_y".lower() in colname.lower():
    if "_x".lower() in colname.lower():  
        colnamex = colname
    if "_y".lower() in colname.lower():
        colnamey = colname

    #if colnamex[:-2] are the same for colnamex and colnamey then multiply and sum



以为我这次尝试的东西有点不同 –


df = df.sort_index(axis=1) # optional,do this if your columns aren't sorted

i = df.filter(like='_x') 
j = df.filter(like='_y')
df['Total'] = np.einsum('ij,ij->i',i,j) # (i.values * j).sum(axis=1)
   measure1_x  measure2_x  measure3_x  measure1_y  measure2_y  measure3_y  Total
0          10          11          10           1           1           1     31
1          12          12           0           2           1           0     36
2          20          10          12           2           1           2     74
3          30           3           1           3           3           1    100
4          21           3           1           1           3           1     31

一个稍微强大的版本,它过滤掉非数字列并事先执行断言 –

df = df.sort_index(axis=1).select_dtypes(exclude=[object])
i = df.filter(regex='.*_x') 
j = df.filter(regex='.*_y')

assert i.shape == j.shape

df['Total'] = np.einsum('ij,j)



