python – 用pandas获取一个新专栏(其他人的共识)
发布时间:2020-12-20 11:10:20 所属栏目:Python 来源:网络整理
导读:我需要一些使用pandas数据框的帮助. 这是数据框: group col1 col2 name1 dog 40 canidae1 dog 40 canidae1 dog 40 canidae1 dog 40 canidae1 dog 40 1 dog 40 canidae1 dog 40 canidae2 frog 85 dendrobatidae2 frog 89 leptodactylidae2 frog 89 leptodact
我需要一些使用pandas数据框的帮助.
这是数据框: group col1 col2 name 1 dog 40 canidae 1 dog 40 canidae 1 dog 40 canidae 1 dog 40 canidae 1 dog 40 1 dog 40 canidae 1 dog 40 canidae 2 frog 85 dendrobatidae 2 frog 89 leptodactylidae 2 frog 89 leptodactylidae 2 frog 82 leptodactylidae 2 frog 89 2 frog 81 2 frog 89 dendrobatidae 3 horse 87 equidae1 3 donkey 76 equidae2 3 zebra 67 equidae3 4 bird 54 psittacidae 4 bird 56 4 bird 34 5 bear 67 5 bear 54 我想要的是添加一个列“consensus_name”获取: group col1 col2 name consensus_name 1 dog 40 canidae canidae 1 dog 40 canidae canidae 1 dog 40 canidae 1 dog 40 canidae canidae 1 dog 40 canidae canidae 2 frog 85 dendrobatidae leptodactylidae 2 frog 89 leptodactylidae leptodactylidae 2 frog 89 leptodactylidae leptodactylidae 2 frog 82 leptodactylidae leptodactylidae 2 frog 89 leptodactylidae 2 frog 81 leptodactylidae 2 frog 89 dendrobatidae leptodactylidae 3 horse 87 equidae1 equidae3 3 donkey 76 equidae2 equidae3 3 zebra 67 equidae3 equidae3 4 bird 54 psittacidae psittacidae 4 bird 56 psittacidae 4 bird 34 psittacidae 5 bear 67 NA 5 bear 54 NA 为了获得每个组的新列,我得到了最具代表性的组名. >对于group1,有4行,名称为’canidae’,另一行没有任何内容,因此对于每一行,我在列共有名称中写’canidae’ 有没有人有任何想法与熊猫一起做?谢谢您帮忙 :) 输出为anky = group col1 col2 name consensus_name 0 1 dog 40 canidae canidae 1 1 dog 40 canidae canidae 2 1 dog 40 canidae canidae 3 1 dog 40 canidae canidae 4 1 dog 40 NaN canidae 5 1 dog 40 canidae canidae 6 1 dog 40 canidae canidae 7 2 frog 85 dendrobatidae dendrobatidae 8 2 frog 89 leptodactylidae leptodactylidae 9 2 frog 89 leptodactylidae leptodactylidae 10 2 frog 82 leptodactylidae leptodactylidae 11 2 frog 89 NaN leptodactylidae 12 2 frog 81 NaN leptodactylidae 13 2 frog 89 dendrobatidae dendrobatidae 14 3 horse 87 equidae1 equidae1 15 3 donkey 76 equidae2 equidae2 16 3 zebra 67 equidae3 equidae3 17 4 bird 54 psittacidae psittacidae 18 4 bird 56 NaN psittacidae 19 4 bird 34 NaN psittacidae 20 5 bear 67 NaN NaN 21 5 bear 54 NaN NaN 解决方法
使用pandas.DataFrame.Groupby.Series.transform并将其传递给max函数:
#First fillna with empty string df.name.fillna('',inplace=True) df['consensus_name'] = df.groupby('group').name.transform('max') print(df) group col1 col2 name consensus_name 0 1 dog 40 canidae canidae 1 1 dog 40 canidae canidae 2 1 dog 40 canidae canidae 3 1 dog 40 canidae canidae 4 1 dog 40 canidae 5 1 dog 40 canidae canidae 6 1 dog 40 canidae canidae 7 2 frog 85 dendrobatidae leptodactylidae 8 2 frog 89 leptodactylidae leptodactylidae 9 2 frog 89 leptodactylidae leptodactylidae 10 2 frog 82 leptodactylidae leptodactylidae 11 2 frog 89 leptodactylidae 12 2 frog 81 leptodactylidae 13 2 frog 89 dendrobatidae leptodactylidae 14 3 horse 87 equidae1 equidae3 15 3 donkey 76 equidae2 equidae3 16 3 zebra 67 equidae3 equidae3 17 4 bird 54 psittacidae psittacidae 18 4 bird 56 psittacidae 19 4 bird 34 psittacidae 20 5 bear 67 21 5 bear 54 指出后编辑通常不适用: df['name'] = df.groupby('group').name.ffill() df_group = df.groupby('group').name.apply(lambda x: pd.Series.mode(x,dropna=False)).reset_index() df_group = df_group[df_group.level_1 == df_group.groupby('group').level_1.transform('max')] df_group.rename({'name':'consensus_name'},axis=1,inplace=True) df_final = pd.merge(df,df_group,on='group') print(df_final) group col1 col2 name level_1 consensus_name 0 1 dog 40 canidae 0 canidae 1 1 dog 40 canidae 0 canidae 2 1 dog 40 canidae 0 canidae 3 1 dog 40 canidae 0 canidae 4 1 dog 40 canidae 0 canidae 5 1 dog 40 canidae 0 canidae 6 1 dog 40 canidae 0 canidae 7 2 frog 85 dendrobatidae 0 leptodactylidae 8 2 frog 89 leptodactylidae 0 leptodactylidae 9 2 frog 89 leptodactylidae 0 leptodactylidae 10 2 frog 82 leptodactylidae 0 leptodactylidae 11 2 frog 89 leptodactylidae 0 leptodactylidae 12 2 frog 81 leptodactylidae 0 leptodactylidae 13 2 frog 89 dendrobatidae 0 leptodactylidae 14 3 horse 87 equidae1 2 equidae3 15 3 donkey 76 equidae2 2 equidae3 16 3 zebra 67 equidae3 2 equidae3 17 4 bird 54 psittacidae 0 psittacidae 18 4 bird 56 psittacidae 0 psittacidae 19 4 bird 34 psittacidae 0 psittacidae 20 5 bear 67 NaN 0 NaN 21 5 bear 54 NaN 0 NaN (编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |
相关内容
- python – 在Jupyter笔记本中缩放Matplotlib Plots的绘图大
- 这是我见过最全面的Python面向对象教程!没有之一!
- python – SQLite:在每个组中仅返回前2个结果
- Python traceback.print_exc()返回’None’
- 用Python进行基础的函数式编程的教程
- 每周分享五个 PyCharm 使用技巧(六)
- 使用Python GData API,无法获得可编辑的视频条目
- python判断字符串编码的简单实现方法(使用chardet)
- python – 在matplotlib中如何填充由两组不同数组定义的两
- 解决python3中自定义wsgi函数,make_server函数报错的问题