python – 具有MultiIndex的Pandas DataFrame:按DateTime级别值
发布时间:2020-12-20 11:48:46 所属栏目:Python 来源:网络整理
导读:我有和pandas数据帧的multiindex看起来像这样: # -*- coding: utf-8 -*-import numpy as npimport pandas as pd# multi-indexed dataframedf = pd.DataFrame(np.random.randn(8760 * 3,3))df['concept'] = "some_value"df['datetime'] = pd.date_range(star
|
我有和pandas数据帧的multiindex看起来像这样:
# -*- coding: utf-8 -*- import numpy as np import pandas as pd # multi-indexed dataframe df = pd.DataFrame(np.random.randn(8760 * 3,3)) df['concept'] = "some_value" df['datetime'] = pd.date_range(start='2016',periods=len(df),freq='60Min') df.set_index(['concept','datetime'],inplace=True) df.sort_index(inplace=True) 控制台输出: df.head()
Out[23]:
0 1 2
datetime
2016 0.458802 0.413004 0.091056
2016 -0.051840 -1.780310 -0.304122
2016 -1.119973 0.954591 0.279049
2016 -0.691850 -0.489335 0.554272
2016 -1.278834 -1.292012 -0.637931
df.head()
...: df.tail()
Out[24]:
0 1 2
datetime
2018 -1.872155 0.434520 -0.526520
2018 0.345213 0.989475 -0.892028
2018 -0.162491 0.908121 -0.993499
2018 -1.094727 0.307312 0.515041
2018 -0.880608 -1.065203 -1.438645
现在我想在’datetime’级别创建年度总和. 我的第一次尝试是以下,但这不起作用: # sum along years
years = df.index.get_level_values('datetime').year.tolist()
df.index.set_levels([years],level=['datetime'],inplace=True)
df = df.groupby(level=['datetime']).sum()
这对我来说似乎也很沉重,因为这个任务可能很容易实现. 所以这是我的问题:如何获得“日期时间”级别的年度总和?有没有一种简单的方法来通过将函数应用于DateTime级别值来实现这一点? 解决方法
您可以通过第二级multiindex和
year获得
groupby:
# -*- coding: utf-8 -*-
import numpy as np
import pandas as pd
# multi-indexed dataframe
df = pd.DataFrame(np.random.randn(8760 * 3,inplace=True)
df.sort_index(inplace=True)
print df.head()
0 1 2
concept datetime
some_value 2016-01-01 00:00:00 1.973437 0.101535 -0.693360
2016-01-01 01:00:00 1.221657 -1.983806 -0.075609
2016-01-01 02:00:00 -0.208122 -2.203801 1.254084
2016-01-01 03:00:00 0.694332 -0.235864 0.538468
2016-01-01 04:00:00 -0.928815 -1.417445 1.534218
# sum along years
#years = df.index.get_level_values('datetime').year.tolist()
#df.index.set_levels([years],inplace=True)
print df.index.levels[1].year
[2016 2016 2016 ...,2018 2018 2018]
df = df.groupby(df.index.levels[1].year).sum()
print df.head()
0 1 2
2016 -93.901914 -32.205514 -22.460965
2017 205.681817 67.701669 -33.960801
2018 67.438355 150.954614 -21.381809
或者您可以使用 df = df.groupby(df.index.get_level_values('datetime').year).sum()
print df.head()
0 1 2
2016 -93.901914 -32.205514 -22.460965
2017 205.681817 67.701669 -33.960801
2018 67.438355 150.954614 -21.381809
(编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |
