加入收藏 | 设为首页 | 会员中心 | 我要投稿 李大同 (https://www.lidatong.com.cn/)- 科技、建站、经验、云计算、5G、大数据,站长网!
当前位置: 首页 > 编程开发 > Python > 正文

python – pandas resample / TimeGrouper中的错误

发布时间:2020-12-20 13:07:52 所属栏目:Python 来源:网络整理
导读:这是一个可重复的例子: from pandas import DataFrame,Timestamp,TimeGrouperi = [Timestamp('2015-10-07 03:50:01.543999+0000',tz='UTC'),Timestamp('2015-10-07 03:50:02.504000+0000',Timestamp('2015-10-07 03:50:02.180000+0000',Timestamp('2015-10-
这是一个可重复的例子:

from pandas import DataFrame,Timestamp,TimeGrouper

i = [Timestamp('2015-10-07 03:50:01.543999+0000',tz='UTC'),Timestamp('2015-10-07 03:50:02.504000+0000',Timestamp('2015-10-07 03:50:02.180000+0000',Timestamp('2015-10-07 03:50:04.380000+0000',Timestamp('2015-10-07 03:50:14.744000+0000',Timestamp('2015-10-07 03:50:17.380000+0000',Timestamp('2015-10-07 03:50:19.860000+0000',Timestamp('2015-10-07 03:50:19.996000+0000',Timestamp('2015-10-07 03:50:32.823999+0000',Timestamp('2015-10-07 03:50:37.867999+0000',Timestamp('2015-10-07 03:50:41.956000+0000',Timestamp('2015-10-07 03:50:46.584000+0000',Timestamp('2015-10-07 03:50:46.828000+0000',Timestamp('2015-10-07 03:50:49.047999+0000',Timestamp('2015-10-07 03:50:53.668000+0000',Timestamp('2015-10-07 03:50:55.675999+0000',Timestamp('2015-10-07 03:50:55.464000+0000',Timestamp('2015-10-07 03:50:57.123999+0000',Timestamp('2015-10-07 03:51:02.127999+0000',Timestamp('2015-10-07 03:51:02.327999+0000',Timestamp('2015-10-07 03:51:07.484000+0000',Timestamp('2015-10-07 03:51:08.504000+0000',Timestamp('2015-10-07 03:51:08.520000+0000',Timestamp('2015-10-07 03:51:08.119999+0000',Timestamp('2015-10-07 03:51:15.547999+0000',Timestamp('2015-10-07 03:51:16.996000+0000',Timestamp('2015-10-07 03:51:23.888000+0000',Timestamp('2015-10-07 03:51:24.671999+0000',Timestamp('2015-10-07 03:51:26.719999+0000',Timestamp('2015-10-07 03:51:29.924000+0000',Timestamp('2015-10-07 03:52:00.372000+0000',Timestamp('2015-10-07 03:52:02.900000+0000',Timestamp('2015-10-07 03:52:05.883999+0000',Timestamp('2015-10-07 03:52:05.888000+0000',Timestamp('2015-10-07 03:52:29.119999+0000',Timestamp('2015-10-07 03:52:31.319999+0000',Timestamp('2015-10-07 03:52:33.676000+0000',Timestamp('2015-10-07 03:52:33.987999+0000',Timestamp('2015-10-07 03:52:33.248000+0000',Timestamp('2015-10-07 03:52:43.288000+0000',Timestamp('2015-10-07 03:52:45.068000+0000',Timestamp('2015-10-07 03:52:48.259999+0000',Timestamp('2015-10-07 03:52:57.196000+0000',Timestamp('2015-10-07 03:52:59.743999+0000',Timestamp('2015-10-07 03:53:00.244000+0000',Timestamp('2015-10-07 03:53:00.248000+0000',Timestamp('2015-10-07 03:53:00.356000+0000',Timestamp('2015-10-07 03:53:00.380000+0000',Timestamp('2015-10-07 03:53:03.012000+0000',Timestamp('2015-10-07 03:53:14.055999+0000',Timestamp('2015-10-07 03:53:18.447999+0000',Timestamp('2015-10-07 03:53:18.472000+0000',Timestamp('2015-10-07 03:53:27.259999+0000',Timestamp('2015-10-07 03:53:30.831999+0000',Timestamp('2015-10-07 03:53:30.840000+0000',Timestamp('2015-10-07 03:53:31.631999+0000',Timestamp('2015-10-07 03:53:41.776000+0000',Timestamp('2015-10-07 03:53:44.119999+0000',Timestamp('2015-10-07 03:53:52.319999+0000',Timestamp('2015-10-07 03:53:54.239999+0000',Timestamp('2015-10-07 03:53:54.243999+0000',Timestamp('2015-10-07 03:53:54.311999+0000',Timestamp('2015-10-07 03:54:02.648000+0000',Timestamp('2015-10-07 03:54:04.268000+0000',Timestamp('2015-10-07 03:54:05.980000+0000',Timestamp('2015-10-07 03:54:08.959999+0000',Timestamp('2015-10-07 03:54:09.144000+0000',Timestamp('2015-10-07 03:54:09.223999+0000',Timestamp('2015-10-07 03:54:10.828000+0000',Timestamp('2015-10-07 03:54:12.751999+0000',Timestamp('2015-10-07 03:54:15.480000+0000',Timestamp('2015-10-07 03:54:15.484000+0000',Timestamp('2015-10-07 03:54:16.963999+0000',Timestamp('2015-10-07 03:54:17.460000+0000',Timestamp('2015-10-07 03:54:34.519999+0000',Timestamp('2015-10-07 03:54:35.319999+0000',tz='UTC')]

p = [1965.25,1965.25,1965.5,1965.75,1965.0,1964.75,1964.5,1965.25]

现在让我们创建DataFrame:

df = DataFrame(data=p,index=i,columns=['price'])

03:54分钟的数据是什么样的:

df[df.index >= Timestamp('2015-10-07 03:54:00+00:00')].head(12)

                                    price
2015-10-07 03:54:02.648000+00:00  1965.00
2015-10-07 03:54:02.648000+00:00  1965.00
2015-10-07 03:54:02.648000+00:00  1965.00
2015-10-07 03:54:02.648000+00:00  1965.00
2015-10-07 03:54:02.648000+00:00  1965.00
2015-10-07 03:54:02.648000+00:00  1965.00
2015-10-07 03:54:02.648000+00:00  1964.75
2015-10-07 03:54:02.648000+00:00  1964.75
2015-10-07 03:54:02.648000+00:00  1964.75
2015-10-07 03:54:02.648000+00:00  1964.75
2015-10-07 03:54:04.268000+00:00  1964.75
2015-10-07 03:54:04.268000+00:00  1964.75

第一个价格是1965.00.现在让我们使用pandas resample方法创建1分钟的条形图:

df.resample(rule='1Min',how='ohlc',closed='left',label='left')

                             price                           
                              open     high      low    close
2015-10-07 03:50:00+00:00  1965.25  1965.75  1965.25  1965.50
2015-10-07 03:51:00+00:00  1965.75  1965.75  1965.25  1965.50
2015-10-07 03:52:00+00:00  1965.75  1965.75  1965.25  1965.75
2015-10-07 03:53:00+00:00  1965.50  1965.75  1965.25  1965.50
2015-10-07 03:54:00+00:00  1964.75  1965.25  1964.50  1965.25

开盘价是1964.75,应该是1965.00在03:54.

添加行列可以完美地显示问题:

df['row'] = range(1,df.shape[0] + 1)
grouped = df.groupby(TimeGrouper(freq='1Min',label='left'))
grouped.get_group(Timestamp('2015-10-07 03:54:00+00:00')).head(12)
                                    price  row
2015-10-07 03:54:02.648000+00:00  1964.75  189
2015-10-07 03:54:02.648000+00:00  1964.75  188
2015-10-07 03:54:02.648000+00:00  1964.75  187
2015-10-07 03:54:02.648000+00:00  1964.75  186
2015-10-07 03:54:02.648000+00:00  1965.00  185
2015-10-07 03:54:02.648000+00:00  1965.00  181
2015-10-07 03:54:02.648000+00:00  1965.00  183
2015-10-07 03:54:02.648000+00:00  1965.00  182
2015-10-07 03:54:02.648000+00:00  1965.00  180
2015-10-07 03:54:02.648000+00:00  1965.00  184
2015-10-07 03:54:04.268000+00:00  1964.75  190
2015-10-07 03:54:04.268000+00:00  1964.75  191

TimeGrouper类在时间戳相同时更改行的顺序.

临时解决方案是在应用ohlc转换之前按行列对每个组进行排序.

感谢您的关注!

解决方法

以下代码可能适合您.

df = DataFrame(data=p,columns=['price'])
df = df.reset_index().drop_duplicates(subset='index').set_index('index')

df.resample(rule='1Min',label='left')

它显示o / p为

price
                            open    high    low     close
index               
2015-10-07 03:50:00+00:00   1965.25 1965.75 1965.25 1965.50
2015-10-07 03:51:00+00:00   1965.75 1965.75 1965.25 1965.50
2015-10-07 03:52:00+00:00   1965.75 1965.75 1965.25 1965.75
2015-10-07 03:53:00+00:00   1965.50 1965.75 1965.25 1965.50
2015-10-07 03:54:00+00:00   1965.00 1965.25 1964.50 1965.25

(编辑:李大同)

【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容!

    推荐文章
      热点阅读