加入收藏 | 设为首页 | 会员中心 | 我要投稿 李大同 (https://www.lidatong.com.cn/)- 科技、建站、经验、云计算、5G、大数据,站长网!
当前位置: 首页 > 百科 > 正文

postgresql – Postgres – 计算累积数据的变化

发布时间:2020-12-13 15:53:42 所属栏目:百科 来源:网络整理
导读:我通过 Python从一些API源收集数据,并将其添加到Postgres中的2个表中. 然后,我使用此数据来生成报告,加入和分组/过滤数据.我每天都会添加数千行. 成本,收入和销售总是累积的,这意味着每个数据点来自该产品的t1,而t2是数据回溯的时间. 因此,最新的数据拉动将
我通过 Python从一些API源收集数据,并将其添加到Postgres中的2个表中.

然后,我使用此数据来生成报告,加入和分组/过滤数据.我每天都会添加数千行.

成本,收入和销售总是累积的,这意味着每个数据点来自该产品的t1,而t2是数据回溯的时间.

因此,最新的数据拉动将包括所有先前的数据,直到t1. t1,t2是Postgres中没有时区的时间戳.我目前使用的是Postgres 10.

样品:

id,vendor_id,product_id,t1,t2,cost,revenue,sales
1,a,2018-01-01,2018-04-18,50,200,34
2,b,2018-05-01,10,100,10
3,c,2018-01-02,12,9
4,d,2018-01-03,8
5,e,2018-25-02,7

6,2018-04-17,40,30
7,95,8
8,5
9,8,90,4
10,9,0-,3

成本和收入来自两个表,我将它们加入vendor_id,product_id和t2.

有没有办法我可以浏览所有数据并“移位”它并减去,所以我没有累积数据,而是基于时间序列的数据?

这应该在存储之前完成,还是在制作报告时更好?

作为参考,目前如果我想要一个两次变化的报告,我会做两个子查询,但它似乎倒退而不是按时间序列计算数据,只是聚合所需的间隔.

with report1 as (select ...),report2 as (select ...)
select .. from report1 left outer join report2 on ...

非常感谢提前!

JR

解决方法

您可以使用LAG():

Window Functions:

…returns value evaluated at the row that is offset rows before the
current row within the partition; if there is no such row,instead
return default (which must be of the same type as value). Both offset
and default are evaluated with respect to the current row. If omitted,
offset defaults to 1 and default to null.

with sample_data as (
        select 1 as id,'a'::text vendor_id,'a'::text product_id,'2018-01-01'::date as t1,'2018-04-18'::date as t2,50 as cost,200 as revenue,36 as sales
        union all
        select 2 as id,'b'::text product_id,55 as cost,34 as sales
        union all
        select 3 as id,'2018-04-17'::date as t2,35 as cost,150 as revenue,25 as sales
        union all
        select 4 as id,25 as cost,140 as revenue,23 as sales
        union all
        select 5 as id,'2018-04-16'::date as t2,16 as cost,70 as revenue,12 as sales
        union all
        select 6 as id,13 as cost,65 as revenue,11 as sales
)
select sd.*,coalesce(cost - lag(cost) over (partition by vendor_id,product_id order by t2),cost) cost_new,coalesce(revenue - lag(revenue) over (partition by vendor_id,revenue) revenue_new,coalesce(sales - lag(sales) over (partition by vendor_id,sales) sales_new
from sample_data sd
order by t2 desc

(编辑:李大同)

【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容!

    推荐文章
      热点阅读