postgresql – Postgres – 计算累积数据的变化
我通过
Python从一些API源收集数据,并将其添加到Postgres中的2个表中.
然后,我使用此数据来生成报告,加入和分组/过滤数据.我每天都会添加数千行. 成本,收入和销售总是累积的,这意味着每个数据点来自该产品的t1,而t2是数据回溯的时间. 因此,最新的数据拉动将包括所有先前的数据,直到t1. t1,t2是Postgres中没有时区的时间戳.我目前使用的是Postgres 10. 样品: id,vendor_id,product_id,t1,t2,cost,revenue,sales 1,a,2018-01-01,2018-04-18,50,200,34 2,b,2018-05-01,10,100,10 3,c,2018-01-02,12,9 4,d,2018-01-03,8 5,e,2018-25-02,7 6,2018-04-17,40,30 7,95,8 8,5 9,8,90,4 10,9,0-,3 成本和收入来自两个表,我将它们加入vendor_id,product_id和t2. 有没有办法我可以浏览所有数据并“移位”它并减去,所以我没有累积数据,而是基于时间序列的数据? 这应该在存储之前完成,还是在制作报告时更好? 作为参考,目前如果我想要一个两次变化的报告,我会做两个子查询,但它似乎倒退而不是按时间序列计算数据,只是聚合所需的间隔. with report1 as (select ...),report2 as (select ...) select .. from report1 left outer join report2 on ... 非常感谢提前! JR 解决方法
您可以使用LAG():
Window Functions:
with sample_data as ( select 1 as id,'a'::text vendor_id,'a'::text product_id,'2018-01-01'::date as t1,'2018-04-18'::date as t2,50 as cost,200 as revenue,36 as sales union all select 2 as id,'b'::text product_id,55 as cost,34 as sales union all select 3 as id,'2018-04-17'::date as t2,35 as cost,150 as revenue,25 as sales union all select 4 as id,25 as cost,140 as revenue,23 as sales union all select 5 as id,'2018-04-16'::date as t2,16 as cost,70 as revenue,12 as sales union all select 6 as id,13 as cost,65 as revenue,11 as sales ) select sd.*,coalesce(cost - lag(cost) over (partition by vendor_id,product_id order by t2),cost) cost_new,coalesce(revenue - lag(revenue) over (partition by vendor_id,revenue) revenue_new,coalesce(sales - lag(sales) over (partition by vendor_id,sales) sales_new from sample_data sd order by t2 desc (编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |