在R或PostgreSQL中形成时空接近轨迹的组
我正在使用R和PostgreSQL进行一些轨迹分析.为了形成连续位置在时空上靠近的轨迹段组,我创建了下表.我仍然缺少的是列group_id,这是我的问题所在.
bike_id1 datetime bike_id2 near group_id 1 2016-05-28 11:00:00 2 TRUE 1 1 2016-05-28 11:00:05 2 TRUE 1 1 2016-05-28 11:00:10 2 FALSE NA [...] 2 2016-05-28 11:00:05 3 TRUE 1 2 2016-05-28 11:00:10 3 TRUE 1 这是每个轨迹与其他轨迹(没有重复的所有组合)之间的多重比较和日期时间的内部连接(总是在5秒的倍数上采样)的结果.它表明,对于某些位置,自行车1和2同时被采样并且在空间上接近(某个任意阈值). 现在我想给两个自行车在时间上靠近的区段(group_id)赠送独特的ID.这就是我被困住的地方:我希望group_id能够尊重具有多个轨迹的群体.分配group_id的方法应该意识到如果自行车1和2在2016-05-28 11:00:05在一个组中,那么如果在同一时间戳接近2,则3属于同一组(2016- 05-28 11:00:05). R或PostgreSQL中是否有工具可以帮助我完成这项任务?在表中运行循环似乎是错误的方法. 编辑: CREATE TABLE nearness -- ( seq SERIAL NOT NULL UNIQUE -- surrogate for conveniance ( bike1 INTEGER NOT NULL,bike2 INTEGER NOT NULL,stamp timestamp NOT NULL,near boolean,PRIMARY KEY(bike1,bike2,stamp) ); INSERT INTO nearness( bike1,stamp,near) VALUES (1,2,'2016-05-28 11:00:00',TRUE),(1,'2016-05-28 11:00:05','2016-05-28 11:00:10','2016-05-28 11:00:20',TRUE) -- <<-- gap here,'2016-05-28 11:00:25','2016-05-28 11:00:30',FALSE),(4,5,'2016-05-28 11:00:15',(2,3,TRUE) -- <<-- bike 1,3 are in one grp @ 11:00:05,TRUE) -- <<-- no group here,(6,7,FALSE) ; 解决方法
更新:[在理解了真正的问题之后; – ]找到自行车的等价组(set,bike_set)实际上是一个关系划分问题.在一组自行车中查找段(clust)的开始和结束基本上与第一次尝试中的相同.
>群集存储在数组中:(我相信群集不会变得太大) 注意:代码信任(bike2> bike1).这需要保持数组排序,从而规范.实际内容不保证是规范的,因为无法保证递归查询中的添加顺序.这可能需要一些额外的工作. CREATE TABLE nearness ( bike1 INTEGER NOT NULL,FALSE) -- <<-- these False-records serve no pupose,FALSE) -- <<-- result would be the same without them,FALSE) ; -- Recursive union-find to glue together sets of bike_ids --,occuring at the same moment. -- Sets are represented as {ordered,unique} arrays here WITH RECURSIVE wood AS ( WITH omg AS ( SELECT bike1,row_number() OVER(ORDER BY bike1,stamp) AS seq,ARRAY[bike1,bike2]::integer[] AS arr FROM nearness n WHERE near = True ) -- Find all existing combinations of bikes SELECT o1.stamp,o1.seq,ARRAY[o1.bike1,o1.bike2]::integer[] AS arr FROM omg o1 UNION ALL SELECT o2.stamp,o2.seq -- avoid duplicates inside the array,CASE when o2.bike1 = ANY(w.arr) THEN w.arr || o2.bike2 ELSE w.arr || o2.bike1 END AS arr FROM omg o2 JOIN wood w ON o2.stamp = w.stamp AND o2.seq > w.seq AND (o2.bike1 = ANY(w.arr) OR o2.bike2 = ANY(w.arr)) AND NOT (o2.bike1 = ANY(w.arr) AND o2.bike2 = ANY(w.arr)) ),uniq AS ( -- suppress partial sets caused by the recursive union-find buildup SELECT * FROM wood w WHERE NOT EXISTS (SELECT * FROM wood nx WHERE nx.stamp = w.stamp AND nx.arr @> w.arr AND nx.arr <> w.arr -- contains but not equal ) ),xsets AS ( -- make unique sets of bikes SELECT DISTINCT arr --,MIN(seq) AS grp FROM uniq GROUP BY arr ),sets AS ( -- enumerate the sets of bikes SELECT arr,row_number() OVER () AS setnum FROM xsets ),drag AS ( -- Detect beginning and end of segments of consecutive observations SELECT u.* -- within a constant set of bike_ids -- Edge-detection begin of group,NOT EXISTS (SELECT * FROM uniq nx WHERE nx.arr = u.arr AND nx.stamp < u.stamp AND nx.stamp >= u.stamp - '5 sec'::interval ) AS is_first -- Edge-detection end of group,NOT EXISTS (SELECT * FROM uniq nx WHERE nx.arr = u.arr AND nx.stamp > u.stamp AND nx.stamp <= u.stamp + '5 sec'::interval ) AS is_last,row_number() OVER(ORDER BY arr,stamp) AS nseq FROM uniq u ),top AS ( -- id and groupnum for the start of a group SELECT nseq,row_number() OVER () AS clust FROM drag WHERE is_first ),bot AS ( -- id and groupnum for the end of a group SELECT nseq,row_number() OVER () AS clust FROM drag WHERE is_last ) SELECT w.seq as orgseq -- results,please ...,w.stamp,g0.clust AS clust,row_number() OVER(www) AS rn,s.setnum,s.arr AS bike_set FROM drag w JOIN sets s ON s.arr = w.arr JOIN top g0 ON g0.nseq <= w.seq JOIN bot g1 ON g1.nseq >= w.seq AND g1.clust = g0.clust WINDOW www AS (PARTITION BY g1.clust ORDER BY w.stamp) ORDER BY g1.clust,w.stamp ; 结果: orgseq | stamp | clust | rn | setnum | bike_set --------+---------------------+-------+----+--------+---------- 1 | 2016-05-28 11:00:00 | 1 | 1 | 1 | {1,2} 4 | 2016-05-28 11:00:20 | 3 | 1 | 1 | {1,2} 5 | 2016-05-28 11:00:25 | 3 | 2 | 1 | {1,2} 6 | 2016-05-28 11:00:05 | 4 | 1 | 3 | {1,3} 7 | 2016-05-28 11:00:10 | 4 | 2 | 3 | {1,3} 8 | 2016-05-28 11:00:10 | 4 | 3 | 2 | {4,5} (6 rows) (编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |