sql – 如何查找无向图的所有连接子图
对于我正在努力解决的问题,我需要一些帮助.
示例表: ID |Identifier1 | Identifier2 --------------------------------- 1 | a | c 2 | b | f 3 | a | g 4 | c | h 5 | b | j 6 | d | f 7 | e | k 8 | i | 9 | l | h 我想在两列之间对彼此相关的标识符进行分组,并分配唯一的组ID. 期望的输出: Identifier | Gr_ID | Gr.Members --------------------------------------------------- a | 1 | (a,c,g,h,l) b | 2 | (b,d,f,j) c | 1 | (a,l) d | 2 | (b,j) e | 3 | (e,k) f | 2 | (b,j) g | 1 | (a,l) h | 1 | (a,l) j | 2 | (b,j) k | 3 | (e,k) l | 1 | (a,l) i | 4 | (i) 注意:Gr.Members列不是必需的,主要用于更清晰的视图.
但是必须将组ID分配给每个标识符(由两列的并集选择)而不是行. 有关如何构建查询以提供所需输出的任何帮助? 谢谢. 更新:下面是一些额外的样本集及其预期输出. 给定表: Identifier1 | Identifier2 ---------------------------- a | f a | g a | NULL b | c b | a b | h b | j b | NULL b | NULL b | g c | k c | b d | l d | f d | g d | m d | a d | NULL d | a e | c e | b e | NULL 预期输出:所有记录应属于组ID = 1的同一组. 给定表: Identifier1 | Identifier2 -------------------------- a | a b | b c | a c | b c | c 预期输出:记录应位于组ID = 1的同一组中. 解决方法这是一个不使用游标的变体,但使用单个递归查询.本质上,它将数据视为图中的边,并递归遍历图的所有边,在检测到循环时停止.然后它将所有找到的循环放入组中,并为每个组分配一个数字. 请参阅下面有关其工作原理的详细说明.我建议您运行查询CTE-by-CTE并检查每个中间结果以了解它的作用. 样品1 DECLARE @T TABLE (ID int,Ident1 char(1),Ident2 char(1)); INSERT INTO @T (ID,Ident1,Ident2) VALUES (1,'a','a'),(2,'b','b'),(3,'c',(4,(5,'c'); 样本2 我添加了另一行z值,以使多行具有不成对的值. DECLARE @T TABLE (ID int,(1,'c'),'f'),'g'),'h'),'j'),(6,'d',(7,'e','k'),(8,'i',NULL),(88,'z','z'),(9,'l','h'); 样本3 DECLARE @T TABLE (ID int,(10,(11,(12,(13,'l'),(14,(15,(16,'m'),(17,(18,(19,(20,(21,(22,NULL); 询问 WITH CTE_Idents AS ( SELECT Ident1 AS Ident FROM @T UNION SELECT Ident2 AS Ident FROM @T ),CTE_Pairs AS ( SELECT Ident1,Ident2 FROM @T WHERE Ident1 <> Ident2 UNION SELECT Ident2 AS Ident1,Ident1 AS Ident2 FROM @T WHERE Ident1 <> Ident2 ),CTE_Recursive AS ( SELECT CAST(CTE_Idents.Ident AS varchar(8000)) AS AnchorIdent,Ident2,CAST(',' + Ident1 + ',' + Ident2 + ',' AS varchar(8000)) AS IdentPath,1 AS Lvl FROM CTE_Pairs INNER JOIN CTE_Idents ON CTE_Idents.Ident = CTE_Pairs.Ident1 UNION ALL SELECT CTE_Recursive.AnchorIdent,CTE_Pairs.Ident1,CTE_Pairs.Ident2,CAST(CTE_Recursive.IdentPath + CTE_Pairs.Ident2 + ',CTE_Recursive.Lvl + 1 AS Lvl FROM CTE_Pairs INNER JOIN CTE_Recursive ON CTE_Recursive.Ident2 = CTE_Pairs.Ident1 WHERE CTE_Recursive.IdentPath NOT LIKE CAST('%,' + CTE_Pairs.Ident2 + ',%' AS varchar(8000)) ),CTE_RecursionResult AS ( SELECT AnchorIdent,Ident2 FROM CTE_Recursive ),CTE_CleanResult AS ( SELECT AnchorIdent,Ident1 AS Ident FROM CTE_RecursionResult UNION SELECT AnchorIdent,Ident2 AS Ident FROM CTE_RecursionResult ) SELECT CTE_Idents.Ident,CASE WHEN CA_Data.XML_Value IS NULL THEN CTE_Idents.Ident ELSE CA_Data.XML_Value END AS GroupMembers,DENSE_RANK() OVER(ORDER BY CASE WHEN CA_Data.XML_Value IS NULL THEN CTE_Idents.Ident ELSE CA_Data.XML_Value END ) AS GroupID FROM CTE_Idents CROSS APPLY ( SELECT CTE_CleanResult.Ident+',' FROM CTE_CleanResult WHERE CTE_CleanResult.AnchorIdent = CTE_Idents.Ident ORDER BY CTE_CleanResult.Ident FOR XML PATH(''),TYPE ) AS CA_XML(XML_Value) CROSS APPLY ( SELECT CA_XML.XML_Value.value('.','NVARCHAR(MAX)') ) AS CA_Data(XML_Value) WHERE CTE_Idents.Ident IS NOT NULL ORDER BY Ident; 结果1 +-------+--------------+---------+ | Ident | GroupMembers | GroupID | +-------+--------------+---------+ | a | a,b,| 1 | | b | a,| 1 | | c | a,| 1 | +-------+--------------+---------+ 结果2 +-------+--------------+---------+ | Ident | GroupMembers | GroupID | +-------+--------------+---------+ | a | a,l,| 1 | | b | b,j,| 2 | | c | a,| 1 | | d | b,| 2 | | e | e,k,| 3 | | f | b,| 2 | | g | a,| 1 | | h | a,| 1 | | i | i | 4 | | j | b,| 2 | | k | e,| 3 | | l | a,| 1 | | z | z | 5 | +-------+--------------+---------+ 结果3 +-------+--------------------------+---------+ | Ident | GroupMembers | GroupID | +-------+--------------------------+---------+ | a | a,e,m,| 1 | | d | a,| 1 | | e | a,| 1 | | f | a,| 1 | | g | a,| 1 | | j | a,| 1 | | k | a,| 1 | | l | a,| 1 | | m | a,| 1 | +-------+--------------------------+---------+ 这个怎么运作 我将使用第二组样本数据进行解释. CTE_Idents CTE_Idents给出了Ident1和Ident2列中出现的所有标识符的列表. +-------+ | Ident | +-------+ | NULL | | a | | b | | c | | d | | e | | f | | g | | h | | i | | j | | k | | l | | z | +-------+ CTE_Pairs CTE_Pairs给出了两个方向上图形的所有边的列表.同样,UNION用于删除任何重复项. +--------+--------+ | Ident1 | Ident2 | +--------+--------+ | a | c | | a | g | | b | f | | b | j | | c | a | | c | h | | d | f | | e | k | | f | b | | f | d | | g | a | | h | c | | h | l | | j | b | | k | e | | l | h | +--------+--------+ CTE_Recursive CTE_Recursive是查询的主要部分,它从每个唯一标识符开始递归遍历图形. CTE_Recursive.IdentPath NOT LIKE CAST('%,%' AS varchar(8000)) 一旦我们遇到之前包含在Path中的Identifier,递归就会在连接节点列表耗尽时停止. +-------------+--------+--------+-------------+-----+ | AnchorIdent | Ident1 | Ident2 | IdentPath | Lvl | +-------------+--------+--------+-------------+-----+ | a | a | c |,a,| 1 | | a | a | g |,| 1 | | b | b | f |,| 1 | | b | b | j |,| 1 | | c | c | a |,| 1 | | c | c | h |,| 1 | | d | d | f |,| 1 | | e | e | k |,| 1 | | f | f | b |,| 1 | | f | f | d |,| 1 | | g | g | a |,| 1 | | h | h | c |,| 1 | | h | h | l |,| 1 | | j | j | b |,| 1 | | k | k | e |,| 1 | | l | l | h |,| 1 | | l | h | c |,| 2 | | l | c | a |,| 3 | | l | a | g |,| 4 | | j | b | f |,| 2 | | j | f | d |,| 3 | | h | c | a |,| 2 | | h | a | g |,| 3 | | g | a | c |,| 2 | | g | c | h |,| 3 | | g | h | l |,| 4 | | f | b | j |,| 2 | | d | f | b |,| 2 | | d | b | j |,| 3 | | c | h | l |,| 2 | | c | a | g |,| 2 | | b | f | d |,| 2 | | a | c | h |,| 2 | | a | h | l |,| 3 | +-------------+--------+--------+-------------+-----+ CTE_CleanResult CTE_CleanResult仅从CTE_Recursive中保留相关部分,并再次使用UNION合并Ident1和Ident2. +-------------+-------+ | AnchorIdent | Ident | +-------------+-------+ | a | a | | a | c | | a | g | | a | h | | a | l | | b | b | | b | d | | b | f | | b | j | | c | a | | c | c | | c | g | | c | h | | c | l | | d | b | | d | d | | d | f | | d | j | | e | e | | e | k | | f | b | | f | d | | f | f | | f | j | | g | a | | g | c | | g | g | | g | h | | g | l | | h | a | | h | c | | h | g | | h | h | | h | l | | j | b | | j | d | | j | f | | j | j | | k | e | | k | k | | l | a | | l | c | | l | g | | l | h | | l | l | +-------------+-------+ 最后的选择 现在我们需要为每个AnchorIdent构建一串逗号分隔的Ident值.用FOR XML交叉应用就可以了.DENSE_RANK()计算每个AnchorIdent的GroupID号. (编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |