scala – Spark中的各种连接类型是什么?
发布时间:2020-12-16 09:04:13 所属栏目:安全 来源:网络整理
导读:我查看了文档,并说它支持以下连接类型: Type of join to perform. Default inner. Must be one of: inner,cross, outer,full,full_outer,left,left_outer,right,right_outer, left_semi,left_anti. 我查看了SQL连接上的StackOverflow answer和前几个答案没
我查看了文档,并说它支持以下连接类型:
我查看了SQL连接上的StackOverflow answer和前几个答案没有提到上面的一些连接,例如left_semi和left_anti.他们在Spark中意味着什么? 解决方法
这是一个简单的说明性实验:
import org.apache.spark._ import org.apache.spark.sql._ import org.apache.spark.sql.expressions._ import org.apache.spark.sql.functions._ object SparkSandbox extends App { case class Row(id: Int,value: String) private[this] implicit val spark = SparkSession.builder().master("local[*]").getOrCreate() import spark.implicits._ spark.sparkContext.setLogLevel("ERROR") val r1 = Seq(Row(1,"A1"),Row(2,"A2"),Row(3,"A3"),Row(4,"A4")).toDS() val r2 = Seq(Row(3,"A4"),"A4_1"),Row(5,"A5"),Row(6,"A6")).toDS() val joinTypes = Seq("inner","outer","full","full_outer","left","left_outer","right","right_outer","left_semi","left_anti") joinTypes foreach {joinType => println(s"${joinType.toUpperCase()} JOIN") r1.join(right = r2,usingColumns = Seq("id"),joinType = joinType).orderBy("id").show() } } 产量 INNER JOIN +---+-----+-----+ | id|value|value| +---+-----+-----+ | 3| A3| A3| | 4| A4| A4_1| | 4| A4| A4| +---+-----+-----+ OUTER JOIN +---+-----+-----+ | id|value|value| +---+-----+-----+ | 1| A1| null| | 2| A2| null| | 3| A3| A3| | 4| A4| A4| | 4| A4| A4_1| | 5| null| A5| | 6| null| A6| +---+-----+-----+ FULL JOIN +---+-----+-----+ | id|value|value| +---+-----+-----+ | 1| A1| null| | 2| A2| null| | 3| A3| A3| | 4| A4| A4_1| | 4| A4| A4| | 5| null| A5| | 6| null| A6| +---+-----+-----+ FULL_OUTER JOIN +---+-----+-----+ | id|value|value| +---+-----+-----+ | 1| A1| null| | 2| A2| null| | 3| A3| A3| | 4| A4| A4_1| | 4| A4| A4| | 5| null| A5| | 6| null| A6| +---+-----+-----+ LEFT JOIN +---+-----+-----+ | id|value|value| +---+-----+-----+ | 1| A1| null| | 2| A2| null| | 3| A3| A3| | 4| A4| A4_1| | 4| A4| A4| +---+-----+-----+ LEFT_OUTER JOIN +---+-----+-----+ | id|value|value| +---+-----+-----+ | 1| A1| null| | 2| A2| null| | 3| A3| A3| | 4| A4| A4_1| | 4| A4| A4| +---+-----+-----+ RIGHT JOIN +---+-----+-----+ | id|value|value| +---+-----+-----+ | 3| A3| A3| | 4| A4| A4| | 4| A4| A4_1| | 5| null| A5| | 6| null| A6| +---+-----+-----+ RIGHT_OUTER JOIN +---+-----+-----+ | id|value|value| +---+-----+-----+ | 3| A3| A3| | 4| A4| A4_1| | 4| A4| A4| | 5| null| A5| | 6| null| A6| +---+-----+-----+ LEFT_SEMI JOIN +---+-----+ | id|value| +---+-----+ | 3| A3| | 4| A4| +---+-----+ LEFT_ANTI JOIN +---+-----+ | id|value| +---+-----+ | 1| A1| | 2| A2| +---+-----+ (编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |