scala – Spark:如何在循环中结合所有数据帧
发布时间:2020-12-16 18:27:34 所属栏目:安全 来源:网络整理
导读:有没有办法让数据帧联合数据帧在循环中? 这是一个示例代码 var fruits = List( "apple","orange","melon") for (x - fruits){ var df = Seq(("aaa","bbb",x)).toDF("aCol","bCol","name")} I Would want to do some like aCol | bCol | fruitsNameaaa,bbb,a
有没有办法让数据帧联合数据帧在循环中?
这是一个示例代码 var fruits = List( "apple","orange","melon" ) for (x <- fruits){ var df = Seq(("aaa","bbb",x)).toDF("aCol","bCol","name") }
aCol | bCol | fruitsName aaa,bbb,apple aaa,orange aaa,melon 再次感谢 解决方法
Steffen Schmitz的答案是我认为最简洁的答案.
如果您正在寻找更多自定义(字段类型等),下面是一个更详细的答案: import org.apache.spark.sql.types.{StructType,StructField,StringType} import org.apache.spark.sql.Row //initialize DF val schema = StructType( StructField("aCol",StringType,true) :: StructField("bCol",true) :: StructField("name",true) :: Nil) var initialDF = spark.createDataFrame(sc.emptyRDD[Row],schema) //list to iterate through var fruits = List( "apple","melon" ) for (x <- fruits) { //union returns a new dataset initialDF = initialDF.union(Seq(("aaa",x)).toDF) } //initialDF.show() 引用: > How to create an empty DataFrame with a specified schema? (编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |