加入收藏 | 设为首页 | 会员中心 | 我要投稿 李大同 (https://www.lidatong.com.cn/)- 科技、建站、经验、云计算、5G、大数据,站长网!
当前位置: 首页 > 综合聚焦 > 服务器 > 安全 > 正文

scala – StructField中的错误(a,StringType,false).这是假的,应

发布时间:2020-12-16 08:46:22 所属栏目:安全 来源:网络整理
导读:我的 Scala测试中出现此错误: StructType(StructField(a,StringType,true),StructField(b,StructField(c,StructField(d,StructField(e,StructField(f,StructField(NewColumn,false)) did not equal StructType(StructField(a,true))ScalaTestFailureLocatio
我的 Scala测试中出现此错误:

StructType(StructField(a,StringType,true),StructField(b,StructField(c,StructField(d,StructField(e,StructField(f,StructField(NewColumn,false)) did not equal StructType(StructField(a,true))

ScalaTestFailureLocation: com.holdenkarau.spark.testing.TestSuite$class at (TestSuite.scala:13)

Expected :StructType(StructField(a,true))

Actual   :StructType(StructField(a,false))

当它应该是真的时,最后的StructField是假的,我不是为什么.这是真的意味着架构接受空值.

这是我的考验:

val schema1 = Array("a","b","c","d","e","f")
val df = List(("a1","b1","c1","d1","e1","f1"),("a2","b2","c2","d2","e2","f2"))
  .toDF(schema1: _*)

val schema2 = Array("a","f","NewColumn")

val dfExpected = List(("a1","f1","a1_b1_c1_d1_e1_f1"),"f2","a2_b2_c2_d2_e2_f2")).toDF(schema2: _*)

val transformer = KeyContract("NewColumn",schema1)
val newDf = transformer(df)
newDf.columns should contain ("NewColumn")
assertDataFrameEquals(newDf,dfExpected)

这是KeyContract:

case class KeyContract(tempColumn: String,columns: Seq[String],unsigned: Boolean = true) extends Transformer {

  override def apply(input: DataFrame): DataFrame = {
    import org.apache.spark.sql.functions._

    val inputModif = columns.foldLeft(input) { (tmpDf,columnName) =>
      tmpDf.withColumn(columnName,when(col(columnName).isNull,lit("")).otherwise(col(columnName)))
    }

    inputModif.withColumn(tempColumn,concat_ws("_",columns.map(col): _*))
  }
}

提前致谢!!

解决方法

发生这种情况是因为concat_ws从不返回null,并且结果字段被标记为不可为空.

如果要使用第二个DataFrame作为参考,则必须使用schema和Rows:

import org.apache.spark.sql.{Row,SparkSession}
import org.apache.spark.sql.types._

val spark: SparkSession = SparkSession.builder.getOrCreate()

val dfExpected = spark.createDataFrame(spark.sparkContext.parallelize(List(
  Row("a1",Row("a2","a2_b2_c2_d2_e2_f2")
)),StructType(schema2.map { c => StructField(c,c != "NewColumn") }))

这样,最后一列将不可为空:

dfExpected.printSchema
root
 |-- a: string (nullable = true)
 |-- b: string (nullable = true)
 |-- c: string (nullable = true)
 |-- d: string (nullable = true)
 |-- e: string (nullable = true)
 |-- f: string (nullable = true)
 |-- NewColumn: string (nullable = false)

(编辑:李大同)

【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容!

    推荐文章
      热点阅读