scala – StructField中的错误(a,StringType,false).这是假的,应
发布时间:2020-12-16 08:46:22 所属栏目:安全 来源:网络整理
导读:我的 Scala测试中出现此错误: StructType(StructField(a,StringType,true),StructField(b,StructField(c,StructField(d,StructField(e,StructField(f,StructField(NewColumn,false)) did not equal StructType(StructField(a,true))ScalaTestFailureLocatio
我的
Scala测试中出现此错误:
StructType(StructField(a,StringType,true),StructField(b,StructField(c,StructField(d,StructField(e,StructField(f,StructField(NewColumn,false)) did not equal StructType(StructField(a,true)) ScalaTestFailureLocation: com.holdenkarau.spark.testing.TestSuite$class at (TestSuite.scala:13) Expected :StructType(StructField(a,true)) Actual :StructType(StructField(a,false)) 当它应该是真的时,最后的StructField是假的,我不是为什么.这是真的意味着架构接受空值. 这是我的考验: val schema1 = Array("a","b","c","d","e","f") val df = List(("a1","b1","c1","d1","e1","f1"),("a2","b2","c2","d2","e2","f2")) .toDF(schema1: _*) val schema2 = Array("a","f","NewColumn") val dfExpected = List(("a1","f1","a1_b1_c1_d1_e1_f1"),"f2","a2_b2_c2_d2_e2_f2")).toDF(schema2: _*) val transformer = KeyContract("NewColumn",schema1) val newDf = transformer(df) newDf.columns should contain ("NewColumn") assertDataFrameEquals(newDf,dfExpected) 这是KeyContract: case class KeyContract(tempColumn: String,columns: Seq[String],unsigned: Boolean = true) extends Transformer { override def apply(input: DataFrame): DataFrame = { import org.apache.spark.sql.functions._ val inputModif = columns.foldLeft(input) { (tmpDf,columnName) => tmpDf.withColumn(columnName,when(col(columnName).isNull,lit("")).otherwise(col(columnName))) } inputModif.withColumn(tempColumn,concat_ws("_",columns.map(col): _*)) } } 提前致谢!! 解决方法
发生这种情况是因为concat_ws从不返回null,并且结果字段被标记为不可为空.
如果要使用第二个DataFrame作为参考,则必须使用schema和Rows: import org.apache.spark.sql.{Row,SparkSession} import org.apache.spark.sql.types._ val spark: SparkSession = SparkSession.builder.getOrCreate() val dfExpected = spark.createDataFrame(spark.sparkContext.parallelize(List( Row("a1",Row("a2","a2_b2_c2_d2_e2_f2") )),StructType(schema2.map { c => StructField(c,c != "NewColumn") })) 这样,最后一列将不可为空: dfExpected.printSchema root |-- a: string (nullable = true) |-- b: string (nullable = true) |-- c: string (nullable = true) |-- d: string (nullable = true) |-- e: string (nullable = true) |-- f: string (nullable = true) |-- NewColumn: string (nullable = false) (编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |