scala – 如何在每行中添加行号?
发布时间:2020-12-16 09:07:55 所属栏目:安全 来源:网络整理
导读:假设这些是我的数据: ‘Maps‘ and ‘Reduces‘ are two phases of solving a query in HDFS.‘Map’ is responsible to read data from input location.it will generate a key value pair.that is,an intermediate output in local machine.’Reducer’ i
假设这些是我的数据:
‘Maps‘ and ‘Reduces‘ are two phases of solving a query in HDFS. ‘Map’ is responsible to read data from input location. it will generate a key value pair. that is,an intermediate output in local machine. ’Reducer’ is responsible to process the intermediate. output received from the mapper and generate the final output. 我想在每一行添加一个数字,如下面的输出: 1,‘Maps‘ and ‘Reduces‘ are two phases of solving a query in HDFS. 2,‘Map’ is responsible to read data from input location. 3,it will generate a key value pair. 4,that is,an intermediate output in local machine. 5,’Reducer’ is responsible to process the intermediate. 6,output received from the mapper and generate the final output. 将它们保存到文件中. 我试过了: object DS_E5 { def main(args: Array[String]): Unit = { var i=0 val conf = new SparkConf().setAppName("prep").setMaster("local") val sc = new SparkContext(conf) val sample1 = sc.textFile("data.txt") for(sample<-sample1){ i=i+1 val ss=sample.map(l=>(i,sample)) println(ss) } } } 但它的输出就像吹: Vector((1,‘Maps‘ and ‘Reduces‘ are two phases of solving a query in HDFS.)) ... 如何编辑我的代码以生成像我最喜欢的输出的输出? 解决方法
zipWithIndex就是你需要的.它通过在对的第二个位置上添加索引,从RDD [T]映射到RDD [(T,Long)].
sample1 .zipWithIndex() .map { case (line,i) => i.toString + "," + line } 或使用字符串插值(请参阅@ DanielC.Sobral的评论) sample1 .zipWithIndex() .map { case (line,i) => s"$i,$line" } (编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |