scala map reduce 思想
1. mapval v = Vector(1,2,3,4) val v2 = v.map(n => n * 2)
?
scala> val v = Vector(1,4) scala> val v2 = v.map(n => n * 2)
2. reduceval v = Vector(1,4) val v3 = v.reduce((sum,n) => sum + n) 很多例子写 v.reduce((a,b) => a + b) 便不好理解,其实是传入2个值,处理后,再跟下一个值传入,直到所有的都处理完。 3. 具体例子求一个文件中的平均年龄 1 54 import org.apache.spark.SparkConf import org.apache.spark.SparkContext object AvgAgeCalculator { def main(args: Array[String]): Unit = { val conf = new SparkConf().setAppName("Spark Exercise:Average Age Calculator") val sc = new SparkContext(conf) val dataFile = sc.textFile("file:///Users/walle/Documents/spark_projects/sparkage/sample_age_data.txt",5); val count = dataFile.count() //文件是对一行处理,这里对空格进行分割得到第二个,scala数组中是用()根据下标取元素 val ageData = dataFile.map(line => line.split(" ")(1)) //求和 val totalAge = ageData.map(age => Integer.parseInt( String.valueOf(age))).collect().reduce((a,b) => a + b) val avgAge : Double = totalAge.toDouble / count.toDouble println("Average Age is " + avgAge) } }
?
参考文章: http://www.codeblogbt.com/archives/148029 (编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |