Scala并行收集运行时迷惑

发布时间：2020-12-16 19:04:11 所属栏目：安全来源：网络整理

导读：编辑：我的样本量太小了当我按照8 CPU的真实数据运行时,我看到速度提高了7.2倍.对我的代码添加4个字符不是太破旧;) 我目前正在试图通过使用Scala的优势“销售”管理,特别是在扩展CPU时.为此,我创建了一个简单的测试应用程序,它执行了一系列的向量数学,并且有

编辑：我的样本量太小了当我按照8 CPU的真实数据运行时,我看到速度提高了7.2倍.对我的代码添加4个字符不是太破旧;)

我目前正在试图通过使用Scala的优势“销售”管理,特别是在扩展CPU时.为此,我创建了一个简单的测试应用程序,它执行了一系列的向量数学,并且有点惊讶的发现,运行时在我的四核机器上并没有显着更好.有趣的是,我发现运行时是第一次通过收集并且随着后续调用变得越来越糟糕.在并行集合中有没有一些懒惰的东西是导致这个,还是我只是这样做错了？应该注意的是,我来自C/C++#世界,所以完全有可能我搞砸了我的配置.无论如何,这是我的设置：

InteliJ Scala插件

Scala 2.9.1.final

Windows 7 64位,四核处理器(无超线程)

import util.Random

  // simple Vector3D class that has final x,y,z components a length,and a '-' function
  class Vector3D(val x:Double,val y:Double,val z:Double)
  {
    def length = math.sqrt(x*x+y*y+z*z)
    def -(rhs : Vector3D ) = new Vector3D(x - rhs.x,y - rhs.y,z - rhs.z)
  }

object MainClass {

  def main(args : Array[String]) =
  {
    println("Available CPU's: " + Runtime.getRuntime.availableProcessors())
    println("Parallelism Degree set to: " + collection.parallel.ForkJoinTasks.defaultForkJoinPool.getParallelism);
    // my position
    val myPos = new Vector3D(0,0);

    val r = new Random(0);

    // define a function nextRand that gets us a random between 0 and 100
    def nextRand = r.nextDouble() * 100;

    // make 10 million random targets
    val targets = (0 until 10000000).map(_ => new Vector3D(nextRand,nextRand,nextRand)).toArray
    // take the .par hit before we start profiling
    val parTargets = targets.par

    println("Created " + targets.length + " vectors")

    // define a range function
    val rangeFunc : (Vector3D => Double) = (targetPos) => (targetPos - myPos).length

    // we'll select ones that are <50
    val within50 : (Vector3D => Boolean) = (targetPos) => rangeFunc(targetPos) < 50

    // time it sequentially
    val startTime_sequential = System.currentTimeMillis()
    val numTargetsInRange_sequential = targets.filter(within50)
    val endTime_sequential = System.currentTimeMillis()
    println("Sequential (ms): " + (endTime_sequential - startTime_sequential))

    // do the parallel version 10 times
    for(i <- 1 to 10)
    {

      val startTime_par = System.currentTimeMillis()
      val numTargetsInRange_parallel = parTargets.filter(within50)
      val endTime_par = System.currentTimeMillis()

      val ms = endTime_par - startTime_par;
      println("Iteration[" + i + "] Executed in " + ms + " ms")
    }
  }
}

该程序的输出是：

Available CPU's: 4
Parallelism Degree set to: 4
Created 10000000 vectors
Sequential (ms): 216
Iteration[1] Executed in 227 ms
Iteration[2] Executed in 253 ms
Iteration[3] Executed in 76 ms
Iteration[4] Executed in 78 ms
Iteration[5] Executed in 77 ms
Iteration[6] Executed in 80 ms
Iteration[7] Executed in 78 ms
Iteration[8] Executed in 78 ms
Iteration[9] Executed in 79 ms
Iteration[10] Executed in 82 ms

那么这里发生了什么我们做过滤器的前2次,速度比较慢,然后事情加快了？我明白本来就是一个并行启动的成本,我只是想弄清楚在应用程序中表达并行性是什么,特别是我想要能够显示管理程序,运行3-4次在四核心盒上更快.这不是一个好问题吗？

想法？

解决方法

你有微基准疾病.您最有可能对JIT编译阶段进行基准测试.您需要先预先运行JIT.

最好的想法是使用像http://code.google.com/p/caliper/这样的微型基准框架来处理所有这些.

编辑：对于Caliper基准测试Scala项目,有一个很好的SBT Template,参考from this blog post

（编辑：李大同）

【声明】本站内容均来自网络，其相关言论仅代表作者个人观点，不代表本站立场。若无意侵犯到您的权利，请及时与联系站长删除相关内容!