为什么Scala的尾递归慢于Java？

发布时间：2020-12-16 18:59:20 所属栏目：安全来源：网络整理

导读：使用尾递归进行简单加法的 Scala代码 def add(list : List[Int],sum:Int):Int = { //Thread.dumpStack() if (list.isEmpty) { sum } else { val headVal = list.head add(list.tail,sum + headVal) }} 下面是递归模式下添加的java代码. public static int ad

使用尾递归进行简单加法的 Scala代码

def add(list : List[Int],sum:Int):Int = {
    //Thread.dumpStack()
    if (list.isEmpty) {
        sum
    } else {
        val headVal  = list.head
        add(list.tail,sum + headVal)
    }
}

下面是递归模式下添加的java代码.

public static int add(List<Integer> list,Integer sum) {
    // Thread.dumpStack();
    if (list.isEmpty()) {
        return sum;
    } else {
        int headVal = list.remove(0);
        return add(list,sum + headVal);
    }
}

Java版本运行速度至少快10倍.这是1000条目.通过使用System.nanoTime()API之前和之后的测量时间.

Scala版本2.10,Java版本Java 7.两个运行的JVM属性相同.

解决方法

首先,您展示的Scala方法不在(类)上下文中.如果在类中使用此方法,则无法应用尾递归优化,因为该方法既不是final也不是private.如果添加@tailrec,编译将失败.如果我用10000运行它,它会导致堆栈溢出.

至于Java版本：Java版本使用可变列表：头/??尾分解改变了基础列表.
因此,在总结之后,您不能再使用该列表,因为它是空的.

此外,Scala中的List与Java List具有完全不同的含义; Scala列表用于头/尾分解.据我所知java.util.List没有tail方法,Java编译器也没有应用tailrec优化,因此比较是“不公平的”.

无论如何,我已经在不同的场景上运行了一些基于JMH的测试.

您可以真正比较的两个场景是“Scala while”和“Java for”.他们既不使用OOP也不使用函数编程,这只是必要的.

五种不同Scala场景的结果

(请滚动到右侧,最后一栏中有一个小描述)

Benchmark                   Mode   Samples         Mean   Mean error    Units
a.b.s.b.Benchmark5a.run    thrpt        10      238,515        7,769   ops/ms Like in the question,but with @tailrec
a.b.s.b.Benchmark5b.run    thrpt        10      130,202        2,232   ops/ms Using List.sum
a.b.s.b.Benchmark5c.run    thrpt        10     2756,132       29,920   ops/ms while (no List,vars,imperative)
a.b.s.b.Benchmark5d.run    thrpt        10      237,286        2,203   ops/ms tailrec version with pattern matching
a.b.s.b.Benchmark5e.run    thrpt        10      214,719        2,483   ops/ms Like in the question (= no tailrec opt)

> 5a和5e就像问题中的那样; 5a与@tailrec.
> 5b：List.sum：非常慢
> 5c：不是一个大惊喜,命令式版本是最快的版本.
> 5d使用模式匹配而不是if(那将是“我的风格”),我添加了这个的来源：

package app.benchmark.scala.benchmark5

import scala.annotation._
import org.openjdk.jmh.annotations.GenerateMicroBenchmark
import org.openjdk.jmh.annotations.Scope
import org.openjdk.jmh.annotations.State
import org.openjdk.jmh.runner.Runner
import org.openjdk.jmh.runner.RunnerException
import org.openjdk.jmh.runner.options.Options
import org.openjdk.jmh.runner.options.OptionsBuilder

@State(Scope.Benchmark)
object BenchmarkState5d {
  val list = List.range(1,1000)
}

class Benchmark5d {
  private def add(list : List[Int]): Int = {
    @tailrec
    def add(list : List[Int],sum: Int): Int = {
      list match {
        case Nil =>
          sum
        case h :: t =>
          add(t,h + sum)
      }
    }
    add(list,0)
  }

  @GenerateMicroBenchmark
  def run() = {
    add(BenchmarkState5d.list)
  }
}

三种Java场景

Benchmark                   Mode   Samples         Mean   Mean error    Units
a.b.j.b.Benchmark5a.run    thrpt        10       40,437        0,532   ops/ms mutable (rebuilds the list in each iteration)
a.b.j.b.Benchmark5b.run    thrpt        10        0,450        0,008   ops/ms subList
a.b.j.b.Benchmark5c.run    thrpt        10     2735,951       29,177   ops/ms for

如果你真的想要在函数式编程风格(= immutable,tail recursion,head / tail分解)的意义上进行比较,那么Java版本要慢五倍.

正如Marko Topolnik在评论中所指出的那样：

subList doesn’t copy the tail,but it does something comparably bad,when applied to a LinkedList: it wraps the original list and uses an offset to accomodate the semantics. The result is that an O(n) recursive algorithm becomes O(n2)—the same as if the tail was repeatedly copied. Plus,wrappers accrete so you end up with the list wrapped a thousand times. Definitely not comparable to a head/tail list

public class Benchmark5a {
    public static int add(List<Integer> list,Integer sum) {
        if (list.isEmpty()) {
            return sum;
        } else {
            int headVal = list.remove(0);
            return add(list,sum + headVal);
        }
    }

    @GenerateMicroBenchmark
    public long run() {
        final List<Integer> list = new LinkedList<Integer>();
        for(int i = 0; i < 1000; i++) {
            list.add(i);
        }
        return add(list,0);
    }

    public static void main(String[] args) {
        System.out.println(new Benchmark5a().run());
    }
}

@State(Scope.Benchmark)
class BenchmarkState5b {
    public final static List<Integer> list = new LinkedList<Integer>();

    static {
        for(int i = 0; i < 1000; i++) {
            list.add(i);
        }
    }
}

public class Benchmark5b {
    public static int add(List<Integer> list,int sum) {
        if (list.isEmpty()) {
            return sum;
        } else {
            int headVal = list.get(0);
            return add(list.subList(1,list.size()),sum + headVal);
        }
    }

    @GenerateMicroBenchmark
    public long run() {
        return add(BenchmarkState5b.list,0);
    }

    public static void main(String[] args) {
        System.out.println(new Benchmark5b().run());
    }
}

Scala结果很详细

(所有结果仅显示最后一个场景,以及整体结果)

[...]

# VM invoker: /home/oracle-jdk-1.8-8u40/data/oracle-jdk-1.8.0_40/jre/bin/java
# VM options: <none>
# Fork: 1 of 1
# Warmup: 3 iterations,1 s each
# Measurement: 10 iterations,1 s each
# Threads: 1 thread,will synchronize iterations
# Benchmark mode: Throughput,ops/time
# Benchmark: app.benchmark.scala.benchmark5.Benchmark5e.run
# Warmup Iteration   1: 166,153 ops/ms
# Warmup Iteration   2: 215,242 ops/ms
# Warmup Iteration   3: 216,632 ops/ms
Iteration   1: 215,526 ops/ms
Iteration   2: 213,720 ops/ms
Iteration   3: 213,967 ops/ms
Iteration   4: 215,468 ops/ms
Iteration   5: 216,247 ops/ms
Iteration   6: 217,514 ops/ms
Iteration   7: 215,503 ops/ms
Iteration   8: 211,969 ops/ms
Iteration   9: 212,989 ops/ms
Iteration  10: 214,291 ops/ms

Result : 214,719 ±(99.9%) 2,483 ops/ms
  Statistics: (min,avg,max) = (211,969,214,719,217,514),stdev = 1,642
  Confidence interval (99.9%): [212,236,202]


Benchmark                   Mode   Samples         Mean   Mean error    Units
a.b.s.b.Benchmark5a.run    thrpt        10      238,769   ops/ms
a.b.s.b.Benchmark5b.run    thrpt        10      130,232   ops/ms
a.b.s.b.Benchmark5c.run    thrpt        10     2756,920   ops/ms
a.b.s.b.Benchmark5d.run    thrpt        10      237,203   ops/ms
a.b.s.b.Benchmark5e.run    thrpt        10      214,483   ops/ms

Java结果详细

# VM invoker: /home/oracle-jdk-1.8-8u40/data/oracle-jdk-1.8.0_40/jre/bin/java
# VM options: <none>
# Fork: 1 of 1
# Warmup: 3 iterations,ops/time
# Benchmark: app.benchmark.java.benchmark5.Benchmark5c.run
# Warmup Iteration   1: 2777,495 ops/ms
# Warmup Iteration   2: 2888,040 ops/ms
# Warmup Iteration   3: 2692,851 ops/ms
Iteration   1: 2737,169 ops/ms
Iteration   2: 2745,368 ops/ms
Iteration   3: 2754,105 ops/ms
Iteration   4: 2706,131 ops/ms
Iteration   5: 2721,593 ops/ms
Iteration   6: 2769,261 ops/ms
Iteration   7: 2734,461 ops/ms
Iteration   8: 2741,494 ops/ms
Iteration   9: 2740,012 ops/ms
Iteration  10: 2709,915 ops/ms

Result : 2735,951 ±(99.9%) 29,177 ops/ms
  Statistics: (min,max) = (2706,131,2735,951,2769,261),stdev = 19,299
  Confidence interval (99.9%): [2706,774,2765,128]


Benchmark                   Mode   Samples         Mean   Mean error    Units
a.b.j.b.Benchmark5a.run    thrpt        10       40,532   ops/ms
a.b.j.b.Benchmark5b.run    thrpt        10        0,008   ops/ms
a.b.j.b.Benchmark5c.run    thrpt        10     2735,177   ops/ms

更新：使用ArrayList添加了另一个Java方案5d

Benchmark                   Mode   Samples         Mean   Mean error    Units
a.b.j.b.Benchmark5a.run    thrpt        10       34,931        0,504   ops/ms
a.b.j.b.Benchmark5b.run    thrpt        10        0,430        0,005   ops/ms
a.b.j.b.Benchmark5c.run    thrpt        10     2610,085        9,664   ops/ms
a.b.j.b.Benchmark5d.run    thrpt        10       56,693        1,218   ops/ms

（编辑：李大同）

【声明】本站内容均来自网络，其相关言论仅代表作者个人观点，不代表本站立场。若无意侵犯到您的权利，请及时与联系站长删除相关内容!