java – 比较直接和非直接ByteBuffer的get / put操作

发布时间：2020-12-14 17:42:44 所属栏目：Java 来源：网络整理

导读：从直接bytebuffer获取/放置比直接bytebuffer的get / put更快？如果我必须从直接bytebuffer读/写,最好首先读/写入线程本地字节数组,然后使用字节数组完全更新(用于写入)直接bytebuffer？解决方法 Is get/put from a non-direct bytebuffer faster than get/

从直接bytebuffer获取/放置比直接bytebuffer的get / put更快？

如果我必须从直接bytebuffer读/写,最好首先读/写入线程本地字节数组,然后使用字节数组完全更新(用于写入)直接bytebuffer？

解决方法

Is get/put from a non-direct bytebuffer faster than get/put from direct bytebuffer ?

如果将堆缓冲区与不使用本机字节顺序的直接缓冲区进行比较(大多数系统是小端序列,而直接ByteBuffer的默认值是大端),性能非常相似.

如果使用本机有序字节缓冲区,则对于多字节值,性能可能会更好.对于字节,无论你做什么都没有什么区别.

在HotSpot / OpenJDK中,ByteBuffer使用Unsafe类,许多本机方法被视为intrinsics.这是依赖于JVM的,AFAIK是Android VM将其视为最新版本的内在函数.

如果您转储生成的程序集,可以看到Unsafe中的内在函数被转换为一个机器代码指令.即它们没有JNI呼叫的开销.

事实上,如果您进行微调,您可能会发现ByteBuffer getXxxx或setXxxx的大部分时间都花在边界检查中,而不是实际的内存访问.因此,当我必须达到最高性能时,我仍然直接使用Unsafe(注意：Oracle不鼓励这样做)

If I have to read / write from direct bytebuffer,is it better to first read /write in to a thread local byte array and then update ( for writes ) the direct bytebuffer fully with the byte array ?

我会讨厌看到比那更好. ;)听起来很复杂

通常最简单的解决方案更好,更快.

您可以使用此代码自行测试.

public static void main(String... args) {
    ByteBuffer bb1 = ByteBuffer.allocateDirect(256 * 1024).order(ByteOrder.nativeOrder());
    ByteBuffer bb2 = ByteBuffer.allocateDirect(256 * 1024).order(ByteOrder.nativeOrder());
    for (int i = 0; i < 10; i++)
        runTest(bb1,bb2);
}

private static void runTest(ByteBuffer bb1,ByteBuffer bb2) {
    bb1.clear();
    bb2.clear();
    long start = System.nanoTime();
    int count = 0;
    while (bb2.remaining() > 0)
        bb2.putInt(bb1.getInt());
    long time = System.nanoTime() - start;
    int operations = bb1.capacity() / 4 * 2;
    System.out.printf("Each putInt/getInt took an average of %.1f ns%n",(double) time / operations);
}

版画

Each putInt/getInt took an average of 83.9 ns
Each putInt/getInt took an average of 1.4 ns
Each putInt/getInt took an average of 34.7 ns
Each putInt/getInt took an average of 1.3 ns
Each putInt/getInt took an average of 1.2 ns
Each putInt/getInt took an average of 1.3 ns
Each putInt/getInt took an average of 1.2 ns
Each putInt/getInt took an average of 1.2 ns
Each putInt/getInt took an average of 1.2 ns
Each putInt/getInt took an average of 1.2 ns

我确定JNI通话需要的时间超过1.2 ns.

为了证明它不是“JNI”的呼叫,而是围绕它引起延迟的问题.您可以直接使用Unsafe编写相同的循环.

public static void main(String... args) {
    ByteBuffer bb1 = ByteBuffer.allocateDirect(256 * 1024).order(ByteOrder.nativeOrder());
    ByteBuffer bb2 = ByteBuffer.allocateDirect(256 * 1024).order(ByteOrder.nativeOrder());
    for (int i = 0; i < 10; i++)
        runTest(bb1,ByteBuffer bb2) {
    Unsafe unsafe = getTheUnsafe();
    long start = System.nanoTime();
    long addr1 = ((DirectBuffer) bb1).address();
    long addr2 = ((DirectBuffer) bb2).address();
    for (int i = 0,len = Math.min(bb1.capacity(),bb2.capacity()); i < len; i += 4)
        unsafe.putInt(addr1 + i,unsafe.getInt(addr2 + i));
    long time = System.nanoTime() - start;
    int operations = bb1.capacity() / 4 * 2;
    System.out.printf("Each putInt/getInt took an average of %.1f ns%n",(double) time / operations);
}

public static Unsafe getTheUnsafe() {
    try {
        Field theUnsafe = Unsafe.class.getDeclaredField("theUnsafe");
        theUnsafe.setAccessible(true);
        return (Unsafe) theUnsafe.get(null);
    } catch (Exception e) {
        throw new AssertionError(e);
    }
}

版画

Each putInt/getInt took an average of 40.4 ns
Each putInt/getInt took an average of 44.4 ns
Each putInt/getInt took an average of 0.4 ns
Each putInt/getInt took an average of 0.3 ns
Each putInt/getInt took an average of 0.3 ns
Each putInt/getInt took an average of 0.3 ns
Each putInt/getInt took an average of 0.3 ns
Each putInt/getInt took an average of 0.3 ns
Each putInt/getInt took an average of 0.3 ns
Each putInt/getInt took an average of 0.3 ns

因此,您可以看到本地电话比您期望的JNI呼叫快得多.这种延迟的主要原因可能是L2缓存速度.

（编辑：李大同）

【声明】本站内容均来自网络，其相关言论仅代表作者个人观点，不代表本站立场。若无意侵犯到您的权利，请及时与联系站长删除相关内容!