scala – collection.mutable.OpenHashMap vs collection.mutabl

发布时间：2020-12-16 19:13:07 所属栏目：安全来源：网络整理

导读：对于put和get操作,OpenHashMap的性能比HashMap高出约5倍： https://gist.github.com/1423303 HashMap应该优先于OpenHashMap吗？解决方法您的代码与OpenHashMap的一个用例完全匹配.你的代码： println ("scala OpenHashMap: " + time (warmup) { val m = ne

对于put和get操作,OpenHashMap的性能比HashMap高出约5倍： https://gist.github.com/1423303

HashMap应该优先于OpenHashMap吗？

解决方法

您的代码与OpenHashMap的一个用例完全匹配.你的代码：

println ("scala OpenHashMap: " + time (warmup) {  
  val m = new scala.collection.mutable.OpenHashMap[Int,Int]; 
  var i = 0;
  var start = System.currentTimeMillis();
  while(i<100000) { m.put(i,i);i=i+1;};
})

OpenHashMap(scaladoc)的解释：

A mutable hash map based on an open hashing scheme. The precise scheme
is undefined,but it should make a reasonable effort to ensure that an
insert with consecutive hash codes is not unneccessarily penalised. In
particular,mappings of consecutive integer keys should work without
significant performance loss.

我的重点.这解释了你的发现.何时使用OpenHashMap而不是HashMap？见Wikipedia.从那里：

Chained hash tables with linked lists are popular because they require
only basic data structures with simple algorithms,and can use simple
hash functions that are unsuitable for other methods.

The cost of a table operation is that of scanning the entries of the
selected bucket for the desired key. If the distribution of keys is
sufficiently uniform,the average cost of a lookup depends only on the
average number of keys per bucket—that is,on the load factor.

Chained hash tables remain effective even when the number of table
entries n is much higher than the number of slots. Their performance
degrades more gracefully (linearly) with the load factor. For example,
a chained hash table with 1000 slots and 10,000 stored keys (load
factor 10) is five to ten times slower than a 10,000-slot table (load
factor 1); but still 1000 times faster than a plain sequential list,
and possibly even faster than a balanced search tree.

For separate-chaining,the worst-case scenario is when all entries
were inserted into the same bucket,in which case the hash table is
ineffective and the cost is that of searching the bucket data
structure. If the latter is a linear list,the lookup procedure may
have to scan all its entries; so the worst-case cost is proportional
to the number n of entries in the table.

这是一般性解释.与这些事情一样,您的表现会根据使用情况而有所不同,如果您关心它,则需要对其进行测量.

（编辑：李大同）

【声明】本站内容均来自网络，其相关言论仅代表作者个人观点，不代表本站立场。若无意侵犯到您的权利，请及时与联系站长删除相关内容!