C#TPL比C PPL更快？

发布时间：2020-12-16 05:05:50 所属栏目：百科来源：网络整理

导读：我编写了一个非常简单的应用程序,它使用Fibonacci函数来比较TPL的Parallel.ForEach与PPL的parallel_for_each,结果非常奇怪,在具有8个核心的PC上,c#比c快11秒. vs2010和2011预览都有相同的结果. C#代码： using System;using System.Collections.Generic;usin

我编写了一个非常简单的应用程序,它使用Fibonacci函数来比较TPL的Parallel.ForEach与PPL的parallel_for_each,结果非常奇怪,在具有8个核心的PC上,c#比c快11秒.

vs2010和2011预览都有相同的结果.

C#代码：

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Collections.Concurrent;
using System.Threading.Tasks;
using System.Diagnostics;

namespace ConsoleApplication1
{
    class Program
    {

            static void Main(string[] args)
            {
                var ll = new ConcurrentQueue<Tuple<int,int>>();
                var a = new int[12] { 40,41,42,43,44,45,46,47,35,25,36,37 };

                long elapsed = time_call(() =>
                {
                    Parallel.ForEach(a,(n) => { ll.Enqueue(new Tuple<int,int>(n,fibonacci(n))); });
                });

                Console.WriteLine("TPL C# elapsed time: " + elapsed + "nr");
                foreach (var ss in ll)
                {
                    Console.WriteLine(String.Format("fib<{0}>: {1}",ss.Item1,+ss.Item2));
                }

                 Console.ReadLine();
            }

            static long time_call(Action f)
            {
                var p = Stopwatch.StartNew();
                p.Start();
                f();
                p.Stop();
                return p.ElapsedMilliseconds;
            }

             Computes the nth Fibonacci number.
            static int fibonacci(int n)
            {
                if (n < 2) return n;
                return fibonacci(n - 1) + fibonacci(n - 2);
            }
        }
    }

c代码：

#include <windows.h>
#include <ppl.h>
#include <concurrent_vector.h>
#include <array>
#include <tuple>
#include <algorithm>
#include <iostream>

using namespace Concurrency;
using namespace std;

template <class Function>
__int64 time_call(Function&& f) {
    __int64 begin = GetTickCount();
    f();
    return GetTickCount() - begin;
}

// Computes the nth Fibonacci number.
int fibonacci(int n) {
    if (n < 2) return n;
    return fibonacci(n-1) + fibonacci(n-2);
}

int wmain() {
    __int64 elapsed;
    array<int,12> a ={ 40,37 };
    concurrent_vector<tuple<int,int>> results2;

    elapsed = time_call([&]{
        parallel_for_each(a.begin(),a.end(),[&](int n) {
            results2.push_back(make_tuple(n,fibonacci(n)));
        });
    });  

    wcout << L"PPL  time: " << elapsed << L" ms" << endl << endl;
    for_each (results2.begin(),results2.end(),[](tuple<int,int>& pair) {
        wcout << L"fib(" << get<0>(pair) << L"): " << get<1>(pair) << endl;
    });

    cin.ignore();
}

你能指点一下,我的C代码的一部分我错了吗？

宽度group_task我有像c#代码一样的时间：

task_group tasks;
    elapsed = time_call([&] 
    {
        for_each(begin(a),end(a),[&](int n) 
        {
            tasks.run([&,n]{results2.push_back(make_tuple(n,fibonacci(n)));});
        });
        tasks.wait();

解决方法

以下是Rahul v Patil微软团队的解释

Hello,

Thanks for bringing this up. Indeed,you’ve identified the overhead
associated with the default parallel for * – especially when the
number of iterations are small,and the work size is variable. The
default parallel for starts off by breaking down the work into 8
chunks (on 8 cores). As the work finishes,work is dynamically
load-balanced. The default works great in most cases (large number of
iterations),and when the underlying work per iteration is not well
understood (let’s say you call into a library) – but it does come
with unacceptable overheads in some cases.

The solution is exactly what you’ve identified in your alternate
implemtnation. To that effect,we’ll have a parallel for partitioner
called “simple” in next version of Visual Studio,which will be
similar to the alternate implementation you describe and will have
much better performance.

PS: The C# and C++ parallel for each implementations use slightly
different algorithms in how they go through the iterations – hence you
will see slightly different performance characteristics depending on
the workload.

Regards

（编辑：李大同）

【声明】本站内容均来自网络，其相关言论仅代表作者个人观点，不代表本站立场。若无意侵犯到您的权利，请及时与联系站长删除相关内容!