在c中使用多线程快速排序
我使用多线程方法实现了一个quicksort程序,在C中使用Portfolio任务.
但我不确定什么是对的!在我看来,在一个线程中,算法的工作速度比两个或四个线程快.我能以某种方式搞砸同步吗? 谢谢任何人帮助我. 码: #include <thread> #include <chrono> #include <mutex> #include <condition_variable> #include <iostream> #include <queue> #include <vector> #include <set> #include <ctime> #include <algorithm> using namespace std; //print array template <typename T> void print(const vector<T> &arr) { for (size_t i = 0; i < arr.size(); i++) cout << arr[i] << " "; cout << endl; } //queue tasks queue< pair<int,int> > tasks; //mutexs for set and queue task mutex q_mutex,s_mutex; //condition variable condition_variable cv; //set set<int> ss; //partition algorithm template <typename T> int partition(vector<T> &arr,int l,int r) { T tmp = arr[r]; //as pivot element int i = l - 1; for (int j = l; j <= r - 1; j++) if (arr[j] < tmp) { i++; swap(arr[i],arr[j]); } swap(arr[i + 1],arr[r]); i++; return i; } //quick sort template <typename T> void quick_sort(vector<T> &arr) { while (true) { unique_lock<mutex> u_lock(q_mutex); //lock mutex //sort is fineshed if ( ss.size() == arr.size() ) //u_lock.unlock() return; //if queue task is not empty if ( tasks.size() > 0 ) { //get task from queue pair<int,int> cur_task = tasks.front(); tasks.pop(); int l = cur_task.first,r = cur_task.second; if (l < r) { int q = partition(arr,l,r); //split array //Add indexes in set s_mutex.lock(); ss.insert(q); ss.insert(l); ss.insert(r); s_mutex.unlock(); //push new tasks for left and right part tasks.push( make_pair(l,q - 1) ); tasks.push( make_pair(q + 1,r) ); //wakeup some thread which waiting cv.notify_one(); } } else //if queue is empty cv.wait(u_lock); } } //Size array const int ARR_SIZE = 100000; //Count threads const int THREAD_COUNT = 8; thread thrs[THREAD_COUNT]; //generatin array void generate_arr(vector<int> &arr) { srand(time( NULL )); std::generate(arr.begin(),arr.end(),[](){return rand() % 10000; }); } //check for sorting bool is_sorted(const vector<int> &arr) { for (size_t i = 0; i < arr.size() - 1; i++) if ( ! (arr[i] <= arr[i + 1]) ) return false; return true; } int main() { //time clock_t start,finish; vector<int> arr(ARR_SIZE); //generate array generate_arr(arr); cout << endl << "Generating finished!" << endl << endl; cout << "Array before sorting" << endl << endl; //Before sorting print(arr); cout << endl << endl; cout << "Checking is_sorted finished! The result is " << (is_sorted(arr) == 0? "false": "true") << "." << endl << endl; //add task tasks.push( make_pair(0,arr.size() - 1) ); //================================================== start = clock(); for (int i = 0; i < THREAD_COUNT; i++) thrs[i] = thread( quick_sort<int>,ref(arr) ); finish = clock(); //================================================== for (auto& th : thrs) th.join(); cout << "Sorting finished!" << endl << endl; cout << "Array after sorting" << endl << endl; //After sorting print(arr); cout << endl << endl; cout << "Checking is_sorted finished! The result is " << (is_sorted(arr) == 0? "false": "true") << "." << endl << endl; cout << "Runtime: " << (double)(finish - start) / CLOCKS_PER_SEC << endl; return 0; } 解决方法
与性能问题相关的线程数要多得多.其中,
>您需要具有实际并发性,而不仅仅是多个线程.正如@ Rakete1111和@ user1034749都观察到的那样,你没有. 以下是一些可以提高多线程性能的方法: >在方法quick_sort()中,不要在实际排序期间锁定互斥锁q_mutex,就像您目前所做的那样(您使用的unique_lock构造函数锁定互斥锁,并且您不会在unique_lock的生命周期内解锁它). 您还可以考虑增加运行测试的元素数量. 100000并不是那么多,您可能会看到更大问题的不同性能特征.在现代硬件上进行这样的测试,1000000个元素根本不合理. (编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |