Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
937 views
in Technique[技术] by (71.8m points)

multithreading - Multithread program in C++ shows the same performance as a serial one

I just want to write a simple program in C++, which creates two threads and each of them fills vector by squares of integers (0, 1, 4, 9, ...). Here is my code:

#include <iostream>
#include <vector>
#include <functional>
#include <thread>
#include <time.h>

#define MULTI 1
#define SIZE 10000000

void fill(std::vector<unsigned long long int> &v, size_t n)
{
    for (size_t i = 0; i < n; ++i) {
        v.push_back(i * i);
    }
}

int main()
{
    std::vector<unsigned long long int> v1, v2;
    v1.reserve(SIZE);
    v2.reserve(SIZE);
    #if !MULTI
    clock_t t = clock();
    fill(v1, SIZE);
    fill(v2, SIZE);
    t = clock() - t;
    #else
    clock_t t = clock();
    std::thread first(fill, std::ref(v1), SIZE);
    fill(v2, SIZE);
    first.join();
    t = clock() - t;
    #endif
    std::cout << (float)t / CLOCKS_PER_SEC << std::endl;
    return 0;
}

But when I run my program, I see, that there is no significant difference between the serial version and the parallel one (or sometimes parallel version shows even worse results). Any idea what happens?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

When I execute your code with MSVC2015 on a i7, I observe:

  • in debug mode, multithread is 14s compared to 26s in monothread. So it's almost twice as fast. The results are as expected.
  • in release mode, multithread is 0.3 compared to 0.2 in monothread, so it's slower, as you've reported.

This suggest that your issue is related to the fact that the optimized fill() is too short compared to the overhead of creating a thread.

Note also that even when there is enought work to do in fill() (e.g. the unoptimized version), the multithread will not multiply the time by two. Multithreading will increase overall throughput per second on a multicore processor, but each thread taken separately might run a little bit slower than usual.

Edit: additional information

The multithreading performance depends on a lot of factors, among others, for example the number of cores on your processor, the cores used by other processes running during the test, and as remarked by doug in his comment, the profile of the multithreaded task (i.e. memory vs. computing).

To illustrate this, here the results of an informal benchmark that shows that decrease of individual thread throughput is much faster for memory intensive than for floating point intensive computations, and global throughput grows much slower (if at all):

enter image description here

Using the following functions for each thread :

// computation intensive
void mytask(unsigned long long loops)
{
    volatile double x; 
    for (unsigned long long i = 0; i < loops; i++) {
        x = sin(sqrt(i) / i*3.14159);
    }
}

//memory intensive
void mytask2(vector<unsigned long long>& v, unsigned long long loops)
{
    for (unsigned long long i = 0; i < loops; i++) {
        v.push_back(i*3+10);
    }
}

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...