Win32 native monitor is much faster than boost

Windows Vista (finally!) introduced APIs to build a monitor object.

http://msdn2.microsoft.com/en-us/library/ms683469.aspx

So, how’s performance? Let’s benchmark it.

I wrote two bounded FIFO classes. One was based on boost threading library (http://www.boost.org/doc/html/thread.html) and the other was based on the new Win32 APIs. The test application started two threads where one pushed integer 10M times and the other thread pulled them.

The result was surprising.

Win32 API FIFO is 13.5 times faster than boost.

Wow.

// ———————-
//  Boost FIFO
// ———————-
class Queue {
public:
    Queue(size_t queueSize) :
        m_queueSize(queueSize){
    }

    ~Queue(){
    }
    void Push(const int& obj){
        boost::mutex::scoped_lock lock(m_mutex);

        while(m_queue.size() == m_queueSize){
            // no slot available
            m_waitForSlot.wait(lock);
        }
        m_queue.push_back(obj);
        m_waitForObj.notify_one();
    }

    void Pull(int& obj){
        boost::mutex::scoped_lock lock(m_mutex);
        while(m_queue.empty()){
            // no obj available
            m_waitForObj.wait(lock);
        }
        obj = m_queue.front();
        m_queue.pop_front();
        m_waitForSlot.notify_one();
    }
private:
    mutable boost::mutex m_mutex;
    boost::condition m_waitForSlot;
    boost::condition m_waitForObj;
    bool m_isActivated;
    const size_t m_queueSize;
    std::deque<int> m_queue;
};


// ———————-
//  Win32 FIFO
// ———————-
class Queue {
private:
    CONDITION_VARIABLE m_waitForObj;
    CONDITION_VARIABLE m_waitForSlot;
    CRITICAL_SECTION m_cs;

    std::deque< int > m_queue;
    int m_queueSize;
public:
    Queue(int queueSize)
        :m_queueSize(queueSize){
        InitializeConditionVariable(&m_waitForObj);
        InitializeConditionVariable(&m_waitForSlot);
        InitializeCriticalSection(&m_cs);
    }
    ~Queue(){
    }

    void Push(int v){
        EnterCriticalSection(&m_cs);
        while(m_queueSize == m_queue.size()){
            // no slot is available
            SleepConditionVariableCS(&m_waitForSlot, &m_cs, INFINITE);
        }
        m_queue.push_back(v);
        WakeConditionVariable(&m_waitForObj);
        LeaveCriticalSection(&m_cs);
    }

    void Pull(int& v){
        EnterCriticalSection(&m_cs);
        while(m_queue.empty()){
            // no slot is available
            SleepConditionVariableCS(&m_waitForObj, &m_cs, INFINITE);
        }
        v = m_queue.front();
        m_queue.pop_front();
        WakeConditionVariable(&m_waitForSlot);
        LeaveCriticalSection(&m_cs);
    }
};


// ———————-
//  Test code
// ———————-

Queue queue(100);

void PushLoop(){
    for(int i = 0; i < 1024 * 1024 * 10; i++){
        queue.Push(i);
    }
}

void PullLoop(){
    for(int i = 0; i < 1024 * 1024 * 10; i++){
        int v;
        queue.Pull(v);
    }
}

int main(){
    boost::timer tim;
    boost::thread pushThread(PushLoop);
    boost::thread pullThread(PullLoop);
    pushThread.join();
    pullThread.join();
    cout << tim.elapsed() << endl;
}

The test is performed 5 times and get the average by timeit.exe command.

PS F:repostrunkrzxxxsandboxwin32_condition_var> timeit -f boost.dat
Average for F:repostrunkrzxxxsa key over 5 runs

Version Number:   Windows NT 6.0 (Build 6000)
Exit Time:        1173:02 pm, Monday, January 1 1601
Elapsed Time:     0:01:17.822
Process Time:     0:01:46.189
System Calls:     108784020
Context Switches: 26967278
Page Faults:      56206
Bytes Read:       1991820
Bytes Written:    35648
Bytes Other:      208091


PS F:repostrunkrzxxxsandboxwin32_condition_var> timeit -f win32.dat
Average for .win32cv.exe key over 5 runs

Version Number:   Windows NT 6.0 (Build 6000)
Exit Time:        1173:02 pm, Monday, January 1 1601
Elapsed Time:     0:00:05.767
Process Time:     0:00:08.888
System Calls:     2266545
Context Switches: 1832126
Page Faults:      5391
Bytes Read:       283530
Bytes Written:    10094
Bytes Other:      25284

Advertisements

About Moto

Engineer who likes coding
This entry was posted in Optimization. Bookmark the permalink.

One Response to Win32 native monitor is much faster than boost

  1. Pingback: Condition variables performance of boost, Win32, and the C++11 standard library | CODE: Sequoia

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s