Program timing under Window

[C/C + +] timing function comparison

At present, there are various timing functions. Generally, the processing is to call the timing function first, write down the current time tstart, then process a section of program, and then call the timing function, write down the processed time ten, and then make a difference between ten and tstart to get the execution time of the program, but the accuracy of various timing functions is different. Here are some simple records of various timing functions

Method 1

time() gets the current system time, and the returned result is a time_t type is actually a large integer. Its value represents the number of seconds from CUT (Coordinated Universal Time) time on January 1, 1970 00:00:00 (called the Epoch time of UNIX System) to the current time

void test1()
{
    time_t start,stop;
    start = time(NULL);
    foo();//dosomething
    stop = time(NULL);
    printf("Use Time:%ld\n",(stop-start));
}
Method 2

The clock() function returns the number of CPU clock units (clock tick) from the "opening the program process" to "calling the clock() function in the program", which is called "clock time" (wal-clock) CLOCKS_ in MSDN. PER_ SEC, which is used to indicate how many clock timing units there will be in a second

void test2()
{
    double dur;
    clock_t start,end;
    start = clock();
    foo();//dosomething
    end = clock();
    dur = (double)(end - start);
    printf("Use Time:%f\n",(dur/CLOCKS_PER_SEC));
}
Method 3

The timeGetTime() function the system time in milliseconds. This time is the elapsed time from the start of the system. It is the windows api

void test3()
{
    DWORD t1,t2;
    t1 = timeGetTime();
    foo();//dosomething
    t2 = timeGetTime();
    printf("Use Time:%f\n",(t2-t1)*1.0/1000);
}
Method 4

The QueryPerformanceCounter() function returns the value of the high-precision performance counter, which can be timed in subtle units. However, the minimum unit of the exact timing of QueryPerformanceCounter() is related to the system, so you must query the system to get the frequency of the tick returned by QueryPerformanceCounter(). QueryPerformanceFrequency() provides this frequency value, Returns the number of ticks per second

void test4()
{
    LARGE_INTEGER t1,t2,tc;
    QueryPerformanceFrequency(&tc);
    QueryPerformanceCounter(&t1);
    foo();//dosomething
    QueryPerformanceCounter(&t2);
    printf("Use Time:%f\n",(t2.QuadPart - t1.QuadPart)*1.0/tc.QuadPart);
}
Method 5

GetTickCount returns the number of milliseconds elapsed since the operating system was started. Its return value is DWORD

void test5()
{
    DWORD t1,t2;
    t1 = GetTickCount();
    foo();//dosomething
    t2 = GetTickCount();
    printf("Use Time:%f\n",(t2-t1)*1.0/1000);
}
Method 6

RDTSC instruction has a component called "Time Stamp" in CPUs above Intel Pentium, which records the number of clock cycles since the CPU is powered on in the format of 64 bit unsigned integer. Because the current CPU frequency is very high, this component can achieve nanosecond timing accuracy. This accuracy is unmatched by the above methods. In CPUs above Pentium, a machine instruction RDTSC (Read Time Stamp Counter) is provided to read the number of this timestamp and save it in the EDX:EAX register pair. Because the EDX:EAX register pair happens to be the register that C + + language saves the return value of the function on the Win32 platform, we can regard this instruction as an ordinary function call, because RDTSC is not directly supported by the embedded assembler of C + +, so we need to use it_ The emit pseudo instruction is directly embedded in the machine code forms 0X0F and 0X31 of the instruction

inline unsigned __int64 GetCycleCount()
{
    __asm
    {
        _emit 0x0F;
        _emit 0x31;
    }
}

void test6()
{
    unsigned long t1,t2;
    t1 = (unsigned long)GetCycleCount();
    foo();//dosomething
    t2 = (unsigned long)GetCycleCount();
    printf("Use Time:%f\n",(t2 - t1)*1.0/FREQUENCY);   //FREQUENCY is the FREQUENCY of the CPU
}
Method 7

Gettimeofday() is a timing function in Linux environment, int gettimeofday (struct timeval * tv, struct timezone * tz). Gettimeofday() returns the current time with the structure indicated by tv, and the information of the local time zone is placed in the structure indicated by tz

//The timeval structure is defined as:
struct timeval{
long tv_sec; /*second*/
long tv_usec; /*Microsecond*/
};
//The timezone structure is defined as:
struct timezone{
int tz_minuteswest; /*How many minutes is it from Greenwich*/
int tz_dsttime; /*Daylight saving time status*/
};
void test7()
{
    struct timeval t1,t2;
    double timeuse;
    gettimeofday(&t1,NULL);
    foo();
    gettimeofday(&t2,NULL);
    timeuse = t2.tv_sec - t1.tv_sec + (t2.tv_usec - t1.tv_usec)/1000000.0;
    printf("Use Time:%f\n",timeuse);
}
Method 8

Under linux environment, timing with RDTSC instruction is the same as method 6, but the implementation method is a little different in Linux

#if defined (__i386__)
static __inline__ unsigned long long GetCycleCount(void)
{
        unsigned long long int x;
        __asm__ volatile("rdtsc":"=A"(x));
        return x;
}
#elif defined (__x86_64__)
static __inline__ unsigned long long GetCycleCount(void)
{
        unsigned hi,lo;
        __asm__ volatile("rdtsc":"=a"(lo),"=d"(hi));
        return ((unsigned long long)lo)|(((unsigned long long)hi)<<32);
}
#endif

void test8()
{
        unsigned long t1,t2;
        t1 = (unsigned long)GetCycleCount();
        foo();//dosomething
        t2 = (unsigned long)GetCycleCount();
        printf("Use Time:%f\n",(t2 - t1)*1.0/FREQUENCY); //Frequency CPU frequency
}
A simple comparison table is as follows
Serial numberfunctiontypeAccuracy leveltime
1timeC system calllow<1s
2clcokC system calllow<10ms
3timeGetTimeWindows APIin<1ms
4QueryPerformanceCounterWindows APIhigh<0.1ms
5GetTickCountWindows APIin<1ms
6RDTSCinstructionshigh<0.1ms
7gettimeofdayC system call in linux Environmenthigh<0.1ms

In summary, methods 1, 2, 7 and 8 can be executed in linux environment, and methods 1, 2, 3, 4, 5 and 6 can be executed in windows environment. Among them, the return value types of timeGetTime() and GetTickCount() are DWORD. When the statistical millisecond is too large, the result will return to 0 and affect the statistical result
The test results show that under windows environment, the dominant frequency is 1.6GHz, and the unit is seconds

1 Use Time:0
2 Use Time:0.390000
3 Use Time:0.388000
4 Use Time:0.394704
5 Use Time:0.407000
6 Use Time:0.398684
Under linux environment, the main frequency is 2.67GHz, and the unit is seconds

1 Use Time:1
2 Use Time:0.290000
7 Use Time:0.288476
8 Use Time:0.297843
Because the precision of the time() timing function is relatively low, different results will be obtained when running the program many times, sometimes 0 and sometimes 1

The foo() function is as follows:

void foo()
{
    long i;
    for (i=0;i<100000000;i++)
    {
        long a= 0;
        a = a+1;
    }
}

Tags: C++

Posted on Thu, 11 Nov 2021 23:03:40 -0500 by zeb