O

#### Oussama Ben Ghorbel

##### Guest

**f(x) = c * ln(x) * cos(x)**

n=10000000

n=10000000

for (int pp = 2; pp<17; pp++)

{

p = pp;

int chunk = n/p; //acts like floor

omp_set_num_threads(p);

double start_parallel = omp_get_wtime();

//start parallel

#pragma omp parallel shared(tt,chunk) private (i)

{

//printf("thread number %d\n",omp_get_thread_num());

#pragma omp for schedule(dynamic,chunk) nowait

for(i=0; i<n; i++)

{

//tt

*= f(tt*

*);*

tttt

*= f1(tt**); //the speed up is much higher with f1 since log and cos*

//computations are polynomial; see function.

}

} //end parallel

double end_parallel = omp_get_wtime();

double cpu_time_used_parallel = (double) (end_parallel - start_parallel);

printf("parallel: for n=%d, p=%d, time taken=%f, speedup=%f\n",

n,p,cpu_time_used_parallel,

cpu_time_used_seq/cpu_time_used_parallel);

}

Result:

The speedup does not seem to be really affected by the variation of

Am I doing this right, or the cause comes from the non efficient incrementation of the number of threads (i.e. theoretically speaking changing

Continue reading...//computations are polynomial; see function.

}

} //end parallel

double end_parallel = omp_get_wtime();

double cpu_time_used_parallel = (double) (end_parallel - start_parallel);

printf("parallel: for n=%d, p=%d, time taken=%f, speedup=%f\n",

n,p,cpu_time_used_parallel,

cpu_time_used_seq/cpu_time_used_parallel);

}

Result:

Started varying threads:

parallel: for n=10000000, p=2, time taken=0.153774, speedup=3.503831

parallel: for n=10000000, p=3, time taken=0.064447, speedup=8.360370

parallel: for n=10000000, p=4, time taken=0.044694, speedup=12.055239

parallel: for n=10000000, p=5, time taken=0.048700, speedup=11.063550

parallel: for n=10000000, p=6, time taken=0.039009, speedup=13.811989

parallel: for n=10000000, p=7, time taken=0.041735, speedup=12.910017

parallel: for n=10000000, p=8, time taken=0.041268, speedup=13.055919

parallel: for n=10000000, p=9, time taken=0.039032, speedup=13.804157

parallel: for n=10000000, p=10, time taken=0.038970, speedup=13.825767

parallel: for n=10000000, p=11, time taken=0.039843, speedup=13.522884

parallel: for n=10000000, p=12, time taken=0.041356, speedup=13.028237

parallel: for n=10000000, p=13, time taken=0.041039, speedup=13.128763

parallel: for n=10000000, p=14, time taken=0.047433, speedup=11.359218

parallel: for n=10000000, p=15, time taken=0.048430, speedup=11.125202

parallel: for n=10000000, p=16, time taken=0.051950, speedup=10.371477

**Note:**The speedup here is computed against the sequential algorithm (threads = 1)The speedup does not seem to be really affected by the variation of

**p**(number of threads).Am I doing this right, or the cause comes from the non efficient incrementation of the number of threads (i.e. theoretically speaking changing

**p**won't seriously affect O(myprogram) ) ?Continue reading...