Understanding gcc optimization -o3 on a multicore system -


i'm comparing serial versus parallel implementation of code on quad-core processor. 1 of things i'd understand/measure how serial code performs when running on single core.

when compile serial code, use gcc's -o3 option , @ first noticed serial code wasn't doing shabby. however, 1 thing noticed when running compute-intense process on 1 of cores, serial version's performance drops.

here numbers:

total time elapsed: 1s, 233ms <- serial code running total time elapsed: 1s, 238ms <- serial code running total time elapsed: 2s, 128ms <- serial code run other code running on core total time elapsed: 2s, 220ms <- serial code run other code running on core 

i guessing there may background processes running on 1 of 4 cores. best gather running 2 processes on quad-core processor shouldn't saturate 4 cores.

what i'm wondering whether there reason believe step in o3 process allows code take advantage of quad-core set up, or, perhaps more precisely, why supposed "serial version" performs better when other cores available? trying understand gcc documentation , gathered there references threading. don't , wondering if me understand precisely o3 might or might not take advantage of more 1 core.

for worth, using intel(r) core(tm) i7-3820 cpu @ 3.60ghz , running linux mint 13.

thanks

-o3 not in face of more 1 core.

you seeing effects of shared resources on processor: memory bandwidth , cache.


Comments

Popular posts from this blog

java.util.scanner - How to read and add only numbers to array from a text file -

rewrite - Trouble with Wordpress multiple custom querystrings -