NOT NOW | DON'T ASK AGAIN

Claim Your Free Report Before You Forget...

Weekly tech delivered straight to your inbox. Sign up and receive our free report: 20 Tips For Becoming a Technology Power User.


Privacy Policy | More Information

PCMech.com helps normal people get their geek on. We talk about computers, technology, the Internet, social media - anything that makes a geek feel warm and fuzzy inside.

home | about | newsletters |forums | contact | advertising | membership

Helping Normal People Get Their Geek On And Live The Digital Lifestyle

Understanding Processor Pipelining

Posted Oct 19, 2004 by richard  

Now, even though Intel has the advantage of faster clock speeds, this doesn’t work well unless you have a lot of something. Cache. Cache is where all the instructions the processor is going to work on resides. All the cache’s data comes from main memory (RAM). When a processor is going to work on an instruction, it checks the cache to see if it is there. Cache runs at the core speed of the processor, so if you have an Intel 2.6GHz processor, the cache will also be running at 2.6GHz. Now compare this to main memory. Main memory, at the most without over clocking, runs at 400MHz. If a processor can’t find an instruction in the cache, then it has to slow right down to match the speed of the main memory, until it can get the next instruction from it. This makes a massive drop in performance, and is the reason the Celeron processors are such poor performers. With less cache, there are fewer instructions ready for the processor, increasing the chance that the processor will have to slow down to main memory speed. The reason the Intel processors are more susceptible to slowing down is because of a technique processors use to decide which instruction to work on next.


This is what is called pipeline optimization: The processor will always try to keep the pipeline full. To do this, it has to use techniques to guess what will come next. There are 3 different ways of optimizing the pipeline. These ways are:


Speculative execution - This is where the CPU has an instruction, and the next instruction cannot take place unless the CPU knows the answer to the first instruction. The CPU has to work out the answer to the first instruction, but say there is 2 instruction answers, and only one is correct. Without speculative execution, the CPU would send one of the possible answers to the instruction down the pipeline, which in an Intel CPU would take 20 clock cycles to complete. Now, if the CPU chooses the correct instruction answer, then everything is fine, the CPU can go right onto the next one. But what if it is the wrong one? The CPU has to send the other instruction answer down the pipeline, which would mean 20 clock cycles were wasted with the first instruction! So what speculative execution does is send both possible instruction answers down the pipeline, so the CPU processes both. The CPU processes both, then discards the incorrect one. That means a lot less time was wasted.


Branch prediction - This is a tough one, and can mean running at full speed without having to slow down, or having to completely start again from the start of the instruction set for a processor. This builds onto speculative execution. Remember when I said the processor will always want a full pipeline? Well, just because there is 2 possible answers doesn’t mean there is any exception. What will happen is this: The processor will see the 2 possible answers, and will make an educated guess which one is correct from the branch target buffer before the CPU executes both instructions. So, both instructions are not executed anymore, only one of them is, which is the one the CPU predicts will be correct. The branch target buffer is a bank of all the answers that turned out to be right from other instructions, and from looking at this bank the CPU can take a guess which is the correct answer from what it has already done. When the CPU looks in this bank, it allows a good prediction because it has all the results from other instructions. So after sending the instruction that the CPU has guessed correct, the instructions that would come after this prediction are also sent down the pipeline. If the CPU was right with the branch prediction, then there will be a lot of time saved. If not, the whole pipeline has to be flushed and restarted, because it all counted on the first instruction being guessed correctly. This is why the Pentium 4 needs more intelligent branch prediction technology, with a long pipeline it takes a long time for a new set of instructions to reach the end of the pipeline.


Out of order execution - This is where the second instruction cannot be performed, because the CPU has to know the answer to the first instruction before the CPU can know what the answer is to the second one. Without out of order execution, the CPU would execute the first instruction, and leave the rest of the pipeline empty. This would be a massive waste of resources. So what happens is this, the CPU will execute the first instruction, then execute other instructions that have no dependency on the first instruction. So with this, the CPU can work on other instructions while it is waiting for the first one.

Fire Your Computer Guy!

A computer technician spills the beans and makes available the knowledge he has charged clients hundreds in service fees for. It is Computer Secrets Unleashed. Find Out More.

Members


Search

Lijit Search

Featured Product of The Week

Build Your Own Network

Build Your Own Network

Free Weekly Newsletter

Weekly tech delivered straight to your inbox. Sign up and receive our free report: 20 Tips For Becoming a Technology Power User.

Name:
Email:
 

Now Playing on PCMech Video

Feature ImageIs Blocking Ads Right?

Feature ImageA Word On Instant Messaging

See All Videos | PCMech Channel Youtube Channel