Just fetch. Essentially this delays the address register. Everything stays the same, apart from jumps going into effect one instruction late. This then creates a branch slot. This then takes the RAM access time out of the timing loop. EDIT: Looking at this again, the RAM access is only a small part. Bulk comes from logic levels (17 deep!).