1 
Solutions
Solution 1.1
1.1.1  Computer used to run large problems and usually accessed via a network: 
5 supercomputers
1.1.2  1015 or 250 bytes: 7 petabyte
1.1.3  Computer composed of hundreds to thousands of processors and terabytes 
of memory: 3 servers
1.1.4  Today’s science fi ction application that probably will be available in near 
future: 1 virtual worlds
1.1.5  A kind of memory called random access memory: 12 RAM
1.1.6  Part of a computer called central processor unit: 13 CPU
1.1.7  Thousands of processors forming a large cluster: 8 datacenters
1.1.8  A microprocessor containing several processors in the same chip: 10 multi-
core processors
1.1.9  Desktop computer without screen or keyboard usually accessed via a net-
work: 4 low-end servers
1.1.10  Currently the largest class of computer that runs one application or one 
set of related applications: 9 embedded computers
1.1.11  Special language used to describe hardware components: 11 VHDL
1.1.12  Personal  computer  delivering  good  performance  to  single  users  at  low 
cost: 2 desktop computers
1.1.13  Program  that  translates  statements  in  high-level  language  to  assembly 
 language: 15 compiler
S2 
Chapter 1  Solutions
1.1.14  Program  that  translates  symbolic  instructions  to  binary  instructions: 
21 assembler
1.1.15  High-level language for business data processing: 25 cobol
1.1.16  Binary language that the processor can understand: 19 machine language
1.1.17  Commands that the processors understand: 17 instruction
1.1.18  High-level language for scientifi c computation: 26 fortran
1.1.19  Symbolic representation of machine instructions: 18 assembly language
1.1.20  Interface  between  user’s  program  and  hardware  providing  a  variety  of 
 services and supervision functions: 14 operating system
1.1.21  Software/programs developed by the users: 24 application software
1.1.22  Binary digit (value 0 or 1): 16 bit
1.1.23  Software  layer  between  the  application  software  and  the  hardware  that 
includes the operating system and the compilers: 23 system software
1.1.24  High-level language used to write application and system software: 20 C
1.1.25  Portable  language  composed  of  words  and  algebraic  expressions  that 
must be translated into assembly language before run in a computer: 22 high-level 
 language
1.1.26  1012 or 240 bytes: 6 terabyte
Solution 1.2
1.2.1  8 bits × 3 colors = 24 bits/pixel = 4 bytes/pixel. 1280 × 800 pixels = 1,024,000 
pixels. 1,024,000 pixels × 4 bytes/pixel = 4,096,000 bytes (approx 4 Mbytes).
1.2.2  2 GB = 2000 Mbytes. No. frames = 2000 Mbytes/4 Mbytes = 500 frames.
1.2.3  Network speed: 1 gigabit network ==> 1 gigabit/per second = 125 Mbytes/
second. File size: 256 Kbytes = 0.256 Mbytes. Time for 0.256 Mbytes = 0.256/125 = 
2.048 ms.
 
Chapter 1  Solutions 
S3
1.2.4  2 microseconds from cache ==> 20 microseconds from DRAM. 20 micro-
seconds from DRAM ==> 2 seconds from magnetic disk. 20 microseconds from 
DRAM ==> 2 ms from fl ash memory.
Solution 1.3
1.3.1  P2 has the highest performance
performance of P1 (instructions/sec) = 2 × 109/1.5 = 1.33 × 109
performance of P2 (instructions/sec) = 1.5 × 109/1.0 = 1.5 × 109
performance of P3 (instructions/sec) = 3 × 109/2.5 = 1.2 × 109
1.3.2  No. cycles = time × clock rate
cycles(P1) = 10 × 2 × 109 = 20 × 109 s
cycles(P2) = 10 × 1.5 × 109 = 15 × 109 s
cycles(P3) = 10 × 3 × 109 = 30 × 109 s
time = (No. instr. × CPI)/clock rate, then No. instructions = No. cycles/CPI
instructions(P1) = 20 × 109/1.5 = 13.33 × 109
instructions(P2) = 15 × 109/1 = 15 × 109
instructions(P3) = 30 × 109/2.5 = 12 × 109
 = timeold × 0.7 = 7 s
1.3.3  timenew
CPI = CPI × 1.2, then CPI(P1) = 1.8, CPI(P2) = 1.2, CPI(P3) = 3
ƒ = No. instr. × CPI/time, then
ƒ(P1) = 13.33 × 109 × 1.8/7 = 3.42 GHz
ƒ(P2) = 15 × 109 × 1.2/7 = 2.57 GHz
ƒ(P3) = 12 × 109 × 3/7 = 5.14 GHz
1.3.4  IPC = 1/CPI = No. instr./(time × clock rate)
IPC(P1) = 1.42
IPC(P2) = 2
IPC(P3) = 3.33
1.3.5  Timenew/Timeold
1.3.6  Timenew/Timeold
So Instructionsnew
 = 7/10 = 0.7. So ƒnew
 = 9/10 = 0.9.
 = ƒold/0.7 = 1.5 GHz/0.7 = 2.14 GHz.
 = Instructionsold × 0.9 = 30 × 109 × 0.9 = 27 × 109.
S4 
Chapter 1  Solutions
Solution 1.4
1.4.1  P2 
Class A: 105 instr. 
Class B: 2 × 105 instr. 
Class C: 5 × 105 instr. 
Class D: 2 × 105 instr.
Time = No. instr. × CPI/clock rate
−4 
P1: Time class A = 0.66 × 10
−4 
Time class B = 2.66 × 10
−4 
Time class C = 10 × 10
−4 
Time class D = 5.33 × 10
−4
Total time P1 = 18.65 × 10
P2: Time class A = 10
−4 
−4
Time class B = 2 × 10
−4 
Time class C = 5 × 10
−4 
Time class D = 3 × 10
−4
Total time P2 = 11 × 10
1.4.2  CPI = time × clock rate/No. instr.
CPI(P1) = 18.65 × 10
CPI(P2) = 11 × 10
−4 × 2 × 109/106 = 2.2
−4 × 1.5 × 109/106 = 2.79
1.4.3
clock cycles(P1) = 105 × 1 + 2 × 105 × 2 + 5 × 105 × 3 + 2 × 105 × 4 = 28 × 105
clock cycles(P2) = 105 × 2 + 2 × 105 × 2 + 5 × 105 × 2 + 2 × 105 × 3 = 22 × 105
1.4.4 
(500 × 1 + 50 × 5 + 100 × 5 + 50 × 2) × 0.5 × 10–9 = 675 ns
1.4.5  CPI = time × clock rate/No. instr.
CPI = 675 × 10–9 × 2 × 109/700 = 1.92
1.4.6
Time = (500 × 1 + 50 × 5 + 50 × 5 + 50 × 2) × 0.5 × 10–9 = 550 ns
Speed-up = 675 ns/550 ns = 1.22
CPI = 550 × 10–9 × 2 × 109/700 = 1.57
 
Chapter 1  Solutions 
S5
Solution 1.5
1.5.1 
a.  1G, 0.75G inst/s
b.  1G, 1.5G inst/s
1.5.2
a. 
b. 
P2 is 1.33 times faster than P1
P1 is 1.03 times faster than P2
1.5.3
a. 
b. 
P2 is 1.31 times faster than P1
P1 is 1.00 times faster than P2
1.5.4
a.  2.05 µs
b.  1.93 µs
1.5.5 
a.  0.71 µs
b.  0.86 µs
1.5.6
a.  1.30 times faster
b.  1.40 times faster
Solution 1.6
1.6.1 
Compiler A CPI
Compiler B CPI
a. 
b. 
1.00
0.80
1.17 
0.58
S6 
Chapter 1  Solutions
1.6.2 
a.  0.86
b.  1.37
1.6.3 
a. 
b. 
1.6.4 
a. 
b. 
Compiler A speed-up 
Compiler B speed-up
1.52
1.21
P1 peak
4G Inst/s
4G Inst/s
1.77
0.88
P2 peak
3G Inst/s
3G Inst/s
1.6.5  Speed-up, P1 versus P2:
a.  0.967105263
b.  0.730263158
1.6.6 
a.  6.204081633
b.  8.216216216
Solution 1.7
1.7.1 
Geometric mean clock rate ratio =  (1.28 ×  1.56 ×  2.64 ×  3.03 ×  10.00 ×  1.80 × 
0.74)1/7 = 2.15
Geometric mean power ratio = (1.24 × 1.20 × 2.06 × 2.88 × 2.59 × 1.37 × 0.92)1/7 = 
1.62
1.7.2 
Largest clock rate ratio = 2000 MHz/200 MHz = 10 (Pentium Pro to Pentium 4 
Willamette)
Largest power ratio = 29.1 W/10.1 W = 2.88 (Pentium to Pentium Pro)
 
Chapter 1  Solutions 
S7
1.7.3 
Clock rate: 2.667 × 109/12.5 × 106 = 212.8
Power: 95 W/3.3 W = 28.78
1.7.4  C = P/V2 × clockrate
−6
80286: C = 0.0105 × 10
−6
80386: C = 0.01025 × 10
−6
80486: C = 0.00784 × 10
−6
Pentium: C = 0.00612 × 10
Pentium Pro: C = 0.0133 × 10
Pentium 4 Willamette: C = 0.0122 × 10
−6
Pentium 4 Prescott: C = 0.00183 × 10
Core 2: C = 0.0294 × 10
1.7.5  3.3/1.75 = 1.78 (Pentium Pro to Pentium 4 Willamette)
−6
−6
−6
1.7.6 
Pentium to Pentium Pro: 3.3/5 = 0.66
Pentium Pro to Pentium 4 Willamette: 1.75/3.3 = 0.53
Pentium 4 Willamette to Pentium 4 Prescott: 1.25/1.75 = 0.71
Pentium 4 Prescott to Core 2: 1.1/1.25 = 0.88
Geometric mean = 0.68
Solution 1.8
 = V2 × clock rate × C. Power2
1.8.1  Power1
C2/C1 = 0.9 × 52 × 0.5 × 109/3.32 × 1 × 109 = 1.03
 = 0.9 Power1
1.8.2  Power2/Power1
 = V2
2 × clock rate2/V1
2 × clock rate1
Power2/Power1 = 0.87 => Reduction of 13%
1.8.3 
2 × 1 × 109 × 0.8 × C1 = 0.6 × Power1
Power2 = V2
Power1 = 52 × 0.5 × 109 × C1
V2
V2 = ( (0.6 × 52 × 0.5 × 109)/(1 × 109 × 0.8) )1/2 = 3.06 V
2 × 1 × 109 × 0.8 × C1 = 0.6 × 52 × 0.5 × 109 × C1
S8 
Chapter 1  Solutions
1.8.4  Powernew
power scales by 1.
 = 1 × Cold × V2
old/(2
−1/4)2 × clock rate × 21/2 = Powerold. Thus, 
1.8.5  1/2
−1/2 = 21/2
1.8.6  Voltage = 1.1 × 1/2
−1/4 = 0.92 V. Clock rate = 2.667 × 21/2 = 3.771 GHz
Solution 1.9
1.9.1 
a.  1/49 × 100 = 2%
b.  45/120 × 100 = 37.5%
1.9.2 
a. 
b. 
Ileak = 1/3.3 = 0.3
Ileak = 45/1.1 = 40.9
1.9.3 
a. 
b. 
Powerst/Powerdyn = 1/49 = 0.02
Powerst/Powerdyn = 45/57 = 0.6
 = 0.6 => Powerst
 = 0.6 × Powerdyn
1.9.4  Powerst/Powerdyn
a. 
Powerst = 0.6 × 40 W = 24 W
Powerst = 0.6 × 30 W = 18 W
b. 
1.9.5 
a. 
b. 
Ilk = 24/0.8 = 30 A
Ilk = 18/0.8 = 22.5 A