1 
Solutions
Solution 1.1
1.1.1  Computer used to run large problems and usually accessed via a network: 
(3) servers
1.1.2  1015 or 250 bytes: (7) petabyte
1.1.3  A class of computers composed of hundred to thousand processors and tera-
bytes of memory and having the highest performance and cost: (5) supercomputers
1.1.4  Today’s science fi ction application that probably will be available in near 
future: (1) virtual worlds
1.1.5  A kind of memory called random access memory: (12) RAM
1.1.6  Part of a computer called central processor unit: (13) CPU
1.1.7  Thousands of processors forming a large cluster: (8) data centers
1.1.8  Microprocessors containing several processors in the same chip: (10) multi-
core processors
1.1.9  Desktop computer without a screen or keyboard usually accessed via a net-
work: (4) low-end servers
1.1.10  A computer used to running one predetermined application or collection 
of software: (9) embedded computers
1.1.11  Special language used to describe hardware components: (11) VHDL
1.1.12  Personal computer delivering good performance to single users at low cost: 
(2) desktop computers
1.1.13  Program  that  translates  statements  in  high-level  language  to  assembly 
 language: (15) compiler
Sol01-9780123747501.indd   S1
Sol01-9780123747501.indd   S1
9/5/11   11:24 AM
9/5/11   11:24 AM
S2 
Chapter 1  Solutions
1.1.14  Program  that  translates  symbolic  instructions  to  binary  instructions: 
(21) assembler
1.1.15  High-level language for business data processing: (25) Cobol
1.1.16  Binary language that the processor can understand: (19) machine language
1.1.17  Commands that the processors understand: (17) instruction
1.1.18  High-level language for scientifi c computation: (26) Fortran
1.1.19  Symbolic representation of machine instructions: (18) assembly language
1.1.20  Interface  between  user’s  program  and  hardware  providing  a  variety  of 
 services and supervision functions: (14) operating system
1.1.21  Software/programs developed by the users: (24) application software
1.1.22  Binary digit (value 0 or 1): (16) bit
1.1.23  Software  layer  between  the  application  software  and  the  hardware  that 
includes the operating system and the compilers: (23) system software
1.1.24  High-level language used to write application and system software: (20) C
1.1.25  Portable language composed of words and algebraic expressions that must 
be  translated  into  assembly  language  before  run  in  a  computer:  (22)  high-level 
language
1.1.26  1012 or 240 bytes: (6) terabyte
Solution 1.2
1.2.1  8 bits × 3 colors = 24 bits/pixel = 3 bytes/pixel. 
AQ 1
a.
b.
Confi guration 1: 640 × 480 pixels = 179,200 pixels => 179,200 × 3 = 537,600 bytes/frame 
 Confi guration 2: 1280 × 1024 pixels = 1,310,720 pixels => 1,310,720 × 3 = 3,932,160 
bytes/frame
Confi guration 1: 1024 × 768 pixels = 786,432 pixels => 786,432 × 3 = 2,359,296 
bytes/frame
Confi guration 2: 2560 × 1600 pixels = 4,096,000 pixels => 4,096,000 × 3 = 12,288,000 
bytes/frame
Sol01-9780123747501.indd   S2
Sol01-9780123747501.indd   S2
9/5/11   11:24 AM
9/5/11   11:24 AM
 
Chapter 1  Solutions 
S3
1.2.2  No. frames = integer part of (Capacity of main memory/bytes per frame)
a.
b.
Confi guration 1: Main memory: 2 GB = 2000 Mbytes. Frame: 537.600 Mbytes => No. frames = 3
Confi guration 2: Main memory: 4 GB = 4000 Mbytes. Frame: 3,932.160 Mbytes => No. frames = 1
 Confi guration 1: Main memory: 2 GB = 2000 Mbytes. Frame: 2,359.296 Mbytes => No. frames = 0
Confi guration 2: Main memory: 4 GB = 4000 Mbytes. Frame: 12,288 Mbytes => No. frames = 0
1.2.3  File size: 256 Kbytes = 0.256 Mbytes. 
Same solution for a) and b)
Confi guration 1: Network speed: 100 Mbit/sec = 12.5 Mbytes/sec. Time = 0.256/12.5 = 20.48 ms
Confi guration 2: Network speed: 1 Gbit/sec = 125 Mbytes/sec. Time = 0.256/125 = 2.048 ms
AQ 2
1.2.4
a.
b.
2 microseconds from cache ⇒ 20 microseconds from DRAM.
2 microseconds from cache ⇒ 20 microseconds from DRAM.
1.2.5
a.
b.
2 microseconds from cache ⇒ 2 ms from Flash memory. 
2 microseconds from cache ⇒ 4.28 ms from Flash memory.
1.2.5
a.
b.
2 microseconds from cache ⇒ 2 s from magnetic disk. 
2 microseconds from cache ⇒ 5.7 s from magnetic disk.
Solution 1.3
1.3.1  P2 has the highest performance. 
Instr/sec = f/CPI
a.
b.
performance of P1 (instructions/sec) = 3 × 109/1.5 = 2 × 109
performance of P2 (instructions/sec) = 2.5 × 109/1.0 = 2.5 × 109
performance of P3 (instructions/sec) = 4 × 109/2.2 = 1.8 × 109 
performance of P1 (instructions/sec) = 2 × 109/1.2  = 1.66 × 109
performance of P2 (instructions/sec) = 3 × 109/0.8 = 3.75 × 109
performance of P3 (instructions/sec) = 4 × 109/2 = 2 × 109
Sol01-9780123747501.indd   S3
Sol01-9780123747501.indd   S3
9/5/11   11:24 AM
9/5/11   11:24 AM
S4 
Chapter 1  Solutions
1.3.2  No. cycles = time × clock rate
time = (No. Instr × CPI)/clock rate, then No. instructions = No. cycles/CPI
AQ 3
a.
b.
cycles(P1) = 10 × 3 × 109 = 30 × 109 s
cycles(P2) = 10 × 2.5 × 109 = 25 × 109 s
cycles(P3) = 10 × 4 × 109 = 40 × 109 s
No. instructions(P1) = 30 × 109/1.5 = 20 × 109
No. instructions(P2) = 25 × 109/1 = 25 × 109
No. instructions(P3) = 40 × 109/2.2 = 18.18 × 109
cycles(P1) = 10 × 2 × 109 = 20 × 109 s
cycles(P2) = 10 × 3 × 109 = 30 × 109 s
cycles(P3) = 10 × 4 × 109 = 40 × 109 s 
No. instructions(P1) = 20 × 109/1.2 = 16.66 × 109
No. instructions(P2) = 30 × 109/0.8 = 37.5 × 109
No. instructions(P3) = 40 × 109/2 = 20 × 109
1.3.3  timenew = timeold × 0.7 = 7 s
a.
b.
CPInew = CPIold × 1.2, then CPI(P1) = 1.8, CPI(P2) = 1.2, CPI(P3) = 2.6
f = No. Instr × CPI/time, then 
f(P1) = 20 × 109 × 1.8 / 7 = 5.14 GHz
f(P2) = 25 × 109 × 1.2 / 7 = 4.28 GHz
f(P1) = 18.18 × 109 × 2.6 / 7 = 6.75 GHz
CPInew = CPIold × 1.2, then CPI(P1) = 1.44, CPI(P2) = 0.96, CPI(P3) = 2.4
f = No. Instr × CPI/time, then 
f(P1) = 16.66 × 109 × 1.44/7 = 3.42 GHz
f(P2) = 37.5 × 109 × 0.96/7 = 5.14 GHz
f(P1) = 20 × 109 × 2.4/7 = 6.85 GHz
1.3.4  IPC = 1/CPI = No. instr/(time × clock rate)
a.
b.
IPC(P1) = 0.95
IPC(P2) = 1.2
IPC(P3) = 2.5
IPC(P1) = 2
IPC(P2) = 1.25
IPC(P3) = 0.89
1.3.5
a.
b.
Timenew/Timeold = 7/10 = 0.7. So fnew = fold/0.7 = 2.5 GHz/0.7 = 3.57 GHz.
Timenew/Timeold = 5/8 = 0.625. So fnew = fold/0.625 = 4.8 GHz.
Sol01-9780123747501.indd   S4
Sol01-9780123747501.indd   S4
9/5/11   11:24 AM
9/5/11   11:24 AM
 
1.3.6
Chapter 1  Solutions 
S5
a.
b.
Timenew/Timeold = 9/10 = 0.9. Then Instructionsnew = Instructionsold × 0.9 = 30 × 109 × 0.9 = 27 
× 109.
Timenew/Timeold = 7/8 = 0.875. Then Instructionsnew = Instructionsold × 0.875 = 26.25 × 109.
 Solution 1.4
1.4.1
Class A: 105 instr.
Class B: 2 × 105 instr. 
Class C: 5 × 105 instr.
Class D: 2 × 105 instr.
Time = No. instr × CPI/clock rate 
a.
b.
Total time P1 = (105 + 2 × 105 × 2 + 5 × 105 × 3 + 2 × 105 × 3)/(2.5 × 109) = 10.4 × 10−4 s
Total time P2 = (105 × 2 + 2 × 105 × 2 + 5 × 105 × 2 + 2 × 105 × 2)/(3 × 109) = 6.66 × 10−4 s
Total time P1 = (105 × 2 + 2 × 105 × 1.5 + 5 × 105 × 2 + 2 × 105)/(2.5 × 109) = 6.8 × 10−4 s
Total time P2 = (105 + 2 × 105 × 2 + 5 × 105 + 2 × 105)/(3 × 109) = 4 × 10−4 s
1.4.2  CPI = time × clock rate/No. instr
a.
b.
CPI (P1) = 10.4 × 10−4 × 2.5 × 109/106 = 2.6
CPI (P2) = 6.66 × 10−4 × 3 × 109/106 = 2.0
CPI (P1) = 6.8 × 10−4 × 2.5 × 109/106 = 1.7
CPI (P2) = 4 × 10−4 × 3 × 109/106 = 1.2
1.4.3
a.
b.
clock cycles (P1) = 105 × 1 + 2 × 105 × 2 + 5 × 105 × 3 + 2 × 105 × 3 = 26 × 105
clock cycles (P2) = 105 × 2 + 2 × 105 × 2 + 5 × 105 × 2 + 2 × 105 × 2 = 20 × 105
clock cycles (P1) = 17 × 105
clock cycles (P2) = 12 × 105
1.4.4
a.
b.
(650 × 1 + 100 × 5 + 600 × 5 + 50 × 2) × 0.5 × 10–9 = 2,125 ns
(750 × 1 + 250 × 5 + 500 × 5 + 500 × 2) × 0.5 × 10–9 = 2,750 ns
1.4.5  CPI = time × clock rate/No. instr
a.
b.
CPI = 2,125 × 10–9 × 2 × 109/1,400 = 3.03
CPI = 2,750 × 10–9 × 2 × 109/2,000 = 2.75
Sol01-9780123747501.indd   S5
Sol01-9780123747501.indd   S5
9/5/11   11:24 AM
9/5/11   11:24 AM
S6 
Chapter 1  Solutions
1.4.6
a.
b.
Time = (650 × 1 + 100 × 5 + 300 × 5 + 50 × 2) × 0.5 × 10–9 = 1,375 ns
Speedup = 2,125 ns/1,375 ns = 1.54
CPI = 1,375 × 10–9 × 2 × 109/1,100 = 2.5
Time = (750 × 1 + 250 × 5 + 250 × 5 + 500 × 2) × 0.5 × 10–9 = 2,125 ns
Speedup = 2,750 ns/2,125 ns = 1.29
CPI = 2,125 × 10–9 × 2 × 109/1,750 = 2.43
Solution 1.5
1.5.1
a.
b.
P1: 2 × 109 inst/sec,  P2: 2 × 109 inst/sec
P1: 2 × 109 inst/sec,  P2: 3 × 109 inst/sec
1.5.2
a.
b.
T(P2)/T(P1) = 4/7; 
      P2 is 1.75 times faster than P1
T(P2)/T(P1 )= 4.66/5;  P2 is 1.07 times faster than P1
1.5.3
a.
b.
T(P2)/T(P1) = 4.5/8; 
 
  P2 is 1.77 times faster than P1
T(P2)/T(P1) = 5.33/5.5;  P2 is 1.03 times faster than P1
1.5.4
a.
b.
2.91 µs
2.50 µs
1.5.5
a.
b.
0.78 µs
0.90 µs
1.5.6
a.
b.
T = 0.68µs => 1.14 times faster
T = 0.75µs => 1.20 times faster
Sol01-9780123747501.indd   S6
Sol01-9780123747501.indd   S6
9/5/11   11:24 AM
9/5/11   11:24 AM
 
Chapter 1  Solutions 
S7
Solution 1.6
1.6.1  CPI = Texec × f/No. Instr
Compiler A CPI
a.
b.
1.8
1.1
Compiler B CPI
1.5
1.25
1.6.2  fA/fB = (No. Instr(A) ´ CPI(A))/(No. Instr(B) ´ CPI(B))
a.
b.
fA/fB = 1
fA/fB = 0.73
1.6.3
a.
b.
1.6.4
a.
b.
Speedup vs. Compiler A 
Speedup vs. Compiler B 
Tnew/TA = 0.36
Tnew/TA = 0.6
P1 Peak
4 × 109 Inst/s
4 × 109 Inst/s
Tnew/TB = 0.36
Tnew/TB = 0.44
P2 Peak
2 × 109 Inst/s
3 × 109 Inst/s
1.6.5  Speedup, P1 versus P2:
a.
b.
T1/T2 = 1.9
T1/T2 = 1.5
1.6.6
a.
b.
4.37 GHz
6 GHz
Solution 1.7
1.7.1
Geometric  mean  clock  rate  ratio =  (1.28 ×  1.56 ×  2.64 ×  3.03 ×  10.00 ×  1.80 × 
0.74)1/7 = 2.15
Geometric mean power ratio = (1.24 × 1.20 × 2.06 × 2.88 × 2.59 × 1.37 × 0.92)1/7 = 1.62
Sol01-9780123747501.indd   S7
Sol01-9780123747501.indd   S7
9/5/11   11:24 AM
9/5/11   11:24 AM
S8 
Chapter 1  Solutions
1.7.2
Largest clock rate ratio = 2000 MHz/200 MHz = 10 (Pentium Pro to Pentium 4 
Willamette)
Largest power ratio = 29.1 W/10.1 W = 2.88 (Pentium to Pentium Pro)
1.7.3
Clock rate: 2.667 × 109/12.5 × 106 = 213.36
Power: 95 W/3.3 W = 28.78
1.7.4  C = P/V2 × clock rate
80286: C = 0.0105 × 10−6
80386: C = 0.01025 × 10−6
80486: C = 0.00784 × 10−6
Pentium: C = 0.00612 × 10−6
Pentium Pro: C = 0.0133 × 10−6
Pentium 4 Willamette: C = 0.0122 ×10−6
Pentium 4 Prescott: C = 0.00183 × 10−6
Core 2: C = 0.0294 ×10−6
1.7.5  3.3/1.75 = 1.78 (Pentium Pro to Pentium 4 Willamette)
1.7.6
Pentium to Pentium Pro: 3.3/5 = 0.66
Pentium Pro to Pentium 4 Willamette: 1.75/3.3 = 0.53
Pentium 4 Willamette to Pentium 4 Prescott: 1.25/1.75 = 0.71
Pentium 4 Prescott to Core 2: 1.1/1.25 = 0.88
Geometric mean = 0.68
Solution 1.8
1.8.1  Power = V2 × clock rate × C. Power2 = 0.9 Power1
a.
b.
C2/C1 = 0.9 × 1.752 × 1.5 × 109/(1.22 × 2 × 109) = 1.43
C2/C1 = 0.9 × 1.12 × 3 × 109/(0.82 × 4 × 109) = 1.27
1.8.2  Power2/Power1 = V2
2 × clock rate2/(V1
2 × clock rate1)
a.
b.
Power2/Power1 = 0.62 => Reduction of 38%
Power2/Power1 = 0.7 => Reduction of 30%
Sol01-9780123747501.indd   S8
Sol01-9780123747501.indd   S8
9/5/11   11:24 AM
9/5/11   11:24 AM