logo资料库

Computer Organization and Design Revised 4th Solutions 习题答案 高清英文....pdf

第1页 / 共211页
第2页 / 共211页
第3页 / 共211页
第4页 / 共211页
第5页 / 共211页
第6页 / 共211页
第7页 / 共211页
第8页 / 共211页
资料共211页,剩余部分请下载后查看
Sol01-9780123747501
Sol02-9780123747501
Sol03-9780123747501
Sol04-9780123747501
Sol05-9780123747501
Sol06-9780123747501
Sol07-9780123747501
1 Solutions Solution 1.1 1.1.1 Computer used to run large problems and usually accessed via a network: (3) servers 1.1.2 1015 or 250 bytes: (7) petabyte 1.1.3 A class of computers composed of hundred to thousand processors and tera- bytes of memory and having the highest performance and cost: (5) supercomputers 1.1.4 Today’s science fi ction application that probably will be available in near future: (1) virtual worlds 1.1.5 A kind of memory called random access memory: (12) RAM 1.1.6 Part of a computer called central processor unit: (13) CPU 1.1.7 Thousands of processors forming a large cluster: (8) data centers 1.1.8 Microprocessors containing several processors in the same chip: (10) multi- core processors 1.1.9 Desktop computer without a screen or keyboard usually accessed via a net- work: (4) low-end servers 1.1.10 A computer used to running one predetermined application or collection of software: (9) embedded computers 1.1.11 Special language used to describe hardware components: (11) VHDL 1.1.12 Personal computer delivering good performance to single users at low cost: (2) desktop computers 1.1.13 Program that translates statements in high-level language to assembly language: (15) compiler Sol01-9780123747501.indd S1 Sol01-9780123747501.indd S1 9/5/11 11:24 AM 9/5/11 11:24 AM
S2 Chapter 1 Solutions 1.1.14 Program that translates symbolic instructions to binary instructions: (21) assembler 1.1.15 High-level language for business data processing: (25) Cobol 1.1.16 Binary language that the processor can understand: (19) machine language 1.1.17 Commands that the processors understand: (17) instruction 1.1.18 High-level language for scientifi c computation: (26) Fortran 1.1.19 Symbolic representation of machine instructions: (18) assembly language 1.1.20 Interface between user’s program and hardware providing a variety of services and supervision functions: (14) operating system 1.1.21 Software/programs developed by the users: (24) application software 1.1.22 Binary digit (value 0 or 1): (16) bit 1.1.23 Software layer between the application software and the hardware that includes the operating system and the compilers: (23) system software 1.1.24 High-level language used to write application and system software: (20) C 1.1.25 Portable language composed of words and algebraic expressions that must be translated into assembly language before run in a computer: (22) high-level language 1.1.26 1012 or 240 bytes: (6) terabyte Solution 1.2 1.2.1 8 bits × 3 colors = 24 bits/pixel = 3 bytes/pixel. AQ 1 a. b. Confi guration 1: 640 × 480 pixels = 179,200 pixels => 179,200 × 3 = 537,600 bytes/frame Confi guration 2: 1280 × 1024 pixels = 1,310,720 pixels => 1,310,720 × 3 = 3,932,160 bytes/frame Confi guration 1: 1024 × 768 pixels = 786,432 pixels => 786,432 × 3 = 2,359,296 bytes/frame Confi guration 2: 2560 × 1600 pixels = 4,096,000 pixels => 4,096,000 × 3 = 12,288,000 bytes/frame Sol01-9780123747501.indd S2 Sol01-9780123747501.indd S2 9/5/11 11:24 AM 9/5/11 11:24 AM
Chapter 1 Solutions S3 1.2.2 No. frames = integer part of (Capacity of main memory/bytes per frame) a. b. Confi guration 1: Main memory: 2 GB = 2000 Mbytes. Frame: 537.600 Mbytes => No. frames = 3 Confi guration 2: Main memory: 4 GB = 4000 Mbytes. Frame: 3,932.160 Mbytes => No. frames = 1 Confi guration 1: Main memory: 2 GB = 2000 Mbytes. Frame: 2,359.296 Mbytes => No. frames = 0 Confi guration 2: Main memory: 4 GB = 4000 Mbytes. Frame: 12,288 Mbytes => No. frames = 0 1.2.3 File size: 256 Kbytes = 0.256 Mbytes. Same solution for a) and b) Confi guration 1: Network speed: 100 Mbit/sec = 12.5 Mbytes/sec. Time = 0.256/12.5 = 20.48 ms Confi guration 2: Network speed: 1 Gbit/sec = 125 Mbytes/sec. Time = 0.256/125 = 2.048 ms AQ 2 1.2.4 a. b. 2 microseconds from cache ⇒ 20 microseconds from DRAM. 2 microseconds from cache ⇒ 20 microseconds from DRAM. 1.2.5 a. b. 2 microseconds from cache ⇒ 2 ms from Flash memory. 2 microseconds from cache ⇒ 4.28 ms from Flash memory. 1.2.5 a. b. 2 microseconds from cache ⇒ 2 s from magnetic disk. 2 microseconds from cache ⇒ 5.7 s from magnetic disk. Solution 1.3 1.3.1 P2 has the highest performance. Instr/sec = f/CPI a. b. performance of P1 (instructions/sec) = 3 × 109/1.5 = 2 × 109 performance of P2 (instructions/sec) = 2.5 × 109/1.0 = 2.5 × 109 performance of P3 (instructions/sec) = 4 × 109/2.2 = 1.8 × 109 performance of P1 (instructions/sec) = 2 × 109/1.2 = 1.66 × 109 performance of P2 (instructions/sec) = 3 × 109/0.8 = 3.75 × 109 performance of P3 (instructions/sec) = 4 × 109/2 = 2 × 109 Sol01-9780123747501.indd S3 Sol01-9780123747501.indd S3 9/5/11 11:24 AM 9/5/11 11:24 AM
S4 Chapter 1 Solutions 1.3.2 No. cycles = time × clock rate time = (No. Instr × CPI)/clock rate, then No. instructions = No. cycles/CPI AQ 3 a. b. cycles(P1) = 10 × 3 × 109 = 30 × 109 s cycles(P2) = 10 × 2.5 × 109 = 25 × 109 s cycles(P3) = 10 × 4 × 109 = 40 × 109 s No. instructions(P1) = 30 × 109/1.5 = 20 × 109 No. instructions(P2) = 25 × 109/1 = 25 × 109 No. instructions(P3) = 40 × 109/2.2 = 18.18 × 109 cycles(P1) = 10 × 2 × 109 = 20 × 109 s cycles(P2) = 10 × 3 × 109 = 30 × 109 s cycles(P3) = 10 × 4 × 109 = 40 × 109 s No. instructions(P1) = 20 × 109/1.2 = 16.66 × 109 No. instructions(P2) = 30 × 109/0.8 = 37.5 × 109 No. instructions(P3) = 40 × 109/2 = 20 × 109 1.3.3 timenew = timeold × 0.7 = 7 s a. b. CPInew = CPIold × 1.2, then CPI(P1) = 1.8, CPI(P2) = 1.2, CPI(P3) = 2.6 f = No. Instr × CPI/time, then f(P1) = 20 × 109 × 1.8 / 7 = 5.14 GHz f(P2) = 25 × 109 × 1.2 / 7 = 4.28 GHz f(P1) = 18.18 × 109 × 2.6 / 7 = 6.75 GHz CPInew = CPIold × 1.2, then CPI(P1) = 1.44, CPI(P2) = 0.96, CPI(P3) = 2.4 f = No. Instr × CPI/time, then f(P1) = 16.66 × 109 × 1.44/7 = 3.42 GHz f(P2) = 37.5 × 109 × 0.96/7 = 5.14 GHz f(P1) = 20 × 109 × 2.4/7 = 6.85 GHz 1.3.4 IPC = 1/CPI = No. instr/(time × clock rate) a. b. IPC(P1) = 0.95 IPC(P2) = 1.2 IPC(P3) = 2.5 IPC(P1) = 2 IPC(P2) = 1.25 IPC(P3) = 0.89 1.3.5 a. b. Timenew/Timeold = 7/10 = 0.7. So fnew = fold/0.7 = 2.5 GHz/0.7 = 3.57 GHz. Timenew/Timeold = 5/8 = 0.625. So fnew = fold/0.625 = 4.8 GHz. Sol01-9780123747501.indd S4 Sol01-9780123747501.indd S4 9/5/11 11:24 AM 9/5/11 11:24 AM
1.3.6 Chapter 1 Solutions S5 a. b. Timenew/Timeold = 9/10 = 0.9. Then Instructionsnew = Instructionsold × 0.9 = 30 × 109 × 0.9 = 27 × 109. Timenew/Timeold = 7/8 = 0.875. Then Instructionsnew = Instructionsold × 0.875 = 26.25 × 109. Solution 1.4 1.4.1 Class A: 105 instr. Class B: 2 × 105 instr. Class C: 5 × 105 instr. Class D: 2 × 105 instr. Time = No. instr × CPI/clock rate a. b. Total time P1 = (105 + 2 × 105 × 2 + 5 × 105 × 3 + 2 × 105 × 3)/(2.5 × 109) = 10.4 × 10−4 s Total time P2 = (105 × 2 + 2 × 105 × 2 + 5 × 105 × 2 + 2 × 105 × 2)/(3 × 109) = 6.66 × 10−4 s Total time P1 = (105 × 2 + 2 × 105 × 1.5 + 5 × 105 × 2 + 2 × 105)/(2.5 × 109) = 6.8 × 10−4 s Total time P2 = (105 + 2 × 105 × 2 + 5 × 105 + 2 × 105)/(3 × 109) = 4 × 10−4 s 1.4.2 CPI = time × clock rate/No. instr a. b. CPI (P1) = 10.4 × 10−4 × 2.5 × 109/106 = 2.6 CPI (P2) = 6.66 × 10−4 × 3 × 109/106 = 2.0 CPI (P1) = 6.8 × 10−4 × 2.5 × 109/106 = 1.7 CPI (P2) = 4 × 10−4 × 3 × 109/106 = 1.2 1.4.3 a. b. clock cycles (P1) = 105 × 1 + 2 × 105 × 2 + 5 × 105 × 3 + 2 × 105 × 3 = 26 × 105 clock cycles (P2) = 105 × 2 + 2 × 105 × 2 + 5 × 105 × 2 + 2 × 105 × 2 = 20 × 105 clock cycles (P1) = 17 × 105 clock cycles (P2) = 12 × 105 1.4.4 a. b. (650 × 1 + 100 × 5 + 600 × 5 + 50 × 2) × 0.5 × 10–9 = 2,125 ns (750 × 1 + 250 × 5 + 500 × 5 + 500 × 2) × 0.5 × 10–9 = 2,750 ns 1.4.5 CPI = time × clock rate/No. instr a. b. CPI = 2,125 × 10–9 × 2 × 109/1,400 = 3.03 CPI = 2,750 × 10–9 × 2 × 109/2,000 = 2.75 Sol01-9780123747501.indd S5 Sol01-9780123747501.indd S5 9/5/11 11:24 AM 9/5/11 11:24 AM
S6 Chapter 1 Solutions 1.4.6 a. b. Time = (650 × 1 + 100 × 5 + 300 × 5 + 50 × 2) × 0.5 × 10–9 = 1,375 ns Speedup = 2,125 ns/1,375 ns = 1.54 CPI = 1,375 × 10–9 × 2 × 109/1,100 = 2.5 Time = (750 × 1 + 250 × 5 + 250 × 5 + 500 × 2) × 0.5 × 10–9 = 2,125 ns Speedup = 2,750 ns/2,125 ns = 1.29 CPI = 2,125 × 10–9 × 2 × 109/1,750 = 2.43 Solution 1.5 1.5.1 a. b. P1: 2 × 109 inst/sec, P2: 2 × 109 inst/sec P1: 2 × 109 inst/sec, P2: 3 × 109 inst/sec 1.5.2 a. b. T(P2)/T(P1) = 4/7; P2 is 1.75 times faster than P1 T(P2)/T(P1 )= 4.66/5; P2 is 1.07 times faster than P1 1.5.3 a. b. T(P2)/T(P1) = 4.5/8; P2 is 1.77 times faster than P1 T(P2)/T(P1) = 5.33/5.5; P2 is 1.03 times faster than P1 1.5.4 a. b. 2.91 µs 2.50 µs 1.5.5 a. b. 0.78 µs 0.90 µs 1.5.6 a. b. T = 0.68µs => 1.14 times faster T = 0.75µs => 1.20 times faster Sol01-9780123747501.indd S6 Sol01-9780123747501.indd S6 9/5/11 11:24 AM 9/5/11 11:24 AM
Chapter 1 Solutions S7 Solution 1.6 1.6.1 CPI = Texec × f/No. Instr Compiler A CPI a. b. 1.8 1.1 Compiler B CPI 1.5 1.25 1.6.2 fA/fB = (No. Instr(A) ´ CPI(A))/(No. Instr(B) ´ CPI(B)) a. b. fA/fB = 1 fA/fB = 0.73 1.6.3 a. b. 1.6.4 a. b. Speedup vs. Compiler A Speedup vs. Compiler B Tnew/TA = 0.36 Tnew/TA = 0.6 P1 Peak 4 × 109 Inst/s 4 × 109 Inst/s Tnew/TB = 0.36 Tnew/TB = 0.44 P2 Peak 2 × 109 Inst/s 3 × 109 Inst/s 1.6.5 Speedup, P1 versus P2: a. b. T1/T2 = 1.9 T1/T2 = 1.5 1.6.6 a. b. 4.37 GHz 6 GHz Solution 1.7 1.7.1 Geometric mean clock rate ratio = (1.28 × 1.56 × 2.64 × 3.03 × 10.00 × 1.80 × 0.74)1/7 = 2.15 Geometric mean power ratio = (1.24 × 1.20 × 2.06 × 2.88 × 2.59 × 1.37 × 0.92)1/7 = 1.62 Sol01-9780123747501.indd S7 Sol01-9780123747501.indd S7 9/5/11 11:24 AM 9/5/11 11:24 AM
S8 Chapter 1 Solutions 1.7.2 Largest clock rate ratio = 2000 MHz/200 MHz = 10 (Pentium Pro to Pentium 4 Willamette) Largest power ratio = 29.1 W/10.1 W = 2.88 (Pentium to Pentium Pro) 1.7.3 Clock rate: 2.667 × 109/12.5 × 106 = 213.36 Power: 95 W/3.3 W = 28.78 1.7.4 C = P/V2 × clock rate 80286: C = 0.0105 × 10−6 80386: C = 0.01025 × 10−6 80486: C = 0.00784 × 10−6 Pentium: C = 0.00612 × 10−6 Pentium Pro: C = 0.0133 × 10−6 Pentium 4 Willamette: C = 0.0122 ×10−6 Pentium 4 Prescott: C = 0.00183 × 10−6 Core 2: C = 0.0294 ×10−6 1.7.5 3.3/1.75 = 1.78 (Pentium Pro to Pentium 4 Willamette) 1.7.6 Pentium to Pentium Pro: 3.3/5 = 0.66 Pentium Pro to Pentium 4 Willamette: 1.75/3.3 = 0.53 Pentium 4 Willamette to Pentium 4 Prescott: 1.25/1.75 = 0.71 Pentium 4 Prescott to Core 2: 1.1/1.25 = 0.88 Geometric mean = 0.68 Solution 1.8 1.8.1 Power = V2 × clock rate × C. Power2 = 0.9 Power1 a. b. C2/C1 = 0.9 × 1.752 × 1.5 × 109/(1.22 × 2 × 109) = 1.43 C2/C1 = 0.9 × 1.12 × 3 × 109/(0.82 × 4 × 109) = 1.27 1.8.2 Power2/Power1 = V2 2 × clock rate2/(V1 2 × clock rate1) a. b. Power2/Power1 = 0.62 => Reduction of 38% Power2/Power1 = 0.7 => Reduction of 30% Sol01-9780123747501.indd S8 Sol01-9780123747501.indd S8 9/5/11 11:24 AM 9/5/11 11:24 AM
分享到:
收藏