SOLUTIONS MANUAL
COMPUTER ORGANIZATION AND
DESIGNING FOR PERFORMANCE
ARCHITECTURE
EIGHTH EDITION
WILLIAM STALLINGS
Copyright 2009: William Stallings
© 2009 by William Stallings
All rights reserved. No part
of this document may be
reproduced, in any form or
by any means, or posted on
the Internet, without
permission in writing from
the author. Selected
solutions may be shared
with students, provided that
they are not available,
unsecured, on the Web.
-2-
NOTICE
This manual contains solutions to the review questions and
homework problems in Computer Organization and Architecture,
Eighth Edition. If you spot an error in a solution or in the
wording of a problem, I would greatly appreciate it if you
would forward the information via email to ws@shore.net.
An errata sheet for this manual, if needed, is available at
http://www.box.net/shared/q4a7bmmtyc . File name is
S-COA8e-mmyy
W.S.
-3-
TABLE OF CONTENTS
Chapter 1 Introduction...........................................................................................5
Chapter 2 Computer Evolution and Performance.............................................6
Chapter 3 Computer Function and Interconnection........................................14
Chapter 4 Cache Memory....................................................................................19
Chapter 5 Internal Memory.................................................................................32
Chapter 6 External Memory................................................................................38
Chapter 7 Input/Output......................................................................................43
Chapter 8 Operating System Support................................................................50
Chapter 9 Computer Arithmetic.........................................................................57
Chapter 10 Instruction Sets: Characteristics and Functions ...........................69
Chapter 11 Instruction Sets: Addressing Modes and Formats.......................80
Chapter 12 Processor Structure and Function..................................................85
Chapter 13 Reduced Instruction Set Computers..............................................92
Chapter 14 Instruction-Level Parallelism and Superscalar Processors.........97
Chapter 15 Control Unit Operation..................................................................103
Chapter 16 Microprogrammed Control...........................................................106
Chapter 17 Parallel Processing..........................................................................109
Chapter 18 Multicore Computers.....................................................................118
Chapter 19 Number Systems.............................................................................121
Chapter 20 Digital Logic....................................................................................122
Chapter 21 The IA-64 Architecture ..................................................................126
Appendix B Assembly Language and Related Topics..................................130
-4-
CHAPTER 1 INTRODUCTION
AA NSWERS TO
NSWERS TO QQ UESTIONS
UESTIONS
1.1 Computer architecture refers to those attributes of a system visible to a
programmer or, put another way, those attributes that have a direct impact on the
logical execution of a program. Computer organization refers to the operational
units and their interconnections that realize the architectural specifications.
Examples of architectural attributes include the instruction set, the number of bits
used to represent various data types (e.g., numbers, characters), I/O mechanisms,
and techniques for addressing memory. Organizational attributes include those
hardware details transparent to the programmer, such as control signals;
interfaces between the computer and peripherals; and the memory technology
used.
1.2 Computer structure refers to the way in which the components of a computer are
interrelated. Computer function refers to the operation of each individual
component as part of the structure.
1.3 Data processing; data storage; data movement; and control.
1.4 Central processing unit (CPU): Controls the operation of the computer and
performs its data processing functions; often simply referred to as processor.
Main memory: Stores data.
I/O: Moves data between the computer and its external environment.
System interconnection: Some mechanism that provides for communication
among CPU, main memory, and I/O. A common example of system
interconnection is by means of a system bus, consisting of a number of conducting
wires to which all the other components attach.
1.5 Control unit: Controls the operation of the CPU and hence the computer
Arithmetic and logic unit (ALU): Performs the computer’s data processing
functions
Registers: Provides storage internal to the CPU
CPU interconnection: Some mechanism that provides for communication among
the control unit, ALU, and registers
-5-
CHAPTER 2 COMPUTER EVOLUTION AND
PERFORMANCE
AA NSWERS TO
NSWERS TO QQ UESTIONS
UESTIONS
2.1 In a stored program computer, programs are represented in a form suitable for
storing in memory alongside the data. The computer gets its instructions by reading
them from memory, and a program can be set or altered by setting the values of a
portion of memory.
2.2 A main memory, which stores both data and instructions: an arithmetic and logic
unit (ALU) capable of operating on binary data; a control unit, which interprets
the instructions in memory and causes them to be executed; and input and output
(I/O) equipment operated by the control unit.
2.3 Gates, memory cells, and interconnections among gates and memory cells.
2.4 Moore observed that the number of transistors that could be put on a single chip
was doubling every year and correctly predicted that this pace would continue
into the near future.
2.5 Similar or identical instruction set: In many cases, the same set of machine
instructions is supported on all members of the family. Thus, a program that
executes on one machine will also execute on any other. Similar or identical
operating system: The same basic operating system is available for all family
members. Increasing speed: The rate of instruction execution increases in going
from lower to higher family members. Increasing Number of I/O ports: In going
from lower to higher family members. Increasing memory size: In going from
lower to higher family members. Increasing cost: In going from lower to higher
family members.
In a microprocessor, all of the components of the CPU are on a single chip.
2.6
AA NSWERS TO
NSWERS TO PP ROBLEMS
ROBLEMS
2.1 This program is developed in [HAYE98]. The vectors A, B, and C are each stored
in 1,000 contiguous locations in memory, beginning at locations 1001, 2001, and
3001, respectively. The program begins with the left half of location 3. A counting
variable N is set to 999 and decremented after each step until it reaches –1. Thus,
the vectors are processed from high location to low location.
-6-
Location Instruction
0
1
2
3L
3R
4L
4R
5L
5R
6L
6R
7L
7R
8L
8R
9L
9R
10L
10R
999
1
1000
LOAD M(2000)
ADD M(3000)
STOR M(4000)
LOAD M(0)
SUB M(1)
JUMP+ M(6, 20:39)
JUMP M(6, 0:19)
STOR M(0)
ADD M(1)
ADD M(2)
STOR M(3, 8:19)
ADD M(2)
STOR M(3, 28:39)
ADD M(2)
STOR M(4, 8:19)
JUMP M(3, 0:19)
Comments
Constant (count N)
Constant
Constant
Transfer A(I) to AC
Compute A(I) + B(I)
Transfer sum to C(I)
Load count N
Decrement N by 1
Test N and branch to 6R if nonnegative
Halt
Update N
Increment AC by 1
Modify address in 3L
Modify address in 3R
Modify address in 4L
Branch to 3L
2.2 a.
Opcode
00000001 000000000010
Operand
b. First, the CPU must make access memory to fetch the instruction. The
instruction contains the address of the data we want to load. During the execute
phase accesses memory to load the data value located at that address for a total
of two trips to memory.
2.3 To read a value from memory, the CPU puts the address of the value it wants into
the MAR. The CPU then asserts the Read control line to memory and places the
address on the address bus. Memory places the contents of the memory location
passed on the data bus. This data is then transferred to the MBR. To write a value
to memory, the CPU puts the address of the value it wants to write into the MAR.
The CPU also places the data it wants to write into the MBR. The CPU then asserts
the Write control line to memory and places the address on the address bus and
the data on the data bus. Memory transfers the data on the data bus into the
corresponding memory location.
-7-
2.4
Address Contents
08A
08B
08C
08D
LOAD M(0FA)
STOR M(0FB)
LOAD M(0FA)
JUMP +M(08D)
LOAD –M(0FA)
STOR M(0FB)
This program will store the absolute value of content at memory location 0FA into
memory location 0FB.
2.5 All data paths to/from MBR are 40 bits. All data paths to/from MAR are 12 bits.
2.6 The purpose is to increase performance. When an address is presented to a memory
Paths to/from AC are 40 bits. Paths to/from MQ are 40 bits.
module, there is some time delay before the read or write operation can be
performed. While this is happening, an address can be presented to the other
module. For a series of requests for successive words, the maximum rate is
doubled.
2.7 The discrepancy can be explained by noting that other system components aside from clock
speed make a big difference in overall system speed. In particular, memory systems and
advances in I/O processing contribute to the performance ratio. A system is only as fast as
its slowest link. In recent years, the bottlenecks have been the performance of memory
modules and bus speed.
2.8 As noted in the answer to Problem 2.7, even though the Intel machine may have a
faster clock speed (2.4 GHz vs. 1.2 GHz), that does not necessarily mean the system
will perform faster. Different systems are not comparable on clock speed. Other
factors such as the system components (memory, buses, architecture) and the
instruction sets must also be taken into account. A more accurate measure is to run
both systems on a benchmark. Benchmark programs exist for certain tasks, such as
running office applications, performing floating-point operations, graphics
operations, and so on. The systems can be compared to each other on how long
they take to complete these tasks. According to Apple Computer, the G4 is
comparable or better than a higher-clock speed Pentium on many benchmarks.
2.9 This representation is wasteful because to represent a single decimal digit from 0
through 9 we need to have ten tubes. If we could have an arbitrary number of
these tubes ON at the same time, then those same tubes could be treated as binary
bits. With ten bits, we can represent 210 patterns, or 1024 patterns. For integers,
these patterns could be used to represent the numbers from 0 through 1023.
2.10 CPI = 1.55; MIPS rate = 25.8; Execution time = 3.87 ns. Source: [HWAN93]
-8-