外文资料
Object landscapes and lifetimes
just about abstract data typing,
Technically, OOP is
inheritance, and
polymorphism, but other issues can be at least as important. The remainder of this
section will cover these issues.
One of the most important factors is the way objects are created and destroyed.
Where is the data for an object and how is the lifetime of the object controlled? There
are different philosophies at work here. C++ takes the approach that control of
efficiency is the most important issue, so it gives the programmer a choice. For
maximum run-time speed, the storage and lifetime can be determined while the
program is being written, by placing the objects on the stack (these are sometimes
called automatic or scoped variables) or in the static storage area. This places a
priority on the speed of storage allocation and release, and control of these can be
very valuable in some situations. However, you sacrifice flexibility because you must
know the exact quantity, lifetime, and type of objects while you're writing the
program. If you are trying to solve a more general problem such as computer-aided
design, warehouse management, or air-traffic control, this is too restrictive.
the point
The second approach is to create objects dynamically in a pool of memory called
the heap. In this approach, you don't know until run-time how many objects you need,
what their lifetime is, or what their exact type is. Those are determined at the spur of
the moment while the program is running. If you need a new object, you simply make
it on the heap at
that you need it. Because the storage is managed
dynamically, at run-time, the amount of time required to allocate storage on the heap
is significantly longer than the time to create storage on the stack. (Creating storage
on the stack is often a single assembly instruction to move the stack pointer down, and
another to move it back up.) The dynamic approach makes the generally logical
assumption that objects tend to be complicated, so the extra overhead of finding
storage and releasing that storage will not have an important impact on the creation of
an object. In addition,
to solve the general
the greater flexibility is essential
programming problem.
Java uses the second approach, exclusively]. Every time you want to create an
object, you use the new keyword to build a dynamic instance of that object.
There's another issue, however, and that's the lifetime of an object. With
languages that allow objects to be created on the stack, the compiler determines how
long the object lasts and can automatically destroy it. However, if you create it on the
heap the compiler has no knowledge of its lifetime. In a language like C++, you must
determine programmatically when to destroy the object, which can lead to memory
leaks if you don’t do it correctly (and this is a common problem in C++ programs).
Java provides a feature called a garbage collector that automatically discovers when
an object is no longer in use and destroys it. A garbage collector is much more
convenient because it reduces the number of issues that you must track and the code
you must write. More important, the garbage collector provides a much higher level
of insurance against the insidious problem of memory leaks (which has brought many
a C++ project to its knees).
The rest of this section looks at additional factors concerning object lifetimes and
landscapes.
1 Collections and iterators
If you don’t know how many objects you’re going to need to solve a particular
problem, or how long they will last, you also don’t know how to store those objects.
How can you know how much space to create for those objects? You can’t, since that
information isn’t known until run-time.
The solution to most problems in object-oriented design seems flippant: you
create another type of object. The new type of object that solves this particular
problem holds references to other objects. Of course, you can do the same thing with
an array, which is available in most languages. But there’s more. This new object,
generally called a container (also called a collection, but the Java library uses that
term in a different sense so this book will use “container”), will expand itself
whenever necessary to accommodate everything you place inside it. So you don’t
need to know how manyobjects you’re going to hold in a container. Just create a
container object and let it take care of the details.
Fortunately, a good OOP language comes with a set of containers as part of the
package. In C++, it’s part of the Standard C++ Library and is sometimes called the
Standard Template Library (STL). Object Pascal has containers in its Visual
Component Library (VCL). Smalltalk has a very complete set of containers. Java also
has containers in its standard library. In some libraries, a generic container is
considered good enough for all needs, and in others (Java, for example) the library has
different types of containers for different needs: a vector (called an ArrayList in Java)
for consistent access to all elements, and a linked list for consistent insertion at all
elements, for example, so you can choose the particular type that fits your needs.
Container libraries may also include sets, queues, hash tables, trees, stacks, etc.
All containers have some way to put things in and get things out; there are
usually functions to add elements to a container, and others to fetch those elements
back out. But fetching elements can be more problematic, because a single-selection
function is restrictive. What if you want to manipulate or compare a set of elements in
the container instead of just one?
The solution is an iterator, which is an object whose job is to select the elements
within a container and present them to the user of the iterator. As a class, it also
provides a level of abstraction. This abstraction can be used to separate the details of
the container from the code that’s accessing that container. The container, via the
iterator, is abstracted to be simply a sequence. The iterator allows you to traverse that
sequence without worrying about the underlying structure—that is, whether it’s an
ArrayList, a LinkedList, a Stack, or something else. This gives you the flexibility to
easily change the underlying data structure without disturbing the code in your
program. Java began (in version 1.0 and 1.1) with a standard iterator, called
Enumeration, for all of its container classes. Java 2 has added a much more complete
container library that contains an iterator called Iterator that does more than the older
Enumeration.
From a design standpoint, all you really want
is a sequence that can be
manipulated to solve your problem. If a single type of sequence satisfied all of your
needs, there’d be no reason to have different kinds. There are two reasons that you
need a choice of containers. First, containers provide different types of interfaces and
external behavior. A stack has a different interface and behavior than that of a queue,
which is different from that of a set or a list. One of these might provide a more
flexible solution to your problem than the other. Second, different containers have
different efficiencies for certain operations. The best example is an ArrayList and a
LinkedList. Both are simple sequences that can have identical interfaces and external
behaviors. But certain operations can have radically different costs. Randomly
accessing elements in an ArrayList is a constant-time operation; it takes the same
amount of time regardless of the element you select. However, in a LinkedList it is
expensive to move through the list to randomly select an element, and it takes longer
to find an element that is further down the list. On the other hand, if you want to insert
an element in the middle of a sequence, it’s much cheaper in a LinkedList than in an
ArrayList. These and other operations have different efficiencies depending on the
underlying structure of the sequence. In the design phase, you might start with a
LinkedList and, when tuning for performance, change to an ArrayList. Because of the
abstraction via iterators, you can change from one to the other with minimal impact
on your code.
In the end, remember that a container is only a storage cabinet to put objects in.
If that cabinet solves all of your needs, it doesn’t really matter how it is implemented
(a basic concept with most types of objects). If you’re working in a programming
environment that has built-in overhead due to other factors, then the cost difference
between an ArrayList and a LinkedList might not matter. You might need only one
type of sequence. You can even imagine the “perfect” container abstraction, which
can automatically change its underlying implementation according to the way it is
used.
2 The singly rooted hierarchy
One of the issues in OOP that has become especially prominent since the
introduction of C++ is whether all classes should ultimately be inherited from a single
base class. In Java (as with virtually all other OOP languages) the answer is “yes” and
the name of this ultimate base class is simply Object. It turns out that the benefits of
the singly rooted hierarchy are many.
All objects in a singly rooted hierarchy have an interface in common, so they are
all ultimately the same type. The alternative (provided by C++) is that you don’t know
that everything is the same fundamental
type. From a backward-compatibility
standpoint this fits the model of C better and can be thought of as less restrictive, but
when you want to do full-on object-oriented programming you must then build your
own hierarchy to provide the same convenience that’s built into other OOP languages.
And in any new class library you acquire, some other incompatible interface will be
used. It requires effort (and possibly multiple inheritance) to work the new interface
into your design. Is the extra “flexibility” of C++ worth it? If you need it—if you have
a large investment in C—it’s quite valuable. If you’re starting from scratch, other
alternatives such as Java can often be more productive.
All objects in a singly rooted hierarchy (such as Java provides) can be
guaranteed to have certain functionality. You know you can perform certain basic
operations on every object in your system. A singly rooted hierarchy, along with
creating all objects on the heap, greatly simplifies argument passing (one of the more
complex topics in C++).
A singly rooted hierarchy makes it much easier to implement a garbage collector
(which is conveniently built into Java). The necessary support can be installed in the
base class, and the garbage collector can thus send the appropriate messages to every
object in the system. Without a singly rooted hierarchy and a system to manipulate an
object via a reference, it is difficult to implement a garbage collector.
Since run-time type information is guaranteed to be in all objects, you’ll never
end up with an object whose type you cannot determine. This is especially important
with system level operations, such as exception handling, and to allow greater
flexibility in programming.
3 Collection libraries and support for easy collection use
Because a container is a tool that you’ll use frequently, it makes sense to have a
library of containers that are built in a reusable fashion, so you can take one off the
shelf Because a container is a tool that you’ll use frequently, it makes sense to have a
library of containers that are built in a reusable fashion, so you can take one off the
shelf and plug it into your program. Java provides such a library, which should satisfy
most needs.
Downcasting vs. templates/generics
To make these containers reusable, they hold the one universal type in Java that
was previously mentioned: Object. The singly rooted hierarchy means that everything
is an Object, so a container that holds Objects can hold anything. This makes
containers easy to reuse.
To use such a container, you simply add object references to it, and later ask for
them back. But, since the container holds only Objects, when you add your object
reference into the container it is upcast to Object, thus losing its identity. When you
fetch it back, you get an Object reference, and not a reference to the type that you put
in. So how do you turn it back into something that has the useful interface of the
object that you put into the container?
Here, the cast is used again, but this time you’re not casting up the inheritance
hierarchy to a more general type, you cast down the hierarchy to a more specific type.
This manner of casting is called downcasting. With upcasting, you know, for example,
that a Circle is a type of Shape so it’s safe to upcast, but you don’t know that an
Object is necessarily a Circle or a Shape so it’s hardly safe to downcast unless you
know that’s what you’re dealing with.
It’s not completely dangerous, however, because if you downcast to the wrong
thing you’ll get a run-time error called an exception, which will be described shortly.
When you fetch object references from a container, though, you must have some way
to remember exactly what they are so you can perform a proper downcast.
Downcasting and the run-time checks require extra time for the running program,
and extra effort from the programmer. Wouldn’t it make sense to somehow create the
container so that it knows the types that it holds, eliminating the need for the
downcast and a possible mistake? The solution is parameterized types, which are
classes that the compiler can automatically customize to work with particular types.
For example, with a parameterized container, the compiler could customize that
container so that it would accept only Shapes and fetch only Shapes.
Parameterized types are an important part of C++, partly because C++ has no
singly rooted hierarchy. In C++, the keyword that implements parameterized types is
“template.” Java currently has no parameterized types since it is possible for it to get
by—however awkwardly—using the singly rooted hierarchy. However, a current
proposal for parameterized types uses a syntax that is strikingly similar to C++
templates.
译文
对象的创建和存在时间
从技术角度说,OOP(面向对象程序设计)只是涉及抽象的数据类型、继承
以及多形性,但另一些问题也可能显得非常重要。本节将就这些问题进行探讨。
最重要的问题之一是对象的创建及破坏方式。对象需要的数据位于哪儿,如
何控制对象的“存在时间”呢?针对这个问题,解决的方案是各异其趣的。C++
认为程序的执行效率是最重要的一个问题,所以它允许程序员作出选择。为获得
最快的运行速度,存储以及存在时间可在编写程序时决定,只需将对象放置在堆
栈(有时也叫作自动或定域变量)或者静态存储区域即可。这样便为存储空间的
分配和释放提供了一个优先级。某些情况下,这种优先级的控制是非常有价值的。
然而,我们同时也牺牲了灵活性,因为在编写程序时,必须知道对象的准确的数
量、存在时间、以及类型。如果要解决的是一个较常规的问题,如计算机辅助设
计、仓储管理或者空中交通控制,这一方法就显得太局限了。
第二个方法是在一个内存池中动态创建对象,该内存池亦叫“堆”或者“内
存堆”。若采用这种方式,除非进入运行期,否则根本不知道到底需要多少个对
象,也不知道它们的存在时间有多长,以及准确的类型是什么。这些参数都在程
序正式运行时才决定的。若需一个新对象,只需在需要它的时候在内存堆里简单
地创建它即可。由于存储空间的管理是运行期间动态进行的,所以在内存堆里分
配存储空间的时间比在堆栈里创建的时间长得多(在堆栈里创建存储空间一般只
需要一个简单的指令,将堆栈指针向下或向下移动即可)。由于动态创建方法使
对象本来就倾向于复杂,所以查找存储空间以及释放它所需的额外开销不会为对
象的创建造成明显的影响。除此以外,更大的灵活性对于常规编程问题的解决是
至关重要的。
C++允许我们决定是在写程序时创建对象,还是在运行期间创建,这种控制
方法更加灵活。大家或许认为既然它如此灵活,那么无论如何都应在内存堆里创
建对象,而不是在堆栈中创建。
但还要考虑另外一个问题,亦即对象的“存在时间”或者“生存时间”
(Lifetime)。若在堆栈或者静态存储空间里创建一个对象,编译器会判断对象