logo资料库

Game Programming Gems 2.pdf

第1页 / 共551页
第2页 / 共551页
第3页 / 共551页
第4页 / 共551页
第5页 / 共551页
第6页 / 共551页
第7页 / 共551页
第8页 / 共551页
资料共551页,剩余部分请下载后查看
Section 1 General Programming
1.1 Optimization for C++ Games
1.2 Inline Functions Versus Macros
1.3 Programming with Abstract Interfaces
1.4 Exporting C++ Classes from DLLs
1.5 Protect Yourself from DLL Hell and Missing OS Functions
1.6 Dynamic Type Information
1.7 A Property Class for Generic C++ Member Access
1.8 A Game Entity Factory
1.9 Adding Deprecation Facilities to C++
1.10 A Drop-in Debug Memory Manager
1.11 A Built-in Game Profiling Module
1.12 Linear Programming Model for Windows-based Games
1.13 Stack Winding
1.14 Self-Modifying Code
1.15 File Management Using Resource Files
1.16 Game Input Recording and Playback
1.17 A Flexible Text Parsing System
1.18 A Generic Tweaker
1.19 Genuine Random Number Generation
1.20 Using Bloom Filters to Improve Computational Performance
1.21 3ds max Skin Exporter and Animation Toolkit
1.22 Using Web Cameras in Video Games
Section 2 Mathematics
2.1 Floating-Point Tricks: Improving Performance with IEEE Floating Point
2.2 Vector and Plane Tricks
2.3 Fast, Robust Intersection of 3D Line Segments
2.4 Inverse Trajectory Determination
2.5 The Parallel Transport Frame
2.6 Smooth C2 Quaternion-based Flythrough Paths
2.7 Recursive Dimensional Clustering: A Fast Algorithm for Collision Detection
2.8 Programming Fractals
Section 3 Artificial Intelligence
3.1 Strategies for Optimizing Al
3.2 Micro-Threads for Game Object Al
3.3 Managing Al with Micro- Threads
3.4 An Architecture for RTS Command Queuing
3.5 A High-Performance Tilebased Line-of-Sight and Search System
3.6 Influence Mapping
3.7 Strategic Assessment Techniques
3.8 Terrain Reasoning for 3D Action Games
3.9 Expanded Geometry for Points-of-Visibility Pathfinding
3.10 Optimizing Points-of-Visibility Pathfinding
3.11 Flocking with Teeth: Predators and Prey
3.12 A Generic Fuzzy State Machine in C++
3.13 Imploding Combinatorial Explosion in a Fuzzy System
3.14 Using a Neural Network in a Game: A Concrete Example
Section 4 Geometry Management
4.1 Comparison of VIPM Methods
4.2 Simplified Terrain Using Interlocking Tiles
4.3 Sphere Ttees for Fast Visibility Culling, Ray Tracing, and Range Searching
4.4 Compressed Axis-Aligned Bounding Box Trees
4.5 Direct Access Quadtree Lookup
4.6 Approximating Fish Tank Refractions
4.7 Rendering Print Resolution Screenshots
4.8 Applying Decals to Arbitrary Surfaces
4.9 Rendering Distant Scenery with Skyboxes
4.10 Self-Shadowing Characters
4.11 Classic Super Mario 64 Third-Person Control and Animation
Section 5 Graphics Display
5.1 Cartoon Rendering: Real-time Silhouette Edge Detection and Rendering
5.2 Cartoon Rendering Using Texture Mapping and Programmable Vertex Shaders
5.3 Dynamic Per-Pixel Lighting Techniques
5.4 Generating Procedural Clouds Using 3D Hardware
5.5 Texture Masking for Faster Lens Flare
5.6 Practical Priority Buffer Shadows
5.7 Impostors: Adding Clutter
5.8 Operations for Hardware- Accelerated Procedural Texture Animation
Section 6 Audio Programming
6.1 Game Audio Design Patterns
6.2 A Technique to Instantaneously Reuse Voices in a Sample-based Synthesizer
6.3 Software-based DSP Effects
6.4 Interactive Processing Pipeline for Digital Audio
6.5 A Basic Music Sequencer for Games
6.6 An Interactive Music Sequencer for Games
6.7 A Low-Level Sound API
APPENDIX
GameGems II Converted by Borz borzpro @yahoo .com 2002.12.01
1.1 m "' Optimization for C++ Games G Andrew Kirmse, LucasArts Entertainment ark@alum.mit.edu Well-written C++ games are often more maintainable and reusable than their plain C counterparts are—but is it worth it? Can complex C++ programs hope to match traditional C programs in speed? With a good compiler and thorough knowledge of the language, it is indeed pos- sible to create efficient games in C++. This gem describes techniques you can use to speed up games in particular. It assumes that you're already convinced of the benefits of using C++, and that you're familiar with the general principles of optimization (see Further Investigations for these). One general principle that merits repeating is the absolute importance of profil- ing. In the absence of profiling, programmers tend to make two types of mistakes. First, they optimize the wrong code. The great majority of a program is not perfor- mance critical, so any time spent speeding it up is wasted. Intuition about which code is performance critical is untrustworthy—only by direct measurement can you be sure. Second, programmers sometimes make "optimizations" that actually slow down the code. This is particularly a problem in C++, where a deceptively simple line can actually generate a significant amount of machine code. Examine your compiler's out- put, and profile often. Object Construction and Destruction The creation and destruction of objects is a central concept in C++, and is the main area where the compiler generates code "behind your back." Poorly designed pro- grams can spend substantial time calling constructors, copying objects, and generat- ing costly temporary objects. Fortunately, common sense and a few simple rules can make object-heavy code run within a hair's breadth of the speed of C. • Delay construction of objects until they're needed. The fastest code is that which never runs; why create an object if you're not going to use it? Thus, in the following code: void Function(int arg) 5
Section 1 General Programming Object obj; if (arg *= 0) return; even when arg is zero, we pay the cost of calling Object's constructor and destruc- tor. If arg is often zero, and especially if Object itself allocates memory, this waste can add up in a hurry. The solution, of course, is to move the declaration of obj until after the //check. Be careful about declaring nontrivial objects in loops, however. If you delay con- struction of an object until it's needed in a loop, you'll pay for the construction and destruction of the object on every iteration. It's better to declare the object before the loop and pay these costs only once. If a function is called inside an inner loop, and the function creates an object on the stack, you could instead cre- ate the object outside the loop and pass it by reference to the function. Use initializer lists. Consider the following class: class Vehicle { public: } private: std: : string mName; Vehicle(const std::string &name) // Don't do this! { mName = name; Because member variables are constructed before the body of the constructor is invoked, this code calls the constructor for the string mName, and then calls the = operator to copy in the object's name. What's particularly bad about this exam- ple is that the default constructor for string may well allocate memory — in fact, more memory than may be necessary to hold the actual name assigned to the variable in the constructor for Vehicle. The following code is much better, and avoids the call to operator =. Further, given more information (in this case, the actual string to be stored), the nondefault string constructor can often be more efficient, and the compiler may be able to optimize away the Vehicle constructor invocation when the body is empty: class Vehicle { public: private: Vehicle(const std::string &name) : mName(name) { }
1.1 Optimization for C++ Games std::string mName; Prefer preincrement to postincrement. The problem with writing x = y++ is that the increment function has to make a copy of the original value of y, increment y, and then return the original value. Thus, postincrement involves the construction of a temporary object, while preincrement doesn't. For integers, there's no additional overhead, but for user- defined types, this is wasteful. You should use preincrement whenever you have the option. You almost always have the option in for loop iterators. Avoid operators that return by value. The canonical way to write vector addition in C++ is this: Vector operator+(const Vector &v1, const Vector &v2) This operator must return a new Vector object, and furthermore, it must return it by value. While this allows useful and readable expressions like v = v 1 + z>2, the cost of a temporary construction and a Vector copy is usually too much for some- thing called as often as vector addition. It's sometimes possible to arrange code so that the compiler is able to optimize away the temporary object (this is known as the "return value optimization"), but in general, it's better to swallow your pride and write the slightly uglier, but usually faster: void Vector::Add(const Vector &v1, const Vector &v2) Note that operator+= doesn't suffer from the same problem, as it modifies its first argument in place, and doesn't need to return a temporary. Thus, you should use operators like += instead of + when possible. Use lightweight constructors. Should the constructor for the Vector class in the previous example initialize its elements to zero? This may come in handy in a few spots in your code, but it forces every caller to pay the price of the initialization, whether they use it or not. In particular, temporary vectors and member variables will implicitly incur the extra cost. A good compiler may well optimize away some of the extra code, but why take the chance? As a general rule, you want an object's constructor to initialize each of its member variables, because uninitialized data can lead to subtle bugs. However, in small classes that are frequently instantiated, especially as temporaries, you should be prepared to compromise this rule for performance. Prime candidates in many games are the Vector and Matrix classes. These classes should provide medi- ods (or alternate constructors) to set themselves to zero and the identity, respec- tively, but the default constructor should be empty.
Section 1 General Programming As a corollary to this principle, you should provide additional constructors to classes where this will improve performance. If the Vehicle class in our second example were instead written like this: class Vehicle { public: Vehicle () . void SetName(const std: :string &name) { mName = name; private: std: : string mName; we'd incur the cost of constructing mName, and then setting it again later via Set- Name(). Similarly, it's cheaper to use copy constructors than to construct an object and then call operator=. Prefer constructing an object this way — Vehicle vl(v2) — to this way — Vehicle vl; vl = v2;. If you want to prevent the compiler from automatically copying an object for you, declare a private copy constructor and operator= for the object's class, but don't implement either function. Any attempt to copy the object will then result in a compile-time error. Also get into the habit of declaring single-argument con- structors as explicit, unless you mean to use them as type conversions. This pre- vents the compiler from generating hidden temporary objects when converting types. Preallocate and cache objects. A game will typically have a few classes that it allocates and frees frequently, such as weapons or particles. In a C game, you'd typically allocate a big array up front and use them as necessary. With a little planning, you can do the same thing in C++. The idea is that instead of continually constructing and destructing objects, you request new ones and return old ones to a cache. The cache can be imple- mented as a template, so that it works for any class, provided that the class has a default constructor. Code for a sample cache class template is on the accompany- ing CD. You can either allocate objects to fill the cache as you need them, or preallocate all of the objects up front. If, in addition, you maintain a stack discipline on the objects (meaning that before you delete object X, you first delete all objects allo- cated after X), you can allocate the cache in a contiguous block of memory.
1.1 Optimization for C++Games Memory Management 9 —• C++ applications generally need to be more aware of the details of memory manage- ment than C applications do. In C, all allocations are explicit though mallocQ and freeQ, while C++ can implicitly allocate memory while constructing temporary objects and member variables. Most C++ games (like most C games) will require their own memory manager. Because a C++ game is likely to perform many allocations, it must be especially careful about fragmenting the heap. One option is to take one of the traditional approaches: either don't allocate any memory at all after the game starts up, or main- tain a large contiguous block of memory that is periodically freed (between levels, for example). On modern machines, such draconian measures are not necessary, if you're willing to be vigilant about your memory usage. The first step is to override the global new and delete operators. Use custom imple- mentations of diese operators to redirect the game's most common allocations away from mallocQ and into preallocated blocks of memory. For example, if you find that you have at most 10,000 4-byte allocations outstanding at any one time, you should allocate 40,000 bytes up front and issue blocks out as necessary. To keep track of which blocks are free, maintain a. free list by pointing each free block to the next free block. On allo- cation, remove the front block from the list, and on deallocation, add the freed block to the front again. Figure 1.1.1 illustrates how the free list of small blocks might wind its way through a contiguous larger block after a sequence of allocations and frees. used free used used t free ~ .~ free _ _ A . T FIGURE 1.1.1 A linked free list. You'll typically find that a game has many small, short-lived allocations, and thus you'll want to reserve space for many small blocks. Reserving many larger blocks wastes a substantial amount of memory for those blocks that are not currently in use; above a certain size, you'll want to pass allocations off to a separate large block alloca- tor, or just to mallocQ. Virtual Functions Critics of C++ in games often point to virtual functions as a mysterious feature that drains performance. Conceptually, the mechanism is simple. To generate a virtual function call on an object, the compiler accesses the objects virtual function table,
10 Section 1 General Programming retrieves a pointer to the member function, sets up the call, and jumps to the member function's address. This is to be compared with a function call in C, where the com- piler sets up the call and jumps to a fixed address. The extra overhead for the virtual function call is die indirection to die virtual function table; because the address of the call isn't known in advance, there can also be a penalty for missing the processor's instruction cache. Any substantial C++ program will make heavy use of virtual functions, so the idea is to avoid these calls in performance-critical areas. Here is a typical example: virtual char *GetPointer() = 0; class BaseClass { public: }; class Class"! : public BaseClass { >; class Class2 : public BaseClass { virtual char *GetPointer(); virtual char *GetPointer(); }| void Function(BaseClass *pObj) { char *ptr = pObj->GetPointer(); } If FunctionQ is performance critical, we want to change die call to GetPointer from virtual to inline. One way to do this is to add a new protected data member to BaseClass, which is returned by an inline version of GetPointerQ, and set the data member in each class: class BaseClass { public: inline char *GetPointerFast() { return mpPointer; inline void SetPointer(char *pData) { mpData = pData; protected: } } private: char *mpData;
1.1 Optimization for C++Games , . 11 // classl and class2 call SetPointer as necessary //in member functions void Function(BaseClass *pObj) { char *ptr = pObj->GetPointerFast(); } A more drastic measure is to rearrange your class hierarchy. If Classl and Class2 have only slight differences, it might be worth combining them into a single class, with a flag indicating whether you want the class to behave like Classl or Class2 at runtime. With this change (and the removal of the pure virtual BaseClass), the Get- Pointer function in the previous example can again be made inline. This transforma- tion is far from elegant, but in inner loops on machines with small caches, you'd be willing to do much worse to get rid of a virtual function call. Although each new virtual function adds only the size of a pointer to a per-class table (usually a negligible cost), the yzrtf virtual function in a class requires a pointer to the virtual function table on a pet-object basis. This means that you don't want to have any virtual functions at all in small, frequently used classes where this extra overhead is unacceptable. Because inheritance generally requires the use of one or more virtual functions (a virtual destructor if nothing else), you don't want any hierarchy for small, heavily used objects. Code Size Compilers have a somewhat deserved reputation for generating bloated code for C++. Because memory is limited, and because small is fast, it's important to make your exe- cutable as small as possible. The first thing to do is get the compiler on your side. If your compiler stores debugging information in the executable, disable the generation of debugging information. (Note that Microsoft Visual C++ stores debugging infor- mation separate from the executable, so this may not be necessary.) Exception handling generates extra code; get rid of as much exception-generating code as possible. Make sure the linker is configured to strip out unused functions and classes. Enable the com- piler's highest level of optimization, and try setting it to optimize for size instead of speed—sometimes this actually produces faster code because of better instruction cache coherency. (Be sure to verify that intrinsic functions are still enabled if you use this setting.) Get rid of all of your space-wasting strings in debugging print statements, and have the compiler combine duplicate constant strings into single instances. Inlining is often the culprit behind suspiciously large functions. Compilers are free to respect or ignore your inline keywords, and they may well inline functions without telling you. This is another reason to keep your constructors lightweight, so that objects on the stack don't wind up generating lots of inline code. Also be careful of overloaded operators; a simple expression like ml = m2 * m3 can generate a ton of
分享到:
收藏