logo资料库

数据流分析理论及实践Data Flow Analysis Theory and Practice.pdf

第1页 / 共379页
第2页 / 共379页
第3页 / 共379页
第4页 / 共379页
第5页 / 共379页
第6页 / 共379页
第7页 / 共379页
第8页 / 共379页
资料共379页,剩余部分请下载后查看
2880_c000.pdf
Data Flow Analysis: Theory and Practice
Preface
Contents
Appendix A: An Introduction to GCC
References
2880_c001.pdf
Table of Contents
Chapter 1: An Introduction to Data Flow Analysis
1.1 A Motivating Example
1.1.1 Optimizing for Heap Memory
1.1.2 Computing Liveness
Modelling Interprocedural Effects
Simple Intraprocedural Liveness Analysis
Intraprocedural Analysis with Interprocedural Approximation
Interprocedural Analysis
A Comparison of Liveness Computed by Three Methods
1.1.3 Computing Aliases
1.1.4 Performing Optimization
1.1.5 General Observations
1.2 Program Analysis: The Larger Perspective
Applications of Analysis
Approaches to Program Analysis
Time of Performing Analysis
Scope of Analysis
Flow Sensitivity of Analysis
Context Sensitivity of Analysis
Granularity of Performing Analysis
Program Representations Used for Analysis
Representations of Information
1.3 Characteristics of Data Flow Analysis
1.4 Summary and Concluding Remarks
1.5 Bibliographic Notes
Appendix A: An Introduction to GCC
References
2880_c002.pdf
Table of Contents
Part I: Intraprocedural Data Flow Analysis
Chapter 2: Classical Bit Vector Data Flow Analysis
2.1 Basic Concepts and Notations
2.2 Discovering Local Data Flow Information
2.3 Discovering Global Properties of Variables
2.3.1 Live Variables Analysis
2.3.2 Dead Variables Analysis
2.3.3 Reaching Definitions Analysis
2.3.4 Reaching Definitions for Copy Propagation
2.4 Discovering Global Properties of Expressions
2.4.1 Available Expressions Analysis
2.4.2 Partially Available Expressions Analysis
2.4.3 Anticipable Expressions Analysis
2.4.4 Classical Partial Redundancy Elimination
Hoisting Path of an Expression
Transformation Using Hoisting Path
Limitations of Partial Redundancy Elimination
Handling Critical Edges
Handling Critical Nodes
2.4.5 Lazy Code Motion
2.5 Combined May-Must Analyses
2.6 Summary and Concluding Remarks
2.7 Bibliographic Notes
Appendix A: An Introduction to GCC
References
2880_c003.pdf
Table of Contents
Chapter 3: Theoretical Abstractions in Data Flow Analysis
3.1 Graph Properties Relevant to Data Flow Analysis
3.2 Data Flow Framework
3.2.1 Modeling Data Flow Values Using Lattices
Partially Ordered Sets
Lattices and Complete Lattices
Meet Semilattices
3.2.2 Modeling Flow Functions
3.2.3 Data Flow Frameworks
3.3 Data Flow Assignments
3.3.1 Meet Over Paths Assignment
3.3.2 Fixed Point Assignment
3.3.3 Existence of Fixed Point Assignment
3.4 Computing Data Flow Assignments
3.4.1 Computing MFP Assignment
3.4.2 Comparing MFP and MOP Assignments
3.4.3 Undecidability of MOP Assignment Computation
3.5 Complexity of Data Flow Analysis for Rapid Frameworks
3.5.1 Properties of Data Flow Frameworks
3.5.2 Complexity for General CFGs
3.5.3 Complexity in Special Cases
3.6 Summary and Concluding Remarks
3.7 Bibliographic Notes
Appendix A: An Introduction to GCC
References
2880_c004.pdf
Table of Contents
Chapter 4: General Data Flow Frameworks
4.1 Non-Separable Flow Functions
4.2 Discovering Properties of Variables
4.2.1 Faint Variables Analysis
4.2.2 Possibly Uninitialized Variables Analysis
4.2.3 Constant Propagation
Classical Constant Propagation Using Def-Use Chains
Data Flow Analysis for Constant Propagation
Properties of Constant Propagation Data Flow Framework
4.2.4 Variants of Constant Propagation
Conditional Constant Propagation
Copy Constant Propagation
Linear Constant Propagation
4.3 Discovering Properties of Pointers
4.3.1 Points-To Analysis of Stack and Static Data
Points-To Analysis with Degree of Certainty
4.3.2 Alias Analysis of Stack and Static Data
A Comparison of Points-to and Alias Relations
4.3.3 Formulating Data Flow Equations for Alias Analysis
4.4 Liveness Analysis of Heap Data
4.4.1 Access Expressions and Access Paths
4.4.2 Liveness of Access Paths
4.4.3 Representing Sets of Access Paths by Access Graphs
Summarization
Operations on Access Graphs
Safety of Access Graph Operations
4.4.4 Data Flow Analysis for Explicit Liveness
Convergence of Explicit Liveness Analysis
Efficiency of Explicit Liveness Analysis
4.4.5 The Motivating Example Revisited
Intraprocedural Analysis by Ignoring the Interprocedural Effects
Intraprocedural Analysis with Conservative Interprocedural Approximation
4.5 Modeling Entity Dependence
4.5.1 Primitive Entity Functions
4.5.2 Composite Entity Functions
4.6 Summary and Concluding Remarks
4.7 Bibliographic Notes
Appendix A: An Introduction to GCC
References
2880_c005.pdf
Table of Contents
Chapter 5: Complexity of Iterative Data Flow Analysis
5.1 Generic Flow Functions and Data Flow Equations
5.2 Generic Round-Robin Iterative Algorithm
5.3 Complexity of Round-Robin Iterative Algorithm
5.3.1 Identifying the Core Work Using Work List
5.3.2 Information Flow Paths in Bit Vector Frameworks
Information Flow Paths and the Work List Algorithm
5.3.3 Defining Complexity Using Information Flow Paths
5.3.4 Information Flow Paths in Fast Frameworks
5.3.5 Information Flow Paths in Non-separable Frameworks
5.4 Summary and Concluding Remarks
5.5 Bibliographic Notes
Appendix A: An Introduction to GCC
References
2880_c006.pdf
Table of Contents
Chapter 6: Single Static Assignment Form as Intermediate Representation
6.1 Introduction
6.1.1 An Overview of SSA
6.1.2 Benefits of SSA Representation
6.2 Construction of SSA Form Programs
6.2.1 Dominance Frontier
6.2.2 Placement of phi-instructions
6.2.3 Renaming of Variables
6.2.4 Correctness of the Algorithm
6.3 Destruction of SSA
6.3.1 An Algorithm for SSA Destruction
6.3.2 SSA Destruction and Register Allocation
Overview
Spilling
The Spilling Algorithm
Coloring
Coalescing by Recoloring
SSA coalescing
Register Copies
6.4 Summary and Concluding Remarks
6.5 Bibliographic Notes
Appendix A: An Introduction to GCC
References
2880_c007.pdf
Table of Contents
Part II: Interprocedural Data Flow Analysis
Chapter 7: Introduction to Interprocedural Data Flow Analysis
7.1 A Motivating Example
7.2 Program Representations for Interprocedural Analysis
7.3 Modeling Interprocedural Data Flow Analysis
7.3.1 Summary Flow Functions
7.3.2 Inherited and Synthesized Data Flow Information
7.3.3 Approaches to Interprocedural Data Flow Analysis
7.4 Compromising Precision for Scalability
7.4.1 Flow and Context Insensitivity
Flow insensitivity
Context insensitivity
7.4.2 Side Effects Analysis
7.5 Language Features Influencing Interprocedural Analysis
7.6 Common Variants of Interprocedural Data Flow Analysis
7.6.1 Intraprocedural Analysis with Conservative Interprocedural Approximation
7.6.2 Intraprocedural Analysis with Side Effects Computation
Flow insensitive side effects
Flow sensitive side effects
7.6.3 Whole Program Analysis
Context insensitive whole program analysis
Context sensitive whole program analysis
7.7 An Aside on Interprocedural Optimizations
7.8 Summary and Concluding Remarks
7.9 Bibliographic Notes
Appendix A: An Introduction to GCC
References
2880_c008.pdf
Table of Contents
Chapter 8: Functional Approach to Interprocedural Data Flow Analysis
8.1 Side Effects Analysis of Procedure Calls
8.1.1 Computing Flow Sensitive Side Effects
8.1.2 Computing Flow Insensitive Side Effects
8.2 Handling the Effects of Parameters
8.2.1 Defining Aliasing of Parameters
8.2.2 Formulating Alias Analysis of Parameters
8.2.3 Augmenting Data Flow Analyses Using Parameter Aliases
8.2.4 Efficient Parameter Alias Analysis
8.3 Whole Program Analysis
8.3.1 Lattice of Flow Functions
8.3.2 Reducing Function Compositions and Confluences
Function compositions and confluences for bit vector frameworks
Function compositions and confluences for general frameworks
8.3.3 Constructing Summary Flow Functions
8.3.4 Computing Data Flow Information
8.3.5 Enumerating Summary Flow Functions
8.4 Summary and Concluding Remarks
8.5 Bibliographic Notes
Appendix A: An Introduction to GCC
References
2880_c009.pdf
Table of Contents
Chapter 9: Value-Based Approach to Interprocedural Data Flow Analysis
9.1 Program Model for Value-Based Approaches to Interprocedural Data Flow Analysis
9.2 Interprocedural Analysis Using Restricted Contexts
9.3 Interprocedural Analysis Using Unrestricted Contexts
9.3.1 Using Call Strings to Represent Unrestricted Contexts
9.3.2 Issues in Termination of Call String Construction
9.4 Bounding Unrestricted Contexts Using Data Flow Values
9.4.1 Call String Invariants
An aside on flow function with periodic points
9.4.2 Value-Based Termination of Call String Construction
9.5 The Motivating Example Revisited
9.6 Summary and Concluding Remarks
9.7 Bibliographic Notes
Appendix A: An Introduction to GCC
References
2880_c010.pdf
Table of Contents
Part III: Implementing Data Flow Analysis
Chapter 10: Implementing Data Flow Analysis in GCC
10.1 Specifying a Data Flow Analysis
10.1.1 Registering a Pass With the Pass Manager in GCC
10.1.2 Specifying Available Expressions Analysis
10.1.3 Specifying Other Bit Vector Data Flow Analyses
10.2 An Example of Data Flow Analysis
10.2.1 Executing the Data Flow Analyzer
10.2.2 Examining the Gimple Version of CFG
10.2.3 Examining the Result of Data Flow Analysis
10.3 Implementing the Generic Data Flow Analyzer gdfa
10.3.1 Specification Primitives
10.3.2 Interface with GCC
Traversal Over CFG and Basic Blocks
Discovering the Entities in a Statement
Constructing and Manipulating Data Flow Values
Facilities for Printing Entities
10.3.3 The Preparatory Pass
10.3.4 Local Data Flow Analysis
10.3.5 Global Data Flow Analysis
10.4 Extending the Generic Data Flow Analyzer gdfa
Appendix A: An Introduction to GCC
References
2880_a001.pdf
Table of Contents
Appendix A: An Introduction to GCC
A.1 About GCC
A.2 Building GCC
Configuring GCC
Compiling GCC
Installing GCC
Downloading and Installing gdfa
A.3 Further Readings in GCC
References
2880_references.pdf
Table of Contents
References
Appendix A: An Introduction to GCC
Data Flow Analysis Theory and Practice © 2009 by Taylor & Francis Group, LLC
Data Flow Analysis Theory and Practice Uday P. Khedker Amitabha Sanyal Bageshri Karkare Boca Raton London New York CRC Press is an imprint of the Taylor & Francis Group, an informa business © 2009 by Taylor & Francis Group, LLC
CRC Press Taylor & Francis Group 6000 Broken Sound Parkway NW, Suite 300 Boca Raton, FL 33487-2742 © 2009 by Taylor & Francis Group, LLC CRC Press is an imprint of Taylor & Francis Group, an Informa business No claim to original U.S. Government works Printed in the United States of America on acid-free paper 10 9 8 7 6 5 4 3 2 1 International Standard Book Number-13: 978-0-8493-2880-0 (Hardcover) This book contains information obtained from authentic and highly regarded sources. Reasonable efforts have been made to publish reliable data and information, but the author and publisher can- not assume responsibility for the validity of all materials or the consequences of their use. The authors and publishers have attempted to trace the copyright holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this form has not been obtained. If any copyright material has not been acknowledged please write and let us know so we may rectify in any future reprint. Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, microfilming, and recording, or in any information storage or retrieval system, without written permission from the publishers. For permission to photocopy or use material electronically from this work, please access www.copy- right.com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that pro- vides licenses and registration for a variety of users. For organizations that have been granted a photocopy license by the CCC, a separate system of payment has been arranged. Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe. Library of Congress Cataloging-in-Publication Data Khedker, Uday. Bageshri Karkare. p. cm. Data flow analysis : theory and practice / Uday Khedker, Amitabha Sanyal, Includes bibliographical references and index. ISBN 978-0-8493-2880-0 (hardcover : alk. paper) 1. Compilers (Computer programs) 2. Data flow computing. 3. Software engineering. 4. Computer software--Verification. I. Sanyal, Amitabha. II. Karkare, Bageshri. III. Title. QA76.76.C65K54 2009 004’.35--dc22 2009002056 Visit the Taylor & Francis Web site at http://www.taylorandfrancis.com and the CRC Press Web site at http://www.crcpress.com © 2009 by Taylor & Francis Group, LLC
Preface Data flow analysis is a classical static analysis technique that has been used to dis- cover useful properties of programs being analyzed. It has found many useful ap- plications ranging from compiler optimizations to software engineering to software verification. Modern compilers use this technique to produce code that maximize performance. In software engineering, it is used to re-engineer or reverse engineer programs. Finally, data flow analysis based techniques are used in software verifica- tion to prove the soundness of programs with respect to properties of interest. This book provides a detailed treatment of data flow analysis. Although we explain it in the context of compiler optimizations, the concepts are general enough to be used for other applications. This is possible because we use a general model of data flow equations to represent the specification of data flow analysis. These data flow equations are defined in terms of constant and dependent Gen and Kill components. For classical bit vector frameworks, the constant Gen and Kill suffice; dependent parts are required for frameworks like constant propagation, points-to analysis etc. Such a modeling explicates the inter-dependence of data flow values and leads to an orthogonal generality that models flow functions in terms of a rather small set of constituent functions called entity functions. On the one hand, modeling flow functions in terms of entity functions allows us to define information flow paths that explain empirical observations for a large class of data flow frameworks and facilitate tight complexity bounds on solution procedures for data flow equations. On the other hand, this modeling also allows reasoning about the feasibility of constructing summary flow functions. The book is organized in three parts: The first part deals with the specification of data flow frameworks and the solution process at the intraprocedural level. This part presents the lattice theoretic modeling of data flow frameworks apart from the gen- eralizations of constant and dependent parts in flow functions and entity functions as constituents of flow functions. It shows how these generalizations lead to tight complexity bounds. This part also presents a large number of data flow frameworks. The diversity of these analyses is an evidence of the wide applicability of the gener- alizations presented. The final chapter of the first part presents SSA representation of programs. This is interesting because it builds an additional layer of abstraction over the control flow graph representation of programs and directly relates the definition points and the use points of data. This increases the efficiency with which a class of optimizations can be performed. The second part of the book presents interprocedural data flow analysis. As a matter of choice, we avoid methods that are specific to a particular application or © 2009 by Taylor & Francis Group, LLC v
vi a particular data flow framework and instead, focus on generic approaches. The first approach is a functional approach that constructs context independent summary flow functions of procedures. These flow functions are used at the call points to incorporate the effects of procedure calls. The second approach is a value-based approach that computes distinct values for distinct calling contexts; this is achieved by augmenting the data flow values with context information. The third part of the book describes the implementation of a generic data flow analyzer for bit vector frameworks in GCC and shows how it can be instantiated to a given framework. This book is an outcome of our notes for the course CS618: Program Analysis which is a graduate course at the Department of Computer Science and Engineering, IIT Bombay. The slides used in the course and the source of the generic data flow analyzer gdfaare available at the web page of the book: http://www.cse.iitb.ac.in/˜uday/dfaBook-web As errors are discovered, we will upload an errata on the above web page. Any additional material that we find relevant to a course based on this book will also be made available on the same web page. Many people have gone through the earlier versions of this manuscript. The reg- istrants of CS618 were our captive audience for testing our examples—some ex- amples tested their patience in the examinations of CS618. The following students of CS618 pointed out errors to us: Abhishek Shrivastav, Amitraj Singh Chouhan, Dhritiman Das, Harshada Gune, Md. Naseerunddin, Nilesh Padariya, Prashima Sharma, and Pushpraj Agrawal. Among others, Jaishri Waghmare, Prashant Singh Rawat, Sameera Deshpande, Santosh Sonawane, and Seema Ravandale read some chapters and gave valuable comments. Seema extended gdfa to include support for reaching definitions analysis. Sameera’s help in preparing the first draft of the index is gratefully acknowledged. Finally, this book would not have been possible without the patience and constant encouragement of our families. They have gracefully tolerated our mental, if not physical, absence, relieving us from a sense of guilt. We express a deep sense of gratitude for their support. Uday P. Khedker, Amitabha Sanyal, and Bageshri Karkare © 2009 by Taylor & Francis Group, LLC
vii To my mother Rajani and the memory of my father Prabhakar Khedker Uday Khedker To my parents Arunojjwal and Prakriti Sanyal Amitabha Sanyal © 2009 by Taylor & Francis Group, LLC
Contents Preface 1 An Introduction to Data Flow Analysis . . . . 1.1 A Motivating Example . . . . . . Optimizing for Heap Memory . . . . . . Computing Liveness . . . . . . . Computing Aliases . . . . Performing Optimization . . . . . . . . . General Observations . . . . . . . . . . . 1.1.1 . . . . 1.1.2 . . . . 1.1.3 . . . . 1.1.4 1.1.5 . . . . Program Analysis: The Larger Perspective . . . . . . . . Characteristics of Data Flow Analysis Summary and Concluding Remarks . . . . . . . . . . . . Bibliographic Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 1.3 1.4 1.5 I Intraprocedural Data Flow Analysis 2 Classical Bit Vector Data Flow Analysis Basic Concepts and Notations . . . . . . . 2.3.1 2.3.2 2.3.3 2.3.4 2.4 Discovering Global Properties of Expressions 2.1 . . . . 2.2 Discovering Local Data Flow Information . . . . 2.3 Discovering Global Properties of Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Live Variables Analysis . . . . Dead Variables Analysis . . . . . . . . Reaching Definitions Analysis . . Reaching Definitions for Copy Propagation . . . . . . . 2.4.1 . . . Available Expressions Analysis . . . . 2.4.2 . . . Partially Available Expressions Analysis 2.4.3 Anticipable Expressions Analysis . . . . . . . 2.4.4 Classical Partial Redundancy Elimination . . . . . . 2.4.5 Lazy Code Motion . . . . Combined May-Must Analyses . . . . . . . . Summary and Concluding Remarks Bibliographic Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 2.6 2.7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . © 2009 by Taylor & Francis Group, LLC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v 1 1 1 4 9 10 10 12 16 18 19 21 23 23 24 26 26 29 29 32 33 33 36 37 39 49 53 56 57 ix
分享到:
收藏