logo资料库

Inside the Python virtual machine (2019).pdf

第1页 / 共126页
第2页 / 共126页
第3页 / 共126页
第4页 / 共126页
第5页 / 共126页
第6页 / 共126页
第7页 / 共126页
第8页 / 共126页
资料共126页,剩余部分请下载后查看
Table of Contents
Introduction
The View From 30,000ft
Compiling Python Source Code
From Source To Parse Tree
Python tokens
From Parse Tree To Abstract Syntax Tree
Building The Symbol Table
From AST To Code Objects
Python Objects
PyObject
Under the cover of Types
Type Object Case Studies
Minting type instances
Objects and their attributes
Method Resolution Order (MRO)
Code Objects
Exploring code objects
Code Objects within other code objects
Code Objects in the VM
Frames Objects
Allocating Frame Objects
Interpreter and Thread States
The Interpreter state
The Thread state
Intermezzo: The abstract.c Module
The evaluation loop, ceval.c
Putting names in place
The parts of the machine
The Evaluation loop
A sampling of opcodes
The Block Stack
A Short Note on Exception Handling
From Class code to bytecode
Generators: Behind the scenes.
The Generator object
Running a generator
Inside The Python Virtual Machine Obi Ike-Nwosu This book is for sale at http://leanpub.com/insidethepythonvirtualmachine This version was published on 2019-03-02 This is a Leanpub book. Leanpub empowers authors and publishers with the Lean Publishing process. Lean Publishing is the act of publishing an in-progress ebook using lightweight tools and many iterations to get reader feedback, pivot until you have the right book and build traction once you do. © 2015 - 2019 Obi Ike-Nwosu
Also By Obi Ike-Nwosu Intermediate Python
Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2. The View From 30,000ft . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3. Compiling Python Source Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . From Source To Parse Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Python tokens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . From Parse Tree To Abstract Syntax Tree . . . . . . . . . . . . . . . . . . . . . . . . . Building The Symbol Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . From AST To Code Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 3.2 3.3 3.4 3.5 1 3 9 9 11 15 16 24 4. Python Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 37 38 41 44 48 59 PyObject . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Under the cover of Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Type Object Case Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Minting type instances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Objects and their attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Method Resolution Order (MRO) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 4.2 4.3 4.4 4.5 4.6 5. Code Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 62 68 70 Exploring code objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Code Objects within other code objects . . . . . . . . . . . . . . . . . . . . . . . . . . Code Objects in the VM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 5.2 5.3 6. 7. Frames Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 6.1 74 Allocating Frame Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Interpreter and Thread States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 76 7.1 7.2 78 The Interpreter state . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Thread state . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8. Intermezzo: The abstract.c Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 9. The evaluation loop, ceval.c . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 88 90 Putting names in place . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The parts of the machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.1 9.2
CONTENTS 9.3 9.4 The Evaluation loop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A sampling of opcodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 99 10. The Block Stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 A Short Note on Exception Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 10.1 11. From Class code to bytecode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 12. Generators: Behind the scenes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 The Generator object . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 Running a generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 12.1 12.2
1. Introduction The Python Programming language has been around for quite a while. Development work was started on the first version by Guido Van Rossum in 1989 and it has since grown to become one of the more popular languages that has been used in applications ranging from graphical interfaces to financial¹ and data analysis² applications. This write-up aims to go behind the scene of the Python interpreter and provide a conceptual overview of how a python program is executed. This material targets CPython which as of this writing is the most popular implementation of Python and is considered the standard. Python and CPython are used interchangeably in this text but any mention of Python refers to CPython which is the version of python implemented in C. Other implementations include PyPy which is python implemented in a restricted subset of Python, Jython which is python implemented on the Java Virtual Machine etc. I like to think of the execution of a python program as split into two or three main phases as listed below depending on how the interpreter is invoked. These are covered in different measures within this write-up: 1. Initialization : This involves the set up of the various data structures needed by the python process. This will probably only counts when a program is being executed non-interactively through the interpreter shell. 2. Compiling : This involves activities such as parsing source code to build syntax trees, creation of abstract syntax trees, building of symbol tables and generation of code objects. 3. Interpreting : This involves the actual execution of generated code objects within some context. The process of generating parse trees and abstract syntax trees from source code is language agnostic so the same methods that apply to other languages also apply to Python; as a result, not much is on this subject is covered here. On the other hand, the process of building symbol tables and code objects from the Abstract Syntax tree is the more interesting part of the compilation phase which is handled in a more or less python specific way and attention is paid to it. The interpreting of compiled code objects and all the data structures that are used in the process is also covered. Topics that will be touched upon include but are not limited to the process of building symbol tables and generating code objects, python objects, frame objects, code objects, function objects, python opcodes, the interpreter loop, generators and user defined classes. ¹http://tpq.io/ ²http://pandas.pydata.org/
Introduction 2 This material is aimed at anybody that is interested in gaining some insight into how the CPython virtual machine functions. It is assumed that the user is already familiar with python and understands the fundamentals of the language. As part of this expose on the virtual machine, we go through a considerable amount of C code so a user that has a rudimentary understanding of C will find it easier to follow. After all said and done, all that is needed to get through this material is a healthy desire to want to learn about the CPython virtual machine. This work is an expanded version of personal notes taken while investigating the inner working of the python interpreter. There is substantial amount of wisdoms in videos available in Pycon videos³, school lectures⁴ and blog write-ups⁵. This work will not be complete without acknowledging these fantastic sources that have been leveraged in the production of this work. At the end of this book, a user should be able to understand the intricacies of how the Python interpreter executes a program. This includes the various steps involved in executing the program and the various data structures that are crucial to the execution of such program. We start off with a gentle bird’s eye view of what happens when a trivial program is executed by passing the module name to the interpreter at the commandline. The CPython executable can be installed from source by following the instructions at the Python Developer’s Guide⁶. Python version 3 is used throughout this material. ³https://www.youtube.com/watch?v=XGF3Qu4dUqk ⁴http://pgbovine.net/cpython-internals.htm/ ⁵https://tech.blog.aknin.name/2010/04/02/pythons-innards-introduction/ ⁶https://docs.python.org/devguide/index.html#
2. The View From 30,000ft This chapter provides a high level expose on how the interpreter goes about executing a python program. In subsequent chapters, we zoom in on the various pieces of puzzle and provide a more detailed description of such pieces. Regardless of the complexity of a python program, this process is the same. The excellent explanation of this process provided by Yaniv Aknin in his Python Internal series¹ provides some of the basis and motivation for this discussion. Given a python module, test.py, this module can be executed at the command line by passing it as an argument to the python interpreter program as such $python test.py. This is just one of the ways of the invoking the python executable - we could start the interactive interpreter, execute a string as code etc but these other methods of execution are not of interest to us. When the module is passed as an argument to the executable on the command line, figure 2.1 best captures the flow of various activities that are involved in the actual execution of the supplied module. Figure 2.1: Flow during execution of source code The python executable is a C program just like any other C program such as the linux kernel or a simple hello world program in C so pretty much the same process happens when the python executable is invoked. Take a moment to grasp this, the python executable is just another program that runs your own program. The same argument can be made for the relationship between C and assembly or llvm. The standard process initialization which depends on the platform the executable is running on starts once the python executable is invoked with module name as argument, ¹https://tech.blog.aknin.name/2010/04/02/pythons-innards-introduction/
分享到:
收藏