PythonMatplotlibSciKitsNumpySciPyIPythonIP[y]:Cython2017EDITIONEdited byGaël VaroquauxEmmanuelle GouillartOlaf VahtrasScipyLecture Noteswww.scipy-lectures.orgGaël Varoquaux • Emmanuelle Gouillart • Olav VahtrasChristopher Burns • Adrian Chauve • Robert Cimrman • Christophe CombellesPierre de Buyl • Ralf Gommers • André Espaze • Zbigniew Jędrzejewski-Szmek Valentin Haenel • Gert-Ludwig Ingold • Fabian PedregosaDidrikPinte • Nicolas P. Rougier • Pauli Virtanenand many others...
Contents
1 Getting started with Python for science
2
2
1.1 Python scientific computing ecosystem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10
1.2 The Python language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3 NumPy: creating and manipulating numerical data . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
47
1.4 Matplotlib: plotting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
1.5
Scipy : high-level scientific computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
1.6 Getting help and finding documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278
2 Advanced topics
282
2.1 Advanced Python Constructs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282
2.2 Advanced NumPy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
2.3 Debugging code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334
2.4 Optimizing code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345
2.5
Sparse Matrices in SciPy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351
2.6
Image manipulation and processing using Numpy and Scipy . . . . . . . . . . . . . . . . . . . . . . . . 372
2.7 Mathematical optimization: finding minima of functions . . . . . . . . . . . . . . . . . . . . . . . . . . 418
2.8
Interfacing with C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 462
3 Packages and applications
484
Statistics in Python . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 484
3.1
Sympy : Symbolic Mathematics in Python . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 530
3.2
3.3
Scikit-image: image processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 538
3.4 Traits: building interactive dialogs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 565
3D plotting with Mayavi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 585
3.5
3.6
scikit-learn: machine learning in Python . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 600
Index
688
i
Scipy lecture notes, Edition 2017.1
Contents
1
CHAPTER1
Getting started with Python for science
This part of the Scipy lecture notes is a self-contained introduction to everything that is needed to use Python for
science, from the language itself, to numerical computing or plotting.
1.1 Python scientific computing ecosystem
Authors: Fernando Perez, Emmanuelle Gouillart, Gaël Varoquaux, Valentin Haenel
1.1.1 Why Python?
The scientist’s needs
• Get data (simulation, experiment control),
• Manipulate and process data,
• Visualize results, quickly to understand, but also with high quality figures, for reports or publications.
2
Scipy lecture notes, Edition 2017.1
Python’s strengths
• Batteries included Rich collection of already existing bricks of classic numerical methods, plotting or data
processing tools. We don’t want to re-program the plotting of a curve, a Fourier transform or a fitting algo-
rithm. Don’t reinvent the wheel!
• Easy to learn Most scientists are not payed as programmers, neither have they been trained so. They need
to be able to draw a curve, smooth a signal, do a Fourier transform in a few minutes.
• Easy communication To keep code alive within a lab or a company it should be as readable as a book by
collaborators, students, or maybe customers. Python syntax is simple, avoiding strange symbols or lengthy
routine specifications that would divert the reader from mathematical or scientific understanding of the
code.
• Efficient code Python numerical modules are computationally efficient. But needless to say that a very fast
code becomes useless if too much time is spent writing it. Python aims for quick development times and
quick execution times.
• Universal Python is a language used for many different problems. Learning Python avoids learning a new
software for each new problem.
How does Python compare to other solutions?
Compiled languages: C, C++, Fortran. . .
Pros
Cons
• Very fast. For heavy computations, it’s difficult to outperform these languages.
• Painful usage: no interactivity during development, mandatory compilation steps, verbose
syntax, manual memory management. These are difficult languages for non programmers.
Matlab scripting language
Pros
Cons
• Very rich collection of libraries with numerous algorithms, for many different domains. Fast
execution because these libraries are often written in a compiled language.
• Pleasant development environment: comprehensive and help, integrated editor, etc.
• Commercial support is available.
• Base language is quite poor and can become restrictive for advanced users.
• Not free.
Julia
Pros
• Fast code, yet interactive and simple.
• Easily connects to Python or C.
1.1. Python scientific computing ecosystem
3
Scipy lecture notes, Edition 2017.1
Cons
• Ecosystem limited to numerical computing.
• Still young.
Other scripting languages: Scilab, Octave, R, IDL, etc.
• Open-source, free, or at least cheaper than Matlab.
• Some features can be very advanced (statistics in R, etc.)
• Fewer available algorithms than in Matlab, and the language is not more advanced.
• Some software are dedicated to one domain. Ex: Gnuplot to draw curves. These programs
are very powerful, but they are restricted to a single type of usage, such as plotting.
Pros
Cons
Python
Pros
• Very rich scientific computing libraries
• Well thought out language, allowing to write very readable and well structured code: we
“code what we think”.
• Many libraries beyond scientific computing (web server, serial port access, etc.)
• Free and open-source software, widely spread, with a vibrant community.
• A variety of powerful environments to work in, such as IPython, Spyder, Jupyter notebooks,
Pycharm
Cons
• Not all the algorithms that can be found in more specialized software or toolboxes.
1.1.2 The Scientific Python ecosystem
Unlike Matlab, or R, Python does not come with a pre-bundled set of modules for scientific computing. Below are
the basic building blocks that can be combined to obtain a scientific computing environment:
Python, a generic and modern computing language
• The language: flow control, data types (string, int), data collections (lists, dictionaries), etc.
• Modules of the standard library: string processing, file management, simple network protocols.
• A large number of specialized modules or applications written in Python: web framework, etc.
. . . and
scientific computing.
• Development tools (automatic testing, documentation generation)
1.1. Python scientific computing ecosystem
4
See also:
chapter on Python language
Scipy lecture notes, Edition 2017.1
Core numeric libraries
• Numpy: numerical computing with powerful numerical arrays objects, and routines to manipulate them.
http://www.numpy.org/
See also:
chapter on numpy
• Scipy : high-level numerical routines. Optimization, regression, interpolation, etc http://www.scipy.org/
See also:
chapter on scipy
• Matplotlib : 2-D visualization, “publication-ready” plots http://matplotlib.org/
See also:
chapter on matplotlib
1.1. Python scientific computing ecosystem
5
Scipy lecture notes, Edition 2017.1
Advanced interactive environments:
• IPython, an advanced Python console http://ipython.org/
• Jupyter, notebooks in the browser http://jupyter.org/
Domain-specific packages,
• Mayavi for 3-D visualization
• pandas, statsmodels, seaborn for statistics
• sympy for symbolic computing
• scikit-image for image processing
• scikit-learn for machine learning
and much more packages not documented in the scipy lectures.
See also:
chapters on advanced topics
chapters on packages and applications
1.1. Python scientific computing ecosystem
6