pandas官方文档中文版.pdf

发布时间：2022-05-29 发布人：admin 分类：说明书资料大小：2.80M 资料格式：pdf 举报版权申诉

guo1282699-12332489-4744300845218254909.pdf-第1页.png

第1页 / 共214页

guo1282699-12332489-4744300845218254909.pdf-第2页.png

第2页 / 共214页

guo1282699-12332489-4744300845218254909.pdf-第3页.png

第3页 / 共214页

guo1282699-12332489-4744300845218254909.pdf-第4页.png

第4页 / 共214页

guo1282699-12332489-4744300845218254909.pdf-第5页.png

第5页 / 共214页

guo1282699-12332489-4744300845218254909.pdf-第6页.png

第6页 / 共214页

guo1282699-12332489-4744300845218254909.pdf-第7页.png

第7页 / 共214页

guo1282699-12332489-4744300845218254909.pdf-第8页.png

第8页 / 共214页

Pandas 官方教程

十分钟搞定 Pandas

Pandas 秘籍

第一章

第二章

第三章

第四章

第五章

第六章

第七章

第八章

第九章

学习 Pandas

01 - Lesson

02 - Lesson

03 - Lesson

04 - Lesson

05 - Lesson

06 - Lesson

07 - Lesson

08 - Lesson

09 - Lesson

10 - Lesson

11 - Lesson

目錄 Pandas 官方教程十分钟搞定 Pandas Pandas 秘籍第一章第二章第三章第四章第五章第六章第七章第八章第九章学习 Pandas 01 - Lesson 02 - Lesson 03 - Lesson 04 - Lesson 05 - Lesson 06 - Lesson 07 - Lesson 08 - Lesson 09 - Lesson 10 - Lesson 11 - Lesson 1.1 1.2 1.3 1.3.1 1.3.2 1.3.3 1.3.4 1.3.5 1.3.6 1.3.7 1.3.8 1.3.9 1.4 1.4.1 1.4.2 1.4.3 1.4.4 1.4.5 1.4.6 1.4.7 1.4.8 1.4.9 1.4.10 1.4.11 1

Pandas 官方教程 Pandas 官方教程官方教程是官方文档的教程页面上的教程。名称十分钟搞定 pandas Pandas 秘籍学习 Pandas 原文 10 Minutes to pandas Pandas cookbook Learn Pandas 译者 ChaoSimple 飞龙派兰数据在线阅读 PDF格式 EPUB格式 MOBI格式代码仓库 2

十分钟搞定 Pandas 十分钟搞定 pandas 原文：10 Minutes to pandas 译者：ChaoSimple 来源：【原】十分钟搞定pandas 官方网站上《10 Minutes to pandas》的一个简单的翻译，原文在这里。这篇文章是对 pandas 的一个简单的介绍，详细的介绍请参考：秘籍。习惯上，我们会按下面格式引入所需要的包： In [1]: import pandas as pd In [2]: import numpy as np In [3]: import matplotlib.pyplot as plt 一、创建对象可以通过数据结构入门来查看有关该节内容的详细信息。 1、可以通过传递一个 list 对象来创建一个 Series ，pandas 会默认创建整型索引： In [4]: s = pd.Series([1,3,5,np.nan,6,8]) In [5]: s Out[5]: 0 1.0 1 3.0 2 5.0 3 NaN 4 6.0 5 8.0 dtype: float64 3

十分钟搞定 Pandas 2、通过传递一个 numpy array ，时间索引以及列标签来创建一个 DataFrame ： In [6]: dates = pd.date_range('20130101', periods=6) In [7]: dates Out[7]: DatetimeIndex(['2013-01-01', '2013-01-02', '2013-01-03', '2013-0 1-04', '2013-01-05', '2013-01-06'], dtype='datetime64[ns]', freq='D') In [8]: df = pd.DataFrame(np.random.randn(6,4), index=dates, col umns=list('ABCD')) In [9]: df Out[9]: A B C D 2013-01-01 0.469112 -0.282863 -1.509059 -1.135632 2013-01-02 1.212112 -0.173215 0.119209 -1.044236 2013-01-03 -0.861849 -2.104569 -0.494929 1.071804 2013-01-04 0.721555 -0.706771 -1.039575 0.271860 2013-01-05 -0.424972 0.567020 0.276232 -1.087401 2013-01-06 -0.673690 0.113648 -1.478427 0.524988 3、通过传递一个能够被转换成类似序列结构的字典对象来创建一个 DataFrame ： 4

十分钟搞定 Pandas In [10]: df2 = pd.DataFrame({ 'A' : 1., ....: 'B' : pd.Timestamp('20130102'), ....: 'C' : pd.Series(1,index=list(range( 4)),dtype='float32'), ....: 'D' : np.array([3] * 4,dtype='int3 2'), ....: 'E' : pd.Categorical(["test","trai n","test","train"]), ....: 'F' : 'foo' }) ....: In [11]: df2 Out[11]: A B C D E F 0 1.0 2013-01-02 1.0 3 test foo 1 1.0 2013-01-02 1.0 3 train foo 2 1.0 2013-01-02 1.0 3 test foo 3 1.0 2013-01-02 1.0 3 train foo 4、查看不同列的数据类型： In [12]: df2.dtypes Out[12]: A float64 B datetime64[ns] C float32 D int32 E category F object dtype: object 5、如果你使用的是 IPython，使用 Tab 自动补全功能会自动识别所有的属性以及自定义的列，下图中是所有能够被自动识别的属性的一个子集： 5

十分钟搞定 Pandas In [13]: df2. df2.A df2.boxplot df2.abs df2.C df2.add df2.clip df2.add_prefix df2.clip_lower df2.add_suffix df2.clip_upper df2.align df2.columns df2.all df2.combine df2.any df2.combineAdd df2.append df2.combine_first df2.apply df2.combineMult df2.applymap df2.compound df2.as_blocks df2.consolidate df2.asfreq df2.convert_objects df2.as_matrix df2.copy df2.astype df2.corr df2.at df2.corrwith df2.at_time df2.count df2.axes df2.cov df2.B df2.cummax df2.between_time df2.cummin df2.bfill df2.cumprod df2.blocks df2.cumsum df2.bool df2.D 二、查看数据详情请参阅：基础。 1、查看 DataFrame 中头部和尾部的行： 6

十分钟搞定 Pandas In [14]: df.head() Out[14]: A B C D 2013-01-01 0.469112 -0.282863 -1.509059 -1.135632 2013-01-02 1.212112 -0.173215 0.119209 -1.044236 2013-01-03 -0.861849 -2.104569 -0.494929 1.071804 2013-01-04 0.721555 -0.706771 -1.039575 0.271860 2013-01-05 -0.424972 0.567020 0.276232 -1.087401 In [15]: df.tail(3) Out[15]: A B C D 2013-01-04 0.721555 -0.706771 -1.039575 0.271860 2013-01-05 -0.424972 0.567020 0.276232 -1.087401 2013-01-06 -0.673690 0.113648 -1.478427 0.524988 2、显示索引、列和底层的 numpy 数据： In [16]: df.index Out[16]: DatetimeIndex(['2013-01-01', '2013-01-02', '2013-01-03', '2013-0 1-04', '2013-01-05', '2013-01-06'], dtype='datetime64[ns]', freq='D') In [17]: df.columns Out[17]: Index([u'A', u'B', u'C', u'D'], dtype='object') In [18]: df.values Out[18]: array([[ 0.4691, -0.2829, -1.5091, -1.1356], [ 1.2121, -0.1732, 0.1192, -1.0442], [-0.8618, -2.1046, -0.4949, 1.0718], [ 0.7216, -0.7068, -1.0396, 0.2719], [-0.425 , 0.567 , 0.2762, -1.0874], [-0.6737, 0.1136, -1.4784, 0.525 ]]) 3、 describe() 函数对于数据的快速统计汇总： 7

十分钟搞定 Pandas In [19]: df.describe() Out[19]: A B C D count 6.000000 6.000000 6.000000 6.000000 mean 0.073711 -0.431125 -0.687758 -0.233103 std 0.843157 0.922818 0.779887 0.973118 min -0.861849 -2.104569 -1.509059 -1.135632 25% -0.611510 -0.600794 -1.368714 -1.076610 50% 0.022070 -0.228039 -0.767252 -0.386188 75% 0.658444 0.041933 -0.034326 0.461706 max 1.212112 0.567020 0.276232 1.071804 4、对数据的转置： In [20]: df.T Out[20]: 2013-01-01 2013-01-02 2013-01-03 2013-01-04 2013-01-05 2 013-01-06 A 0.469112 1.212112 -0.861849 0.721555 -0.424972 -0.673690 B -0.282863 -0.173215 -2.104569 -0.706771 0.567020 0.113648 C -1.509059 0.119209 -0.494929 -1.039575 0.276232 -1.478427 D -1.135632 -1.044236 1.071804 0.271860 -1.087401 0.524988 5、按轴进行排序 8

分享到：

赞收藏

资料库

pandas官方文档中文版.pdf

相关推荐

开发技术

热门标签

最新资料