logo资料库

dataanalyst_201602-student_slides.pdf

第1页 / 共680页
第2页 / 共680页
第3页 / 共680页
第4页 / 共680页
第5页 / 共680页
第6页 / 共680页
第7页 / 共680页
第8页 / 共680页
资料共680页,剩余部分请下载后查看
Cloudera Data Analyst Training: Using Pig, Hive, and Impala with Hadoop 201602
IntroducCon Chapter 1
Course Chapters §  Introduc.on §  Hadoop Fundamentals §  IntroducCon to Pig §  Basic Data Analysis with Pig §  Processing Complex Data with Pig §  MulC-Dataset OperaCons with Pig §  Pig TroubleshooCng and OpCmizaCon §  IntroducCon to Impala and Hive §  Querying with Impala and Hive §  Impala and Hive Data Management §  Data Storage and Performance §  RelaConal Data Analysis With Impala and Hive §  Working with Impala §  Analyzing Text and Complex Data with Hive §  Hive OpCmizaCon §  Extending Hive §  Choosing the Best Tool for the Job §  Conclusion © Copyright 2010-2016 Cloudera. All rights reserved. Not to be reproduced or shared without prior wriIen consent from Cloudera. 01-3
Chapter Topics Introduc.on §  About this Course §  About Cloudera §  Course LogisCcs §  IntroducCons © Copyright 2010-2016 Cloudera. All rights reserved. Not to be reproduced or shared without prior wriIen consent from Cloudera. 01-4
Course ObjecCves (1) During this course, you will learn § The purpose of Hadoop and its related tools § The features that Pig, Hive, and Impala offer for data acquisi.on, storage, and analysis § How to iden.fy typical use cases for large-scale data analysis § How to load data from rela.onal databases and other sources § How to manage data in HDFS and export it for use with other systems § How Pig, Hive, and Impala improve produc.vity for typical analysis tasks § The language syntax and data formats supported by these tools © Copyright 2010-2016 Cloudera. All rights reserved. Not to be reproduced or shared without prior wriIen consent from Cloudera. 01-5
Course ObjecCves (2) § How to design and execute queries on data stored in HDFS § How to join diverse datasets to gain valuable business insight § How Hive and Impala can be extended with custom func.ons and scripts § How to analyze structured, semi-structured, and unstructured data § How to store and query data for beOer performance § How to determine which tool is the best choice for a given task © Copyright 2010-2016 Cloudera. All rights reserved. Not to be reproduced or shared without prior wriIen consent from Cloudera. 01-6
Chapter Topics Introduc.on §  About this Course §  About Cloudera §  Course LogisCcs §  IntroducCons © Copyright 2010-2016 Cloudera. All rights reserved. Not to be reproduced or shared without prior wriIen consent from Cloudera. 01-7
About Cloudera (1) § The leader in Apache Hadoop-based soRware and services § Founded by Hadoop experts from Facebook, Yahoo, Google, and Oracle § Provides support, consul.ng, training, and cer.fica.on for Hadoop users § Staff includes commiOers to virtually all Hadoop projects § Many authors of industry standard books on Apache Hadoop projects © Copyright 2010-2016 Cloudera. All rights reserved. Not to be reproduced or shared without prior wriIen consent from Cloudera. 01-8
分享到:
收藏