大数据分析平台技术文档
摘要:本文档目的在于迅速搭建分布式大数据分析平台(主要包括底层的 HDFS 存
储层和上层的 Spark 分析层)
实现方式:基于分布式的大数据分析平台构建方案主要有两种:
1, 物理机真实集群,采用 Apache Ambari 框架
2, 采用虚拟化方式构建私有云,使用 Kubernetes + Docker 的方式实现夸物理机
网络的云平台
环境准备:Centos 7 的三台物理机器(由于 Kubernetes 框架的版本迭代比较频繁,
现在 Kubernetes 服务基于 Centos 7 的 systemctl 管理,而且底层需要 glibc 1.4 以上版
本的支持,所以低于 centos 7 需要升级 glibc,实现比较麻烦)
Ambari 集群搭建:
步骤 1:下载 Ambari 安装文件
1,对于自己的操作系统选择相应安装方式(我们这里选用第一种 Centos
的安装方式,yum install,之前要下载 yum 源):
Redhat/CentOS/Oracle
SUSE
Ubuntu
cd /etc/yum.repos.d/
wget
cd /etc/zypp/repos.d
wget
cd /etc/apt/sources.list.d
wget
2,选择合适的现在 ambari-repo-url:
(Redhat / CentOS / Oracle) 6,7
http://public-repo-1.hortonworks.com/ambari/centos6/1.x/updates/
1.7.0/ambari.repo
(Redhat / CentOS / Oracle) 5
http://public-repo-1.hortonworks.com/ambari/centos5/1.x/updates/
1.7.0/ambari.repo
SUSE 11
Ubuntu 12
http://public-repo-1.hortonworks.com/ambari/suse11/1.x/updates/1
.7.0/ambari.repo
http://public-repo-1.hortonworks.com/ambari/ubuntu12/1.x/update
s/1.7.0/ambari.list
步骤 2:安装,启动和使用 Ambari -Server
1,通过 yum 等服务从公共仓库安装 Ambari-Server
Redhat/CentOS/Oracle
SUSE
yum install ambari-server
zypper install ambari-server
Ubuntu
apt-key adv --recv-keys –keyserver keyserver.ubuntu.com
B9733A7A07513CAD
apt-get update
apt-get install ambari-server
2,通过 ambari-server setup 配置和建立自己的集群依赖环境:
ambari-server setup
ambari-server start
3,在从节点安装启动 ambari-agent:
yum install ambari-agent
ambari-agent start
步骤详解:
步骤 3:载系统对应的仓库源
1.
[root@jifeng02 ~]# cd /etc/yum.repos.d/
选择合适的
下载,使用 wget
[root@jifeng02 yum.repos.d]# wget http://public-repo-1.hortonworks.com/ambari/centos6/1.x/updates/1.6.1/ambari.repo
--2014-10-17 22:30:35-- http://public-repo-1.hortonworks.com/ambari/centos6/1.x/updates/1.6.1/ambari.repo
正在解析主机 public-repo-1.hortonworks.com... 54.192.157.143, 54.230.158.152, 54.230.158.197, ...
正在连接 public-repo-1.hortonworks.com|54.192.157.143|:80... 已连接。
已发出 HTTP 请求,正在等待回应... 200 OK
长度:472 [binary/octet-stream]
正在保存至: “ambari.repo”
100%[======================================================================>] 472
--.-K/s in 0s
2014-10-17 22:30:35 (81.0 MB/s) - 已保存 “ambari.repo” [472/472])
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
步骤 2:
(1)准备工作(基本工具,初始配置)
工具:yum,rpm,scp,curl,wget,pdsh(前几个一般系统都自带了,pdsh 需要自
己装)
[plain] view plaincopy
1.
yum install pdsh
(2)配置 hosts
添加:vim /etc/hosts
[plain] view plaincopy
1.
2.
3.
4.
10.*.*.01 node01
10.*.*.02 node02
10.*.*.03 node03
10.*.*.04 node04
(3)配置 ssh 免登录
选定 node01 作为 Ambari-Server,需要配置该节点到其它节点的 SSH 免登录
[plain] view plaincopy
1.
2.
cd ~
ssh-keygen
一直按回车,会生成默认的公钥和私钥
[sql] view plaincopy
1.
2.
.ssh/id_rsa
.ssh/id_rsa.pub
执行
[plain] view plaincopy
1.
2.
chmod 700 ~/.ssh
chmod 600 ~/.ssh/authorized_keys
配置本地的免登录
[plain] view plaincopy
1.
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
配置其它节点的免登陆
[plain] view plaincopy
1.
2.
3.
scp ~/.ssh/authorized_keys dn1:/root/.ssh/
scp ~/.ssh/authorized_keys dn2:/root/.ssh/
scp ~/.ssh/authorized_keys dn3:/root/.ssh/
将私钥从 master 上下载下来,配置 ambari agent 时会用到
[plain] view plaincopy
1.
.ssh/id_rsa
同步集群的时间(ntp)
进入 node01, yum install ntp, 然后修改/etc/ntp.conf:
server 192.168.0.1
fudge 192.168.0.1 stratum 8
运行 : chkconfig ntpd on;
运行:watch ntpq -p ,可以看到下面的界面:
进入 node02、node03,yum install ntp ,修改/etc/ntp.conf:
server node01
server 192.168.0.1
fudge 192.168.0.1 stratum 8
[plain] view plaincopy
1.
2.
3.
4.
setenforce 0
chkconfig iptables off
/etc/init.d/iptables stop
umask 022
Disable PackageKit
[plain] view plaincopy
1.
vim /etc/yum/pluginconf.d/refresh-packagekit.conf
将内容改为 enabled=0
步骤 3: 安装,建立和启动 ambari-server
yum install ambari-server
[plain] view plaincopy
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
36.
37.
38.
39.
40.
41.
42.
[root@jifeng02 jifeng]# yum install ambari-server
Loaded plugins: fastestmirror, refresh-packagekit, security
Loading mirror speeds from cached hostfile
* base: centos.ustc.edu.cn
* extras: centos.ustc.edu.cn
* updates: centos.ustc.edu.cn
Setting up Install Process
Resolving Dependencies
--> Running transaction check
---> Package ambari-server.noarch 0:1.6.1-98 will be installed
--> Processing Dependency: postgresql-server >= 8.1 for package: ambari-server-1.6.1-98.noarch
--> Running transaction check
---> Package postgresql-server.i686 0:8.4.20-1.el6_5 will be installed
--> Processing Dependency: postgresql-libs(x86-32) = 8.4.20-1.el6_5 for package: postgresql-server-8.4.20-1.el6_5.i686
--> Processing Dependency: postgresql(x86-32) = 8.4.20-1.el6_5 for package: postgresql-server-8.4.20-1.el6_5.i686
--> Processing Dependency: libssl.so.10(libssl.so.10) for package: postgresql-server-8.4.20-1.el6_5.i686
--> Processing Dependency: libcrypto.so.10(libcrypto.so.10) for package: postgresql-server-8.4.20-1.el6_5.i686
--> Running transaction check
---> Package openssl.i686 0:1.0.0-27.el6 will be updated
--> Processing Dependency: openssl = 1.0.0-27.el6 for package: openssl-devel-1.0.0-27.el6.i686
---> Package openssl.i686 0:1.0.1e-30.el6_5.2 will be an update
---> Package postgresql.i686 0:8.4.13-1.el6_3 will be updated
--> Processing Dependency: postgresql(x86-32) = 8.4.13-1.el6_3 for package: postgresql-devel-8.4.13-1.el6_3.i686
---> Package postgresql.i686 0:8.4.20-1.el6_5 will be an update
---> Package postgresql-libs.i686 0:8.4.13-1.el6_3 will be updated
---> Package postgresql-libs.i686 0:8.4.20-1.el6_5 will be an update
--> Running transaction check
---> Package openssl-devel.i686 0:1.0.0-27.el6 will be updated
---> Package openssl-devel.i686 0:1.0.1e-30.el6_5.2 will be an update
---> Package postgresql-devel.i686 0:8.4.13-1.el6_3 will be updated
---> Package postgresql-devel.i686 0:8.4.20-1.el6_5 will be an update
--> Finished Dependency Resolution
Dependencies Resolved
=======================================================================================================
=========
Package
Arch
Version
Repository
Size
=======================================================================================================
=========
Installing:
ambari-server
noarch
1.6.1-98
Updates-ambari-1.6.1
39 M
Installing for dependencies:
postgresql-server
i686
8.4.20-1.el6_5
updates
3.4 M
43.
44.
45.
46.
47.
48.
49.
50.
51.
52.
53.
54.
55.
56.
57.
58.
59.
60.
61.
62.
63.
64.
65.
66.
67.
68.
69.
70.
71.
72.
73.
74.
75.
76.
77.
78.
79.
80.
81.
82.
83.
84.
85.
Updating for dependencies:
openssl
i686
1.0.1e-30.el6_5.2
updates
openssl-devel
postgresql
postgresql-devel
postgresql-libs
i686
i686
i686
i686
1.0.1e-30.el6_5.2
updates
8.4.20-1.el6_5
updates
8.4.20-1.el6_5
8.4.20-1.el6_5
updates
updates
1.5 M
1.2 M
2.6 M
810 k
205 k
Transaction Summary
=======================================================================================================
=========
Install
2 Package(s)
Upgrade
5 Package(s)
Total download size: 49 M
Is this ok [y/N]: y
Downloading Packages:
(1/7): ambari-server-1.6.1-98.noarch.rpm
(2/7): openssl-1.0.1e-30.el6_5.2.i686.rpm
(3/7): openssl-devel-1.0.1e-30.el6_5.2.i686.rpm
(4/7): postgresql-8.4.20-1.el6_5.i686.rpm
(5/7): postgresql-devel-8.4.20-1.el6_5.i686.rpm
(6/7): postgresql-libs-8.4.20-1.el6_5.i686.rpm
(7/7): postgresql-server-8.4.20-1.el6_5.i686.rpm
| 39 MB 09:05
| 1.5 MB 00:01
| 1.2 MB 00:01
| 2.6 MB 00:03
| 810 kB 00:01
| 205 kB 00:00
| 3.4 MB 00:03
----------------------------------------------------------------------------------------------------------------
Total
89 kB/s | 49 MB 09:20
warning: rpmts_HdrFromFdno: Header V4 RSA/SHA1 Signature, key ID 07513cad: NOKEY
Retrieving key from http://public-repo-1.hortonworks.com/ambari/centos6/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
Importing GPG key 0x07513CAD:
Userid: "Jenkins (HDP Builds) "
From : http://public-repo-1.hortonworks.com/ambari/centos6/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
Is this ok [y/N]: y
Running rpm_check_debug
Running Transaction Test
Transaction Test Succeeded
Running Transaction
Updating : openssl-1.0.1e-30.el6_5.2.i686
Updating : postgresql-libs-8.4.20-1.el6_5.i686
Updating : openssl-devel-1.0.1e-30.el6_5.2.i686
Updating : postgresql-8.4.20-1.el6_5.i686
Installing : postgresql-server-8.4.20-1.el6_5.i686
Installing : ambari-server-1.6.1-98.noarch
Updating : postgresql-devel-8.4.20-1.el6_5.i686
Cleanup
: postgresql-devel-8.4.13-1.el6_3.i686
Cleanup
: postgresql-8.4.13-1.el6_3.i686
1/12
2/12
3/12
4/12
5/12
6/12
7/12
8/12
9/12
10/12
11/12
12/12
1/12
2/12
3/12
4/12
5/12
6/12
7/12
8/12
9/12
10/12
11/12
12/12
Cleanup
: postgresql-libs-8.4.13-1.el6_3.i686
Cleanup
: openssl-devel-1.0.0-27.el6.i686
Cleanup
: openssl-1.0.0-27.el6.i686
Verifying : postgresql-8.4.20-1.el6_5.i686
Verifying : postgresql-devel-8.4.20-1.el6_5.i686
Verifying : openssl-1.0.1e-30.el6_5.2.i686
Verifying : ambari-server-1.6.1-98.noarch
Verifying : postgresql-server-8.4.20-1.el6_5.i686
Verifying : openssl-devel-1.0.1e-30.el6_5.2.i686
Verifying : postgresql-libs-8.4.20-1.el6_5.i686
Verifying : postgresql-8.4.13-1.el6_3.i686
Verifying : openssl-1.0.0-27.el6.i686
Verifying : postgresql-libs-8.4.13-1.el6_3.i686
Verifying : postgresql-devel-8.4.13-1.el6_3.i686
Verifying : openssl-devel-1.0.0-27.el6.i686
86.
87.
88.
89.
90.
91.
92.
93.
94.
95.
96.
97.
98.
99.
100.
101.
102.
Installed:
ambari-server.noarch 0:1.6.1-98
103.
104.
105. Dependency Installed:
postgresql-server.i686 0:8.4.20-1.el6_5
106.
107.
108. Dependency Updated:
openssl.i686 0:1.0.1e-30.el6_5.2
openssl-devel.i686 0:1.0.1e-30.el6_5.2
postgresql.i686 0:8.4.20-1.el6_5
postgresql-devel.i686 0:8.4.20-1.el6_5
postgresql-libs.i686 0:8.4.20-1.el6_5
109.
110.
111.
112.
113. Complete!
运行 setup 配置安装环境:
[html] view plaincopy
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
[root@jifeng02 jifeng]# ambari-server setup
Using python /usr/bin/python2.6
Setup ambari-server
Checking SELinux...
SELinux status is 'enabled'
SELinux mode is 'enforcing'
Temporarily disabling SELinux
WARNING: SELinux is set to 'permissive' mode and temporarily disabled.
OK to continue [y/n] (y)? y
Customize user account for ambari-server daemon [y/n] (n)? y
Enter user account for ambari-server daemon (root):jifeng
Adjusting ambari-server permissions and ownership...