Configuring Your Systems and Installing Greenplum
Perform the following tasks in order:
1. Make sure your systems meet the System Requirements
2. Setting the Greenplum Recommended OS Parameters
3. (master only) Creating the Greenplum Database Administrative User Account
4. (master only) Installing the Greenplum Database Software
5. Installing and Configuring Greenplum on all Hosts
6.Creating the Data Storage Areas
7.Synchronizing System Clocks
Unless noted, these tasks should be performed for all hosts in your Greenplum Database array (master,
standby master and segments).
1. System Requirements
下表为生产环境中推荐的最低配置要求。
Table1:System Prerequisites for Greenplum Database 5.0
Operating System
File Systems
Minimum CPU
Minimum Memory
Disk Requirements
Network Requirements
Software and Utilities
Note: See the Greenplum Database
Release Notes for current supported
platform information.
• xfs required for data storage on SUSE Linux
and Red Hat (ext3 supported for root file
system)
Pentium Pro compatible (P3/Athlon and above)
16 GB RAM per server
• 150MB per host for Greenplum installation
• Approximately 300MB per segment instance
for meta data
• Appropriate free space for data with disks at no
more than 70% capacity
• High-speed, local storage
10 Gigabit Ethernet within the array
Dedicated, non-blocking switch
NIC bonding is recommended when multiple
interfaces are present
zlib compression libraries
bash shell
GNU tars
GNU zip
GNU sed (used by Greenplum Database
gpinitsystem)
perl
secure shell
Important: SSL is supported only on the Greenplum Database master host system. It is not
supported on the segment host systems.
Important: For all Greenplum Database host systems, the SELinux must be disabled. You should also
disable firewall software such as iptables (on systems such as RHEL 6.x and CentOS 6.x )or firewalld
(on systems such as RHEL 7.x and CentOS 7.x).
(1)This command checks the status of SELinux when run as root:
# sestatus
SELinuxstatus: disabled
You can disable SELinux by editing the /etc/selinux/config file. As root, change the value
of the SELINUX parameter in the config file and reboot the system:
SELINUX=disabled
(2) This command checks the status of iptables when run as root(CentOS 6.x):
# /sbin/chkconfig --list iptables
#chkconfig iptables off
#service iptables stop
#chkconfig --list iptables
This is the output if iptables is disabled.
iptables 0:off 1:off 2:off 3:off 4:off 5:off 6:off
(3) This command checks the status of firewalld when run as root(CentOS 7.x):
# systemctl status firewalld
This is the output if firewalld is disabled.
firewalld.service - firewalld - dynamic firewall daemon
Loaded: loaded (/usr/lib/systemd/system/firewalld.service;
disabled; vendor preset: enabled)
Active: inactive (dead)
These commands disable firewalld when run as root:
# systemctl stop firewalld.service
# systemctl disable firewalld.service
2.Setting the Greenplum Recommended OS Parameters
Greenplum requires the certain Linux operating system (OS) parameters be set on all hosts in your
Greenplum Database system (masters and segments).
In general, the following categories of system parameters need to be altered:
• Shared Memory - A Greenplum Database instance will not work unless the shared memory segment
for your kernel is properly sized. Most default OS installations have the shared memory values set
too low for Greenplum Database. On Linux systems, you must also disable the OOM (out of memory)
killer. For information about Greenplum Database shared memory requirements, see the Greenplum
Database server configuration parameter shared_buffers in the Greenplum Database Reference
Guide.
• Network - On high-volume Greenplum Database systems, certain network-related tuning parameters
must be set to optimize network connections made by the Greenplum interconnect.
• User Limits - User limits control the resources available to processes started by a user's shell.
Greenplum Database requires a higher limit on the allowed number of file descriptors that a single
process can have open. The default settings may cause some Greenplum Database queries to fail
because they will run out of file descriptors needed to process the query.
(1)Linux System Settings
• Edit the /etc/hosts file and make sure that it includes the host names and all interface address
names for every machine participating in your Greenplum Database system.
• Set the following parameters in the /etc/sysctl.conf file and reboot:
Note:红色为需要添加的内容,蓝色为保持不变的内容,绿色为需要修改的内容
kernel.shmmax = 500000000(default 68719476736)
kernel.shmmni = 4096
kernel.shmall = 4000000000(default 4294967296)
kernel.sem = 500 1024000 200 4096
kernel.sysrq = 1(default 0)
kernel.core_uses_pid = 1(default 1)
kernel.msgmnb = 65536(default 65536)
kernel.msgmax = 65536(default 65536)
kernel.msgmni = 2048
net.ipv4.tcp_syncookies = 1(default 1)
net.ipv4.conf.default.accept_source_route = 0(default 0)
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_max_syn_backlog = 4096
net.ipv4.conf.all.arp_filter = 1
net.ipv4.ip_local_port_range = 10000 65535
net.core.netdev_max_backlog = 10000
net.core.rmem_max = 2097152
net.core.wmem_max = 2097152
vm.overcommit_memory = 2
• Set the following parameters in the /etc/security/limits.conf file:
* soft nofile 65536
* hard nofile 65536
* soft nproc 131072
* hard nproc 131072
Note:For Red Hat Enterprise Linux (RHEL) 6.x and CentOS 6.x, parameter values in the
/etc/security/limits.d/90-nproc.conf file override the values in the limits.conf file. If a parameter value
is set in both conf files, ensure that the parameter is set properly in the 90-nproc.conf file. The Linux
module pam_limits sets user limits by reading the values from the limits.conf file and then from the
90-nproc.conf file.
3.Creating the Greenplum Database Administrative User Account
You must create a dedicated operating system user account on the master node to run Greenplum
Database. You administer Greenplum Database as this operating system user. This user account is
named, by convention, gpadmin.
You cannot run the Greenplum Database server as root.
The gpadmin user account must have permission to access the services and directories required to
install and run Greenplum Database.
To create the gpadmin operating system user account on the Greenplum Database master node and set
the account password, run the groupadd, useradd, and passwd commands as the root user.
For example:
# groupadd gpadmin
# useradd gpadmin -g gpadmin
# passwd gpadmin
New password:
Retype new password:
4.Installing the Greenplum Database Software
(1) Log in as root on the machine that will become the Greenplum Database master host.
If you do not have root access on the master host machine, run the binary installer as the gpadmin
user and install the software into a directory in which you have write permission.
(2)Download or copy the Binary Installation distribution file to the master host machine. The Binary
Installer distribution filename has the format greenplum-db-
-.zip where
is similar to RHEL7-x86_64 (Red Hat 64-bit) or SuSE12-x86_64 (SuSe Linux 64 bit).
(3) Unzip the installer file:
# unzip greenplum-db-5.9.0-rhel6-x86_64.zip
(4)Launch the installer using bash:
# /bin/bash greenplum-db-5.9.0-rhel6-x86_64.bin
(5)The installer prompts you to accept the Greenplum Database license agreement. Type yes to accept
the license agreement.
(6)The installer prompts you to provide an installation path. Press ENTER to accept the default install
path(/usr/local/greenplum-db-), or enter an absolute path to a custom install location.
You must have write permission to the location you specify.
(7)The installer installs the Greenplum Database software and creates a greenplum-db symbolic link
one directory level above the version-specific installation directory. The symbolic link is used to
facilitate patch maintenance and upgrades between versions. The installed location is referred to as
$GPHOME.
(8)If you installed as root, change the ownership and group of the installed files to gpadmin:
# chown -R gpadmin /usr/local/greenplum*
# chgrp -R gpadmin /usr/local/greenplum*
(9)To perform additional required system configuration tasks and to install Greenplum Database on
other hosts, go to the next task Installing and Configuring Greenplum on all Hosts.
About Your Greenplum Database Installation(安装成功之后,文件包括以下内容)
• greenplum_path.sh — This file contains the environment variables for Greenplum Database. See
Setting Greenplum Environment Variables.
• bin — This directory contains the Greenplum Database management utilities. This directory also
contains the PostgreSQL client and server programs, most of which are also used in Greenplum
Database.
• docs/cli_help — This directory contains help files for Greenplum Database command-line utilities.
• docs/cli_help/gpconfigs — This directory contains sample gpinitsystem configuration files and
host files that can be modified and used when installing and initializing a Greenplum Database system.
• docs/javadoc — This directory contains javadocs for the gNet extension (gphdfs protocol). The jar
files for the gNet extension are installed in the $GPHOME/lib/hadoop directory.
• etc — Sample configuration file for OpenSSL and a sample configuration file to be used with the
gpcheck management utility.
• ext — Bundled programs (such as Python) used by some Greenplum Database utilities.
• include — The C header files for Greenplum Database.
• lib — Greenplum Database and PostgreSQL library files.
• sbin — Supporting/Internal scripts and programs.
• share — Shared files for Greenplum Database.
5.Installing and Configuring Greenplum on all Hosts
When run as root, gpseginstall copies the Greenplum Database installation from the current host
and installs it on a list of specified hosts, creates the Greenplum operating system user account
(typically named gpadmin), sets the account password (default is changeme), sets the ownership of the
Greenplum Database installation directory, and exchanges ssh keys between all specified host address
names (both as root and as the specified user account).
Note: If you are setting up a single node system, you can still use gpseginstall to perform
the required system configuration tasks on the current host. In this case, the hostfile_exkeys
should have only the current host name.
To install and configure Greenplum Database on all specified hosts
(1) Log in to the master host as root:
$ su -
(2) Source the path file from your master host's Greenplum Database installation directory:
# source /usr/local/greenplum-db/greenplum_path.sh
(3) Create a file called hostfile_exkeys that has the machine configured host names and host
addresses (interface names) for each host in your Greenplum system (master, standby master and
segments). Make sure there are no blank lines or extra spaces. For example, if you have a master,
standby master and two segments, your file would look something like this:
mdw
sdw1
sdw2
smdw
(4) Run the gpseginstall utility referencing the hostfile_exkeys file you just created. This example
runs the utility as root. The utility creates the Greenplum operating system user account gpadmin as
a system account on all hosts and sets the account password to changeme for that user on all segment
hosts.
# gpseginstall -f hostfile_exkeys -p 123qwe
Use the -u and -p options to specify a different operating system account name and password. See
gpseginstall for option information and running the utility as a non-root user.
Confirming Your Installation
(1) Log in to the master host as gpadmin:
$ su - gpadmin
(2) Source the path file from Greenplum Database installation directory:
# source /usr/local/greenplum-db/greenplum_path.sh