Oracle Exadata
硬件及系统健康检查
2018 年 03 月
Exadata 巡检报告
Customer:
Date:
2018.03.01
System Type
Rack Master Serial Number
Exadata X2-2 1/2 rack
1051AK22E0
文档信息
编写者
审核者
批准人
编写日期
2018-3-01
审核日期
批准日期
Exadata 巡检报告
目 录
1. 目的........................................................................................................................ 1
2. 检查范围................................................................................................................ 1
3. 配置信息................................................................................................................ 1
IP 地址列表..................................................................................................1
3.1.
3.2. 机柜内标签信息..........................................................................................2
4. 检查项目................................................................................................................ 1
4.1.
INFINIBAND 交换机..................................................................................1
4.2. ETHERNET SWITCH.................................................................................4
4.3. STORAGE SERVERS...............................................................................16
4.4. ORACLE DATABASE SERVERS...........................................................24
4.5. PDU 检查..................................................................................................32
4.5.1. PDU-A:............................................................................................ 32
4.5.2. PDU-B:............................................................................................ 32
5. 检查结果和处理建议.......................................................................................... 33
5.1. 检查结果....................................................................................................33
5.2. 处理建议....................................................................................................34
Exadata 巡检报告
1. 目的
硬件健康检查服务的目的在于通过周期性的检查工作,了解硬件的运行情况,了
解硬件平台在平常的性能指标,记录相关的配置信息或变更,通过这种长期的积累,
可以更好的预测业务需求和硬件运行状态的变化,从而进行适当的调整,使硬件平台
运行在良好的状态。如果检查时发现异常情况,那么可以提前进行调整,对于需要停
机的调整,也可以做好安排,这样就避免了问题突然发生,减小对业务的影响。
Exadata 巡检报告
2. 检查范围
硬件健康检查包括如下内容:
系统硬件
包括 CPU、内存、磁盘,检查系统中硬件是否运行在正常状态
操作系统
检查 CPU、内存、磁盘等系统资源的利用率,检查系统运行日志等
如果在检查时发现系统出现问题,对相关的信息进行记录,在需要进行调整的情
况下,会和贵方工程师一起进行讨论,确定调整方案,并约定合适的时间进行调整,
同时进行调整后的观察。对于比较复杂的问题,会在报告中指出,并建议客户进一步
进行相关的详细检查,提出测试和解决方案,以便问题得到解决。
Exadata 巡检报告
3. 配置信息
3.1. IP 地址列表
Hostname
Net0 IP
ILOM IP
RU X2-2
RU
Full
Half Qtr
X2-8
dm01sw-ib1
10.224.150.153
dm01sw-kvm
10.224.150.151
dm01sw-ip
dm01sw-ib2
10.224.150.152
10.224.150.154
dm01db04
dm01db03
dm01db02
dm01db01
dm01cel07
dm01cel06
dm01cel05
dm01cel04
dm01cel03
dm01cel02
dm01cel01
10.224.150.132
10.224.150.143
10.224.150.131
10.224.150.142
10.224.150.130
10.224.150.141
10.224.150.129
10.224.150.140
10.224.150.139
10.224.150.150
10.224.150.138
10.224.150.149
10.224.150.137
10.224.150.148
10.224.150.136
10.224.150.147
10.224.150.135
10.224.150.146
10.224.150.134
10.224.150.145
10.224.150.133
10.224.150.144
dm01sw-ib3
10.224.150.155
PDU-A
PDU-B
10.224.150.156
10.224.150.157
22
21
20
19
18
17
16
14
12
10
8
6
4
2
1
22
22
21
20
17
16
6
4
2
21
20
19
18
17
16
14
12
10
8
6
4
2
1
23
22
21
24-28
16-20
14
12
10
8
6
4
2
1
3.2. 机柜内标签信息
Exadata 巡检报告
The cables between the various units within the rack are labelled by manufacturing. The
cables are
also colour-coded as follows:
Black – Infiniband Data.
Black – Infiniband Switch & Storage Cell Node Ethernet management cables.
Red – ILOM Ethernet management cables.
Blue – Component Gigabit Ethernet management (eth0) cables on DB Nodes.
Orange – Component KVM switch to dongle cables (not on X2-8).
Black – AC power jumper cables
Some examples are given here (Exadata X2):
U21 P19
(local / where it is): Rack Unit 21 Port 19.
U8 Video (remote / where its going to): Rack Unit 8 video dongle.
At an Infiniband switch (connection to second switch) (black cable):
R1 U20 P8A (local): Rack Unit 20 Port 8A on the switch.
R1 U24 P8A (remote): Rack Unit 24 Port 8A on the switch.
At a Infiniband switch to a PCI card on a server (black cable):
R1 U20 P15A (local): Rack Unit 20 Port 15A on the switch.
R1 U12 PCIE3-1 (remote): Rack Unit 12 PCIE card in slot 3, port #1.
At a server's ILOM to Ethernet switch (red cable):
R1 U8 ILOM (local): Rack Unit 8 ILOM NET MGT port.
R1 U23 P38 (remote): Rack Unit 23 Port 38 on the Ethernet switch.
At a server's power cable / PDU2 (black cable):
U19 PS0 (local): Rack Unit 19 Power Supply 0.
PDU A (remote): Group 2 Output 3 on PDU A (left side, viewed from rear).
G2-3
For data cables the label at the opposite end of the cable is labelled with local/remote
exchanged;
for power cables the labels at each end are identical.
For the Exadata X2-2:
The X2-8 does not have a KVM.
Exadata 巡检报告
4. 检查项目
4.1. INFINIBAND 交换机
Task
Comment
Switch
1
2
3
CHECK THE SUN DATACENTER 36-PORT MANAGED QDR
INFINIBAND SWITCHES
IB switch #1 is only relevant to half & full-rack installations – ignore for
the smaller configurations.
√ √ √
localhost: root
password: welcome1
<== Expected firmware: 1.3.3-2
The original systems were released with
firmware version 1.3.3-2.
X2-2 systems were released with
firmware
version 1.3.3-2.
The current version is 1.3.3-2.
Login as root:
The switch OS is Linux-
based.
Examine the Firmware
version:
[root@ dm01sw-ib1 ~]# version
SUN DCS 36p version: 1.3.3-2
Build time: Apr 4 2011
11:15:19
SP board info:
Manufacturing Date:
2010.07.27
Serial Number: "NCD570331"
Hardware Revision: 0x0005