1.1 配置主机名
- 分别在四个节点(linux001 linux002 linux003 linux004)上编辑/etc/hosts文件【每个节点内容一致】
1: [root@linux001 ~]# vi /etc/hosts
2: # Do not remove the following line, or various programs
3: # that require network functionality will fail.
4: 127.0.0.1 linux001 localhost.localdomain localhost
5: ::1 localhost6.localdomain6 localhost6i 172.23.176.103 linux001
6: 172.23.176.101 linux002
7: 172.23.176.104 linux003
8: 172.23.176.102 linux004
- 其中172.23.176.103 linux001;127.23.176.101 linux002;127.23.176.104 linux003;127.23.176.102 linux004.为所添加的内容,在每天服务器上执行同样的操作。编辑完以后安装Esc键后按shift +:建后输入wq!保存退出。
1.2配置ssh互信
- 以linux001和linux002为例。首先配置linux002免密码登录linux001.在linux001上执行:
1: [root@linux001]$ >ssh-keygen -t dsa
2: Generating public/private dsa key pair.
3: Enter file in which to save the key (/home/oracle/.ssh/id_dsa):
4: Created directory '/home/oracle/.ssh'.
5: Enter passphrase (empty for no passphrase):
6: Enter same passphrase again:
7: Your identification has been saved in /home/oracle/.ssh/id_dsa.
8: Your public key has been saved in /home/oracle/.ssh/id_dsa.pub.
9: [root@linux001]>cd /root/.ssh
10: [root@linux001]>mv id_dsa.pub linux001.pub
11: [root@linux001]>scp linux001.pub root@linux002:/root/.ssh
12: 在linux002上执行
13: [root@linux002]> cd /root/.ssh
14: [root@linux002]cat linux001.pub >> authorized_keys
15: [root@linux002]chmod 600 authorized_keys
16: [root@linux002]cd ..
17: [root@linux002] chmod 700 .ssh
- 配置linux001免密码登录linux002,在linux002上执行:
1: [root@linux002]$ >ssh-keygen -t dsa
2: Generating public/private dsa key pair.
3: Enter file in which to save the key (/home/oracle/.ssh/id_dsa):
4: Created directory '/home/oracle/.ssh'.
5: Enter passphrase (empty for no passphrase):
6: Enter same passphrase again:
7: Your identification has been saved in /home/oracle/.ssh/id_dsa.
8: Your public key has been saved in /home/oracle/.ssh/id_dsa.pub.
9: [root@linux002]>cd /root/.ssh
10: [root@linux002]>mv id_dsa.pub linux001.pub
11: [root@linux002]>scp linux001.pub root@linux001:/root/.ssh
- 在linux001上执行
1: [root@linux001]> cd /root/.ssh
2: [root@linux001]cat linux001.pub >> authorized_keys
3: [root@linux001]chmod 600 authorized_keys
4: [root@linux002]cd ..
5: [root@linux001] chmod 700 .ssh
- 其它机器linux001和linux003相互之间,linux001和linux004相互之间,linux002和linux004相互之间,linux002和linux003相互之间,linux003和linux004相互之间配置方法以此类似(一共需要配置12次)。
2 多路径配置
2.1 相关软件包
- 所需相关软件包
device-mapper-1.02.13-6.9.i586.rpm
该软件运行在底层,主要进行设备虚拟化和映射。
multipath-tools-0.4.7-34.18.i586.rpm,这个软件包可以在
多路径的管理和监控工具,主要进行路径状态的检测,管理工作
- 查看软件是否安装,如果没有则安装。(所有4台机器上操作)
[root@linux001] rpm -qa | grep device-mapper(查看device-mapper是否安装,如果没有任何提示,则通过zypper install 软件名安装)
[root@linux001] rpm -qa | grep multipath-tool(查看device-mapper是否安装,如果没有任何提示,则通过zypper install 软件名安装)
- 安装命令:
[root@linux001] zypper install device-mapper
[root@linux001] zypper install multipath-tool
- 如果多路径模块没有加载成功请使用下列命初始化DM,或重启系统
1: ---Use the following commands to initialize and start DM for the first time:
2: [root@linux001] modprobe dm-multipath
3: [root@linux001] modprobe dm-round-robin
4: [root@linux001] service multipathd start
5: [root@linux001] multipath –v2
2.2 查看wwid号和状态。
1: linux001:/ # multipath -ll(任意一节点)
2: Mpath1 (36000d31000eea500000000000000000008 ) dm-11 COMPELNT compelnet ,vol
3: [size=8.0T][features=1 queue_if_no_path][hwhandler=0] wp=rw
4: \_ round-robin 0 [prio=1 ] status=active
5: \_ 3:0:5:6 sdh 8:112 [active][undef]
6: \_ 4:0:7:6 sdn 8:208 [active][undef]
7: \_ 4:0:7:6 sdw 65:96 [active][undef]
8: \_ 4:0:7:6 sdz 65:114 [active][undef]
9:
10: Mpath2 (36000d31000eea500000000000000000003 ) dm-6 COMPELNT compelnet ,vol
11: [size=2.0T][features=1 queue_if_no_path][hwhandler=0] wp=rw
12: \_ round-robin 0 [prio=1 ] status=active
13: \_ 3:0:4:1sdc 8:32 [active][undef]
14: \_ 4:0:6:1 sdi 8:128 [active][undef]
15: \_ 4:0:4:1 sdo 8:224 [active][undef]
16: \_ 4:0:5:1 sdr 65:16 [active][undef]
17:
18: Mpath3 (36000d31000eea500000000000000000007) dm-8 COMPELNT compelnet ,vol
19: [size=500G][features=1 queue_if_no_path][hwhandler=0] wp=rw
20: \_ round-robin 0 [prio=1 ] status=active
21: \_ 3:0:4:5 sde 8:64 [active][undef]
22: \_ 4:0:6:5 sdk 8:160 [active][undef]
23: \_ 4:0:4:5 sdq 65:0 [active][undef]
24: \_ 4:0:5:5 sdt 65:48 [active][undef]
- 为了使多链路下的伪设备名(mpath1,mpath2,mpath3)更有意义我们可以通过修改相关配置文件进行配置。这里我们需要记住每个多链路路径下的伪设备的wwid。号如mpath1中后面括号中的 36000d31000eea500000000000000000008。另外我们发现在设备名称后的状态为activ undef则说明相关设备还没有激活。
2.3 配置文件的修改及创建
- 创建一个multipath.conf的配置文件,该文件在安装后不会自动创建。不过有一个模板可以使用,使用如下命令可以创建一个multipath.conf的文件cp/usr/share/doc/packages/multipath-tools/multipath.conf.synthetic /etc/multipath.conf(把系统中的默认带的文件multipath.conf.synthetic拷贝到/etc/multipath.conf里,生产新的配置文件multipath.conf)
- 修改/etc/multipath.conf文件
1: vi /etc/multipath.conf
2: blacklist {
3: devnode "^sda"
4: devnode "^sdb"
5: } 添加黑名单排除本地磁盘sda,sdb
6: defaults {
7: user_friendly_names yes
8: } 修改user_friendly_names为 yes
9: multipaths {
10: multipath {
11: wwid 36000d31000eea500000000000000000008(mapth1的wwid)
12: alias hana-8.0T
13: path_grouping_policymultibus
14: path_selector"round-robin 0"
15: failbackmanual
16: rr_weightpriorities
17: no_path_retry5
18: rr_min_io100
19: }
20: multipath {
21: wwid6000d31000eea500000000000000000003 (mapth2的wwid)
22: aliashana-2.0T
23: }
24: multipath {
25: Wwid 36000d31000eea500000000000000000007(math3的wwid)
26: aliashana-500G
27: }
28:
29: }
- 修改过的配置文件我们可以通过scp的方式把配置文件multipath.conf复制到其他3台服务器上。命令如下所示:
1: [root@linux001]#scp /etc/multipath.conf root@linux002:/etc/multipath.conf
2: [root@linux001]#scp /etc/multipath.conf root@linux003:/etc/multipath.conf
3: [root@linux001]#scp /etc/multipath.conf root@linux004:/etc/multipath.conf
2.4 重启multipathd服务。
- 重启服务(每个节点)
1: [root@linux001]# /etc/init.d/multipathd stop
2: Stopping multipathd daemon: [ OK ]
3: [root@linux001]# /etc/init.d/multipathd start
4: Starting multipathd daemon: [ OK ] -----提示OK 正常开启服务
5: 开机自动运行(2,3,5运行级别)(每个节点)
6: [root@linux001]#chkconfig --level 235 multipathd on
2.5 查看multipath状态
1: linux001:/ # multipath -ll (每个节点上面执行)
2: Mpath1 (36000d31000eea500000000000000000008 ) dm-11 COMPELNT compelnet ,vol
3: [size=8.0T][features=1 queue_if_no_path][hwhandler=0] wp=rw
4: \_ round-robin 0 [prio=1 ] status=active
5: \_ 3:0:5:6 sdh 8:112 active ready running
6: \_ 4:0:7:6 sdn 8:208 active ready running
7: \_ 4:0:7:6 sdw 65:96 active ready running
8: \_ 4:0:7:6 sdz 65:114 active ready running
9:
10: Mpath2(36000d31000eea500000000000000000003 ) dm-6 COMPELNT compelnet ,vol
11: [size=2.0T][features=1 queue_if_no_path][hwhandler=0] wp=rw
12: \_ round-robin 0 [prio=1 ] status=active
13: \_ 3:0:4:1sdc 8:32 active ready running
14: \_ 4:0:6:1 sdi 8:128 active ready running
15: \_ 4:0:4:1 sdo 8:224 active ready running
16: \_ 4:0:5:1 sdr 65:16 active ready running
17:
18: Mpath3 (36000d31000eea500000000000000000007) dm-8 COMPELNT compelnet ,vol
19: [size=500G][features=1 queue_if_no_path][hwhandler=0] wp=rw
20: \_ round-robin 0 [prio=1 ] status=active
21: \_ 3:0:4:5 sde 8:64 active ready running
22: \_ 4:0:6:5 sdk 8:160 active ready running
23: \_ 4:0:4:5 sdq 65:0 active ready running
24: \_ 4:0:5:5 sdt 65:48 active ready running
- 当看到如上所示:active ready running,则代表多链路命名已经配置成功了。
3 Ocfs2相关配置
3.1 相关软件包
Ocfs2-tools-1.6.4.0.3.5
Ocfs2-dmp-default-1.6_3.0.13_0.27-0.5.84
Ocfs2-tools-o2cb-1.6.4.0.3.5
Ocfs2console-1.6.4.0.3.5
- 查看软件是否安装,如果没有则安装。(所有4台机器上操作)
1: [root@linux001] rpm -qa | grep ocfs2-tools
2: [root@linux001] rpm -qa | grep ocfs2-dmp-default
3: [root@linux001] rpm -qa | grep ocfs2-tools-o2cb
4: [root@linux001] rpm -qa | grep ocfs2-console
5: 安装命令:
6: [root@linux001] zypper install device-mapper
7: [root@linux001] zypper install multipath-tool
- 配置节点(任意一台节点)
ocfs2console --> Cluster --> Node Configuration-->把每个节点都添加上,名字就是Hostname, IP部分添私网IP
- 同步节点.(任意一台节点)
ocfs2console --> Cluster --> Progagate Cluster Configuration 同步linux001上的配置到linux002 linux003 linux004上去。
1: [root@ linux001]# cat /etc/ocfs2/cluster.conf
2: node:
3: ip_port = 7777
4: ip_address = 172.23.176.103
5: number = 0
6: name = linux001
7: cluster = ocfs2
8:
9: node:
10: ip_port = 7777
11: ip_address = 172.23.176.101
12: number = 1
13: name = linux002
14: cluster = ocfs2
15: node:
16: ip_port = 7777
17: ip_address = 172.23.176.103
18: number = 2
19: name = linux003
20: cluster = ocfs2
21:
22: node:
23: ip_port = 7777
24: ip_address = 172.23.176.102
25: number = 3
26: name = linux004
27: cluster = ocfs2
28: cluster:
29: node_count = 4
30: name = ocfs2
3.3 配置02cb
- 在每个节点上执行:
1: [root@linux001]# /etc/init.d/o2cb configure
2: Configuring the O2CB driver.
3: This will configure the on-boot properties of the O2CB driver.
4: The following questions will determine whether the driver is loaded on
5: boot. The current values will be shown in brackets ('[]'). Hitting
6:without typing an answer will keep that current value. Ctrl-C
7: will abort.
8: Load O2CB driver on boot (y/n) [n]: y
9: Cluster stack backing O2CB [o2cb]:
10: Cluster to start on boot (Enter "none" to clear) [ocfs2]:
11: Specify heartbeat dead threshold (>=7) [31]:
12: Specify network idle timeout in ms (>=5000) [30000]:
13: Specify network keepalive delay in ms (>=1000) [2000]:
14: Specify network reconnect delay in ms (>=2000) [2000]:
15: Writing O2CB configuration: OK
16: Loading filesystem "configfs": OK
17: Mounting configfs filesystem at /sys/kernel/config: OK
18: Loading filesystem "ocfs2_dlmfs": OK
19: Creating directory '/dlm': OK
20: Mounting ocfs2_dlmfs filesystem at /dlm: OK
21: Checking O2CB cluster configuration : Failed
3.4 文件系统分区
1: [root@linux001]# fdisk /dev/mapper/hana-500G
2: Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel
3: Building a new DOS disklabel. Changes will remain in memory only,
4: until you decide to write them. After that, of course, the previous
5: content won't be recoverable.
6:
7: Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)
8:
9: Command (m for help): n
10: Command action
11: e extended
12: p primary partition (1-4)
13: p
14: Partition number (1-4): 1
15: First cylinder (1-261, default 1):
16: Using default value 1
17: Last cylinder or +size or +sizeM or +sizeK (1-261, default 261):
18: Using default value 261
19:
20: Command (m for help): w
21: The partition table has been altered!
22:
23: Calling ioctl() to re-read partition table.
- 同样方法对其他设备进行分区。(实际过程中也可以对整个设备进行格式化而不进行分区,本例中就是这样做的,因此没有分区这一步。)
3.5 格式化文件系统
1: ?格式化文件系统(任意一台节点)
2: [root@linux001]# mkfs.ocfs2 /dev/mapper/hana-8.0t
3: [root@linux001]# mkfs.ocfs2 /dev/mapper/hana2.0t
4: [root@linux001]# mkfs.ocfs2 /dev/mapper/hana-500G
5: ?创建数据存放目录
6: [root@linux001] mkdir /saphana
7: [root@linux001] mkdir -p /saphana/data
8: [root@linux001] mkdir -p /saphana/log
9: [root@linux001] mkdir -p /saphana/shared
10: ?挂载目录
11: [root@linux001]#mount -t ocfs2 -o nointr /dev/mapper/hana-8.0T /saphana/data(重启后会失效)
12: [root@linux001]#mount -t ocfs2 -o nointr /dev/mapper/hana-2.0T /saphana/log(重启后会失效)
13: [root@linux001]#mount -t ocfs2 -o nointr /dev/mapper/hana-500G /saphana/shared(重启后会失效)
14: ?查看挂载情况(每个节点)
15: [root@NEWORACLE2 ~]# df -h
16: Filesystem size Used Avali Use% Mount on
17: /dev/mapper/system-root 30G 4.8G 24G 17% /
18: Devtmprs 253G 320k 253G 1% /dev
19: tmpfs 380G 88k 380G 1% /dev/shm
20: /dev/sda1 152M 36M 108M 25% /boot
21: /dev/dm-11 8.0T 13G 8.0T 1% saphana/data
22: /dev/dm-6 2.0T 13G 2.0T 1% saphana/log
23: /dev/dm-11 500G 2.9G 298G 1% saphana/shared
24: ?把挂载文件写入到配置文件里,以免服务器重启后失效(每个节点)
25: 编辑文件 vi /etc/fstab在文件末尾添加如下文件:
26: /dev/mapper/hana-8.0T /saphana/data ocfs2 _netdev 0 0
27: /dev/mapper/hana-2.0T /saphana/log ocfs2 _netdev 0 0
28: /dev/mapper/hana-500G /saphana/shared ocfs2 _netdev 0 0
29: 保存后,重启服务器如果通过df -h命令,如下所示,则说明配置成功。
30: [root@NEWORACLE2 ~]# df -h
31: Filesystem size Used Avali Use% Mount on
32: /dev/mapper/system-root 30G 4.8G 24G 17% /
33: Devtmprs 253G 320k 253G 1% /dev
34: tmpfs 380G 88k 380G 1% /dev/shm
35: /dev/sda1 152M 36M 108M 25% /boot
36: /dev/dm-11 8.0T 13G 8.0T 1% saphana/data
37: /dev/dm-6 2.0T 13G 2.0T 1% saphana/log
38: /dev/dm-11 500G 2.9G 298G 1% saphana/shared
3.6 重启相关服务
- 重启服务(每个节点)
1: [root@linux001 ~]# service ocfs2 restart
2: [root@linux001 ~]# service o2cb restart
3: Unmounting ocfs2_dlmfs filesystem: OK
4: Unloading module "ocfs2_dlmfs": OK
5: Unmounting configfs filesystem: OK
6: Unloading module "configfs": OK
7: Loading filesystem "configfs": OK
8: Mounting configfs filesystem at /sys/kernel/config: OK
9: Loading filesystem "ocfs2_dlmfs": OK
10: Mounting ocfs2_dlmfs filesystem at /dlm: OK
11: Starting O2CB cluster ocfs2: OK
12: ?加载开机运行(每个节点)
13: [root@linux001]#chkconfig --level 235 o2cb on
14: [root@linux001]#chkconfig --level 235 ocfs2 on
3.7 查看服务状态
1: ?查看02cb服务(每个节点)
2: root@NEWORACLE2 ~]# service o2cb status
3: Driver for "configfs": Loaded
4: Filesystem "configfs": Mounted
5: Driver for "ocfs2_dlmfs": Loaded
6: Filesystem "ocfs2_dlmfs": Mounted
7: Checking O2CB cluster ocfs2: Online
8: Heartbeat dead threshold = 31
9: Network idle timeout: 30000
10: Network keepalive delay: 2000
11: Network reconnect delay: 2000
12: Checking O2CB heartbeat:active
13: ?查看ocfs2服务(每个节点)
14: root@linux001 ~]# service ocfs2 status
15: Configured OCFS2 mountpoints: /saphana/data /saphana/log /saphana/shared
16: Active OCFS2 mountpoints: /saphana/data /sap/hanalog
17: /saphana/shared
4 测试
4.1 创建文件进行读写
- 测试很简单在任何一台服务器上,如linux001上的 /saphana/data 目录下创建文件linux,并且往里面随便加入点内容,如果在另几台服务器的 /saphana/data 也可以看到linux文件并且可以进行读写则代表成功。
5 常见错误
1: [root@linux001 ~]# mount -t ocfs2 -o nointr /dev/mapper/hana-500G /saphana/
2: shared
3: mount.ocfs2: Device or resource busy while mounting /dev/sdb1 on /u02/ocfs_redo.
4: Check 'dmesg' for more information on this error.
5: ?设备可能已经被挂载,可以卸妆后重新挂载。
6: [root@linux001 ~]#umount /dev/mapper/hana-500G
7: [root@linux001 ~]#umount /saphana/shared
8: [root@linux001 ~]# mount -t ocfs2 -o nointr /dev/mapper/hana-500G /saphana/
9: Shared
10: mount.ocfs2: Unable to access cluster service while trying to join the group
11: ?df查看没有挂载上去,接下来重启了该节点的ocfs2服务,重新挂载正常。
12: mount -t ocfs2 -o nointr /dev/mapper/hana-500G /saphana/shared
13: ocfs2_hb_ctl: Bad magic number in superblock while reading uuid
14: mount.ocfs2: Error when attempting to run /sbin/ocfs2_hb_ctl: "Operation not
15: permitted
16: ?这个问题是由于ocfs2文件文件系统分区没有格式化引起的错误,在挂载ocfs2文件系统之前,
17: 用于这个文件系统的分区一定要进行格式化.