1.1 配置主机名

  • 分别在四个节点(linux001 linux002 linux003 linux004)上编辑/etc/hosts文件【每个节点内容一致】
 
1: [root@linux001 ~]# vi /etc/hosts
2: # Do not remove the following line, or various programs
3: # that require network functionality will fail.
4: 127.0.0.1              linux001 localhost.localdomain localhost
5: ::1             localhost6.localdomain6 localhost6i                172.23.176.103   linux001
6: 172.23.176.101  linux002
7: 172.23.176.104  linux003
8: 172.23.176.102  linux004
  • 其中172.23.176.103 linux001;127.23.176.101 linux002;127.23.176.104 linux003;127.23.176.102 linux004.为所添加的内容,在每天服务器上执行同样的操作。编辑完以后安装Esc键后按shift +:建后输入wq!保存退出。

1.2配置ssh互信

  • 以linux001和linux002为例。首先配置linux002免密码登录linux001.在linux001上执行:
 
1: [root@linux001]$ >ssh-keygen  -t dsa
2: Generating public/private dsa key pair.
3: Enter file in which to save the key (/home/oracle/.ssh/id_dsa):
4: Created directory '/home/oracle/.ssh'.
5: Enter passphrase (empty for no passphrase):
6: Enter same passphrase again:
7: Your identification has been saved in /home/oracle/.ssh/id_dsa.
8: Your public key has been saved in /home/oracle/.ssh/id_dsa.pub.
9: [root@linux001]>cd /root/.ssh
10: [root@linux001]>mv  id_dsa.pub  linux001.pub
11: [root@linux001]>scp    linux001.pub  root@linux002:/root/.ssh
12: 在linux002上执行
13: [root@linux002]> cd /root/.ssh
14: [root@linux002]cat  linux001.pub  >> authorized_keys
15: [root@linux002]chmod  600   authorized_keys
16: [root@linux002]cd ..
17: [root@linux002] chmod  700 .ssh
  • 配置linux001免密码登录linux002,在linux002上执行:
 
1: [root@linux002]$ >ssh-keygen  -t dsa
2: Generating public/private dsa key pair.
3: Enter file in which to save the key (/home/oracle/.ssh/id_dsa):
4: Created directory '/home/oracle/.ssh'.
5: Enter passphrase (empty for no passphrase):
6: Enter same passphrase again:
7: Your identification has been saved in /home/oracle/.ssh/id_dsa.
8: Your public key has been saved in /home/oracle/.ssh/id_dsa.pub.
9: [root@linux002]>cd /root/.ssh
10: [root@linux002]>mv  id_dsa.pub  linux001.pub
11: [root@linux002]>scp    linux001.pub  root@linux001:/root/.ssh
  • 在linux001上执行
 
1: [root@linux001]> cd /root/.ssh
2: [root@linux001]cat  linux001.pub  >> authorized_keys
3: [root@linux001]chmod  600   authorized_keys
4: [root@linux002]cd ..
5: [root@linux001] chmod  700 .ssh
  • 其它机器linux001和linux003相互之间,linux001和linux004相互之间,linux002和linux004相互之间,linux002和linux003相互之间,linux003和linux004相互之间配置方法以此类似(一共需要配置12次)。

2 多路径配置

2.1 相关软件包

  • 所需相关软件包

device-mapper-1.02.13-6.9.i586.rpm      

该软件运行在底层,主要进行设备虚拟化和映射。

multipath-tools-0.4.7-34.18.i586.rpm,这个软件包可以在

多路径的管理和监控工具,主要进行路径状态的检测,管理工作

  • 查看软件是否安装,如果没有则安装。(所有4台机器上操作)

[root@linux001] rpm  -qa  | grep device-mapper(查看device-mapper是否安装,如果没有任何提示,则通过zypper install  软件名安装)

[root@linux001] rpm   -qa | grep  multipath-tool(查看device-mapper是否安装,如果没有任何提示,则通过zypper install  软件名安装)

  • 安装命令:

[root@linux001] zypper   install   device-mapper

[root@linux001] zypper   install   multipath-tool

  • 如果多路径模块没有加载成功请使用下列命初始化DM,或重启系统
 
1: ---Use the following commands to initialize and start DM for the first time:
2: [root@linux001] modprobe dm-multipath
3: [root@linux001] modprobe dm-round-robin
4: [root@linux001] service multipathd start
5: [root@linux001] multipath –v2

2.2 查看wwid号和状态。

 
1: linux001:/ # multipath -ll(任意一节点)
2: Mpath1 (36000d31000eea500000000000000000008 ) dm-11  COMPELNT compelnet ,vol
3: [size=8.0T][features=1 queue_if_no_path][hwhandler=0] wp=rw
4: \_ round-robin 0 [prio=1 ] status=active
5:  \_ 3:0:5:6 sdh 8:112  [active][undef]
6:  \_ 4:0:7:6 sdn 8:208   [active][undef]
7:  \_ 4:0:7:6 sdw 65:96   [active][undef]
8:  \_ 4:0:7:6 sdz  65:114   [active][undef]
9:
10: Mpath2 (36000d31000eea500000000000000000003 ) dm-6  COMPELNT compelnet ,vol
11: [size=2.0T][features=1 queue_if_no_path][hwhandler=0] wp=rw
12: \_ round-robin 0 [prio=1 ] status=active
13:  \_ 3:0:4:1sdc 8:32  [active][undef]
14:  \_ 4:0:6:1 sdi 8:128   [active][undef]
15:  \_ 4:0:4:1 sdo 8:224  [active][undef]
16:  \_ 4:0:5:1 sdr  65:16   [active][undef]
17:
18: Mpath3 (36000d31000eea500000000000000000007) dm-8  COMPELNT compelnet ,vol
19: [size=500G][features=1 queue_if_no_path][hwhandler=0] wp=rw
20: \_ round-robin 0 [prio=1 ] status=active
21:  \_ 3:0:4:5 sde 8:64  [active][undef]
22:  \_ 4:0:6:5 sdk 8:160   [active][undef]
23:  \_ 4:0:4:5 sdq 65:0  [active][undef]
24:  \_ 4:0:5:5 sdt  65:48   [active][undef]
  • 为了使多链路下的伪设备名(mpath1,mpath2,mpath3)更有意义我们可以通过修改相关配置文件进行配置。这里我们需要记住每个多链路路径下的伪设备的wwid。号如mpath1中后面括号中的 36000d31000eea500000000000000000008。另外我们发现在设备名称后的状态为activ undef则说明相关设备还没有激活。

2.3 配置文件的修改及创建

  • 创建一个multipath.conf的配置文件,该文件在安装后不会自动创建。不过有一个模板可以使用,使用如下命令可以创建一个multipath.conf的文件cp/usr/share/doc/packages/multipath-tools/multipath.conf.synthetic    /etc/multipath.conf(把系统中的默认带的文件multipath.conf.synthetic拷贝到/etc/multipath.conf里,生产新的配置文件multipath.conf)
  • 修改/etc/multipath.conf文件
 
1: vi /etc/multipath.conf
2: blacklist {
3: devnode "^sda"
4: devnode "^sdb"
5: }  添加黑名单排除本地磁盘sda,sdb
6: defaults {
7: user_friendly_names yes
8: }   修改user_friendly_names为 yes
9: multipaths {
10: multipath {
11:      wwid 36000d31000eea500000000000000000008(mapth1的wwid)
12:      alias  hana-8.0T
13:     path_grouping_policymultibus
14:     path_selector"round-robin 0"
15:     failbackmanual
16:     rr_weightpriorities
17:  no_path_retry5
18:   rr_min_io100
19:  }
20: multipath {
21: wwid6000d31000eea500000000000000000003 (mapth2的wwid)
22: aliashana-2.0T
23: }
24: multipath {
25: Wwid   36000d31000eea500000000000000000007(math3的wwid)
26: aliashana-500G
27: }
28:
29: }
  • 修改过的配置文件我们可以通过scp的方式把配置文件multipath.conf复制到其他3台服务器上。命令如下所示:
 
1: [root@linux001]#scp  /etc/multipath.conf  root@linux002:/etc/multipath.conf
2: [root@linux001]#scp  /etc/multipath.conf  root@linux003:/etc/multipath.conf
3: [root@linux001]#scp  /etc/multipath.conf  root@linux004:/etc/multipath.conf

2.4 重启multipathd服务。

  • 重启服务(每个节点)
 
1: [root@linux001]# /etc/init.d/multipathd stop
2: Stopping multipathd daemon:                       [  OK  ]
3: [root@linux001]# /etc/init.d/multipathd start
4: Starting multipathd daemon:         [  OK  ] -----提示OK 正常开启服务
5: 开机自动运行(2,3,5运行级别)(每个节点)
6: [root@linux001]#chkconfig --level  235  multipathd  on

2.5 查看multipath状态

 
1: linux001:/ # multipath -ll (每个节点上面执行)
2: Mpath1 (36000d31000eea500000000000000000008 ) dm-11  COMPELNT compelnet ,vol
3: [size=8.0T][features=1 queue_if_no_path][hwhandler=0] wp=rw
4: \_ round-robin 0 [prio=1 ] status=active
5:  \_ 3:0:5:6 sdh 8:112    active  ready  running
6:  \_ 4:0:7:6 sdn 8:208    active  ready  running
7:  \_ 4:0:7:6 sdw 65:96    active  ready  running
8:  \_ 4:0:7:6 sdz  65:114  active  ready  running
9:
10: Mpath2(36000d31000eea500000000000000000003 ) dm-6  COMPELNT compelnet ,vol
11: [size=2.0T][features=1 queue_if_no_path][hwhandler=0] wp=rw
12: \_ round-robin 0 [prio=1 ] status=active
13:  \_ 3:0:4:1sdc 8:32     active  ready  running
14:  \_ 4:0:6:1 sdi 8:128    active  ready  running
15:  \_ 4:0:4:1 sdo 8:224   active  ready  running
16:  \_ 4:0:5:1 sdr  65:16  active  ready  running
17:
18: Mpath3 (36000d31000eea500000000000000000007) dm-8  COMPELNT compelnet ,vol
19: [size=500G][features=1 queue_if_no_path][hwhandler=0] wp=rw
20: \_ round-robin 0 [prio=1 ] status=active
21:  \_ 3:0:4:5 sde 8:64       active  ready  running
22:  \_ 4:0:6:5 sdk 8:160    active  ready  running
23:  \_ 4:0:4:5 sdq 65:0   active  ready  running
24:  \_ 4:0:5:5 sdt  65:48    active  ready  running
  • 当看到如上所示:active  ready  running,则代表多链路命名已经配置成功了。

3 Ocfs2相关配置

3.1 相关软件包

Ocfs2-tools-1.6.4.0.3.5

Ocfs2-dmp-default-1.6_3.0.13_0.27-0.5.84

Ocfs2-tools-o2cb-1.6.4.0.3.5

Ocfs2console-1.6.4.0.3.5

  • 查看软件是否安装,如果没有则安装。(所有4台机器上操作)
 
1: [root@linux001] rpm  -qa  | grep  ocfs2-tools
2: [root@linux001] rpm   -qa | grep  ocfs2-dmp-default
3: [root@linux001] rpm   -qa | grep  ocfs2-tools-o2cb
4: [root@linux001] rpm   -qa | grep  ocfs2-console
5: 安装命令:
6: [root@linux001] zypper   install   device-mapper
7: [root@linux001] zypper   install   multipath-tool
  • 配置节点(任意一台节点)

ocfs2console  --> Cluster --> Node Configuration-->把每个节点都添加上,名字就是Hostname, IP部分添私网IP

  • 同步节点.(任意一台节点)

ocfs2console  --> Cluster --> Progagate Cluster Configuration 同步linux001上的配置到linux002 linux003 linux004上去。

 
1: [root@ linux001]# cat /etc/ocfs2/cluster.conf
2: node:
3:         ip_port = 7777
4:         ip_address = 172.23.176.103
5:         number = 0
6:         name = linux001
7:         cluster = ocfs2
8:
9: node:
10:         ip_port = 7777
11:         ip_address = 172.23.176.101
12:         number = 1
13:         name = linux002
14:         cluster = ocfs2
15: node:
16:         ip_port = 7777
17:         ip_address = 172.23.176.103
18:         number = 2
19:         name = linux003
20:         cluster = ocfs2
21:
22: node:
23:         ip_port = 7777
24:         ip_address = 172.23.176.102
25:         number = 3
26:         name = linux004
27:         cluster = ocfs2
28: cluster:
29:         node_count = 4
30:         name = ocfs2

3.3 配置02cb

  • 在每个节点上执行:
 
1: [root@linux001]# /etc/init.d/o2cb configure
2: Configuring the O2CB driver.
3: This will configure the on-boot properties of the O2CB driver.
4: The following questions will determine whether the driver is loaded on
5: boot.  The current values will be shown in brackets ('[]').  Hitting
6: 
without typing an answer will keep that current value. Ctrl-C
7: will abort.
8: Load O2CB driver on boot (y/n) [n]: y
9: Cluster stack backing O2CB [o2cb]:
10: Cluster to start on boot (Enter "none" to clear) [ocfs2]:
11: Specify heartbeat dead threshold (>=7) [31]:
12: Specify network idle timeout in ms (>=5000) [30000]:
13: Specify network keepalive delay in ms (>=1000) [2000]:
14: Specify network reconnect delay in ms (>=2000) [2000]:
15: Writing O2CB configuration: OK
16: Loading filesystem "configfs": OK
17: Mounting configfs filesystem at /sys/kernel/config: OK
18: Loading filesystem "ocfs2_dlmfs": OK
19: Creating directory '/dlm': OK
20: Mounting ocfs2_dlmfs filesystem at /dlm: OK
21: Checking O2CB cluster configuration : Failed

3.4 文件系统分区

 
1: [root@linux001]# fdisk /dev/mapper/hana-500G
2: Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel
3: Building a new DOS disklabel. Changes will remain in memory only,
4: until you decide to write them. After that, of course, the previous
5: content won't be recoverable.
6:
7: Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)
8:
9: Command (m for help): n
10: Command action
11:    e   extended
12:    p   primary partition (1-4)
13: p
14: Partition number (1-4): 1
15: First cylinder (1-261, default 1):
16: Using default value 1
17: Last cylinder or +size or +sizeM or +sizeK (1-261, default 261):
18: Using default value 261
19:
20: Command (m for help): w
21: The partition table has been altered!
22:
23: Calling ioctl() to re-read partition table.
  • 同样方法对其他设备进行分区。(实际过程中也可以对整个设备进行格式化而不进行分区,本例中就是这样做的,因此没有分区这一步。)

3.5 格式化文件系统

 

 
1: ?格式化文件系统(任意一台节点)
2: [root@linux001]# mkfs.ocfs2  /dev/mapper/hana-8.0t
3: [root@linux001]# mkfs.ocfs2  /dev/mapper/hana2.0t
4: [root@linux001]# mkfs.ocfs2  /dev/mapper/hana-500G
5: ?创建数据存放目录
6: [root@linux001] mkdir  /saphana
7: [root@linux001] mkdir  -p  /saphana/data
8: [root@linux001] mkdir  -p  /saphana/log
9: [root@linux001] mkdir  -p  /saphana/shared
10: ?挂载目录
11: [root@linux001]#mount  -t  ocfs2  -o nointr  /dev/mapper/hana-8.0T /saphana/data(重启后会失效)
12: [root@linux001]#mount -t  ocfs2  -o nointr  /dev/mapper/hana-2.0T /saphana/log(重启后会失效)
13: [root@linux001]#mount -t  ocfs2  -o nointr  /dev/mapper/hana-500G /saphana/shared(重启后会失效)
14: ?查看挂载情况(每个节点)
15: [root@NEWORACLE2 ~]# df -h
16: Filesystem                  size    Used  Avali  Use%  Mount on
17: /dev/mapper/system-root     30G    4.8G  24G   17%     /
18: Devtmprs                   253G   320k  253G   1%    /dev
19: tmpfs                       380G   88k   380G   1%   /dev/shm
20: /dev/sda1                   152M   36M   108M  25%  /boot
21: /dev/dm-11                  8.0T     13G   8.0T   1%  saphana/data
22: /dev/dm-6                   2.0T     13G   2.0T   1%  saphana/log
23: /dev/dm-11                 500G    2.9G   298G  1%  saphana/shared
24: ?把挂载文件写入到配置文件里,以免服务器重启后失效(每个节点)
25: 编辑文件 vi   /etc/fstab在文件末尾添加如下文件:
26: /dev/mapper/hana-8.0T        /saphana/data  ocfs2   _netdev   0  0
27: /dev/mapper/hana-2.0T        /saphana/log  ocfs2      _netdev   0  0
28: /dev/mapper/hana-500G        /saphana/shared  ocfs2   _netdev   0  0
29: 保存后,重启服务器如果通过df -h命令,如下所示,则说明配置成功。
30: [root@NEWORACLE2 ~]# df -h
31: Filesystem                  size    Used  Avali  Use%  Mount on
32: /dev/mapper/system-root     30G    4.8G  24G   17%     /
33: Devtmprs                   253G   320k  253G   1%    /dev
34: tmpfs                       380G   88k   380G   1%   /dev/shm
35: /dev/sda1                   152M   36M   108M  25%  /boot
36: /dev/dm-11                  8.0T     13G   8.0T   1%  saphana/data
37: /dev/dm-6                   2.0T     13G   2.0T   1%  saphana/log
38: /dev/dm-11                 500G    2.9G   298G  1%  saphana/shared

3.6 重启相关服务

  • 重启服务(每个节点)
 
1: [root@linux001 ~]# service ocfs2  restart
2: [root@linux001 ~]# service o2cb restart
3: Unmounting ocfs2_dlmfs filesystem: OK
4: Unloading module "ocfs2_dlmfs": OK
5: Unmounting configfs filesystem: OK
6: Unloading module "configfs": OK
7: Loading filesystem "configfs": OK
8: Mounting configfs filesystem at /sys/kernel/config: OK
9: Loading filesystem "ocfs2_dlmfs": OK
10: Mounting ocfs2_dlmfs filesystem at /dlm: OK
11: Starting O2CB cluster ocfs2: OK
12: ?加载开机运行(每个节点)
13: [root@linux001]#chkconfig --level  235  o2cb  on
14: [root@linux001]#chkconfig --level  235   ocfs2  on

3.7 查看服务状态

 
1: ?查看02cb服务(每个节点)
2: root@NEWORACLE2 ~]# service o2cb status
3: Driver for "configfs": Loaded
4: Filesystem "configfs": Mounted
5: Driver for "ocfs2_dlmfs": Loaded
6: Filesystem "ocfs2_dlmfs": Mounted
7: Checking O2CB cluster ocfs2: Online
8: Heartbeat dead threshold = 31
9:   Network idle timeout: 30000
10:   Network keepalive delay: 2000
11:   Network reconnect delay: 2000
12: Checking O2CB heartbeat:active
13: ?查看ocfs2服务(每个节点)
14: root@linux001 ~]# service ocfs2  status
15: Configured OCFS2 mountpoints:  /saphana/data /saphana/log  /saphana/shared
16: Active    OCFS2 mountpoints:  /saphana/data /sap/hanalog
17: /saphana/shared

4 测试

4.1 创建文件进行读写

  • 测试很简单在任何一台服务器上,如linux001上的  /saphana/data  目录下创建文件linux,并且往里面随便加入点内容,如果在另几台服务器的 /saphana/data 也可以看到linux文件并且可以进行读写则代表成功。

5 常见错误

 
1: [root@linux001 ~]# mount -t ocfs2 -o nointr /dev/mapper/hana-500G /saphana/
2: shared
3: mount.ocfs2: Device or resource busy while mounting /dev/sdb1 on /u02/ocfs_redo.
4: Check 'dmesg' for more information on this error.
5: ?设备可能已经被挂载,可以卸妆后重新挂载。
6: [root@linux001 ~]#umount  /dev/mapper/hana-500G
7: [root@linux001 ~]#umount  /saphana/shared
8: [root@linux001 ~]# mount -t ocfs2 -o nointr /dev/mapper/hana-500G /saphana/
9: Shared
10: mount.ocfs2: Unable to access cluster service while trying to join the group
11: ?df查看没有挂载上去,接下来重启了该节点的ocfs2服务,重新挂载正常。
12: mount -t ocfs2 -o nointr  /dev/mapper/hana-500G  /saphana/shared
13: ocfs2_hb_ctl: Bad magic number in superblock while reading uuid
14: mount.ocfs2: Error when attempting to run /sbin/ocfs2_hb_ctl: "Operation not
15: permitted
16: ?这个问题是由于ocfs2文件文件系统分区没有格式化引起的错误,在挂载ocfs2文件系统之前,
17:   用于这个文件系统的分区一定要进行格式化.