RAC1:我服务器要重启下,喂:RAC2你先顶一下,咋个搞?!

(167) 2024-05-11 12:01:01

RAC1:我服务器要重启下,喂:RAC2你先顶一下,咋个搞?! (https://mushiming.com/)  第1张
【引言】
今天接到一小活,Linux 7.X上的一套RAC环境,因某种原因,节点1 RAC1需要重启服务器,业务还不能中断。

业务不中断?这不正是RAC应该干的事吗!!!解决方案为:将节点1上的监听、实例停掉后,再关闭节点1的集群服务即可。

好长时间没有操作集群,主要是RAC环境太稳定,没发挥机会,很多命令记得不是很熟悉,一路下来基本靠-help。好记性步入烂笔头,本文将操作步骤记录下来,以备后续查阅。

正文如下:
这里先点一下,RAC关闭的合理顺序:
停止各节点的监听服务–>>关闭数据库(实例)–>>关闭节点集群服务–>>关闭服务器

注意:
退出所有的客户端连接,如SQLplus …

因本文是关闭单节点,故操作步骤如下:
停止节点1的监听服务–>>关闭数据库(实例)–>>关闭节点集群服务CRS–>>关闭服务器

注意:本案例中crsctl命令在root的path中做了声明,故crsctl命令均使用root执行,
[root@dbts-rac1 ~]# which crsctl
/oracle/app/11.2.0/grid/bin/crsctl

1.先检查集群运行状态(root用户或者grid用户),命令为crs_stat -t -v或者crsctl stat res -t

root@dbts-rac1 ~]#crs_stat -t -v
Name           Type           R/RA   F/FT   Target    State     Host        
----------------------------------------------------------------------
ora....ER.lsnr ora....er.type 0/5    0/     ONLINE    ONLINE    dbts-rac1  
ora....N1.lsnr ora....er.type 0/5    0/0    ONLINE    ONLINE    dbts-rac1  
ora....VOTE.dg ora....up.type 0/5    0/     ONLINE    ONLINE    dbts-rac1  
ora....ARCH.dg ora....up.type 1/5    0/     ONLINE    ONLINE    dbts-rac1  
ora....ATA1.dg ora....up.type 1/5    0/     ONLINE    ONLINE    dbts-rac1  
ora....ATA2.dg ora....up.type 0/5    0/     ONLINE    ONLINE    dbts-rac1  
ora.asm        ora.asm.type   0/5    0/     ONLINE    ONLINE    dbts-rac1  
ora....dbdb.db ora....se.type 0/2    0/1    ONLINE    ONLINE    dbts-rac1  
ora....sdbt.db ora....se.type 0/2    0/1    ONLINE    ONLINE    dbts-rac1  
ora....SM1.asm application    0/5    0/0    ONLINE    ONLINE    dbts-rac1  
ora....C1.lsnr application    0/5    0/0    ONLINE    ONLINE    dbts-rac1  
ora....ac1.gsd application    0/5    0/0    OFFLINE   OFFLINE               
ora....ac1.ons application    0/3    0/0    ONLINE    ONLINE    dbts-rac1  
ora....ac1.vip ora....t1.type 0/0    1/0    ONLINE    ONLINE    dbts-rac1  
ora....SM2.asm application    0/5    0/0    ONLINE    ONLINE    dbts-rac2  
ora....C2.lsnr application    0/5    0/0    ONLINE    ONLINE    dbts-rac2  
ora....ac2.gsd application    0/5    0/0    OFFLINE   OFFLINE               
ora....ac2.ons application    0/3    0/0    ONLINE    ONLINE    dbts-rac2  
ora....ac2.vip ora....t1.type 0/0    1/0    ONLINE    ONLINE    dbts-rac2  
ora.dbts.db   ora....se.type 0/2    0/1    ONLINE    ONLINE    dbts-rac1  
ora....atbi.db ora....se.type 0/2    1/1    ONLINE    ONLINE    dbts-rac1  
ora....tetl.db ora....se.type 0/2    1/1    ONLINE    ONLINE    dbts-rac1  
ora.cvu        ora.cvu.type   0/5    0/0    ONLINE    ONLINE    dbts-rac1  
ora.gsd        ora.gsd.type   0/5    0/     OFFLINE   OFFLINE               
ora....network ora....rk.type 0/5    0/     ONLINE    ONLINE    dbts-rac1  
ora.oc4j       ora.oc4j.type  0/1    0/2    ONLINE    ONLINE    dbts-rac1  
ora.ons        ora.ons.type   0/3    0/     ONLINE    ONLINE    dbts-rac1  
ora.scan1.vip  ora....ip.type 0/0    0/0    ONLINE    ONLINE    dbts-rac1  
[root@dbts-rac1 ~]# crsctl stat res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS       
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.LISTENER.lsnr
               ONLINE  ONLINE       dbts-rac1                                   
               ONLINE  ONLINE       dbts-rac2                                   
ora.OCR_VOTE.dg
               ONLINE  ONLINE       dbts-rac1                                   
               ONLINE  ONLINE       dbts-rac2                                   
ora.ORA_ARCH.dg
               ONLINE  ONLINE       dbts-rac1                                   
               ONLINE  ONLINE       dbts-rac2                                   
ora.ORA_DATA1.dg
               ONLINE  ONLINE       dbts-rac1                                   
               ONLINE  ONLINE       dbts-rac2                                   
ora.ORA_DATA2.dg
               ONLINE  ONLINE       dbts-rac1                                   
               ONLINE  ONLINE       dbts-rac2                                   
ora.asm
               ONLINE  ONLINE       dbts-rac1               Started             
               ONLINE  ONLINE       dbts-rac2               Started             
ora.gsd
               OFFLINE OFFLINE      dbts-rac1                                   
               OFFLINE OFFLINE      dbts-rac2                                   
ora.net1.network
               ONLINE  ONLINE       dbts-rac1                                   
               ONLINE  ONLINE       dbts-rac2                                   
ora.ons
               ONLINE  ONLINE       dbts-rac1                                   
               ONLINE  ONLINE       dbts-rac2                                   
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       dbts-rac1                                              
ora.dbts-rac1.vip
      1        ONLINE  ONLINE       dbts-rac1                                   
ora.dbts-rac2.vip
      1        ONLINE  ONLINE       dbts-rac2                                   
ora.dbts.db
      1        ONLINE  ONLINE       dbts-rac1               Open                
      2        ONLINE  ONLINE       dbts-rac2               Open                            
ora.cvu
      1        ONLINE  ONLINE       dbts-rac1                                   
ora.oc4j
      1        ONLINE  ONLINE       dbts-rac1                                   
ora.scan1.vip
      1        ONLINE  ONLINE       dbts-rac1                                   
[root@dbts-rac1 ~]# 

注意:
个人比较喜欢使用crsctl stat res -t,原因是显示结果清晰明了,不像crs_stat -t -v 显示结果挤在一起,结果显示不友好。

2.停节点1的监听服务(oracle用户操作)

[oracle@dbts-rac2 ~]$ srvctl stop listener -n dbts-rac1

停止后状态查看

[oracle@dbts-rac1 ~]$ srvctl status listener -n dbts-rac1 
Listener LISTENER is enabled on node(s): dbts-rac1
Listener LISTENER is not running on node(s): dbts-rac1
  1. 检查实例运行状态,以实例dbts为例
    本案例中db_unique_name为dbts,检查方式为:
SYS@dbts2> show parameter unique

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
db_unique_name                       string      dbts

查看两节点上数据库实例运行状态

[oracle@dbts-rac1 ~]$ srvctl status database -d dbts
Instance dbts1 is running on node dbts-rac1
Instance dbts2 is running on node dbts-rac2

关闭节点1上的实例,两种方式:
方式1: 指定实例名称dbts1

[oracle@dbts-rac1 ~]$ srvctl stop instance -d dbts -i dbts1 -o immediate

方式2: 指定节点名称

[oracle@dbts-rac1 ~]$ srvctl stop instance -d dbts -n dbts-rac1 -o immediate

查看实例状态,发现节点1上的实例已经关闭

[oracle@dbts-rac2 dbs]$ srvctl status database -d dbts
Instance dbts1 is not running on node dbts-rac1
Instance dbts2 is running on node dbts-rac2
  1. 检查集群运行状态
    方式1:检查本节点集群运行状态
[root@dbts-rac1 ~]# crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online

方式2:crsctl check cluster -all

[root@dbts-rac1 ~]# crsctl check cluster -all
**************************************************************
dbts-rac1:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
**************************************************************
dbts-rac2:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
**************************************************************
  1. 停止节点1的集群服务
[root@dbts-rac1 ~]# crsctl stop crs
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'dbts-rac1'
CRS-2673: Attempting to stop 'ora.crsd' on 'dbts-rac1'
CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'dbts-rac1'
CRS-2673: Attempting to stop 'ora.oc4j' on 'dbts-rac1'
CRS-2673: Attempting to stop 'ora.LISTENER_SCAN1.lsnr' on 'dbts-rac1'
CRS-2673: Attempting to stop 'ora.OCR_VOTE.dg' on 'dbts-rac1'
CRS-2673: Attempting to stop 'ora.ORA_DATA2.dg' on 'dbts-rac1'
CRS-2673: Attempting to stop 'ora.ORA_ARCH.dg' on 'dbts-rac1'
CRS-2673: Attempting to stop 'ora.ORA_DATA1.dg' on 'dbts-rac1'
CRS-2673: Attempting to stop 'ora.cvu' on 'dbts-rac1'
CRS-2673: Attempting to stop 'ora.dbts-rac1.vip' on 'dbts-rac1'
CRS-2677: Stop of 'ora.cvu' on 'dbts-rac1' succeeded
CRS-2672: Attempting to start 'ora.cvu' on 'dbts-rac2'
CRS-2677: Stop of 'ora.LISTENER_SCAN1.lsnr' on 'dbts-rac1' succeeded
CRS-2673: Attempting to stop 'ora.scan1.vip' on 'dbts-rac1'
CRS-2677: Stop of 'ora.ORA_ARCH.dg' on 'dbts-rac1' succeeded
CRS-2676: Start of 'ora.cvu' on 'dbts-rac2' succeeded
CRS-2677: Stop of 'ora.ORA_DATA2.dg' on 'dbts-rac1' succeeded
CRS-2677: Stop of 'ora.ORA_DATA1.dg' on 'dbts-rac1' succeeded
CRS-2677: Stop of 'ora.dbts-rac1.vip' on 'dbts-rac1' succeeded
CRS-2672: Attempting to start 'ora.dbts-rac1.vip' on 'dbts-rac2'
CRS-2677: Stop of 'ora.scan1.vip' on 'dbts-rac1' succeeded
CRS-2672: Attempting to start 'ora.scan1.vip' on 'dbts-rac2'
CRS-2676: Start of 'ora.dbts-rac1.vip' on 'dbts-rac2' succeeded
CRS-2677: Stop of 'ora.oc4j' on 'dbts-rac1' succeeded
CRS-2672: Attempting to start 'ora.oc4j' on 'dbts-rac2'
CRS-2676: Start of 'ora.scan1.vip' on 'dbts-rac2' succeeded
CRS-2672: Attempting to start 'ora.LISTENER_SCAN1.lsnr' on 'dbts-rac2'
CRS-2676: Start of 'ora.LISTENER_SCAN1.lsnr' on 'dbts-rac2' succeeded
CRS-2677: Stop of 'ora.OCR_VOTE.dg' on 'dbts-rac1' succeeded
CRS-2673: Attempting to stop 'ora.asm' on 'dbts-rac1'
CRS-2677: Stop of 'ora.asm' on 'dbts-rac1' succeeded
CRS-2676: Start of 'ora.oc4j' on 'dbts-rac2' succeeded
CRS-2673: Attempting to stop 'ora.ons' on 'dbts-rac1'
CRS-2677: Stop of 'ora.ons' on 'dbts-rac1' succeeded
CRS-2673: Attempting to stop 'ora.net1.network' on 'dbts-rac1'
CRS-2677: Stop of 'ora.net1.network' on 'dbts-rac1' succeeded
CRS-2792: Shutdown of Cluster Ready Services-managed resources on 'dbts-rac1' has completed
CRS-2677: Stop of 'ora.crsd' on 'dbts-rac1' succeeded
CRS-2673: Attempting to stop 'ora.crf' on 'dbts-rac1'
CRS-2673: Attempting to stop 'ora.ctssd' on 'dbts-rac1'
CRS-2673: Attempting to stop 'ora.evmd' on 'dbts-rac1'
CRS-2673: Attempting to stop 'ora.asm' on 'dbts-rac1'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'dbts-rac1'
CRS-2677: Stop of 'ora.ctssd' on 'dbts-rac1' succeeded
CRS-2677: Stop of 'ora.mdnsd' on 'dbts-rac1' succeeded
CRS-2677: Stop of 'ora.evmd' on 'dbts-rac1' succeeded
CRS-2677: Stop of 'ora.asm' on 'dbts-rac1' succeeded
CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'dbts-rac1'
CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'dbts-rac1' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'dbts-rac1'
CRS-2677: Stop of 'ora.cssd' on 'dbts-rac1' succeeded
CRS-2677: Stop of 'ora.crf' on 'dbts-rac1' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'dbts-rac1'
CRS-2677: Stop of 'ora.gipcd' on 'dbts-rac1' succeeded
CRS-2673: Attempting to stop 'ora.gpnpd' on 'dbts-rac1'
CRS-2677: Stop of 'ora.gpnpd' on 'dbts-rac1' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'dbts-rac1' has completed
CRS-4133: Oracle High Availability Services has been stopped
[root@dbts-rac1 ~]# crsctl check has
CRS-4639: Could not contact Oracle High Availability Services

[root@dbts-rac1 ~]# crsctl config has
CRS-4622: Oracle High Availability Services autostart is enabled.

这里可以设置has为开机不可用,后续再开启has;也可以先不用disable has

[root@dbts-rac1 ~]# crsctl disable has
CRS-4621: Oracle High Availability Services autostart is disabled.

节点1上集群成功停止

再次检查节点1 集群状态

[root@dbts-rac1 ~]# crsctl check cluster -all
CRS-4639: Could not contact Oracle High Availability Services
CRS-4000: Command Check failed, or completed with errors.
[root@dbts-rac1 ~]# crsctl check has
CRS-4639: Could not contact Oracle High Availability Services

这里看到节点1上的集群已经停止,crsctl命令已经不能使运行;此刻只能在节点2上查看集群状态

[root@dbts-rac2 ~]# crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
[root@dbts-rac2 ~]# crsctl check cluster -all
**************************************************************
dbts-rac2:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
**************************************************************

可以看到,节点2上集群在正常运行,crsctl check cluster -all已经看不到节点1的运行信息。

使用如下命令查看节点2上的集群接管所有服务,原本节点1上的服务处于offline状态

[root@dbts-rac2 ~]# crsctl status res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS       
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.LISTENER.lsnr
               ONLINE  ONLINE       dbts-rac2                                   
ora.OCR_VOTE.dg
               ONLINE  ONLINE       dbts-rac2                                   
ora.ORA_ARCH.dg
               ONLINE  ONLINE       dbts-rac2                                   
ora.ORA_DATA1.dg
               ONLINE  ONLINE       dbts-rac2                                   
ora.ORA_DATA2.dg
               ONLINE  ONLINE       dbts-rac2                                   
ora.asm
               ONLINE  ONLINE       dbts-rac2               Started             
ora.gsd
               OFFLINE OFFLINE      dbts-rac2                                   
ora.net1.network
               ONLINE  ONLINE       dbts-rac2                                   
ora.ons
               ONLINE  ONLINE       dbts-rac2                                   
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       dbts-rac2                                   
ora.crmprdbdb.db
      1        OFFLINE OFFLINE                               Instance Shutdown   
      2        ONLINE  ONLINE       dbts-rac2               Open                
ora.crmprdsdbt.db
      1        OFFLINE OFFLINE                               Instance Shutdown   
      2        ONLINE  ONLINE       dbts-rac2               Open                
ora.dbts-rac1.vip
      1        ONLINE  INTERMEDIATE dbts-rac2               FAILED OVER         
ora.dbts-rac2.vip
      1        ONLINE  ONLINE       dbts-rac2                                   
ora.dbts.db
      1        OFFLINE OFFLINE                               Instance Shutdown   
      2        ONLINE  ONLINE       dbts-rac2               Open                
ora.crmuatbi.db
      1        OFFLINE OFFLINE                               Instance Shutdown   
      2        ONLINE  ONLINE       dbts-rac2               Open                
ora.crmuatetl.db
      1        OFFLINE OFFLINE                               Instance Shutdown   
      2        ONLINE  ONLINE       dbts-rac2               Open                
ora.cvu
      1        ONLINE  ONLINE       dbts-rac2                                   
ora.oc4j
      1        ONLINE  ONLINE       dbts-rac2                                   
ora.scan1.vip
      1        ONLINE  ONLINE       dbts-rac2  

在节点2上查看集群服务,可以看到节点1上的相关服务也已经关闭;VIP也已经切换至节点2。

ok,上述单节点1的集群已经正常关闭;达到本文停止单节点服务的目的。

这里再抛出一个问题,节点1集群正常关闭,但是否能够正常拉起来呐?
是不是还不太确定,怎么确定,手动拉一次呗,演练一次更放心。

使用#crsctl start crs命令进行重新拉起。
节点1上root用户操作

[root@dbts-rac1 ~]# crsctl start crs
CRS-4123: Oracle High Availability Services has been started

执行完上述命令后,立刻通过crsctl命令来检查集群状态,会报错如下。

[root@dbts-rac1 ~]# crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4530: Communications failure contacting Cluster Synchronization Services daemon
CRS-4534: Cannot communicate with Event Manager
[root@dbts-rac1 ~]# crsctl check cluster -all
**************************************************************
dbts-rac1:
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4530: Communications failure contacting Cluster Synchronization Services daemon
CRS-4534: Cannot communicate with Event Manager
**************************************************************
dbts-rac2:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
**************************************************************

为什么会出现上述crs集群检查错误?
因为CRS集群还没有拉起来啊,集群的启动是有一个过程的,等一会即可,本案例大概等了一分钟。

这里要吐槽下:
crsctl stop crs关闭集群时,有每一步的执行过程,使用crsctl start crs 除了CRS-4123: Oracle High Availability Services has been started外却没有更多提示,不友好。

一分钟后,再次检查集群状态,ok了

[root@dbts-rac1 ~]# crsctl check cluster -all
**************************************************************
dbts-rac1:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
**************************************************************
dbts-rac2:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
**************************************************************

至此,RAC单节点服务器停止,搞定。

【总结】
1.本文记录了一次RAC单节点关闭的操作过程,重启过程反演即可。
2.很少敲集群命令,手生了很多,还是得常敲常练。

关注个人微信公众号“一森咖记”
RAC1:我服务器要重启下,喂:RAC2你先顶一下,咋个搞?! (https://mushiming.com/)  第2张

THE END

发表回复