Disclaimer Note: The following information is gathered by referring various sources and credit should go to them.
If any updated version added with the uncovered function then please do try with your own risk.
_________________________________________________________________________________
Was CAA introduced from AIX 7.1 ?
No,
The Cluster Aware function of AIX was introduced with AIX 7.1 and AIX 6.1 TL6. This new technology builds clustering technologies into the AIX base operating system. Cluster Aware provides the AIX kernel with heartbeating, health management and monitoring capabilities.
Using Cluster Aware AIX you can
easily create a cluster of AIX nodes with the following capabilities (Please note, CAA is not for High Availability (HA):
-
Clusterwide event management - The AIX Event Infrastructure allows event propagation across the cluster so that applications can monitor events from any node in the cluster.
-
Communication and storage events
-
Node UP and node DOWN
-
Network adapter UP and DOWN
-
Network address change
-
Point-of-contact UP and DOWN
-
Disk UP and DOWN
-
-
Predefined and user-defined events
-
-
Clusterwide storage naming service - When a cluster is defined or modified, the AIX interfaces automatically create a consistent shared device view across the cluster. A global device name, such as cldisk5, would refer to the same physical disk from any node in the cluster.
-
Clusterwide command distribution - The clcmd command provides a facility to distribute a command to a set of nodes that are members of a cluster. For example, the command clcmd date returns the output of the date command from each of the nodes in the cluster.
- Clusterwide communication making use of networking and storage communication Cluster Aware AIX is not a high availability solution taken alone and does not replace existing high availability products. It can be seen as a set of commands and services that the cluster software can exploit to provide high availability and disaster recovery support to external applications. CAA does not provide the application monitoring and resource failover capabilities that PowerHA provides. In fact, IBM PowerHA SystemMirror 7.1 and even (RSCT) Reliable Scalable Cluster Technology use the built-in AIX clustering capabilities. The reason for the introduction of cluster built-in functions of AIX was to simplify the configuration and management of high availability clusters. It also lays a foundation for future AIX capabilities and the next generation of PowerHA SystemMirror.
Note: Cluster Aware AIX
capability is included in AIX 7.1 Standard or Enterprise Editions,
but is not included in AIX 7.1 Express Edition.
Creating the cluster
Before creating the cluster there are some things
to consider.
CAA uses IP based network communications and
storage interface communication through Fibre Channel and SAS
adapters. When using both type of communication, all nodes in the
cluster can always communicate with any other nodes in the cluster
configuration and thus eliminating "split brain" incidents.
Network
A multicast/unicast address is used for cluster
communications between the nodes in the cluster. Therefore, you need
to ensure proper network configuration on each node. Each node must
have at least one IP address configured on its network interface. The
IP address is used as a basis for creating an IP multicast/unicast address,
which the cluster uses for internal communications. Check also if
entries of the IP addresses exist in every node’s /etc/hosts file.
Storage
Each node of the cluster should have common
storage devices available, either SAN or SAS disks. These storage
devices are used for the cluster repository disk and for any
clustered shared disks. If Fibre Channel devices will be used, the
following procedure must be followed before creating the cluster (SAS
adapters do not require special setup):
-
Run the following command:
rmdev -Rl fcsX
Note: X is the number of your adapter. If you booted from the
Fibre Channel adapter, you do not need to complete this step.
-
Run the following command:
chdev -l fcsX -a tme=yes
Note: If you booted from the Fibre Channel
adapter, add the -P flag. The target mode enabled (TME) attribute is
needed in order FC adapter to be supported.
-
Run the following command:
chdev -l fscsiX -a dyntrk=yes -a fc_err_recov=fast_fail
-
Run the cfgmgr command.
Note: If you booted from the Fibre Channel
adapter and used the -P flag, you must reboot.
-
Verify the configuration changes by running the following command:
lsdev -C | grep sfwcom
The following is an example of the output displayed from the lsdev -C
| grep sfwcom command:
lsdev -C | grep sfwcom
sfwcomm0 Available 01-00-02-FF Fiber Channel Storage Framework Comm
sfwcomm1 Available 01-01-02-FF Fiber Channel Storage Framework Comm
The above procedure has to be performed on all
nodes of the cluster.
The cluster repository disk is used as the central
repository for the cluster configuration data. A disk size of 10 GB
is recommended. The minimum is 1 GB.
The following commands can be used for creating
and managing the cluster:
lscluster Used to list cluster configuration information.
mkcluster Used to create a cluster.
chcluster Used to change a cluster configuration.
rmcluster Used to remove a cluster configuration.
clcmd Used to distribute a command to a set of nodes that are members of a cluster.
See Man pages for different options.
Create the cluster:
Create the cluster with mkcluster
command:
# mkcluster -n mycluster -m nodeA,nodeB -r hdisk1 -d hdisk2,hdisk3
mkcluster: Cluster shared disks are automatically renamed to names such as
cldisk1, [cldisk2, ...] on all cluster nodes. However, this cannot take place while a disk is busy or on a node which is down or not reachable. If any disks cannot be renamed now, they will be renamed later by the clconfd daemon, when the node is available and the disks are not busy.
The name of the cluster is mycluster,
the nodes are nodeA and nodeB, the
repository disk is hdisk1 and the shared disks are
hdisk2 and hdisk3. Note that
repository disk and shared disks will be automatically renamed as
caa_private0 and cldisk1 and
cldisk2 respectively. These names will be the same
over both nodes, no matter what was their initial hdisk number (which
could be different on each node).
Before mkcluster command:
# lspv
hdisk0 0050187a43833dc5 rootvg active
hdisk1 0050187a8de3af7d None active
hdisk2 0050187a70d8c6cf None
hdisk3 none None
After mkcluster command:
# lspv
hdisk0 0050187a43833dc5 rootvg active
caa_private0 0050187a8de3af7d caavg_private active
cldisk1 0050187a70d8c6cf None
cldisk2 none None
When the cluster is ready a special volume group
(caavg_private), new logical volumes and filesystems
are created.
When you create a cluster with mkcluster command
the following actions are performed (taken from
http://pic.dhe.ibm.com/infocenter/aix/v7r1/index.jsp?topic=%2Fcom.ibm.aix.clusteraware%2Fclaware_architecture.htm):
-
The cluster is created using the mkcluster command.
-
The cluster configuration is written to the raw section of the cluster repository disk.
-
Special volume groups and logical volumes are created on the cluster repository disk.
-
Cluster file systems are created on the special volume group.
-
Cluster services are made available to other functions in the operating system, such as Reliable Scalable Cluster Technology (RSCT) and PowerHA® SystemMirror.
-
Storage framework register lists are created on the cluster repository disk.
-
A global device namespace is created and interaction with LVM starts for handling associated volume group events.
-
A clusterwide multicast address is established.
-
The node discovers all of the available communication interfaces.
-
The cluster interface monitoring starts.
-
The cluster interacts with Autonomic Health Advisory File System (AHAFS) for clusterwide event distribution.
-
The cluster exports cluster messaging and cluster socket services to other functions in the operating system, such as Reliable Scalable Cluster Technology (RSCT) and PowerHA SystemMirror.
To check the status of the cluster use lscluster
with the following options:
-c Lists the cluster configuration.
-d Lists the cluster storage interfaces.
-i Lists the cluster configuration interfaces on the local node.
-m Lists the cluster node configuration information.
To check if the cluster is operating properly
execute any clusterwide command:
#clcmd date
-------------------------------
NODE nodeA
-------------------------------
Wed Jun 6 11:19:44 EEST 2012
-------------------------------
NODE nodeB
-------------------------------
Wed Jun 6 11:19:44 EEST 2012
To remove the cluster just type:
# rmcluster -n mycluster
rmcluster: Removed cluster shared disks are automatically renamed to names such
as hdisk10, [hdisk11, ...] on all cluster nodes. However, this cannot
take place while a disk is busy or on a node which is down or not
reachable. If any disks cannot be renamed now, you must manually
rename them by removing them from the ODM database and then running
the cfgmgr command to recreate them with default names. For example:
rmdev -l cldisk1 -d
rmdev -l cldisk2 -d
cfgmgr
Cluster Aware AIX helps to create very easily
cluster with minimum set of commands and user intervention. In our
opinion, one of the best features it provides is the common disk
names used in all the participating nodes in the cluster.
Examples
-
To list the cluster configuration for all nodes, enter:
lscluster –m
The sample of the output follows:
# lscluster -m
Calling node query for all nodes...
Node query number of nodes examined: 2
Node name: nodeA.ibm.com
Cluster shorthand id for node: 1
uuid for node: 84088524-b124-11e3-8210-32c8e74b1e02
State of node: UP NODE_LOCAL
Smoothed rtt to node: 0
Mean Deviation in network rtt to node: 0
Number of clusters node is a member in: 1
CLUSTER NAME TYPE SHID UUID
Sample local 84ee37f4-b124-11e3-8210-32c8e74b1e02
Number of points_of_contact for node: 0
Point-of-contact interface & contact state
n/a
------------------------------
Node name: nodeB.ibm.com
Cluster shorthand id for node: 2
uuid for node: 8492a5a6-b124-11e3-8210-32c8e74b1e02
State of node: UP
Smoothed rtt to node: 70
Mean Deviation in network rtt to node: 82
Number of clusters node is a member in: 1
CLUSTER NAME TYPE SHID UUID
Sample local 84ee37f4-b124-11e3-8210-32c8e74b1e02
Number of points_of_contact for node: 2
Point-of-contact interface & contact state
dpcom UP RESTRICTED
en0 UP
-
To list the cluster configuration for the local node, enter:
lscluster -s
The sample of the output
follows:
# lscluster -s
Cluster Network Statistics:
pkts seen: 33861217 passed: 32052241
IP pkts: 5778096 UDP pkts: 1934943
gossip pkts sent: 1463320 gossip pkts recv: 688759
cluster address pkts: 0 CP pkts: 1808962
bad transmits: 5 bad posts: 4
Bad transmit (overflow - disk ): 0
Bad transmit (overflow - tcpsock): 0
Bad transmit (host unreachable): 0
Bad transmit (net unreachable): 0
Bad transmit (network down): 0
Bad transmit (no connection): 0
short pkts: 0 multicast pkts: 1808880
cluster wide errors: 0 bad pkts: 0
dup pkts: 0 dropped pkts: 14
pkt fragments: 1 fragments queued: 0
fragments freed: 0
pkts pulled: 0 no memory: 0
rxmit requests recv: 10 requests found: 3
requests missed: 7 ooo pkts: 0
requests reset sent: 7 reset recv: 0
remote tcpsock send: 0 tcpsock recv: 0
rxmit requests sent: 0
alive pkts sent: 0 alive pkts recv: 0
ahafs pkts sent: 2 ahafs pkts recv: 0
nodedown pkts sent: 0 nodedown pkts recv: 1
socket pkts sent: 62 socket pkts recv: 54
cwide pkts sent: 275321 cwide pkts recv: 275318
socket pkts no space: 0 pkts recv notforhere: 0
Pseudo socket pkts sent: 0 Pseudo socket pkts recv: 0
Pseudo socket pkts dropped: 0
arp pkts sent: 1 arp pkts recv: 2
stale pkts recv: 0 other cluster pkts: 4
storage pkts sent: 1 storage pkts recv: 1
disk pkts sent: 174 disk pkts recv: 0
unicast pkts sent: 275364 unicast pkts recv: 82
out-of-range pkts recv: 0
IPv6 pkts sent: 0 IPv6 pkts recv: 122
IPv6 frags sent: 0 IPv6 frags recv: 0
Unhandled large pkts: 0
mrxmit overflow : 0 urxmit overflow: 0
-
To list the interface information for the local node, enter:
lscluster –i
The sample of output follows:
# lscluster -i
Network/Storage Interface Query
Cluster Name: Sample
Cluster uuid: 84ee37f4-b124-11e3-8210-32c8e74b1e02
Number of nodes reporting = 2
Number of nodes expected = 2
Node nodeA.ibm.com
Node uuid = 84088524-b124-11e3-8210-32c8e74b1e02
Number of interfaces discovered = 2
Interface number 1 en0
ifnet type = 6 ndd type = 7
Mac address length = 6
Mac address = 32:C8:E7:4B:1E:02
Smoothed rrt across interface = 0
Mean Deviation in network rrt across interface = 0
Probe interval for interface = 100 ms
ifnet flags for interface = 0x1E080863
ndd flags for interface = 0x0021081B
Interface state UP
Number of regular addresses configured on interface = 1
IPv4 ADDRESS: 9.3.199.216 broadcast 9.3.199.255 netmask 255.255.254.0
Number of cluster multicast addresses configured on interface = 1
IPv4 MULTICAST ADDRESS: 228.3.199.216 broadcast 0.0.0.0 netmask 0.0.0.0
Interface number 2 dpcom
ifnet type = 0 ndd type = 305
Mac address length = 0
Mac address = 00:00:00:00:00:00
Smoothed rrt across interface = 750
Mean Deviation in network rrt across interface = 1500
Probe interval for interface = 22500 ms
ifnet flags for interface = 0x00000000
ndd flags for interface = 0x00000009
Interface state UP RESTRICTED AIX_CONTROLLED
Pseudo Interface
Interface State DOWN
Node nodeB.ibm.com
Node uuid = 8492a5a6-b124-11e3-8210-32c8e74b1e02
Number of interfaces discovered = 2
Interface number 1 en0
ifnet type = 6 ndd type = 7
Mac address length = 6
Mac address = 32:C8:EF:AD:7C:02
Smoothed rrt across interface = 0
Mean Deviation in network rrt across interface = 0
Probe interval for interface = 990 ms
ifnet flags for interface = 0x1E084863
ndd flags for interface = 0x0021081B
Interface state UP
Number of regular addresses configured on interface = 1
IPv4 ADDRESS: 9.3.199.128 broadcast 9.3.199.255 netmask 255.255.254.0
Number of cluster multicast addresses configured on interface = 1
IPv4 MULTICAST ADDRESS: 228.3.199.216 broadcast 0.0.0.0 netmask 0.0.0.0
Interface number 2 dpcom
ifnet type = 0 ndd type = 305
Mac address length = 0
Mac address = 00:00:00:00:00:00
Smoothed rrt across interface = 750
Mean Deviation in network rrt across interface = 1500
Probe interval for interface = 22500 ms
ifnet flags for interface = 0x00000000
ndd flags for interface = 0x00000009
Interface state UP RESTRICTED AIX_CONTROLLED
Pseudo Interface
Interface State DOWN
-
To list the storage interface information for the cluster, enter:
lscluster -d
The sample of output follows:
# lscluster -d
Storage Interface Query
Cluster Name: Sample
Cluster uuid: 84ee37f4-b124-11e3-8210-32c8e74b1e02
Number of nodes reporting = 2
Number of nodes expected = 2
Node nodeA.ibm.com
Node uuid = 84088524-b124-11e3-8210-32c8e74b1e02
Number of disk discovered = 1
hdisk4
state : UP
uDid :
uUid : 76c94719-7335-ded6-10e2-77d61ff7998c
type : REPDISK
Node nodeB.ibm.com
Node uuid = 8492a5a6-b124-11e3-8210-32c8e74b1e02
Number of disk discovered = 1
hdisk0
state : UP
uDid : 382300c4f4f700004c0000000140799c6e39.3105VDASD03AIXvscsi
uUid : 76c94719-7335-ded6-10e2-77d61ff7998c
type : REPDISK
-
To list the cluster configuration, enter:
lscluster -c
The sample of the output
follows:
# lscluster -c
Cluster Name: Sample
Cluste UUID: 8e1d89da-b39d-11e3-91e7-d24dc2d9d309
Number of nodes in cluster = 2
Cluster ID for node nodeA.ibm.com: 1
Primary IP address for node r5r3m25.aus.stglabs.ibm.com: 9.3.207.132
Cluster ID for node nodeB.ibm.com: 2
Primary IP address for node r5r3m26.aus.stglabs.ibm.com: 9.3.207.218
Number of disks in cluster = 1
Disk = hdisk6 UUID = 57208624-fda4-d404-a7c0-8e425e2941a4 cluster_major = 0 cluster_minor = 1
Multicast for site LOCAL: IPv4 228.3.207.132 IPv6 ff05::e403:cf84
Communication Mode: multicast
Local node maximum capabilities: HNAME_CHG, UNICAST, IPV6, SITE
Effective cluster-wide capabilities: HNAME_CHG, UNICAST, IPV6, SI
CAA (Cluster Aware AIX)
CAA is an AIX feature, and with that AIX kernel has the capability to provide specific cluster services, like heartbeating and node monitoring. Beside these, using Cluster Aware AIX you can easily create a cluster of AIX nodes. CAA does not replace PowerHA, it provides several services for PowerHA. PowerHA 7.1 and RSCT use the built-in AIX clustering capabilities, which simplifies the configuration and management of cluster.
CAA needs the following ports on all nodes for network communication:
4098 (for multicast)
6181
16191
42112
These CAA commands can be used for managing clusters:
lscluster list cluster configuration information
-c cluster configuration
-d disk (storage) configuration
-i interfaces configuration
-m node configuration
mkcluster create a cluster
chcluster change a cluster configuration
rmcluster remove a cluster configuration
clcmd run a command on all nodes of a cluster
----------------
PowerHA uses a shared disk to store Cluster Aware AIX (CAA) information. At least a 512 MB (and no more than 460 GB) shared disk is needed, for this cluster repository disk. (This disk cannot be used for application storage or any other purpose.)
CAA stores the repository disk related information in the ODM CuAt, as part of the cluster information.
# odmget CuAt | grep -p cluster
CuAt:
name = "cluster0"
attribute = "node_uuid"
value = "52a6b8be-fff8-11e5-8e37-56a1a7627864"
type = "R"
generic = "DU"
rep = "s"
nls_index = 3
CuAt:
name = "cluster0"
attribute = "clvdisk"
value = "d7063c81-3f64-b5f7-d82b-fa8ed99bfe61"
type = "R"
generic = "DU"
rep = "s"
nls_index = 2
In case this ODM entry is missing (which can cause that a node will fail to join the cluster) it can be repopulated (and the node forced to join the cluster) using clusterconf command:
clusterconf -r hdiskx //hdiskx is the repository
diskCAA is an AIX feature, and with that AIX kernel has the capability to provide specific cluster services, like heartbeating and node monitoring. Beside these, using Cluster Aware AIX you can easily create a cluster of AIX nodes. CAA does not replace PowerHA, it provides several services for PowerHA. PowerHA 7.1 and RSCT use the built-in AIX clustering capabilities, which simplifies the configuration and management of cluster.
CAA needs the following ports on all nodes for network communication:
4098 (for multicast)
6181
16191
42112
These CAA commands can be used for managing clusters:
lscluster list cluster configuration information
-c cluster configuration
-d disk (storage) configuration
-i interfaces configuration
-m node configuration
mkcluster create a cluster
chcluster change a cluster configuration
rmcluster remove a cluster configuration
clcmd run a command on all nodes of a cluster
----------------
PowerHA uses a shared disk to store Cluster Aware AIX (CAA) information. At least a 512 MB (and no more than 460 GB) shared disk is needed, for this cluster repository disk. (This disk cannot be used for application storage or any other purpose.)
CAA stores the repository disk related information in the ODM CuAt, as part of the cluster information.
# odmget CuAt | grep -p cluster
CuAt:
name = "cluster0"
attribute = "node_uuid"
value = "52a6b8be-fff8-11e5-8e37-56a1a7627864"
type = "R"
generic = "DU"
rep = "s"
nls_index = 3
CuAt:
name = "cluster0"
attribute = "clvdisk"
value = "d7063c81-3f64-b5f7-d82b-fa8ed99bfe61"
type = "R"
generic = "DU"
rep = "s"
nls_index = 2
In case this ODM entry is missing (which can cause that a node will fail to join the cluster) it can be repopulated (and the node forced to join the cluster) using clusterconf command:
Important Daemons related with PowerHA
-
To display the status of all PowerHA SystemMirror and RSCT subsystems, enter:
clshowsrv -v
The command displays the output information similar to the following
example:
Local node: "hadev11" ("hadev11.aus.stglabs.ibm.com", "hadev11.aus.stglabs.ibm.com")
Cluster services status: "OFFLINE" ("ST_INIT")
Remote communications: "UP"
Cluster-Aware AIX status: "UP"
Remote node: "hadev12" ("hadev12.aus.stglabs.ibm.com", "hadev12")
Cluster services status: "OFFLINE" ("ST_INIT")
Remote communications: "UP"
Cluster-Aware AIX status: "UP"
Status of the
RSCT
subsystems used by PowerHA SystemMirror:
Subsystem Group PID Status
cthags cthags 9371848 active
ctrmc rsct 11862036 active
Status of the
PowerHA
SystemMirror subsystems:
Subsystem Group PID Status
clstrmgrES cluster 12124406 active
Status of the
CAA
subsystems:
Subsystem Group PID Status
clconfd caa 10420354 active
clcomd caa 8912916 active
cthags
(Cluster high availability group services subsystem):
The
cthags subsystem is associated with the hagsd daemon.
The
cthagsctrl control command controls the
operation of the group services subsystem (cthags) under the control
of the system resource controller (SRC).
ctrmc
(cluster resource monitoring and control):
A
subsystem component of RSCT, RMC services monitor and manage
individual servers or nodes in a cluster. AIX also uses it to talk to
products such as CSM, HMC, and Power/HA.
Even though it is not absolutely necessary on a
standalone system, it logs informative messages into errpt. CTRMC
daemons must be active in order for DLPAR operations to work.
This service is started from the
"/etc/inittab":
ctrmc:2:once:/usr/bin/startsrc -s ctrmc > /dev/console 2>&1
ctrmc:2:once:/usr/bin/startsrc -s ctrmc > /dev/console 2>&1
Note: In order for RMC to work, port 657 TCP/UDP
must be open in both directions between the HMC public interface and
the LPARs.
clstrmgrES
(Cluster manager daemon):
Shows
if cluster is STABLE or not, cluster version, Dynamic Node Priority
(pgspace free, disk busy, cpu idle).
This is the main PowerHA daemon. It maintains an upated information
about the health of the cluster, status of the resource groups and
interfaces. After PowerHA is installed clstrmgrES is started from
inittab, and it is always running whether cluster services are
started or not.
lssrc -ls clstrmgrES displays a lots of useful information about cluster
It uses services provided by the RSCT subsystems to monitor the status of the nodes and their interfaces. It receives information from Topology Sevices and uses Group Services for inter-node communication. It invokes the appropriate scripts in response to node or network events.(recovering from SW/HW failures, request to online/offline a node, request to move/online/offline a resource group) It maintains update informations about the resource groups (status, location) A daemon which runs on each cluster nodes.
If clstrmgr hangs or is terminated the default action taken by SRC is to issue halt -q, causing the system to crash. Clstrmgr is dependent on RSCT; if topsvcs or grpsvcs has problems with starting, the clstrmgr will not start either.
lssrc -ls clstrmgrES displays a lots of useful information about cluster
It uses services provided by the RSCT subsystems to monitor the status of the nodes and their interfaces. It receives information from Topology Sevices and uses Group Services for inter-node communication. It invokes the appropriate scripts in response to node or network events.(recovering from SW/HW failures, request to online/offline a node, request to move/online/offline a resource group) It maintains update informations about the resource groups (status, location) A daemon which runs on each cluster nodes.
If clstrmgr hangs or is terminated the default action taken by SRC is to issue halt -q, causing the system to crash. Clstrmgr is dependent on RSCT; if topsvcs or grpsvcs has problems with starting, the clstrmgr will not start either.
root@nzapdb232
/ >
#
lssrc -ls clstrmgrES
Current
state: ST_STABLE
sccsid
= "@(#)36 1.135.1.118
src/43haes/usr/sbin/cluster/hacmprd/main.C,hacmp.pe,61haes_r713,1343A_hacmp713
10/21/"
build
= "Oct 27 2014 16:03:01 1433C_hacmp713"
i_local_nodeid
0, i_local_siteid -1, my_handle 2
ml_idx[1]=1
ml_idx[2]=0
There
are 0 events on the Ibcast queue
There
are 0 events on the RM Ibcast queue
CLversion:
15
local
node vrmf is 7132
cluster
fix level is "2"
The
following timer(s) are currently active:
Current
DNP values
DNP
Values for NodeId - 2 NodeName - nzapdb232
PgSpFree = 9878917 PvPctBusy = 0 PctTotalTimeIdle = 79.272111
DNP
Values for NodeId - 1 NodeName - nzapdb248
PgSpFree = 9918426 PvPctBusy = 0 PctTotalTimeIdle = 90.611423
CAA
Cluster Capabilities
CAA
Cluster services are active
There
are 4 capabilities
Capability
0
id: 3 version: 1 flag: 1
Hostname Change capability is defined and globally available
Capability
1
id: 2 version: 1 flag: 1
Unicast capability is defined and globally available
Capability
2
id: 0 version: 1 flag: 1
IPV6 capability is defined and globally available
Capability
3
id: 1 version: 1 flag: 1
Site capability is defined and globally available
trcOn
0, kTraceOn 0, stopTraceOnExit 0, cdNodeOn 0
Last
event run was X_RE_ST_CH_CO on node 2
clconfd
(Cluster configuration daemon):
This
keeps CAA cluster configuration information in sync. It wakes up
every 10 minutes to synchronize any necessary cluster changes.
clcomd
(Cluster communications daemon):
All
cluster communication is going through clcomd. It must be running
before any cluster services can be started. The trusted IP addresses
are stored in /etc/cluster/rhosts file. (root.system 0600).
Nodes with an empty or missing rhosts file will refuse all POWERHA
related communication. The real use of the /etc/cluster/rhosts file
is before the cluster is first synchronized. After the first
synchronization, ODM classes are populated, and rhosts file is not
used (only when adding new node to the cluster.)
Clcomd is started via /etc/inittab entry, which is created during PowerHA install. Clcomd is managed by src (startsrc, stopsrc, refresh; refresh is useful to reread /etc/cluster/rhosts file), and logs are in /var/hacmp/clcomd/clcomd.log. It uses port 6191.
Clcomd used for these tasks:
- verification and synchronization
- global ODM changes and remote command execution (commands listed under /usr/es/sbin/cluster)
- File collections
- C-SPOC and User and password administration (c-spoc commands are in /usr/es/sbin/cluster/cspoc)
Clcomd is started via /etc/inittab entry, which is created during PowerHA install. Clcomd is managed by src (startsrc, stopsrc, refresh; refresh is useful to reread /etc/cluster/rhosts file), and logs are in /var/hacmp/clcomd/clcomd.log. It uses port 6191.
Clcomd used for these tasks:
- verification and synchronization
- global ODM changes and remote command execution (commands listed under /usr/es/sbin/cluster)
- File collections
- C-SPOC and User and password administration (c-spoc commands are in /usr/es/sbin/cluster/cspoc)
clinfoES
(Cluster information program daemon):
Clinfo
obtains updated cluster information from the Cluster Manager. It
makes information about the state of the cluster, nodes, networks and
applications. Used by clstat, and it is optional on cluster nodes and
clients.
startsrc
-s clinfoES starts clinfo
(/usr/es/sbin/cluster/etc/rc.cluster this script also starts
everything)
stopsrc -s clinfoES stops clinfo
stopsrc -s clinfoES stops clinfo
C-SPOC
(Cluster Single Point of Control):
It helps managing the entire cluster from a single point. In smitty hacmp or with commands which are under /usr/es/sbin/cluster/cspoc.
C-SPOC using clcomd for HACMP communication between nodes, so /etc/rhosts file no longer used.
If there is a failure of a C-SPOC function it will be logged in the /tmp/cspoc.log, on the node performing the operation.
cspoc.log contains the used commands in this file
HACMP start-up:
When PowerHA installed it will create this entry:
hacmp:2:once:/usr/es/sbin/cluster/etc/rc.init >/dev/console 2>&1 <--it br="" clcomdes="" clstrmgres="" snmpd="" starts="" syslogd="">
If PowerHA configured for IP Address Takeover:
harc:2:wait:/usr/es/sbin/cluster/etc/harc.net # HACMP for AIX network startup
When start at system restart option is chosen in C-SPOC (Manage HACMP services)
hacmp6000:2:wait:/usr/es/sbin/cluster/etc/rc.cluster -boot -A # Bring up Cluster <--do better="" control="" manual="" not="" option="" p="" this="" use="">--do>--it>
It helps managing the entire cluster from a single point. In smitty hacmp or with commands which are under /usr/es/sbin/cluster/cspoc.
C-SPOC using clcomd for HACMP communication between nodes, so /etc/rhosts file no longer used.
If there is a failure of a C-SPOC function it will be logged in the /tmp/cspoc.log, on the node performing the operation.
cspoc.log contains the used commands in this file
HACMP start-up:
When PowerHA installed it will create this entry:
hacmp:2:once:/usr/es/sbin/cluster/etc/rc.init >/dev/console 2>&1 <--it br="" clcomdes="" clstrmgres="" snmpd="" starts="" syslogd="">
If PowerHA configured for IP Address Takeover:
harc:2:wait:/usr/es/sbin/cluster/etc/harc.net # HACMP for AIX network startup
When start at system restart option is chosen in C-SPOC (Manage HACMP services)
hacmp6000:2:wait:/usr/es/sbin/cluster/etc/rc.cluster -boot -A # Bring up Cluster <--do better="" control="" manual="" not="" option="" p="" this="" use="">--do>--it>
No comments:
Post a Comment