High Availability Cluster Plugin User Guide 3000-hac-v2.5-000023-B v2.5x

Transcription

High Availability Cluster Plugin
User Guide
v2.5x
3000-hac-v2.5-000023-B
Copyright © 2012 Nexenta® Systems, ALL RIGHTS RESERVED
Notice: No part of this publication may be reproduced or transmitted in any form or by any means, electronic
or mechanical, including photocopying and recording, or stored in a database or retrieval system for any
purpose, without the express written permission of Nexenta Systems (hereinafter referred to as “Nexenta”).
Nexenta reserves the right to make changes to this documentation at any time without notice and assumes
no responsibility for its use. Nexenta products and services only can be ordered under the terms and
conditions of Nexenta Systems’ applicable agreements. All of the features described in this documentation
may not be available currently. Refer to the latest product announcement or contact your local Nexenta
Systems sales office for information on feature and product availability. This documentation includes the
latest information available at the time of publication.
Nexenta is a registered trademark of Nexenta Systems in the United States and other countries. All other
trademarks, service marks, and company names in this documentation are properties of their respective
owners.
All other trademarks, service marks, and company names in this documentation are properties of their
respective owners.
Product Versions Applicable to this Documentation:
Product
Versions supported
NexentaStor
NexentaStor v3.x
ii
High Availability Cluster Plugin User Guide
Contents
Preface
1
................................................v
Introduction
............................................1
About Introduction
Product Features
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1
Server Monitoring and Failover
Storage Failover
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2
Exclusive Access to Storage
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3
SCSI-2 PGR for Additional Protection
Service Failover
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4
Additional Resources
2
Installation & Setup
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4
......................................5
About Installation and Setup
Prerequisites
Adding Plugins
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6
Installing Plugins
3
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6
Sample Network Architecture
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6
Configuring the HA Cluster
.................................9
About Configuring the HA Cluster
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9
Binding the Nodes Together with SSH
4
. . . . . . . . . . . . . . . . . . . . . . . . .4
. . . . . . . . . . . . . . . . . . . . . . . . . . .9
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Configuring the Cluster’s Shared Volumes
. . . . . . . . . . . . . . . . . . . . 11
About Configuring the Cluster’s Shared Volumes
Importing the Current Node
. . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Heartbeat and Network Interfaces
About Heartbeat and Network Interfaces
Heartbeat Mechanism
. . . . . . . . . . . . . . . . . . . . . . . 11
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Adding a Virtual IP or Hostname
5
. . . . . . . . . . . . . . . . . . . 11
. . . . . . . . . . . . . . . . . . . . . . . . . . 13
. . . . . . . . . . . . . . . . . . . . . . . . 13
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
iii
Contents
Configuring the Cluster and Heartbeat Interfaces
Serial Link
6
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Configuring Storage Failover
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
About Configuring Storage Failover
Cluster Configuration Data
Mapping Information
NFS/CIFS Failover
. . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Configuring iSCSI Targets for Failover
. . . . . . . . . . . . . . . . . . . . . . . . . . 19
Configuring Fibre Channel Targets for Failover
7
Advanced Setup
About Advanced Setup
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Setting Failover Mode
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
System Operations
. . . . . . . . . . . . . . . . . . . . . . . . . . . 24
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
About System Operations
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Checking Status of Cluster
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Checking Cluster Failover Mode
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Failure Events
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Service Repair
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Replacing a Faulted Node
Maintenance
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
System Upgrades
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Upgrade Procedure
9
. . . . . . . . . . . . . . . . . . . . 20
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Adding Additional Virtual Hostnames
8
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Testing and Troubleshooting
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
About Testing and Troubleshooting
. . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Resolving Name Conflicts
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Specifying Cache Devices
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Manually Triggering a Failover
Verifying DNS Entries
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Verifying Moving Resources Between Nodes
. . . . . . . . . . . . . . . . . . . . . . 33
Verifying Failing Service Back to Original Node
Gathering Support Logs
Index
iv
. . . . . . . . . . . . . . . . . . 14
. . . . . . . . . . . . . . . . . . . . 34
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
Preface
This documentation presents information specific to Nexenta Systems, Inc.
products. The information is for reference purposes and is subject to
change.
Intended Audience
The information in this document is confidential and may not be disclosed
to any third parties without the prior written consent of Nexenta Systems,
Inc..
This documentation is intended for Network Storage Administrators. It
assumes that you have experience NexentaStor and with data storage
concepts, such as NAS, SAN, NFS, and ZFS.
Documentation History
The following table lists the released revisions of this documentation.
Table 1: Documentation Revision History
Revision
Date
Description
3000-hac-v2.5-000023-B
December, 2012
GA
Contacting Support
•
Visit the Nexenta Systems, Inc. customer portal http://
www.nexenta.com/corp/support/support-overview/account-management.
Login and browse the customers knowledge base.
Choose a method for contacting support:
•
Using the NexentaStor user interface, NMV (Nexenta Management
View):
a. Click Support.
b. Complete the request form.
c. Click Send Request.
•
Using the NexentaStor command line, NMC (Nexenta Management
Console):
a. At the command line, type support.
b. Complete the support wizard.
v
Preface
Comments
Your comments and suggestions to improve this documentation are greatly
appreciated. Send any feedback to [email protected] and
include the documentation title, number, and revision. Refer to specific
pages, sections, and paragraphs whenever possible.
vi
1
Introduction
This section includes the following topics:
•
About Introduction
•
Product Features
•
•
Storage Failover
•
•
Service Failover
•
About Introduction
The following section explains high-availability and failover concepts.
Product Features
The Nexenta High Availability Cluster (HAC) plugin provides a storage
volume-sharing service. You make one or more shared volumes highly
available by detecting system failures and transferring ownership of shared
volumes to the other server in the cluster pair.
An HA Cluster consists of two NexentaStor appliances. Neither system is
designated as the primary or secondary system. You can manage both
systems actively for shared storage, although only one system owns each
volume at a time.
HA Cluster is based on RSF-1 (Resilient Server Facility), an industry-leading
high-availability and cluster middleware application that ensures critical
applications and services continue to run in the event of system failures.
1
Introduction
HA Cluster provides server monitoring and failover. Protection of services,
such as iSCSI, involves cooperation with other modules such as the SCSI
Target plugin.
You can execute NMC commands on all appliances in the group.
An HA Cluster consists of:
•
NexentaStor Appliances — Runs a defined set of services and
monitors each other for failures. HAC connects these NexentaStor
appliances through various communication channels, through
which they exchange heartbeats that provide information about
their states and the services that on them.
•
RSF-1 Cluster Service — Transferable unit consists of:
•
Application start-up and shutdown code
•
Network identity and appliance data
You can migrate services between cluster appliances manually, or
automatically, if one appliance fails.

To view the existing groups of appliances, using NMC:

Type:
nmc:/$ show group

To view the existing groups of appliances, using NMV:
1. Click Status > General.
2. In the Appliances panel, click <host_name>.
Storage Failover
The primary benefit of HA Cluster is to detect storage system failures and
transfer ownership of shared volumes to the alternate NexentaStor
appliance. All configured services fail over to the other server. HA Cluster
ensures service continuity during exceptional events, including power
outages, disk failures, appliances that run out of memory or crash, and
other failures.
Currently, the minimum time to detect that an appliance has failed is
approximately 10 seconds. The failover and recovery time is largely
dependent on the amount of time it takes to re-import the data volume on
the alternate appliance. Best practices to reduce the failover time include
using fewer zvols and file systems for each data volume. When using fewer
file systems, you may want to use other properties, such as reservations
and quotas, to control resource contention between multiple applications.
In the default configuration, HA Cluster implements failover storage services
if network connectivity is lost. HA Cluster automatically determines which
network device to monitor based on the services that are bound to an
interface. It checks all nodes in the cluster, so even if a node is not running
2
Introduction
any services, HA Cluster continues to monitor the unused interfaces. If the
state of one changes to offline, it prevents failover to this node for services
that are bound to that interface. When the interface recovers, HA Cluster
enables failover for that interface again.
Other types of failure protection include link aggregation for network
interfaces and MPxIO for protection against SAS link failures.
For example, it does not move local Users that were configured for the
NexentaStor appliance. Nexenta highly recommends that you use a
directory service such as LDAP for that case.
You access a shared volume exclusively through the appliance that currently
provides the corresponding volume-sharing service. To ensure this
exclusivity, HA Cluster provides reliable fencing through the utilization of
multiple types of heartbeats. Fencing is the process of isolating a node in a
HA cluster, and/or protecting shared resources when a node malfunctions.
Heartbeats, or pinging, allow for constant communication between the
servers. The most important of these is the disk heartbeat in conjunction
with any other type. Generally, additional heartbeat mechanisms increase
reliability of the cluster's fencing logic; the disk heartbeats, however, are
essential.
HA Cluster can reboot the failed appliance in certain cases:
•
Failure to export the shared volume from an appliance that has
failed to provide the (volume-sharing) service. This functionality is
analogous to Stonith, the technique for fencing in computer
clusters.
In addition, NexentaStor RSF-1 cluster provides a number of other fail safe
mechanisms:
•
When you start a (volume sharing) service, make sure that the IP
address associated with that service is NOT attached to any
interface. The cluster automatically detects and reports if an
interface is using the IP address. If it is, the local service does not
perform the start-up sequence.
On disc systems which support a SCSI reservation, you can place a disc
before accessing the file systems, and have the system set to panic if it loses
the reservation. This feature also serves to protect the data on a disc
system.
HA Cluster also supports SCSI-2 reservations.
!
Note:
3
Introduction
SCSI-2 PGR for Additional Protection
HA Cluster employs SCSI-2 PGR (persistent group reservations) for
additional protection. PGR enables access for multiple nodes to a device and
simultaneously blocks access for other nodes.
You can enable PGR by issuing SCSI reservations on the devices in a volume
before you import them. This feature enforces data integrity which prevents
the pool from importing into two nodes at any one time.
Nexenta recommends that you always deploy the HAC Cluster with a shared
disk (quorum device) and at least one or more heartbeat channels (Ethernet
or serial). This configuration ensures that the cluster always exclusive
access that is independent of the storage interconnects used in the Cluster.
!
SCSI-3 PGR is not supported for HA Cluster because it does not work with
SATA drives and has certain other limitations.
Note:
Service Failover
As discussed previously, system failures result in the failover of ownership
of the shared volume to the alternate node. As part of the failover process,
HA Cluster migrates the storage services that are associated with the shared
volume(s) and restarts the services on the alternate node.
Nexenta has various professional services offerings to assist with managing
HA Cluster. Nexenta strongly encourages a services engagement to plan and
install the plugin. Nexenta also offers training courses on high availability
and other features. For service and training offerings, go to our website at:
http://www.nexenta.com
For troubleshooting cluster issues, contact:
[email protected]
For licensing questions, email:
[email protected]
HA cluster is based on the Resilient Server Facility from HighAvailability.com. For additional information on cluster concepts and theory
of operations, visit their website at:
http://www.high-availability.com
For more advanced questions that are related to the product, check our FAQ
for the latest information:
http://www.nexenta.com/corp/frequently-asked-questions
4
2
•
•
Prerequisites
•
Adding Plugins
•
The HA Cluster plugin provides high-availability functionality for
NexentaStor. You must install this software on each NexentaStor appliance
in the cluster. This section describes how to set up and install HAC on both
appliances.
Prerequisites
HA Cluster requires shared storage between the NexentaStor Clustered
appliances. You must also set up:
•
One IP address for each cluster service unit (zvol, NAS folder or
iSCSI LUN)
•
Multiple NICs (ethernet cards) on different subnets for cluster
heartbeat and NMV management (This is a good practice, but not
mandatory)
•
DNS entry for each service name in the cluster
NexentaStor supports the use of a separate device as a transaction log for
committed writes. HA Cluster requires that you make the ZFS Intent Log
(ZIL) part of the same storage system as the shared volume.
!
SCSI and iSCSI failover services use the SCSI Target plugin, which is
included with the NexentaStor software.
Note:
5
Adding Plugins
The HAC plugin installs just as any other NexentaStor plugin installs.
Installing Plugins
You can install plugins through both NMC and NMV.

To install a plugin, using NMC:
1. Type:
nmc:/$ setup plugin <plugin_name>
Example:
nmc:/$ setup plugin rsf-cluster
2. Confirm or cancel the installation.
3. Repeat Step 1 — Step 2 for the other node.

To install a plugin, using NMV:
1. Click Settings > Appliance.
2. In the Administration panel, click Plugins.
!
The plugins may not be immediately available from your NexentaStor
repository. It can take up to six hours before the plugins become available.
Note:
3. Click Add Plugin in the Remotely-available plugins section.
4. Confirm or cancel the installation.
5. Repeat Step 1 — Step 4 for the other node.
The cluster hardware setup needs:
•
Two x86/64-bit systems with a SAS-connected JBOD
•
Two network interface cards (not mandatory, but good practice)
The following illustration is an example of an HA Cluster deployment of a
Nexenta iSCSI environment. The host server attaches to iSCSI LUNS which
are connected to the Nexenta appliances node A and node B. The Nexenta
appliances use the Active/Passive function of the HA cluster. Node A services
one group of iSCSI luns while node B presents a NAS storage LUN.
6
Figure 2-1: High Availability Configuration
EyEd^dKZ,/',s/>>/>/dz
KE&/'hZd/KE
>/Ed^/EdtKZ<
,K^d^ZsZ
/^^/^E
WZ/DZz
Wd,
EyEd
WW>/E͗
ŶŽĚĞ
>dZEdWd,
EyEd
WW>/E͗
ŶŽĚĞ
^,ZsK>hD
7
This page intentionally left blank
8
3
•
•
•
You can configure and manage the HA cluster through the appliance’s web
interface, the Nexenta Management View (NMV), or the Nexenta
Management Console (NMC).
You must bind the two HA nodes together with the SSH protocol so that they
can communicate.

To bind the two nodes, using NMC:
1. Type the following on node A:
nmc:/$ setup network ssh-bind root@<IP_address_nodeB>
2. When prompted, type the Super User password.
3. Type the following on node B:
nmc:/$ setup network ssh-bind root@<IP_address_nodeA>
4. When prompted, type the Super User password.
!
Note:
If ssh-binding fails, you can manually configure the /etc/hosts/ file,
which contains the Internet host table. (Type Setup appliance hosts to
access the file.)
9
You need to configure multiple options for the HA cluster before you can use
it successfully.
You cannot assign a NexentaStor appliance to more than one HA cluster.
!
Note:

To configure an HA cluster, using NMV:
1. Select Settings > HA Cluster.
2. Type the Admin name and password.
3. Click Initialize.
4. Type or change the Cluster name.
5. Type a description, (optional).
6. Select the following parameters:
•
Enable Network Monitoring
The Cluster monitors the network for nodes.
•
Configure Serial Heartbeat
The nodes exchange serial heartbeat packets through a
dedicated RS232 serial link between the appliances, using a
custom protocol.
7. Click Configure.
8. Repeat Step 1 — Step 7 for the second node.

To configure an HA cluster, using NMC:

Type:
nmc:/$ create group rsf-cluster
Use the following options when prompted:
10
•
Group name: <cluster_name>
•
Appliances: nodeA, nodeB
•
Description: <write a description>
•
Heartbeat disk: <disk_name>
•
Enable inter-appliance heartbeat through primary interfaces?:
Yes
•
Enable inter-appliance heartbeat through serial ports?: No
4
Configuring the Cluster’s Shared
Volumes
•
•
After setting up the HA cluster, you must create one or more shared
volumes.
After cluster initialization, NexentaStor automatically redirects you to the
adding volume services page. The shared logical hostname is a name
associated with the failover IP interface that is moved to the alternate node
as part of the failover.
!
If you receive an error indicating that the shared logical hostname is not
resolvable, see Resolving Name Conflicts.
Note:
In the event of a system failure, once the volume is shared, the volume
remains accessible to Users as long as one of the systems continues to run.
Importing the Current Node
Although the appliances can access all of the shared volumes, only the
volumes that have been imported to the current appliance display in the list.
If you want to create a new cluster service with a specific shared volume,
you must import this volume to the current node.
11

To import the volume to the current node, using NMC:
1. Type:
setup group rsf-cluster <cluster_name> shared volume add
System response:
Scanning for volumes accessible from all appliances
2. Validate the cluster interconnect and share the volume:
•
...verify appliances interconnect: Yes
•
Initial timeout: 60
•
Standard timeout: 60
You can add virtual IPs/hostnames per volume service.

To add a virtual IP address, using NMC:
1. Type:
nmc:/$ setup group rsf-cluster <service name> vips add
System response:
nmc:/$ VIP____
2. Type the virtual IP address.

To add a virtual IP address, using NMV:
1. In the Cluster Settings panel, click Advanced.
2. Click Additional Virtual Hostnames.
3. Click Add a new virtual hostname.
4. Type values for the following:
•
Virtual Hostname
•
Netmask
•
Interface on node A
•
Interface on node B
5. Click Add.
12
5
Heartbeat and Network
Interfaces
•
•
Heartbeat Mechanism
•
•
Serial Link
NexentaStor appliances in the HA Cluster constantly monitor the states and
status of the other appliances in the Cluster through heartbeats. Because
HA Cluster servers must determine that an appliance (member of the
cluster) has failed before taking over its services, you configure the cluster
to use several communication channels through which to exchange
heartbeats.
Heartbeat Mechanism
In Nexenta, VDEV labels of devices in the shared volume perform the
heartbeat function. If a shared volume consists of a few disks, NexentaStor
uses VDEV labels for two disks for the heartbeat mechanism. You can specify
which disks.
Though the quorum disk option still remains in the configuration file,
Nexenta recommends using the shared volume's labels.
The heartbeat mechanism uses sectors 512 and 518 in the blank 8K space
of the VDEV label on each of the shared disks.
The loss of all heartbeat channels represents a failure. If an appliance
wrongly detects a failure, it may attempt to start a service that is already
running on another server, leading to so-called “split brain” syndrome. This
can result in confusion and data corruption. Multiple, redundant heartbeats
prevent this from occurring.
13
HA Cluster supports the following types of heartbeat communication
channels:
•
Shared Disk/Quorum Device
Accessible and writable from all appliances in the cluster or VDEV
labels of the devices in the shared volume.
•
Network Interfaces
Including configured interfaces, unconfigured interfaces, and link
aggregates.
•
Serial Links
If two NexentaStor appliances do not share any services, then they do not
require direct heartbeats between them. However, each member of a cluster
must transmit at least one heartbeat to propagate control and monitoring
requests. The heartbeat monitoring logic is defined by two parameters: X
and Y, where:
•
X equals the number of failed heartbeats the interface monitors
before taking any action
•
Y represents the number of active heartbeats an interface monitors
before making it available again to the cluster.
The current heartbeat defaults are 3 and 2 heartbeats, respectively.
NexentaStor also provides protection for network interfaces through link
aggregation. You can set up aggregated network interfaces using NMC or
NMV.
When you define the cluster, note that a NexentaStor appliance cannot
belong to more than one cluster.

To define the HA cluster, using NMC:

Type:
nmc:/$ create group rsf-cluster
System response:
Group name
: cluster-example
Appliances
: nodeA, nodeB
Description
: some description
Scanning for disks accessible from all appliances ...
Heartbeat disk
: c2t4d0
Enable inter-appliance heartbeat through dedicated
heartbeats disk? No
Enable inter-appliance heartbeat through primary
interfaces?
Yes
Enable inter-appliance heartbeat through serial ports?
No
Custom properties
:
Bringing up the cluster nodes, please wait ...
14
Jun 20 12:18:39 nodeA RSF-1[23402]: [ID 702911
local0.alert] RSF-1 cold restart: All services stopped.
RSF-1 cluster 'cluster-example' created.
Initializing ..... done.

To configure the Cluster, using NMV:
1. Click Settings > HA Cluster > Initialize.
2. Type in a Cluster Name and Description.
3. Select the following options:
•
Enable Network Monitoring
•
Configure Serial Heartbeat
4. Click Yes to create the initial configuration. Click OK.
The cluster is initialized. You can add shared volumes to cluster.

To change heartbeat properties, using NMC:

Type:
nmc:/$ setup group rsf-cluster <cluster_name>
hb_properties
System response:

•
Enable inter-appliance heartbeat through primary interfaces?:
Yes
•
Enable inter-appliance heartbeat through serial ports?: No
•
Proceed: Yes
To add additional hostnames to a volume service:
1. Click Advanced > Additional Virtual Hostnames.
2. Click Add a new virtual hostname.
Serial Link
HAC exchanges serial heartbeat packets through a dedicated RS232 serial
link between any two appliances, using a custom protocol. To prevent
routing problems affecting this type of heartbeat, do not use IP on this link.
The serial link requires:
•
Spare RS232 serial ports on each HA Cluster server
•
Crossover, or null modem RS232 cable, with an appropriate
connector on each end
You can use null modem cables to connect pieces of the Data Terminal
Equipment (DTE) together or attach console terminals.
On each server, enable the relevant serial port devices but disable any login,
modem or printer services running on it. Do not use the serial port for any
other purpose.
15

To configure serial port heartbeats:

Type Yes to the following question during HA cluster group creation:
Enable inter-appliance heartbeat through serial ports?
16
6
•
•
•
Mapping Information
•
NFS/CIFS Failover
•
•
HA Cluster detects storage system failures and transfers ownership of
shared volumes to the alternate NexentaStor appliance. HA Cluster ensures
service continuity in the presence of service level exceptional events,
including power outage, disk failures, appliance running out of memory or
crashing, etc.
When you configure SCSI targets in a cluster environment, make sure that
you are consistent with configurations and mappings across the cluster
members. HAC automatically propagates all SCSI Target operations.
However, if the alternate node is not available at the time of the
configuration change, problems can occur. By default, the operation results
in a warning to the User that the remote update failed. You can also set HA
Cluster to synchronous mode. In this case, the action fails completely if the
remote update fails.
To protect local configuration information that did not migrate, periodically
save this configuration to a remote site (perhaps the alternate node) and
then use NMC commands to restore it in the event of a failover.
17

To save the cluster, using NMC:

Type:
setup application configuration -save

To restore the cluster, using NMC:

Type:
setup application configuration -restore
Following are examples of using NMC commands to synchronize the cluster
configuration, if one of the nodes is not current. In the following examples,
node A contains the latest configuration and node B needs updating.
Example:

To run this command from node A:

Type:
nmc:/$ setup iscsi config restore-to nodeB
The above command:
!
•
Saves the configuration of node A
•
Copies it to node B
•
Restores it on node B
Restore operations are destructive. Only perform them during a planned
downtime window.
Note:
Example:

To run this command from node B:

Type:
nmc:/$ setup iscsi config restore-from nodeA
The restore command saves key configuration data that includes:
•
Target groups
•
Host groups (stmf.config)
•
Targets
•
Initiators
•
Target portal groups (iscsi.conf)
If you use CHAP authentication, and you configured the CHAP configuration
through NMC or NMV, then you can safely save and restore the
configuration.
18
Mapping Information
Use SCSI Target to map zvols from the cluster nodes to client systems. It is
critical that the cluster nodes contain the same mapping information.
Mapping information is specific to the volume and is stored with the volume
itself.
You can perform manual maintenance tasks on the mapping information
using the mapmgr command.
NFS/CIFS Failover
You can use HA Cluster to ensure the availability of NFS shares to users.
However, note that HA Cluster does not detect the failure of the NFS server
software.
You can use HA Cluster to failover iSCSI volumes from one cluster node to
another. The target IQN moves as part of the failover.
Setting up iSCSI failover involves setting up a zvol in the shared volume.
Note that you perform the process of creating a zvol and sharing it through
iSCSI separately from the HA Cluster configuration.
If you create iSCSI zvols before marking the zvol’s volume as a shared
cluster volume, then when you share the cluster volume as an active iSCSI
session, it may experience some delays. Depending on the network,
application environment and active workload, you may also see command
level failures or disconnects during this period.
When you add a shared volume to a cluster which has zvols created as back
up storage for iSCSI targets, it is vital that you configure all client iSCSI
initiators, regardless of the operating system, to access those targets using
the shared logical hostname that is specified when the volume service was
created, rather than a real hostname associated with one of the appliances.
Note that the cluster manages all aspects of the shared logical hostname
configuration. Therefore, do not configure the shared logical hostname
manually. Furthermore, unless the shared volume service is running, the
shared logical hostname is not present on the network, however, you can
verify it with the ICMP ping command.

To configure iSCSI targets on the active appliance, using NMV:
1. Click Data Management > SCSI Target > zvols.
2. Create a virtual block device using the shared volume.
Make the virtual block device >200MB.
19
HAC automatically migrates the newly created zvol to the other
appliance on failover. Therefore, you do not have to duplicate it
manually.
3. From the iSCSI pane, click iSCSI > Target Portal Groups and
define a target portal group.
!
Note:
It is critical that the IPv4 portal address is the shared logical hostname
specified when the volume service was created, instead of a real hostname
associated with one of the appliances.
HAC automatically replicates the newly created target portal group to the
other appliance.

To create an iSCSI target and add it to the target portal group, using
NMV:
1. Click iSCSI > Targets.
This limits zvol visibility from client initiators to the target portal
group. The newly created iSCSI target is automatically replicated to
the other appliance
2. Type a name and an alias.
The newly created iSCSI target displays in the Targets page.

To create a LUN mapping to the zvol, using NMV:
1. From the SCSI Target pane, click Mappings. This creates a LUN
mapping to the zvol for use as backup storage for the iSCSI target.
The newly created LUN mapping is automatically migrated to the
other appliance on failover.
2. On the client, configure the iSCSI initiator to use both the IQN of the
iSCSI target created and the shared logical hostname associated
with both the volume service and the target portal group to access
the zvol through iSCSI.
Failover time varies depending on the environment. As an example,
initiating failover for a pool containing six zvols, the observed failover time
is 32 seconds. Nodes may stall while the failover occurs, but otherwise
recover quickly.
See Also:
•
“Managing SCSI Targets” in the NexentaStor User Guide
As a prerequisite for configuring Fibre Channel targets, change the HBA port
modes of both appliances from Initiator mode to Target mode.

To change the HBA port mode, using NMV:
1. Click Data Management > SCSI Target Plus > Fibre Channel >
Ports.
20
2. Select Target from the dropdown menu.
3. Once you change the HBA port modes of both appliances from
Initiator mode to Target mode, reboot both appliances so the Target
mode changes can take effect.

To configure Fibre Channel targets on the appliance, in NMV:
1. Click Data Management > SCSI Target on the appliance where
the volume service is currently running.
2. In the zvols pane, click Create.
3. Create a zvol with the following characteristics:
•
Virtual block device: 200 MB
•
Use the shared volume: Yes
The newly created zvol is automatically migrated to the other
appliance on failover.
4. From the SCSI Target pane, click Mappings. This creates a LUN
mapping to the zvol for use as backup storage for the iSCSI target.
The newly created LUN mapping is automatically migrated to the
other appliance on failover.
21
22
7
Advanced Setup
•
•
•
This section describes advanced functions of HA Cluster, such as setting the
failover mode, adding virtual hostnames and volumes, and other
miscellaneous options.
The failover mode defines whether or not an appliance attempts to start a
service when it is not running. There are separate failover mode settings for
each appliance that can run a service.
You can set the failover modes to automatic or manual. In automatic mode,
the appliance attempts to start the service when it detects that there is no
available parallel appliance running in the cluster.
In manual mode, it does not attempt to start the service, but it generates
warnings when it is not available. If the appliance cannot obtain a definitive
answer about the state of the service, or the service is not running
anywhere else, the appropriate timeout must expire before you can take
any action. The primary service failover modes are typically set to automatic
to ensure that an appliance starts its primary service(s) on boot up. Note
that putting a service into manual mode when the service is already running
does not stop that service, it only prevents the service from starting on that
appliance.

To set the failover mode to manual, using NMV:
1. Click Advanced Setup > Cluster Operations > Set all Manual.
23
Advanced Setup
2. Click Yes to confirm.
Until you select Set All Services to manual, automatic or stopped, the
Restore and Clear Saved State buttons do not display.
!
Note:
Before HAC performs an operation, it saves the state of the services in the
cluster, which you can later re-apply to the cluster using the restore
button. Once HAC restores the service state, HAC clears the saved state.

To set the failover mode to manual, using NMC:

Type:
nmc:/$ setup group rsf-cluster <cluster_name> sharedvolume <volume_name> manual

To set the failover mode to automatic, using NMV:
1. Click Advanced Setup > Cluster Operations > Set all Automatic

To set the failover mode to manual, using NMC:

Type:
nmc:/$ setup group rsf-cluster <cluster_name> sharedvolume <volume_name> automatic

To stop all services in the Cluster, using NMV:
1. Click Stop All Services.
After you have initialized the cluster, you can add shared volumes and
virtual hostnames to a cluster.
See Also:
•
24
8
System Operations
•
•
•
•
Failure Events
•
Service Repair
•
•
Maintenance
•
System Upgrades
There are a variety of commands and GUI screens to help you with daily
cluster operations. There is a set of cluster-specific commands to
supplement NMC.
You can check the status of the Cluster at any time.

To see a list of available commands, using NMC:

Type:
help keyword cluster
!
Note:
Although both cluster nodes can access a shared volume using the show
volume command, that command only shows the volume if it's running on
the node currently owning that volume.
25
System Operations
You can use NMV to check the overall cluster status.

To check the status of the Cluster, using NMV:


Click Settings > HA Cluster.
To check the status of the Cluster, using NMC:

Type:
show group rsf-cluster
You can configure HA Cluster to detect failures and alert the user, or to
failover the shared volumes automatically.
To check the current appliance configuration mode:

Type:
nmc:/$ show group rsf-cluster <cluster_name>
System response:
PROPERTY
name
appliances
hbipifs
netmon
info
generation
refresh_timestamp
hbdisks
type
creation
:
:
:
:
:
:
:
:
:
:
VALUE
HA-Cluster
[nodeB nodeA]
nodeB:nodeA: nodeA:nodeB:
1
Nexenta HA Cluster
1
1297423688.31731
nodeA:c2t1d0 nodeB:c2t0d0
rsf-cluster
Feb 11 03:28:08 2011
SHARED VOLUME: ha-test
svc-ha-test-ipdevs
: rsf-data nodeB:e1000g1
nodeA:e1000g1
svc-ha-test-main-node
: nodeA
svc-ha-test-shared-vol-name : ha-test
HA CLUSTER STATUS: HA-Cluster
nodeA:
ha-test
e1000g1
nodeB:
ha-test
e1000g1
26
running
60
auto
unblocked
rsf-data
60
stopped
60
auto
unblocked
rsf-data
60
System Operations
Failure Events
NexentaStor tracks various appliance components, and their state. If and
when failover occurs (or any service changes to a broken state),
NexentaStor sends an email to the administrator describing the event.
!
During the NexentaStor installation, you setup SMTP configuration and test
that you can receive emails from the appliance.
Note:
Service Repair
There are two broken states:
•
Broken_Safe
A problem occurred while starting the service on the server, but it
was stopped safely and you can run it elsewhere.
•
Broken_Unsafe
A fatal problem occurred while starting or stopping the service on the
server. The service cannot run on any other server in the cluster until
it is repaired.

To repair a shared volume which is in broken state, using NMC:
nmc:/$ setup group rsf-cluster shared-volume repair
This initiates and runs the repair process.

To mark a service as repaired, using NMV:
1. Click Settings > HA Cluster.
2. In the Action column, set the action to repaired.
3. Click Confirm.
NexentaStor provides advanced capabilities to restore a node in a cluster, in
case the state changes to out of service. There is no need to delete the
cluster group on another node and reconfigure it and all of the cluster
services.

To replace a faulted node, using NMC:
nmc:/$ setup group rsf-cluster <group_name> replace_node
27
System Operations
After executing the command, the system asks you to choose which node
to exclude from the cluster and which to use instead. The system checks the
host parameters of the new node and if they match the requirements of the
cluster group, it replaces the old one.
!
Note:
Before performing a replace node operation, you must set up the identical
configuration on the new or restored hardware, which HAC uses to replace
the old faulted node. Otherwise, the operation fails. Make the serial port
heartbeats configuration the same, as well.
Maintenance
You can stop HAC from triggering a failover so you can perform
maintenance.

To stop HAC from triggering a failover, using NMC:

Type:
nmc:/$ setup group rsf-cluster <cluster_name> sharedvolume <volume_name> manual
System Upgrades
Occasionally, you may need to upgrade NexentaStor software on the
appliance. Since this may require a reboot, manage it carefully in a cluster
environment. HAC reminds the User that the cluster service is not available
during the upgrade.
Upgrade Procedure
When upgrading, you have an active and a passive node.

To upgrade both nodes, using NMC:
1. Login to the passive node and type:
nmc:/$ setup appliance upgrade
2. After the upgrade successfully finishes, login to the active node and
type the following to failover to the passive node:
nmc:/$ setup group rsf-cluster <group_name>
<shared_volume_name> <passive_node_name> failover
3. After failover finishes, the nodes swap. The active node becomes the
passive node and vice versa. Type the following command on the
current passive node:
nmc:/$ setup appliance upgrade
28
System Operations
4. Type the following to run the failover command on the current active
node and thereby, make it passive again:
nmc:/$ setup group rsf-cluster <group name> <shared volume
name> <active_node_name>
29
System Operations
30
9
•
•
•
•
•
•
•
•
The following section contains various testing and trouble-shooting tips.
You must make the appliances in the HA cluster group resolvable to each
other. This means they must be able to detect each other on the network
and communicate. To achieve this, you can either configure your DNS server
accordingly, or add records to /etc/hosts. If you do not want to edit /etc/
hosts manually, you can set this up when you create the shared volumes.
(You have to enter a virtual shared service hostname and a virtual IP
address.)
Defining these parameters allows the software to modify /etc/hosts tables
automatically on all HA-cluster group members.
You can use a virtual IP address instead of a shared logical hostname.
!
Note:
31
You can add these host records automatically when you create the shared
volumes. If a shared logical hostname cannot find a server, then you need
to define the IP address for that server. Then it adds records to that server
automatically.
You can choose to manually configure your DNS server, or local hosts tables
on the appliances.

To see more information, using NMC:

Type:
nmc:/$ setup appliance hosts -h
Alternatively, you could allow this cluster configuration logic to update your
local hosts records automatically.

To create a shared service, using NMC:

Type:
nmc:/$ setup group rsf-cluster HA-Cluster shared-volume
add
System response:
Scanning for volumes accessible from all appliances

...
To configure the shared cluster volume manually:
1. Using a text editor, open /etc/hosts file for node A:
2. Add the IP addresses for each node:
Example:
172.16.3.20 nodeA nodeA.mydomain
172.16.3.21 nodeB nodeB.mydomain
172.16.3.22 rsf-data
172.16.3.23 rsf-data2
3. Repeat Step 1 — Step 2 for node B.
NexentaStor allows you to configure specific devices in a data volume as
cache devices. For example, using solid-state disks as cache devices can
improve performance for random read workloads of mostly static content.

To specify cache devices when you create the volume, using NMC:

Type:
nmc:/$ setup volume grow
Cache devices are also available for shared volumes in the HA Cluster.
However, note that if you use local disks as cache, they cannot fail over since
they are not accessible by the alternate node. After failover, therefore, the
volume is marked “Degraded” because of the missing devices.
32
If local cache is critical for performance, then set up local cache devices for
the shared volume on each cluster node when you first configure the
volume. This involves setting up local cache on one node, and then manually
failing over the volume to the alternate node so that you can add local cache
there as well. This ensures the volume has cache devices available
automatically after a failover, however, the data volume changes to
“Degraded” and remains in that state because the cache devices are always
unavailable.
Additionally, Users can control the cache settings for zvols within the data
volume. In a cluster, set the zvol cache policy “write-through” not “writeback”.

To administer the cache policy, using NMC:

Type:
nmc:/$ setup volume <volume_name> zvol cache
nmc:/$ setup zvol cache
nmc:/$ show zvol cache
You can manually trigger a failover between systems when needed.
Performing a failover from the current appliance to the specified appliance
causes the volume sharing service to stop on the current appliance, and the
opposite actions take place on the passive appliance. Additionally, the
volume exports to the other node.

To manually trigger a failover, using NMC:

Type:
nmc:/$ setup group rsf-cluster <cluster_name>
<appliance_name> failover
There is a name associated with the cluster that is referred to as a shared
logical hostname. This hostname must be able to detect the clients that
are accessing the cluster. One way to manage this is with DNS.
Install a DNS management application. Through this software, you can view
the host record of the shared cluster hostname to verify that it was setup
properly.
Use a manual failover to move a resource from one NexentaStor node to
another node in the cluster.
33

To move a shared volume from node A to node B:

Type:
nmc@nodeA:/$ setup group <group_name> <cluster_name>
shared-volume show
System response:
HA CLUSTER STATUS: HA-Cluster
nodeA:
vol1-114
stopped
manual unblocked 10.3.60.134 e1000g0
nodeB:
vol1-114
running
auto
unblocked 10.3.60.134 e1000g0
20 8
20
8
The first system response below illustrates a failed failover. The second
illustrates a successful failover.

To verify failover back to the original node:

Type:
nmc:/$ setup group <group_name> <cluster_name> sharedvolume <volume_name> failover <appliance_name>
System response:
SystemCallError: (HA Cluster HA-Cluster): cannot set mode
for cluster node 'nodeA': Service ha-test2 is already
running on nodeA (172.16.3.20)

Type:
nmc:/$ setup group <group_name> <cluster_name> sharedvolume <volume_name> failover <appliance_name>
System response:
Waiting for failover operation to complete......... done.
nodeB:
ha-test2
running
auto
unblocked rsf-data2
e1000g1
60 60

To view support logs, using NMV:


Click View Log.
To view support logs, using NMC:

Type:
nmc:/$ show log
34
Index
A
R
adding
hostname addresses
RSF-1 cluster service 2
S
12
virtual IP addresses 12
additional resources 4
B
basic concepts 2
binding nodes 9
C
configuring
shared volumes 11
configuring HA cluster 10
SCSI-2 PGR 4
service failover 4
shared volumes
configuring 11
SSH
binding the nodes 9
storage failover 2
support
contact v
E
exclusive access
to storage 3
F
failover
service 4
storage 2
G
guide
audience v
H
HA cluster
configuring 10
I
importing current node 11
N
network architecture 6
nexentaStor appliances 2
node
importing 11
P
plugins
installing 6
prerequisites 5
product features 1
35
Index
36
Global Headquarters
455 El Camino Real
Santa Clara, California 95050
Nexenta EMEA Headquarters
Camerastraat 8
1322 BC Almere
Netherlands
Houston Office
2203 Timberloch Pl. Ste. 112
The Woodlands, TX 77380
Nexenta Systems Italy
Via Vespucci 8B
26900 Lodi
Italy
Nexenta Systems China
Room 806, Hanhai Culture Building,
Chaoyang District,
Beijing, China 100020
Nexenta Systems Japan
〒 102-0083
Chiyodaku koujimachi 2-10-3
Tokyo, Japan
Nexenta Systems Korea Chusik Hoesa
3001, 30F World Trade Center
511 YoungDongDa-Ro
GangNam-Gu, 135-729
Seoul, Korea
3000-hac-v2.5-000023-B
38

High Availability Cluster Plugin User Guide 3000-hac-v2.5-000023-B v2.5x

Transcription

Similar documents

In 2011 - Atlanpole

Internationalisation – the Strategic Way

key figure s

PROJECT - Hoogendoorn

MIS 111: Computers and the Inter-networked Society and 3.0

2015 Cluster 9 newsletters - UCSD Jacobs School of Engineering

Yeboah Accra - Connective Cities

01. Norway, Finland, Russia

Job Description Save the Children International Job title:

Key Features of ELCINA Cluster @ Bhiwadi are