9.13P2
|================================================================================================================================|
 NetApp Storage Replication Adapter 9.13P2 for ONTAP for VMware vCenter Site Recovery Manager Readme |
|================================================================================================================================|


Audience
++++++++

 You should be familiar with ESXi servers, VMware vCenter Site Recovery Manager Appliance, and storage systems running ONTAP software.
 The README assumes that you are familiar with how to configure these systems and how the NFS, CIFS, HTTP, SAN, and NAS protocols (such as iSCSI, NFS, and FC) are used for file sharing or transfers.
 This README does not cover basic system or network administration topics.
 The document describes the scripts and how the scripts are used during failover along with cluster mode storage systems running ONTAP.


Features of this release
++++++++++++++++++++++++

NetApp Storage Replication Adapter 9.13P2 for ONTAP for VMware vCenter Site Recovery Manager is a storage vendor-specific
 plug-in for VMware vCenter. The adapter enables communication between Site Recovery Manager and a storage controller at the
 Storage Virtual Machine (SVM) level as well as at the cluster level configuration.

 The adapter interacts with the SVM to discover replicated datastores. Site Recovery Manager (SRM) uses the adapter to support SAN
 storage environments for VMFS (iSCSI and FC) and NAS storage environments for NFS.

 For more information about the new and enhanced features of Site Recovery Manager, see the Site Recovery Manager documentation.

 For more information regarding the bugs that are fixed in this release, please see the release notes for Storage Replication Adapter 9.13P2 for ONTAP.

 NetApp Storage Replication Adapter 9.13P2 for ONTAP supports all the workflows in Site Recovery Manager 6.5, 8.1 and 8.2 for 64-bit, such as
 discovery of arrays and replicated devices, test recovery, recovery (planned migration and disaster recovery) and reprotect. It also supports
 automatic creation of igroup, volume filtering, display of SRA signature & date/time stamp in logs.

 For more information about NetApp Storage Replication Adapter 9.13P2 for ONTAP, see the Installation and Administration Guide.

 Site Recovery Manager and NetApp Storage Replication Adapter 9.13P2 for ONTAP Overview
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

 What Site Recovery Manager does:

 Site Recovery Manager uses virtualization to provide end-to-end disaster recovery management and automation across the entire data center. It uses the VMware infrastructure to build, automate, and test disaster recovery plans for the data center.
 It works with Storage Replication Adapter to discover arrays and replicated and exported datastores and to fail over or test failover datastores.

 What Site Recovery Manager supports:

 Site Recovery Manager with the storage replication scripts supports storage environments such as
 VMFS (iSCSI and FC), RDM (iSCSI and FC), NFS, and active/active configuration.

 For more information about Site Recovery Manager, see the Site Recovery Manager documentation
 on the VMware site.

 SnapMirror modes supported by NetApp Storage Replication Adapter 9.13P2 for Data ONTAP operating in Cluster-Mode

 The adapter supports Cmode DP and XDP with a policy type of async_mirror and mirror_vault SnapMirror.

 For more information about SnapMirror modes, see the Data ONTAP Data Protection Online Backup and Recovery Guide.


Components of the storage replication environment
++++++++++++++++++++++++++++++++++++++++++++++++++

 The adapter provides array-specific support for Site Recovery Manager:

 1. Array discovery
    The DiscoverArrays discovers the storage array operation. This operation discovers the storage
    arrays that the administrator has selected as part of the Site Recovery Manager recovery policy
    during a disaster.
    Discover Arrays operation basically performs the discovery of source and peer storage arrays. This operation is
    executed to obtain information about the storage controllers. For every storage controller, following information is returned.

	Hostname of the storage array (called as array id)
	Array model and vendor
	Replication software name and version
	List of replication destination storage controller hostnames;

	SRA gets the destination details from valid SnapMirror relations. User can include/exclude volumes using the opaque field
	parameters provided in "Add Array Manager wizard" of SRM.

 2. Replicated device discovery
    In a NAS environment, SRA discovers replicated and exported datastores on the primary storage array. To discover an exported volume,
    you should export the volume to an IP address and replicate the volume by using the SnapMirror technology. The operation discovers the datastores that
    are exported and then checks the datastores that are replicated.

    SRM executes discoveryArrays for every pair of replicated storage arrays (based on the 'List of replication destination storage controller hostnames' output of discoverArrays.)
    This operation is executed to obtain information about the replicated storage devices.

    SRM uses the output of this command to identify following types of devices.

	replication source � read-write device on the source array configured for replication to the target array;
	replication target � read-only replica on the target array identified by a string key;
	demoted source � replication source put in read-only mode in preparation for failover;
	promoted target � read-write device on the target array created from replication target during failover;

    Moreover, the output of this command depends on the current state of the system such as
	TestfailoverStart complete
	Failover complete
	reversereplication complete

Support non-disruptive failover test using a writable copy of replicated data

  3. testFailoverStart
     testFailoverStart creates read-write temporary copy of the replication target. It does not have any effect on replication as it does not break the ongoing snapmirror relation.

  4. testFailoverStop
     testFailoverStop operation is carried out post testFailoverStart . It deletes the temporary flex clone replicas created by testFailoverStart.


Support emergency or planned failover

  5. failover
     failover is the actual operation that is carried out during disaster. Following are the high level flows in failover operation

	 failover aborts any ongoing snapmirror transfers. This operation will then quiesce the snapmirror relation.
	 It breaks the snapmirror relation and marks the destination replica as read-write.
     For NAS : Failover exports the input paths by adding the vmkernel IP address to the exports file.
     For SAN : it Collects mapping related information for lun like igroup name, Lun serial number.
	 AccessGroup and the respective initiator details are also collected for a storage device.
	 The operation onlines the datastores if they are not online already.
     Then the operation outputs the failed-over datastores to SRM for discovery.

  6. reprotect
     reprotect is nothing but reverse replication. Reverse replication is used to reverse the snapmirror relation post failover operation.

System requirements and installation information
++++++++++++++++++++++++++++++++++++++++++++++++

Prerequisites
-------------
 Before running the NetApp storage replication scripts, ensure that you meet certain clustered Data ONTAP and Site Recovery
 Manager requirements.

 Prior to installing Storage Replication Adapter, NetApp SRA Server needs to be installed and configured.

 The system requirements include the following:
 ONTAP 9.1 or later is installed on the following storage systems:
	� NetApp FAS and AFF Platforms

 For more information about supported platforms, see the supportability matrix on the NOW site.
 � FlexClone license is enabled to test recovery(Optional)
 � SnapMirror license is enabled on the storage system.
 � In a NAS environment, NFS services are running.
 � In a SAN environment, iSCSI or FC or both services are running.
 � Site Recovery Manager Appliance is deployed.
   Site Recovery Manager supports VMFS (iSCSI and FC), RDM (iSCSI and FC), and NFS storage
   environments.



 To keep the corresponding port open, the firewall (if any) between ONTAP and Site Recovery Manager,
  should be configured because SOAP, HTTP, or SSL is used.



Features not supported in Storage Replication Adapter
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  Certain features and functions are not supported in Storage Replication Adapter.

 � Exports or LUNs that have multiple replication destinations cannot be failed over.

 � Exports or LUNs that have cascading SnapMirror relationships cannot be failed over.

 � The testFailover operation is not supported on Striped volumes

 � SnapMirror LS is not supported as SRA does not support replication within single storage Cluster

 � SyncMirror replication between Site Recovery Manager sites (as in MetroCluster) is not supported.

 � SMVI image consistency is not supported including non-quiesced.

 � Localization is not supported by SRA. Default is English.

 For specific information about recommended configurations and best practices when using different
 replication modes, see the Data ONTAP Data Protection Online Backup and Recovery Guide.

Recommendations
===============
 When you use the storage replication scripts with Site Recovery Manager, ensure that the secondary
 storage system does not contain volumes that have the same name as that of the volume that TestFailover creates.

 � Ensure that the primary and secondary ESX servers can discover devices on storage systems by using
   iSCSI, FC, or NFS.

 � On the secondary storage system, ensure that there are no volumes with names that match the name
   of the volume that TestFailover creates.

 � When adding active/active configuration systems to the Site Recovery Manager configuration, you
   should add each NetApp controller as an array in Array Managers.


Setting up the storage system before running NetApp Storage Replication Adapter 9.13P2 for ONTAP in a NAS environment
=================================================================================================================================================

You must set up the system before running Storage Replication Adapter for Site Recovery Manager.

Steps:

1. Install NetApp SRA Server

2. Install Site Recovery Manager.
   For information about installing Site Recovery Manager, see the Site Recovery Manager
   documentation on the VMware site

3. Install NetApp Storage Replication Adapter 9.13P2 for ONTAP on Site Recovery Manager server at the protected and
   recovery sites

4. Ensure that the datastores at the protected site contain virtual machines that are registered with
   vCenter Server

5. Ensure that the ESX hosts at the protected site have mounted the NFS exported volumes from the
   storage controller

6. Ensure that the valid IP addresses have been entered in the NFS IP Addresses field when adding
   arrays to the Site Recovery Manager using the Array Manager wizard

7. To ensure a successful test recovery operation, verify that the ESX hosts at the secondary storage
   system have a VMkernel port that can access the IP addresses used to serve NFS exports from the
   secondary storage controller[vserver NFS LIF] by using the following command on the console of
   each ESX host:
   vmkping nfs_ip_address
   nfs_ip_address is one of the NFS IP addresses on the storage controller [vserver NFS LIF]

8. Ensure that the NFS logical interface (Lif) is created both on the source and the destination along with the corresponding routing groups.
   The lifs must be associated to the same Vserver where the volumes (devices) to be protected exist

9. Ensure that the volume containing the exports is replicated to the secondary storage
   system

10. On the primary storage system, enter the snapmirror show command to confirm that the
   concerned export is involved in only one replication relationship and that the relationship status
   displays SnapMirrored

11. On the secondary storage system, enter the snapmirror show command to confirm that the
    relationship state is SnapMirrored

    For information about managing a SnapMirror relationship, see the Data ONTAP Data Protection
    Online Backup and Recovery Guide.


Setting up the system before running NetApp Storage Replication Adapter 9.13P2 for ONTAP in a SAN environment
=========================================================================================================================================

You must set up the system before running Storage Replication Adapter for Site Recovery Manager.
Steps

1. Install NetApp SRA Server

2. Install Site Recovery Manager.
   For information about installing Site Recovery Manager, see the Site Recovery Manager
   documentation on the VMware site

3. Install Storage Replication Adapter on Site Recovery Manager server at the protected and
   recovery sites

4. Ensure that the primary ESX hosts are connected to LUNs in the primary storage system

5. Ensure that the LUNs are in igroups, either iSCSI, FC, or both, of OS type vmware on the
   primary storage system

6. Ensure that the iSCSI/FCP logical interfaces (lifs) are configured both on the protected and the recovery sites
   and the respective routing groups are created. The lifs must be associated to the same Vserver where the volumes (devices) to be protected exist

7. Ensure that the ESX hosts at the recovery site have appropriate FC or iSCSI connectivity to the
   secondary storage controllers . This can be done either by verifying that the ESX
   hosts have local LUNs connected on the secondary storage controllers or by using the fcp  initiator show
    or iscsi initiator show commands on the secondary storage controllers

8. Ensure that the volume containing the LUNs is replicated to the secondary storage
   system

9. On the primary storage system, enter the snapmirror show command to confirm that the
   concerned LUN is involved in only one replication relationship and that the relationship status
   displays SnapMirrored

10. On the secondary storage system, enter the snapmirror show command to confirm that the
   relationship state is SnapMirrored.


11. Reprotect operations fail in some cases if they are run before the discovery of devices on the source and destination is completed.
  so as a best practice wait for the discovery of devices to complete on the source and destination before starting a new reprotect operation


Configuration related to Include/Exclude list or the volume filtering in add array manager
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


Volume filtering operation is executed to include or exclude list of volumes. Users provide include or exclude list of volumes initially while running �Add Array Manager� wizard of SRM.
There would be 2 opaque field options.

1. Include list : This will have list of all the volumes to be included.
2. Exclude list : This will have list of volumes to be excluded.

First, discoverDevices will run for include list items and figure out all volumes mentioned in include list. Then, it will run for exclude list entries and exclude volumes that are present in the exclude list.
There are 4 possibilities:

� If both the fields have data, discoverDevices command will return all volumes which have one of the include sub-strings and don't have any of the exclude sub-strings in their name.
� If both the fields are blank, all the volumes will be returned which is same as the present case.
� If include field is empty and exclude field has data then all volumes that do not have exclude sub-strings in their name would be returned.
� If exclude field is empty and include field has data then all volumes that have include sub-strings in their name would be returned.

For simplicity and easy to use approach, we will have a comma separated list of entries as input in both the above opaque parameters.
Example, we have the following volumes on storage arrays:

� srm_sql_oracle_db2,
� srm_exchange
� srm_1
� sql_2
� access

Customer provides:

Include opaque list : srm, oracle, sql
Exclude opaque list : db2, exchange

discoverDevices will list all volumes having include list names i.e. srm, oracle or sql. Out of that list discoverDevices will then filter or remove vols in exclude list i.e. db2 or exchange.
Output vols will be following in the response of discoverDevices:

� srm_1
� sql_2

In case user provides non alphanumeric characters or some special characters then SRA throws error in discoverArrays.


Configuration related to queryReplicationSettings in recovery and reprotect
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
ReplicationSettings is designed to allow SRM to collect replication settings for replicated devices before failover and
pass these settings to SRA after failover to ensure that replication settings are preserved when replication direction is
reversed or restored. SRM passes this infomation as-is to the reverseReplication and restoreReplication commands.
SRM issues a ReplicationSettings command periodically and stores the results in its database.


Failover testing or test recovery
+++++++++++++++++++++++++++++++++
 You can test a recovery plan to ensure that it works during a disaster or a planned migration. Test
 recovery operation does not affect the ongoing SnapMirror replication.
 During test recovery operation, Storage Replication Adapter creates a FlexClone volume from the
 latest Snapshot copy of the SnapMirror destination volume and displays the cloned export path or
 LUN path to Site Recovery Manager.
 The test recovery operation creates a cloned volume on the secondary storage system with the prefix
 testfailover_<volume_name>. For example, the
 cloned volume that is created for the volume - "secondary_volume" in SVM - "vs1" is named
 testfailover_vs1.
 Note:

� At the end of test recovery operation, the FlexClone volume is destroyed. You must ensure that
  you do not have user-created volumes with names prefixed with
  testfailover. You can verify the names of the volumes in the
  secondary storage system by running the vol show command.

� By default, Storage Replication Adapter creates a cloned volume with no space guarantee
  because the cloned volume is not used during actual recovery and is destroyed.
  In a NAS environment, the test recovery operation exports the path of the FlexClone volume and a Junction Path is created.
  Now, a clone of the mirrored datastore is ready for discovery. The operation displays the cloned export path to Site Recovery Manager.
  In a SAN environment, the test recovery operation maps the LUNs of the FlexClone volume to an
  igroup of OS type vmware. The operation checks if an igroup is present for the specific initiator type
  and ID. If an igroup is present, the cloned LUN is mapped to that igroup. If an igroup does not exist
  for a particular initiator, Storage Replication Adapter automatically creates an igroup for that initiator
  and maps the cloned LUN to that igroup. The operation displays the cloned LUN path to Site
  Recovery Manager.
  Automatically created igroups have names prefixed with failover_igroup_domain_, followed by access
  group id. For example, an igroup
  created for an FC initiator that has an access group domain id of "s19" has the name
  failover_igroup_domain-s19.



Troubleshooting
===============
 The log files are available at /var/log/vmware/srm inside the SRM appliance.

 Note: The drconfig.log is the main log file in SRM to check for all the communication between SRA and the SRA server.

 In case of any failure in the adapter or unexpected behaviour, you should set the log level to trivia in SRM and run the failed operation again.
 For details on running the failover operation,see the SRM documentation. You should also examine the log messages.

 Note: The solutions provided assume that you are using the storage replication scripts in a failover environment.


Last updated: June 2020
