Troubleshooting Network Partition vSAN6.6 Post Upgrade

Here are the troubleshooting Steps to resolve Unicastagent issues on vSAN cluster post upgrade usually from vSAN 6.1 / 6.2 / 6.5 to vSAN to 6.6  6.50d and below as we no longer use multicast from version 6.6 and unicastagent list is not updated with correct details on one or more hosts and you also vSAN network partition on one or more hosts .

In these cases you will need to manually add the unicastagent address list on all hosts part of the cluster , follow thru the steps listed below .

 

Note** : From ESXI 6.5 Update 1 onwards , all the unicatagent entries is controlled from vCenter server and vCenter will push all the unicastagent entries to the hosts , none of the below steps are required on unicastagent entries .

 

Step1 : Locate the isolated/Network partitioned host or hosts (more than 1) in the vsan Cluster . When you run the “esxcli vsan cluster get” command you will see they don’t participate in the cluster communication .
[root@IS-EVO-01:~] vmware -lv
VMware ESXi 6.5.0 build-5310538
VMware ESXi 6.5.0 GA
[root@IS-EVO-01:~] esxcli vsan cluster get
Cluster Information
Enabled: true
Current Local Time: 2017-05-10T03:05:33Z
Local Node UUID: 5938de9a-e35b-d745-c9ff-ecf4bbec65d8
Local Node Type: NORMAL
Local Node State: MASTER
Local Node Health State: HEALTHY
Sub-Cluster Master UUID: 548f2c9c-9491-deb8-34e6-3417ebe52102
Sub-Cluster Backup UUID:
Sub-Cluster UUID: 5e4ae17c-5fa6-4b9e-a40d-007dd6e00664
Sub-Cluster Membership Entry Revision: 0
Sub-Cluster Member Count: 1
Sub-Cluster Member UUIDs: 5938de9a-e35b-d745-c9ff-ecf4bbec65d8
Sub-Cluster Membership UUID: a1af6d59-1edb-e115-8456-ecf4bbec91a8
Unicast Mode Enabled: true
Maintenance Mode State: OFF
Step2 : Verify the current unicastagent list by running “esxcli vsan cluster unicastagent list”

From the below output, it is evident that the NODE UUID is all zero’s. This is not correct.

esxcli vsan cluster unicastagent list
NodeUuid                         IsWitness Supports Unicast IP Address Port Iface Name
------------------------------------ --------- ---------------- ------------ ----- ----------

Step3 : Clear the unicastagent list using the following command, where IP Addr is from the “IP Address” column in Step2 “for the command “esxcli vsan cluster unicastagent list”or you may also run the secondary command to clear the list completely.
esxcli vsan cluster unicastagent remove -a <IP Addr>

or

esxcli vsan cluster unicastagent clear
Step4 : Get Node UUID from all the hosts using “esxcli vsan cluster get” , make note of Local Node UUID .
EX : esxcli vsan cluster get
Cluster Information
Enabled: true
Current Local Time: 2017-05-10T04:12:35Z
Local Node UUID: 5938de9a-e35b-d745-c9ff-ecf4bbec65d8
Local Node Type: NORMAL
Local Node State: MASTER
Local Node Health State: HEALTHY
Sub-Cluster Master UUID: 548f2d1a-6d0a-c0bd-dc64-3417ebe525c4
Sub-Cluster Backup UUID:
Sub-Cluster UUID: 5e4ae17c-5fa6-4b9e-a40d-007dd6e00664
Sub-Cluster Membership Entry Revision: 0
Sub-Cluster Member Count: 1
Sub-Cluster Member UUIDs: 5938de9a-e35b-d745-c9ff-ecf4bbec65d8
Sub-Cluster Membership UUID: a1af6d59-1edb-e115-8456-ecf4bbec91a8
Unicast Mode Enabled: true
Maintenance Mode State: OFF

Step5 : Find the vmknic used for vSAN and its associated IP address from all nodes in the vSAN cluster by running command “esxcfg-vmknic -l”
EX : vmk3 Virtual SAN IPv4 192.168.30.3 255.255.255.0 192.168.30.255 00:50:56:63:6b:bc 1500 65535 true STATIC

Here “vmk3” is used for vSAN and the IP is 192.168.30.3

 

Step6 : Add the unicastagent address in all the Hosts using the following command , you will need to run the same commnad multiple times with respective host UUID and the vSAN vmkernel adapter .If the vSAN cluster contains 4 Nodes, Host 1 will have unicastagent address added for Host 2,3 and 4. Second Host will have unicastagent address added for Host 1, 3 and 4. Similarly, add the unicast agent addresses for Host 3 and 4.

esxcli vsan cluster unicastagent add -i vmk3 -t node -u 548f2d1a-6d0a-c0bd-dc64-3417ebe525c4 -a 192.168.10.3 -U 1
esxcli vsan cluster unicastagent add -i vmk3 -t node -u 5937c663-8cb8-3d48-d3ad-ecf4bbec91a8 -a 192.168.10.4 -U 1
esxcli vsan cluster unicastagent add -i vmk3 -t node -u 5937c679-f343-be43-49a3-ecf4bbec6050 -a 192.168.10.5 -U 1

 

Step6 : Check if the network partition is resolved after adding the unicastagent addresses on all the host in the vSAN cluster.
esxcli vsan cluster get
Cluster Information
 Enabled: true
 Current Local Time: 2017-07-27T15:45:35Z
 Local Node UUID: 5938de9a-e35b-d745-c9ff-ecf4bbec65d8
 Local Node Type: NORMAL
 Local Node State: AGENT
 Local Node Health State: HEALTHY
 Sub-Cluster Master UUID: 5937c663-8cb8-3d48-d3ad-ecf4bbec91a8
 Sub-Cluster Backup UUID: 5937c679-f343-be43-49a3-ecf4bbec6050
 Sub-Cluster UUID: 523aeb91-7fcb-7006-f032-55b051f733f0
 Sub-Cluster Membership Entry Revision: 9
 Sub-Cluster Member Count: 3
 Sub-Cluster Member UUIDs: 5937c663-8cb8-3d48-d3ad-ecf4bbec91a8, 5937c679-f343-be43-49a3-ecf4bbec6050, 5938de9a-e35b-d745-c9ff-ecf4bbec65d8
 Sub-Cluster Membership UUID: a1af6d59-1edb-e115-8456-ecf4bbec91a8
 Unicast Mode Enabled: true
 Maintenance Mode State: OFF

esxcli vsan cluster unicastagent list
NodeUuid                       IsWitness Supports Unicast IP Address Port Iface Name
------------------------------------ --------- ---------------- ------------ ----- ----------
5937c679-f343-be43-49a3-ecf4bbec6050 0          true    192.168.10.4 12321
5937c663-8cb8-3d48-d3ad-ecf4bbec91a8 0          true    192.168.10.5 12321

admin

I am a Technical Support Engineer with VMware global support from the year 2015. My current focus is with VMware vSAN ® and VxRail™ ,my overall expertise is around storage availability business unit (VMware vSAN ®, VMware Site Recovery Manager® and Vsphere Data Protection® ). I had initially started my carrier with EMC support for clarion and VNX block storage in 2012

You may also like...