vSAN Scrubber changes 7.0U1c

vSAN Scrubber is a background operation at DOM owner. Scrubber performs detection (scrub) and fixing (recover) of checksum and IO errors.

Mostly Checksum errors and I/O errors which are seen in vSAN environment is due to medium errors against the physical capacity disks where the data is persisted. These errors don’t surface until we try to read the object from both mirrors (in case of raid-1 objects). We might always be performing reads on a single mirror especially in case of Stretched cluster where site affinity is set to either preferred site or secondary site to avoid cross site traffic.

Maintenance tasks, hardware failures which results in a resync task , will trigger reads against the secondary/surviving mirrors which may discover and report checksum errors or I/O failures due to bad sectors backing the object’s component and resync would get stuck.

vSAN 7.0U1 and previous release had the advanced scrubber value “VSAN.ObjectScrubsPerYear” set to 1 per year against each objects, this setting is now changed to 6 per year on the latest vSAN 7.0U1c and later releases. This means that vSAN Scrubber will scrub every objects once every two months to make sure all the affected components with unreadable blocks, incorrect checksum are relocated to different sectors/disks by rebuilding from neighboring components/mirrors. See ESXI 7.01c release notes for additional information.

We can also look at some statistics against certain VMDK if necessary to know if there were any checksum errors against a specific object and ETA for scrubber completion (**Generally used by VMware support)

[root@localhost] esxcli vsan debug object list -u 19433d5f-9dee-9f58-9b67-ecf4bbec65d8

Object UUID: 19433d5f-9dee-9f58-9b67-ecf4bbec65d8
   Version: 13
   Health: healthy
   Owner: is-tse-d155.vsensei.local
   Size: 50.00 GB
   Used: 8.90 GB
      objectVersion: 13
      hostFailuresToTolerate: 1
      SCSN: 92
      CSN: 94


         Component: 19433d5f-e068-605c-faf8-ecf4bbec65d8
           Component State: ACTIVE,  Address Space(B): 53687091200 (50.00GB),  Disk UUID: 52de9507-470b-8a3c-1916-3dc6c696bc5f,  Disk Name: naa.5002538c4044d881:2
           Votes: 1,  Capacity Used(B): 4827643904 (4.50GB),  Physical Capacity Used(B): 4777312256 (4.45GB),  Host Name: is-tse-d156.vsensei.local
         Component: 19433d5f-6436-635c-631f-ecf4bbec65d8
           Component State: ACTIVE,  Address Space(B): 53687091200 (50.00GB),  Disk UUID: 52deab75-cff7-87a7-1273-5a505995c264,  Disk Name: naa.5002538c4044d87f:2
           Votes: 1,  Capacity Used(B): 4827643904 (4.50GB),  Physical Capacity Used(B): 4777312256 (4.45GB),  Host Name: is-tse-d157.vsensei.local
      Witness: 19433d5f-7f0b-645c-55c6-ecf4bbec65d8
        Component State: ACTIVE,  Address Space(B): 0 (0.00GB),  Disk UUID: 525a9779-affc-46ec-5f59-70a0b6ae28d2,  Disk Name: naa.5002538c4044d6ad:2
        Votes: 1,  Capacity Used(B): 12582912 (0.01GB),  Physical Capacity Used(B): 4194304 (0.00GB),  Host Name: is-tse-d155.vsensei.local

   Type: vdisk
   Path: /vmfs/volumes/vsan:523d5e5605a4d751-0c3304ae7a42599b/14433d5f-ce0e-0a0c-8d9f-ecf4bbec65d8/VC-50_12.vmdk (Exists)
   Group UUID: 14433d5f-ce0e-0a0c-8d9f-ecf4bbec65d8
   Directory Name: N/A

[root@localhost] vsish -e get /vmkModules/vsan/dom/owners/19433d5f-9dee-9f58-9b67-ecf4bbec65d8/scrubStats

DOM Owner Scrub Stats
   Total bytes scrubbed during the last round:56224055296
   Total bytes allocated for this object:9554624512
   Total size of all components including parity if any:107374182400
   Number of rounds of scrubbing completed:0
   Number of scrub ops issued:10
   Total number of LSEs/checksum errors detected:0
   Total number of faked checksum errors detected:0
   Total number of OOB LSEs/checksum errors detected:0
   Total number of OOB faked checksum errors detected:0
   Total number of LSEs/checksum errors recovered:0
   Total number of faked checksum errors recovered:0
   Total number of OOB LSEs/checksum errors recovered:0

   Total number of OOB faked checksum errors recovered:0
   Total number of LSEs/checksum errors that failed to recover:0
   Total number of faked checksum errors that failed to recover:0
   Total number of OOB LSEs/checksum errors that failed to recover:0
   Total number of OOB faked checksum errors that failed to recover:0
   Seconds since the last round of scrubbing started:11501811
   Total bytes scrubbed so far in current round:0
   Initial estimation of current scrub ETA (seconds):36432237
   Current estimation of current scrub ETA (seconds):558692354
   Maximum estimation of current scrub ETA (seconds):3071998125
   Maximum estimation of last scrub ETA round (seconds):0

Also take look at the new blog post on Advanced-Cross-vCenter Migration feature which is included with vSphere 7.0U1c.


Hareesh K G is a Site Reliability Engineer with VMware VSAN Engineering, his current focus is with VMware vSAN ® on-premises , his overall expertise is with Storage Availability Business Unit Products (VMware vSAN ®, VMware Site Recovery Manager® and vSphere Data Protection® ). Started his career with EMC support for Clariion and VNX block storage in 2012 and has been with VMware since 2015.

You may also like...