Friday 15 August 2014

Notes from INF-STO2564 session provided by Aboubacar Diare from HP

ATS - Atomic Test&Set primitive.
SCSI2 reservation - entire volume is locked
ATS - lock only block

VMFS3: layout - resources (inodes, blocks, sub-blocks etc.)organized into clusters, each cluster has an associated lock and metadata. Clusters form a cluster groups. Cluster groups repeat to make the filesystem.

ATS Values:
* space locality - esxi hosts strive for contiguous blocks of objects they manage
* reduces storage resources contention
* larger VMFS datastores size
* higher VMs density
* reduced datastore management

ATS Caveats:

* vmotion vs. storage vMotion e.g.

1.) on esxi1 we create 10 VMs which see 500GB at the beginning of 1TB volume
2.) host esxi1 is accessing this contiguous area
3.) on esxi2 we create VM which allocate 100GB on the same 1TB volume at the end of 500GB area accessed by esxi1. Both hosts work fine without any storage contention.
4.) now, we vmotion 5 VMs to esxi2,
5.) esxi2 host get access to contiguous space previously accessed by esxi1
6.) now, this hosts start contending for the resources in different area of the disk. Space locality is disturbed.

Do vMotion only when is needed e.g. DRS recommendation. Instead of causing multiple hosts access region on the disk which other hosts will need and increasing the potential for contention of these resources. Storage vMotion more preferable than classic vMotion.

It doesn't mean don't use vmotion !!! It means think when you scale out with vSphere cluster !!! 

* free capacity (filling up VMFS)- need free capacity on the datastore, esxi don't like full datastores ;)
* storage vMotion -> force esxi to find contiguous space.

UNMAP primitive:

# esxcli system settings advanced set -i 0 -o /VMFS3/EnableBlockDelete

# vmkfstools -y X (X = 1 - 100% default 60%)

VMware disabled UNMAP - only manual reclaiming posssible.
Reclaiming space => huge impact on performance. UNMPAP operation create a balloon file equal to size of space to be de-allocated. Balloon file causes many WR I/O.

Avoid X > 90% !!!

You may not get back space which you expected to get back !
Percentage is derived from free space available on VMFS datastore e.g.

2TB datastore 50% full with vmkfstools -y 50 will attempt to reclaim 0.5TB (512GB) = 50% of 1TB free.

VAAI Consideration/Hidden Benefits:

* fewer commands send to the storage array.

UNMAP command is different and depends on array implementation, it could be more command than without UNMAP primitive.

Some VAAI implementation will NOT work if datastore is not aligned:

* VMFS3 datastore to 64kB
* VMFS5 datastore to 1MB 
* some arrays may reject non-aligned VAAI operations
* concurrent clone operations:
-- maybe throttle by array at some threshold (current backend processing limit)
-- limit number of concurrent clone/zero to 3 or 4.

VAAI and array replication - some performance deprecation for VAAI.

Fixed I/O path policy is NOT ALUA aware and not recommended for ALUA arrays.

Changing MPIO settings for multiple LUNs:

# for i in 'esxcli storage nmp device list | grep ^naa.6001' ; do esxcli storage nmp device set -P VMW_PSP_RR -d $i; done

# for i in 'esxcli storage nmp device list | grep ^naa.6001' ; do esxcli storage nmp psp roundrobin deviceconfig set -t iops –I 1 -d $i; done


FIXED_AP - ALUA aware
ESXi5 rolled FIXED_AP functionality into the FIXED I/O path policy. FIXED_AP only applicable to ALUA capable arrays.

Is FIXED in ESXi5 recommended for ALUA arrays?
NO ! For general usage (LUN thrashing). YES ! as a tool for quickly restoring balance in an unbalanced array LUN configuration.

Round Robin - lower IOPS is better, high IOPS value for sequential workload.
  
 

No comments:

Post a Comment