Saturday, December 10, 2016

VMFSsparse vs SEsparse

VMFSsparse is a virtual disk format used when a VM snapshot is taken or when linked clones are created off the VM. VMFSsparse is implemented on top of VMFS and I/Os issued to a snapshot VM are processed by the VMFSsparse layer. VMFSsparse is essentially a redo-log that grows from empty (immediately after a VM snapshot is taken) to the size of its base VMDK (when the entire VMDK is re-written with new data after the VM snapshotting). This redo-log is just another file in the VMFS namespace and upon snapshot creation the base VMDK attached to the VM is changed to the newly created sparse VMDK.

Because VMFSsparse is implemented above the VMFS layer, it maintains its own metadata structures in order to address the data blocks contained in the redo-log. The block size of a redo-log is one sector size (512 bytes). Therefore the granularity of read and write from redo-logs can be as small as one sector. When I/O is issued from a VM snapshot, vSphere determines whether the data resides in the base VMDK (if it was never written after a VM snapshot) or if it resides in the redo-log (if it was written after the VM snapshot operation) and the I/O is serviced from the right place. The I/O performance depends on various factors, such as I/O type (read vs. write), whether the data exists in the redo-log or the base VMDK, snapshot level, redo-log size, and type of base VMDK.

I/O type: After a VM snapshot takes place, if a read I/O is issued, it is either serviced by the base VMDK or the redo-log, depending on where the latest data resides. For write I/Os, if it is the first write to the block after the snapshot operation, new blocks are allocated in the redo-log file, and data is written after updating the redo-log metadata about the existence of the data in the redo-log and its physical location. If the write I/O is issued to a block that is already available in the redo-log, then it is re-written with new data.

Snapshot depth: When a VM snapshot is created for that first time, the snapshot depth is 1. If another snapshot is created for the same VM, the depth becomes 2 and the base virtual disks for snapshot depth 2 become the sparse virtual disks in snapshot depth 1. As the snapshot depth increases, performance decreases because of the need to traverse through multiple levels of metadata information to locate the latest version of a data block.

I/O access pattern and physical location of data: The physical location of data is also a significant criterion for snapshot performance. For a sequential I/O access, having the entire data available in a single VMDK file would perform better compared to aggregating data from multiple levels of snapshots such as the base VMDK and the sparse VMDK from one or more levels.

Base VMDK type: Base VMDK type impacts the performance of certain I/O operations. After a snapshot, if the base VMDK is thin format [4], and if the VMDK hasn’t fully inflated yet, writes to an unallocated block in the base thin VMDK would lead to two operations (1) allocate and zero the blocks in the base, thin VMDK and (2) allocate and write the actual data in the snapshot VMDK. There will be performance degradation during these relatively rare scenarios.

SEsparse SEsparse is a new virtual disk format that is similar to VMFSsparse (redo-logs) with some enhancements and new functionality. One of the differences of SEsparse with respect to VMFSsparse is that the block size is 4KB for SEsparse compared to 512 bytes for VMFSsparse. Most of the performance aspects of VMFSsparse discussed above—impact of I/O type, snapshot depth, physical location of data, base VMDK type, etc.—applies to the SEsparse format also.

In addition to a change in the block size, the main distinction of the SEsparse virtual disk format is space efficiency. With support from VMware Tools running in the guest operating system, blocks that are deleted by the guest file system are marked and commands are issued to the SEsparse layer in the hypervisor to unmap those blocks. This helps to reclaim space allocated by SEsparse once the guest operating system has deleted that data. SEsparse has some optimizations in vSphere 5.5, like coalescing of I/Os, that improves its performance of certain operations compared to VMFSsparse.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.