3 Real World vSphere Situations to Avoid

Written by Mike Solinap

Published on January 8, 2014

Categories: Data Engineering | Engineering Operations | Software Development & Release Management

Almost a year ago, I wrote about 10 Pitfalls that Can Impact VMware Performance. I thought I’d revisit this topic, provide some specific situations I’ve encountered over the past year, and explain what I’ve learned from them. You may have already taken the advice of my previous article, but as we’re all aware, the real world will typically bring us unexpected issues.

Here are 3 real world vSphere situations that you will want to avoid:

1. Poor NFS performance

You’ve got powerful hosts, a fast network, and a decent number of spindles on your storage array — yet your IO performance is horrible. Are your datastores mounted via NFS? If so, this may likely be the culprit. Don’t get me wrong — after a decade of managing Unix systems as part of our engineering services, I learned that NFS is something that we can’t live without. In terms of vSphere, NFS provides several key benefits when compared to the alternative, block storage. Namely, compared to Fiber Channel; no storage area network investment is needed, LUNs no longer need to be carved, you have better accessibility to individual VMDKs and troubleshooting – port mirror with wireshark! The list goes on.

However, there is 1 big limitation with respect to NFS and vSphere – the NFS implementation in vSphere seems to only support synchronous writes. This is regardless of how your storage array is exporting the NFS volume.

Why is this problematic? It may or may not be, depending what your storage array is. Take for example a Network Appliance. Synchronous writes are NOT a problem, due to the fact that Netapp implements an NVRAM. Writes are considered committed as soon as they are written to NVRAM, as opposed to waiting for your spindles to write the data. A linux or BSD based machine with ZFS and an SSD backed Intent Log is a similar situation. Control is returned to the application before data is committed to the physical disks.

Without the ability to quickly acknowledge the write requests from the vSphere hosts, your overall performance will suffer.

2. Snapshots Are Not Free

The beauty of virtualizing a machine is that we can take snapshots of the running state, and revert back to them if needed. This comes in extremely handy for testing software, configuration changes, or as an easy fallback when migrating to production as part of your build and release management plan.

Unfortunately, this benefit isn’t “free”. When you initiate a snapshot of a virtual machine, vSphere stops writing changes to the original VMDK file. A new file is created and subsequent changes are appended to this file throughout the life of the snapshot. Chances are, we forget that we created the snapshot in the first place, and this can create some obvious, but also not so obvious repercussions.

Obviously, as the the snapshot ages, the delta file will grow. And grow. And grow. Even though you may think your virtual machine is mostly “idle”, things constantly get written to systems logs, periodic tasks run, OS updates get applied, etc. If you have your datastores accessible via NFS (hint hint!), run a find for all delta files and see how much space they’ve accumulated. You may be surprised.

Growing snapshots present another issue. At some point, you will want to commit (delete) them. If your snapshots have grown significantly, committing them will generate a HUGE amount of IO. Delete them and cross your fingers. I’ve encountered a situation where I was deleting a large snapshot. It generated enough IO to cause vSphere to time out the operation. The result – a corrupted VM, a saturated network, and a bogged down storage array.

3. Under Utilized Memory

If you have an abundance of hardware available, and deep pockets, feel free to skip over this section. Otherwise, you’re likely in the same situation as the rest of us — with limited budgets and hardware sorely needing upgrades.

You may be tempted to splurge on some additional RAM since it’s typically the lowest hanging fruit. Or is it? What if your host has no free RAM slots left? With some simple analysis, it is worth investigating whether or not additional RAM is needed at all.

Although your host says its memory utilization is at 80%, it may not necessarily need any more. vSphere employs many techniques to get the most utilization from the host:

Memory deduplication: vSphere will look for identical memory pages, and only keep one copy. Since hosts run multiple, similar machines, running similar applications, there’s good potential for memory savings.
Ballooning: This concept is similar to the “swappiness” behavior of a Linux machine. The idea is that if pages in memory aren’t being accessed often or at all, page them to physical disk so that memory can be freed up. However, vSphere goes one step further and attempts to control or force this paging to disk by employing a balloon driver. Installed with VMware Tools, an artificial process will start consuming memory within the guest. Then, this forces the guest to decide for itself what best should be paged to disk.
Memory Compression: vSphere has the ability to check for the compressibility of a memory page. If it can be compressed greater than 50%, then it will do it. Compressing a memory page still outperforms having to swap to disk.

After looking at performance metrics for each of these techniques, and seeing that the guest is still swapping to disk, then it’s likely that you do in fact need more RAM. However, if new RAM is out of the question for whatever reason, vSphere can take advantage of an SSD and use it as your swap space. With SSD’s dropping in price month after month, this may be your biggest bang for your buck. Additionally, if you are licensed for Enterprise Plus, the SSD can also be used as a read cache, reducing IO on your storage array even further.

Hopefully you’re fortunate enough to avoid situations like these. Do you have any unique situations you’d like to share, or feedback on the ones I’ve mentioned? I’d like to hear them!

Next Steps:

Contact SPK and Associates to see how we can help your organization with our ALM, PLM, and Engineering Tools Support services.
Read our White Papers & Case Studies for examples of how SPK leverages technology to advance engineering and business for our clients.

Michael Solinap
Sr. Systems Integrator
SPK & Associates

← Previous: Engineering and IT Professionals: Take a Break! Next: Agile Development in Regulated Environments Part 2: Key Practices →

Latest White Papers

Accelerating Product Development the SPK Way

Developing high-quality products quickly can be a challenge without the proper tools, processes, and partners to help. Dive into this eBook to discover how partnering with SPK can help you achieve product development success.What You Will Learn In this eBook, we will...

Subscribe to our blog

Stay up to date with the latest Engineering Technology tips and news.

Related Resources

Protected Container Repositories and more in the latest GitLab releases

Apr 18, 2025

GitLab has officially released version 17.8. This update offers significant enhancements across security, DevOps workflows, and machine learning capabilities. With over 60 improvements, this release further solidifies GitLab's role as the most comprehensive AI-powered...

Accelerating Product Development the SPK Way

Apr 11, 2025

Developing high-quality products quickly can be a challenge without the proper tools, processes, and partners to help. Dive into this eBook to discover how partnering with SPK can help you achieve product development success.What You Will Learn In this eBook, we will...

How “Watch It” Helps You Track Critical Jira Changes and Notifications

Apr 11, 2025

As a Jira user, staying informed about critical issue changes is essential to business success. Jira’s native notification system, while relatively accommodating, is not flexible enough for all forms of complex workflows. This is where Watch It by Redmoon Software...

Other Software Experience

Resources

Topics

Latest Blog Posts

Most Popular Resources

3 Real World vSphere Situations to Avoid

1. Poor NFS performance

2. Snapshots Are Not Free

3. Under Utilized Memory

Latest White Papers

Accelerating Product Development the SPK Way

Subscribe to our blog

Thanks for subscribing! You'll hear from us soon!

Related Resources

Protected Container Repositories and more in the latest GitLab releases

Accelerating Product Development the SPK Way

How “Watch It” Helps You Track Critical Jira Changes and Notifications

About

All Content

The Best Marketplace Apps for Jira and Confluence for 2025

Contact