Virtual Nomad: 2015

Monday 12 October 2015

For those in APJ region who couldn't make to VMworld 2015

Hey guys,

I believe there are plenty of you who, for billion of different reasons, couldn't make to VMworld 2015 in US or Europe. However, you still have a chance to visit one of the regional "VMworlds". For APJ region it is obviously vForum 2015 event which is held in Sydney, Australia on 21-22 October 2015.

In a way this event will be even more useful than any of the VMworlds as it is a kind of wrap up of all VMworld 2015 news and highlights. Moreover, you don't have to take 20 hours flight from Sydney to San Francisco or Barcelona. :)

It is a 2 days event where the agenda of the first day covers business challenges & opportunities and the Day 2 will focus mostly on technical deep dives which are my favourite type of any VMware sessions. My personal preference would be "Implementing a Network Virtualisation Strategy' session which is a very hot topic nowadays and "Managing Your Data Centre Efficiently" session which helps you to improve your skillset in never ending challenge of optimization of datacentre.

The access for the first day is absolutely Free - all you have to do is to register here.

If you want to attend technical Deep Dive sessions you can get the All Access pass for just about $600 and this one comes with excellent benefits:

Complimentary entry to Airwatch Connect (valued at $275 per person)
Up to 50% off VMware certifications*
$600 in vCloud Air service credits for free
Exclusive access to in-depth training. Choose two three hour deep dive sessions and receive a certificate of completion/attendance

*Need to register for the exam within 2 weeks of vForum and sit for the exam before December 18th 2015.

On top of that benefits this is a great chance for a networking with industry experts.

To get more information on the event follow this link and book the sessions you like.

Saturday 26 September 2015

ESXi killer feature is under a threat

I know it has been there for a while, but I have just learnt it.

So, apparently the Transparent Page Sharing is disabled by default now. Here is the list of the patches and ESXI builds where TPS was disabled:

ESXi 5.0 Patch ESXi500-201502001, released on February 26, 2015

ESXi 5.1 Update 3 released on December 4, 2014

ESXi 5.5, Patch ESXi550-201501001, released on January 27, 2015

ESXi 6.0

This has been one of my favourite features of the ESXi. I have always taken advantage of it. Even after Nehalem CPUs were released and Large Pages made TPS useless I still preferred to disable Large Pages to have a better understanding of memory usage on my systems. Although, some VMware White Papers stated that there is about 15% to 20% CPU performance increase when using Large Pages, but I could never get the same results in my environments.

So, why VMware made this decision?

Accordign to this KB2080735 some academic researches "have demonstrated that by forcing a flush and reload of cache memory, it is possible to measure memory timings to try and determine an AES encryption key in use on another virtual machine running on the same physical processor of the host server if Transparent Page Sharing is enabled between the two virtual machines". Sounds pretty dangerous, huh?

However, then VMware says "VMware believes information being disclosed in real world conditions is unrealistic" and "This technique works only in a highly controlled system configured in a non-standard way that VMware believes would not be recreated in a production environment."

I understand that VMware prefers "Better safe than sorry" approach and it is fair enough provided that reputation damage would be huge if that flaw would have been exploited in a real production environment.

What exactly was changed and how?

TPS is disabled only for Inter-VM memory sharing. Memory pages within one VM are still shared, though providing significantly less savings from memory deduplication.

To be more specific, the Memory Sharing feature is not actually disabled. VMware introduced so called Salting concept which will let ESXi host deduplicate two identical memory pages in different virtual machines only when their Salt value is the same.

This new concept is enforced using new configuration settings Mem.ShareForceSalting=1. Setting this option to 0 will disable requirement for Salting and will allow Inter-VM memory sharing as it used to be before applying security patches.

If you want to specify Salt value per VM here are the steps from VMware KB2091682

Select the ESXi relevant host.

In the Configuration tab, click Advanced Settings under the software section.

In the Advanced Settings window, click Mem.

Look for Mem.ShareForceSalting and set the value to 1.

Click OK.

Power off the VM, which you want to set salt value.

Right click on VM, click on Edit settings.

Select options menu, click on General under Advanced section

Click Configuration Parameters…

Click Add Row, new row will be added.

On the left side add the text sched.mem.pshare.salt and on the right side specify the unique string.

Power on the VM to take effect of salting.

Repeat steps 7 to 13 to set the salt value for individual VMs.

Same salting values can be specified to achieve the page sharing across VMs.

What impact may it have on your environment?

If you take advantage of TPS to overprovision your environment and your performance stats show that assigned virtual memory is larger than your physical memory be really careful and take decision on TPS before you update your hosts.

Otherwise you are risking to see all other VMware memory management features in action - Ballooning, Compression, Swapping. Definitely, these are pretty cool features, but you don't wanna see them in your Production environment.

What should I do now?

I am not an IT security guy, but as far as I understand this security risk mostly applies to multitenant environments where virtual machines belong to different companies. It can be also a risk where security requirements to the vSphere farm are significantly higher, e.g. in banking, defence industries. So you should probably check your security policies before re-enabling TPS.

However, in most of the other companies re-enabling TPS doesn't seem to be a big issue in my opinion. Just make sure it is your educated choice.

Monday 21 September 2015

The use case of Route based on source MAC hash load balancing

It is a big pleasure to work with experienced clients and there is always something to learn from them.

We all know main load balancing options - based on source port, source mac, IP hash. etc.
Very often people stick with load balancing bases on source port just because it provides sufficient distribution of the traffic across all physical NICs assigned to the port group and doesn't require any configuration on the physical switch.

What I know about source mac address load balancing does absolutely the same, but uses extra CPU cycles to compute the MAC address hash. So there was no point in using it.

However, as I have learnt today, there is significant difference in load balancing behaviour between two methods mentioned above when using VM with more than one virtual NIC.

So, when the source port load balancing is used the ESXi switch will use the port ID of the first virtual NIC of the VM to identify the uplink to use and the same port ID (and hence the same uplink) will be applied to the traffic sent/received by all other virtual NICs of that VM.

However, with source MAC address load balancing the uplink will be selected using the MAC address of each of VM's virtual NICs.

That's not very common use case, but it is still good to learn the use case of the feature nobody paid much attention.

PS I haven't yet tested myself if this is true, but I definitely will once I get access to my home lab

Tuesday 15 September 2015

Windows Failover Cluster Migration with vSphere Replication

Recently I was helping a colleague of mine with the migration of one of the clients to a new datacenter. Most of the VMs were planned to be migrated using vSphere Replication. However, the customer couldn't make a decision on how to migrate its numerous Windows Failover Clusters (WFC) with physical RDM disks for the simple reason that the vSphere Replication doesn't support replication of RDM disks in Physical compatibility mode.

Yes, you still can replicate virtual RDM disk, but it will be automatically converted to VMDK file at destination so you won't be able to use cross-host WFC. That's what I thought before I found this excellent article which contains very interesting note:

"If you wish to maintain the use of a virtual RDM at the target location, it is possible to create a virtual RDM at the target location using the same size LUN, unregister (not delete from disk) the virtual machine to which the virtual RDM is attached from the vCenter Server inventory, and then use that virtual RDM as a seed for replication."

That's when I thought it should still be possible to use vSphere Replication to move WFC to another datacenter with zero impact on clustered services and zero changes on OS/Application level.

ESXi and Guest VM time sync - learning from mistake

Today I was browsing some interesting blogs while getting ready for VCAP5 exam and stumbled upon the excellent post about time syncing in Guest VM.

The most interesting part of the post for me was the following:

"Even if you have your guests configured NOT to do periodic time syncs with VMware Tools, it will still force NTP to sync to the host on snapshot operations, suspend/resume, or vMotion."

That was pretty big surprise for me as I have always had all my VMs synced with NTP on the OS level.

Another budget vSphere home lab post – Part 2

The first part of the post can be found here

Finally I got some time to write a second part about my home lab.

We'll start with some pictures of the build process, but if you find them boring just scroll down to the next section.

I just have to mention that the assembly process was very simple and straightforward even for the guy who built his last PC about 15 years ago. Took me only a couple of hours to get both servers powered on.

Preparing RHEL/Centos 7 VM template for vRealize Automation, Site Recovery Manager or plain Guest Customization.

Yesterday I have been deploying my first vRA 6.2 on my home lab and the first blueprint I thought I would build was Linux Centos. The reason I chose the CentOS is pretty simple - this OS has been used in some VWware Hands on Labs I played with recently.

So I downloaded CentOS 7 Minimal install ISO and thought I would have my first blueprint ready to be deployed in no time. However, it took me 3 hours to figure out how to properly prepare Linux VM for Guest OS customization. The customization is also used with Site Recovery Manager to customize VMs after faill over and when providing VM with vRealize Automation.

I am gonna be working with vRA and blueprints for a while so I thought I would document all steps of the process and share it.

It is pretty simple so there will be no screenshots.

1. Enable networking (I think I missed that step in GUI installation wizard)

run

vi /etc/sysconfig/network-scripts/ifcfg-enXXXXX

And change this line as following

ONBOOT=yes

Alternatively you can enable Ethernet card using nmtui command

2. Install PERL (it is also needed for customization and it doesn't come with Minimal CentOS)

yum install perl gcc make kernel-headers kernel-devel -y ONBOOT
Check if it is installed with whereis perl command

3. Install open-vm-tools - yep, not standard Vmware tools, but open source VM tools.

yum install open-vm-tools

4. Install the deployPkg Tools Plug-in.

This is important addition to the Open VM tools which is responsible for actual customization of the Linux VM

Create reposository file vmware-tools.repo in /etc/yum.repo.d/ and add the following lines to the file.

[vmware-tool]
name = VMware Tools
baseurl = http://packages.vmware.com/packages/rhel7/x86_64/
enabled = 1
gpgcheck = 0

And then run

yum install open-vm-tools-deploypkg

5. Change the release name

This helps vSphere to recognise the CentOS as RHEL and do proper customization.
If you don't run this command you will end up with customization applied to wrong files, e.g. you will have eth0 file with the IP Address you setup in Customization profile instead of updating the actual NIC file, e.g. eno16780032

rm -f /etc/redhat-release && touch /etc/redhat-release && echo "Red Hat Enterprise Linux Server release 7.0 (Maipo)" > /etc/redhat-release

6. Run Preparation script that deletes unique and temp data.

#!/bin/bash
clean yum cache
/usr/bin/yum clean all
#remove udev hardware rules
/bin/rm -f /etc/udev/rules.d/70*
#remove nic mac addr and uuid from ifcfg scripts
/bin/sed -i '/^$HWADDR\|UUID$=/d' /etc/sysconfig/network-scripts/ifcfg-eth0
#remove host keys (important step security wise. similar to system GUID in Windows)
/bin/rm -f /etc/ssh/*key*
#engage logrotate to shrink logspace used
/usr/sbin/logrotate -f /etc/logrotate.conf
#and lets shutdown
init 0

Here are some links I used when compiling this short instruction - thanks a lot guys.

http://serverfault.com/questions/653052/from-vsphere-5-5-deploying-centos-7-from-template-ignores-customizations

https://lonesysadmin.net/2013/03/26/preparing-linux-template-vms/

http://www.boche.net/blog/index.php/2015/08/09/rhel-7-open-vm-tools-and-guest-customization/

Update

I tried to re-create another Linux template today using the steps in this procedure and for some reasons deploying VM from VRA was failing during the guest customisation. However, when I cloned the VM manually from vCenter and used the same customisation all went fine.

Then I repointed the vRA blueprint to just cloned VM and now I am able to deploy VMs from vRA.

Have no idea what would cause the problem... :(

Wednesday 5 August 2015

Another budget vSphere home lab post - Part 1

A slightly more than a year ago I bought the top specs Macbook Pro (i7, 16GB, 512 GB) being totally sure it will be sufficient to run simple nested vSphere Lab. Honestly speaking I didn’t think through the requirements and what exactly I want to run in it, I just wanted a new toy. When vSphere 6 Beta was released it was a big surprise for me that 16GB of RAM is just enough to meet minimum requirements for 2 hosts and vCenter. I am not even talking about attempts to fit vSphere Horizon or vCloud Director setup in my nested lab. I used one quite powerful server at work for a while, but good time has passed very fast and it was definitely time to come up with a new permanent solution.

First experience with VMware vCloud Air

That was my first experience with vCloud Air and I am afraid I can't call it pleasant.

I applied for Early Access program back in November 2014 and VMware promised $1000 credit on my account.

A couple of weeks later I received an invitation from VMware for this program and I went through the registration forms filling all details including credit card number. Once the registration was finished I received another email from VMware which contained common VMware links and confirmation that my account was successfully created, but it didn't have any links or instructions on how to use vCloud Air.

When I checked my account I noticed that the status of VMware vCloud Air service was "provision pending". I contacted VMware Support and was told that I need to wait for couple of days as it takes time to get service arranged.

I have waited for two weeks, but the services stayed in the same status "provision pending"
Opened another support ticket and we started usual email ping-pong game. The case was escalated finally, but it took me several emails and direct contact with the support manager to get this thing going.

45 days after I had applied for this service I was finally provided the root cause of the problem, but not the fix. Apparently VMware couldn't provision vCloud Air for me because my currency at the registration step was set to Australian Dollars which is pretty logic for me because I live in Australia.

Another 2 weeks later VMware managed to fix the problem and provided the access to vCloud Air.

The first bug I noticed is that I couldn't see $1000 credit in my account. Another email to support and I was explained that it was a temporary issue and it would be fixed soon.

The very next thing I did was deploying new VM from the preconfigured templates in public catalog.
I chose W2012 R2, but once it was powered on I couldn't log in using the password I found in the VM properties in Customization section.

I though I might did something wrong. I killed VM and deployed a new one now manually setting the password for VM. Well, it didn't work either but that time I noticed that the VM didn't reboot as it was supposed to do once the customization is over. I have also tried to use the option "Power on and force customization" on the vApp, but this attempt failed too - the VM didn't want to reboot after first power on.

I gave up to solve this problem myself and contacted VMware support for the 4th time. Turned out it was another bug. When you deploy a new Windows VM from a VMware catalog it is provisioned with E1000 NIC which has some driver conflict. This prevents system customization to complete, and thus the admin password is never set.

Here is the VMware KB which explains how to resolve the issue - http://kb.vmware.com/kb/2098020

Lastly, I was going to test performance of VM with some simple Excel macro. I have already tested it on AWS and Azure VMs and in our hosted environment. Unfortunately, non of the cloud providers were able to get close to the results I got on my vSphere farm, and vCloud Air was my last hope.
However, the results were disappointing - Amazon and Microsoft showed way better results.

not much of fun, isn't it?

I really hope all the problems will be fixed soon and that cheap and old fashioned GUI will be replaced with descent portal.