Intro
Before you begin reading I want to warn you that running Hyper-V and Azure Stack in a nested hypervisor setup on VMware is not supported in any way by Microsoft. However, Microsoft is open to hear any feedback and experience leveraging nested virtualization which results in my feedback and tweaks to get TP2 going using a nested hypervisor setup on VMware.
I deployed Azure Stack on VMware ESXi 6.0.0 (4192238) running in a virtual datacenter (vCloud Director), creating multiple inceptions; Hypervisor (Hyper-V) on Hypervisor (ESXi), Software-defined storage virtualization (S2D) on software-defined storage virtualization (VSAN), Software-defined network virtualization (NC/Switch) on software-defined network virtualization (vSphere/NSX),Virtual datacenter (Azure Stack) on virtual datacenter (vCloud).
A nested hypervisor scenario like this also allows you to create a virtualized Windows container running in a VM on a hypervisor (Hyper-V) nested in another hypervisor (VMware ESXi) spanning three virtualization layers. Pretty neat, however, these kind of nested scenarios are certainly not meant and ready yet for production environments. Microsoft or any other vendor won’t support it for obvious reasons, mixing all these unaligned cutting edge virtualized components with each other in this premature stage of nesting things means trouble. However, I’m really convinced that this will be the way to go to test your VMware or Hyper-V PoC enviroments out and that these nested hypervisor setups eventually become trustworthy enough for production workloads. Microsoft already started the nested hypervisor revolution in production by releasing Hyper-V Containers as part of “Windows Server 2016′. Hyper-V Containers make use of a nested Hyper-V hypervisor setup and are officially supported by Microsoft and Docker.
VMware tweaks
I followed the Azure Stack deployment guide , booted the ‘cloudbuilder.vhdx’ and installed VMware tools. I immediately created a dummy VM and tried to start to see if Hyper-V is working as expected. Unfortunately, it didn’t. Looking at the startup events in the system event log I spotted the usual hypervisor initialization sequence and found the following message.
Source: ‘Hyper-V-Hypervisor‘
Message: ‘Hypervisor launch failed; Processor does not support the minimum features required to run the hypervisor (MSR index 0x48F, allowed bits 0x2BEFFF0003EFFF, required bits 0x33FFFF00036DFB).’
The exposed virtual processor doesn’t have the required Windows 2016 Hyper-V virtualization features, luckily I could solve it by upgrading the VM from hardware version 10 to 11.
Also be sure you that hardware-assisted CPU virtualization is exposed to the guest OS. See the hardware virtualization settings in the screenshot below.
Running HW version 11 with the right and latest virtualization settings should give you a successful ‘Hypervisor successfully started.’ message in the system event log. Try to create a dummy VM in the Hyper-V manager and try to start it. If it’s up and running then congratulations, inception part one succeeded, you’re running a nested Hyper-V hypervisor on VMware.
I also recommend to create the VM with the VMXNET3 network adapter, any other adapter (certainly emulated ones) are unstable and won’t work nicely with the network virtualization on top.
Since Azure Stack infrastructure VM’s are using NAT to connect to the outside world, the MAS-BGPNAT01 has to reach the assigned gateway. Just like enabling ‘MAC spoofing’ on the Hyper-V VM network adapter when you’re nesting Hyper-V in Hyper-V you now have to do the same on VMware to allow guest network communication to go outside. The adapter or port on the virtual switch should allow other MAC addresses other than the hosts MAC address to go outside. To change it on VMware, open vSphere and go to the ‘distributed port group’ associated to the VM, choose edit and go to the security tab and accept ‘promiscuous mode’, ‘mac address changes’ and ‘forged transmits’.
VMware processor and network virtualization settings are now set, let’s start with the actual Azure Stack deployment.
Azure Stack tweaks
I came across two problems running the Azure Stack PowerShell deployment script ‘.\InstallAzureStackPOC.ps1’. The Azure Stack tests reported no cores and no supported virtualization features for my virtual VMware CPU’s.
It also checked if I was running a physical machine or not and failed when it detected the VM.
I dived into the PowerShell scripts folders and quickly found and changed the two lines below in ‘C:\CloudDeployment\Roles\PhysicalMachines\Tests\BareMetal.Tests.ps1′
VM : Line 376: Change ‘Should Be $false’ TO ‘Should Be $true’
Cores : Line 453: Change ‘Should Not BeLessThan $minimumNumberOfCoresPerMachine’ TO ‘Should Not BeLessThan 0’
I reran the setup with the ‘–rerun’ parameter and the validation tests continued without a problem. I encountered several errors but was able to continue the installation using ‘Invoke-EceAction -RolePath Cloud -ActionType Deployment -Verbose -Start NUMBER‘ where NUMBER is the step where the deployment halted. You can find it in the ‘Invocation of step NUMBER failed.’ error message. Like ‘60.61’ or ‘60.61.100’, if that won’t go through then try to go back to 60 and see if that works.For more information about re-running the deployment, see https://azure.microsoft.com/en-us/documentation/articles/azure-stack-rerun-deploy/
So far my experience deploying TP2 on a nested hypervisor setup on VMware. I still have to try different networking scenarios and see if the system will remain stable, I will update this blog with my findings.
Happy Stacking,
Ruud
Installing Azure Stack on VMware Workstation? Read ‘Gisli Gudmundsson’s blog here.
Update: I’m running the same setup now for three weeks. The host remains stable, no issues there. However, I do experience the occasional crashes/reboots with the infrastructure VM’s. Luckily it doesn’t happen that often and Azure Stack services and VM’s always recover very fast and without a problem which is quite amazing, kudos for the team for that! I used every Hyper-V and Network setting in the book resolving this issue but unfortunately without succes. I suspect something with the prerelease code of ‘Windows Server 2016’ Azure Stack TP2 uses (which you can’t update with a CU) in combination with a nested VMware hypervisor setup. Please reach out to me if you managed to resolve these issues so I can update my setup and this blog with your findings.