Recently there was a YouTube video with AMD and Red Hat showing a live migration from Intel to AMD processors. This is something that often comes up in customer meetings. The customers don’t want to be locked into a particular processor vendor if they continue to grow out their virtualization farm. Personally I can’t argue much with that. I also think the video was pretty well done. A bunch of people emailed me and asked why VMware hadn’t come up with this. Well, it’s because it can be pretty dangerous in current environments. Let me explain.
What Was Shown
Red Hat and AMD got together and produced a video of a live migration going between an AMD host and an Intel host. VMware (and other vendors) have prevented customer from doing this for a long time and still don’t condone the activity today. Intel also doesn’t condone doing this. Red Hat and AMD see this as an opportunity to gain a competitive advantage and introduce this technique to market and so they produced the video and started to distribute collateral around the technique. The only downside is they do nothing to tell customers what they are doing to get this to work and why it might be dangerous to an environment. It’s believed that the companies used CPUID masking to mask out incompatible feature sets in the processors in order to move the VMs around. If so then pay close attention to point number 2 below in the “Cross Platforms with VMware” section. If there’s anyone from AMD or Red Hat reading and cares to share what they did then please leave a comment or email me directly.
The Problem
Ever since VMware created VMotion (live migration) nearly 6 years ago they’ve worked very hard to make sure customers didn’t shoot themselves in the foot. VMware does this through processor checks. The tricky part of live migration has to do with what the hypervisor can actually see and manipulate. There are certain instructions that run in kernel mode and do get seen by the hypervisor and so various techniques such as binary translation or paravirtualization can be used to deal with these instructions. The hypervisor could also mask these operations out or emulate them to some degree. There are other instructions that run in user mode. These instructions do not get seen by the hypervisor and so there’s no masking and no emulating that can be done.
One such user mode instruction is the SSE instruction group. SSE deals largely with advanced video processing. It’s currently at SSE 4.1. As an example of what can go wrong with live migration when using user mode instructions take a look at this video.
In the video VMware migrated a VM from a host that understands the SSE 4.1 instruction to a host that doesn’t understand SSE 4.1. The application that they were using just happened to be using SSE 4.1. As you can see in the video the application crashes. Now you understand the problem.
Cross Platform with VMware
Why can’t we do the same thing with VMware? Actually, we can. There are a few different ways to get a VM to move between platforms today.
1) Turn off CPU checks globally. Yes, this will have the desired outcome of moving VMs between platforms. Here’s a nice little video showing just that.
For anyone that wants to play along at home all you have to do is add some XML to your global Virtual Center settings file and restart the Virtual Center Server service. The file you need to edit is at C:\Documents and Settings\All Users\Application Data\VMware\VMware VirtualCenter\vpxd.cfg. NOTE!!! This is not a supported configuration. The code you need to add is the chunk between <migrate> and </migrate> shown below.
There’s a major problem with solution number 1. With over 6.5 million windows applications out there today you’re basically gambling that none of your apps use those nasty user mode commands. The good news is that most applications in the datacenter aren’t going to be using these commands. However, more and more desktops are finding their way into the datacenter environment. All of this is why it’s downright dangerous to move applications around without any CPU checking.
2) We could selectively mask out CPUIDs. This is probably what Red Hat and AMD are doing. This is also what VMware Enhanced VMotion does in cooperation with the processor. If we do a normal VMotion between incompatible processors we’ll get a warning like the one below.
We can use the advanced CPU features for a VM to mask out the CPUIDs causing the conflict.
This will enable us to do cross platform VMotions/live migrations but then we run across another problem – not all processors are created equal. It’s a well known fact that different processors run commands at different speeds. The memory architecture for example is very different between AMD and Intel processors. It’s so different that Windows and Linux load different kernels depending on which processor they’re running on. So what happens if you move your app that is highly tuned for the current memory architecture and has an Intel kernel loaded in the OS to an AMD box? You’ll probably end up with a performance issue. This is really the tip of the iceberg for what could happen.
3) VMware has looked at a 3rd option – mask to a common set of features. Basically look at what runs the same and is implemented the same in both Intel and AMD chipsets. After doing some comparisons the commonalities come out to a Pentium II feature set. Of course in order to get to that common feature set a lot of the other processor functions would have to be emulated in software. When I’ve explained this to customers interested in cross-platform migration they’ve said this is pretty useless. It’s a lot like taking the engine out of your car, cutting a hole in the floor, and moving around Fred Flintstone style.
The Bottom Line
After all is said and done you end up with option #1 which will probably crash your apps, option #2 which will probably cause all sorts of unforeseen performance and other issues, and option #3 which is just plain useless. While I can appreciate Red Hat and AMD trying to push technology forward the demonstration is downright reckless without the proper explanations and pretty useless after you do explain things. Meanwhile, you could do the same thing with VMware for the past 6 years, however VMware will continue to push safe, reliable technologies such as VMotion and Enhanced VMotion Compatibility to provide customers with the most mature and stable virtualization technology on the planet.
-
Travis Phipps
-
Mike DiPetrillo
-
Louw
-
Mike DiPetrillo



