Frequently Given Misleading Answers

The other day I came across this FGA item describing how to detect virtualized environments. It includes interesting comments which make Microsoft, Intel, and AMD sound stupid, but perhaps only reflect on the author being either deliberately misleading, or trying far too hard to sound smarter than everyone else.

Quoting the FGA:

According to Microsoft, a flag bit in the ECX register (bit #31, “Hypervisor present”), after executing CPUID with the EAX register set to 0x000000001, will be set to 1 in a (Microsoft) virtual machine and set to 0 on real hardware. This is indeed the official Hypervisor detection mechanism. It’s also the official detection mechanism for VMWare.

But here Microsoft and VMWare are incorrectly relying upon an accident of hardware implementation. Both Intel’s and AMD’s CPUID specifications state that bit #31 of the ECX register is reserved. Intel’s specification even explicitly states that one should not count on the value of the bit. That includes not counting on the fact of it being zero on real hardware. As such, Microsoft’s “official” detection mechanism is bogus.

Sadly, only the FGA itself is bogus. It makes several bold assumptions: Microsoft has absolutely no idea how to design software, Microsoft has zero influence on development of future CPUs, Intel and AMD have no idea how to design CPUs, and Intel and AMD have no idea how their existing CPUs work. Let’s take a look at the claims in detail.

The FGA was written around 2010. At that point, evolution of CPUID information was well understood. Unused bits are typically documented as “reserved”. Yes, Intel might say that “one should no count on the value of the bit”—and of course that is the case, because if a 2009 CPU does not use the bit and a future 2012 CPU defines it for some purpose, software written in 2010 has no business relying on the bit’s value because it cannot assume it will remain constant and doesn’t know what it might one day mean.

In other words, the FGA makes it sound like reserved CPUID bits have random, unpredictable values, which is obvious nonsense, because it would make future extensions impossible. Whatever one may think of Intel, their engineers are not that stupid. There are sometimes undocumented CPUID bits set, but by and large undefined bits are zero… because that is the only behavior which is of any use to Intel/AMD!

While difficult to prove, it is reasonable to assume that companies like Microsoft and VMware have a very good idea what the CPUID values of existing CPUs are. They decided to use the bit because it was consistently set to zero on CPUs existing at the time, or at least on the CPUs relevant to their products.

The FGA further assumes that companies like Intel/AMD and Microsoft operate in a vacuum, never heard of each other, don’t talk to each other, and don’t care about interoperability of their products. That assumption is more than a little bold. In fact, some evidence of the opposite was already available when the FGA was written.

AMD’s CPUID Specification Rev. 2.16 from September 2005 (page 10) defines bits 31:14 of CPUID Fn0000_0001_ECX (Feature Identifiers) as “Reserved”.  But revision 2.18 from January 2006 is subtly and significantly different (again page 10). Bits 30:14 are still “reserved”, but bit 31 is now “RAZ”, or read as zero, and no longer reserved. That notably applies to all (then) existing AMD CPUs, as the CPUID specification makes no mention that the bit would only be zero on specific models. It also applies to future CPUs (unless and until the specification is changed).

Why would that be? Did Microsoft and/or VMware talk to AMD or something? Is this some sort of collusion? The FGA said the mechanism was bogus, but this doesn’t sound all that bogus.

In fact the current (December 2017) AMD documentation goes one step further and not only says bit 31 is RAZ, but also mentions that it is “reserved for use by hypervisor to indicate guest status”. The explicit mention of hypervisors was added sometime between revision 2.28 (April 2008) and revision 2.34 (from September 2010, page 11) of AMD’s CPUID Specification.

To be fair to the FGA author, it took Intel a lot longer to update their documentation. Intel’s Application Note 485 (Intel Processor Identification and the CPUID Instruction), order number 241618-036 from August 2009 still says (page 25) that bits 31:28 of ECX feature flags are reserved, with the comment “do not count on their value” that is apparently so open to misinterpretation. The next revision, order number 241618-037 from January 2011, changed things. While bits 30:29 were still listed as reserved, bit 31 was now “not used”, with the comment “always returns 0”. Again, that applied to all existing Intel CPUs and until and unless the documentation is changed, all future ones as well.

As of December 2017, Intel’s CPUID documentation is unchanged. It does not mention what bit 31 might be used for, but does say that it is always zero and not “reserved”. (Intel likes to leave it as an exercise for the reader to figure out what any given aspect of x86 architecture might possibly be good for.)

The FGA article was originally written in 2009. At that point AMD’s CPUID documentation already defined the bit as zero (not “reserved”) but possibly did not yet explain its purpose. The FGA article also references a VMware document from March 2011, and at that point even Intel’s CPUID document “un-reserved” the ECX bit 31. At that point AMD already explicitly documented its purpose and Intel changed the documentation to define the bit as zero.

All those archived CPU documents are actually useful sometimes. Who knew!

This entry was posted in Corrections, Documentation, Virtualization. Bookmark the permalink.

9 Responses to Frequently Given Misleading Answers

  1. ender says:

    But it’s Micro$oft! They can’t do anything right, ever!

  2. Rugxulo says:

    “0x000000001” … so, just “1”? (Gotta love extra zeros, even an extra leading one, even with no following alpha hex chars.)

    “According to Microsoft, a flag bit in the ECX register (bit #31, ‘Hypervisor present’), after executing CPUID with the EAX register set to 0x000000001, will be set to 1 in a (Microsoft) virtual machine and set to 0 on real hardware. This is indeed the official Hypervisor detection mechanism. It’s also the official detection mechanism for VMWare.”

    Did I miss the point? So it only works for Hyper-V and VMware? Doesn’t seem to work under VBox. Can’t this be misused? Sure, like all version detection, it can bring benefits, but it also seems like a way to block or forbid certain features, OSes, etc. Yuck.

  3. Michal Necasek says:

    No, it does not work with only VMware and Hyper-V. In VirtualBox, whether the CPUID bit is set depends on the VM settings. Sometime (often) it’s useful to set the bit, sometimes it’s useful to not set it. KVM also sets this bit.

    Yes, it can be misused. Like everything else. Current operating systems use this bit to check for and enable various VM-specific optimizations and features.

  4. Richard Wells says:

    Under VMWare, it is possible to change the reported values of all the CPUID registers in the VMX file making the check of bit 31 of ECX an unreliable method.

    https://github.com/spender-sandbox/cuckoo-modified/issues/459 shows an example of how to hide the values.

  5. Michal Necasek says:

    Right. Basically if the bit is set, you’re definitely in a VM. If it’s clear, you might or might not be in a VM.

  6. Mr. Argent says:

    I kind of wonder whether or not forcing a physical computer to identify as a virtual machine would have any perks beyond psyching out some malware designed to deliberately not activate in a lab environment. Doubt it, but the thought is interesting.

  7. Michal Necasek says:

    It’s kind of hard to do 🙂 Not technically impossible, I’m sure if you knew exactly how Intel’s microcode update works, you could change the CPUID signature. I don’t think it would have any advantage other than, as you say, confusing malware into not activating itself.

  8. MiaM says:

    Virtualisation is so common in server environments that I see no reason for malware to only activate on physical hardware.

    On desktop and home user hardware it’s another thing.

  9. @MiaM: …what about malware designed to attack home computers? There there might well be reasons for acting differently on physical as opposed to virtual machines (although, in that case, I’d expect it to be the other way, activating on virtual machines but not physical machines, so as to allow one to safely develop malware without accidentally wrecking the computer they’re using to develop it – and, even then, only for the alpha/beta versions, for testing purposes, not for the malware’s gold release, unless one were wanting to develop VM-specific malware for some reason [which is almost certainly possible, and, now that I think about it, might have some use relating to the wide usage of virtualisation on servers, although I can’t think of what that use might be]).

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.