Andrew Cooper
2018-10-16 18:54:25 UTC
Hello,
I realise this is an old CPU, but I've I've encountered a weird failure
on it.
Specifically:
(XEN) CPU Vendor: Intel, Family 6 (0x6), Model 23 (0x17), Stepping 6
(raw 00010676)
[***@harpertown ~]# head /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 23
model name : Intel(R) Xeon(R) CPU E5420 @ 2.50GHz
stepping : 6
microcode : 0x60f
cpu MHz : 2493.756
cache size : 6144 KB
physical id : 0
In Xen, we use an MSR load list to update EFER on vmentry/exit, when
hardware doesn't support the EFER field in the VMCB itself. This is a
change I made in 4.11 to fix a bug with NX handling on context switching.
After some investigation, it turns out that after vmentry, while the
load list has the value 0xd01 (NXE, LMA, LME, SCE), the value loaded
into hardware is 0xd00 (NXE, LMA, LME).
I.e. when an MSR load list is used for EFER, we resume the guest with
SCE cleared. This is rather terminal for 64bit guests, as
syscall/sysret instructions take a #UD fault.
I can't see anything relevant in the Specification Update for this
processor.
I've confirmed that by not using a load list, the current value in EFER
is preserved once the vmentry is complete, and by disabling the EFER
intercept, I can re-set SCE in non-root context and have syscall/sysret
work correctly.
However, given this behaviour, I can't think of any way to context
switch NX properly, and leave 64bit guests in a working state.
Do you have any suggestions?
Thanks,
~Andrew
I realise this is an old CPU, but I've I've encountered a weird failure
on it.
Specifically:
(XEN) CPU Vendor: Intel, Family 6 (0x6), Model 23 (0x17), Stepping 6
(raw 00010676)
[***@harpertown ~]# head /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 23
model name : Intel(R) Xeon(R) CPU E5420 @ 2.50GHz
stepping : 6
microcode : 0x60f
cpu MHz : 2493.756
cache size : 6144 KB
physical id : 0
In Xen, we use an MSR load list to update EFER on vmentry/exit, when
hardware doesn't support the EFER field in the VMCB itself. This is a
change I made in 4.11 to fix a bug with NX handling on context switching.
After some investigation, it turns out that after vmentry, while the
load list has the value 0xd01 (NXE, LMA, LME, SCE), the value loaded
into hardware is 0xd00 (NXE, LMA, LME).
I.e. when an MSR load list is used for EFER, we resume the guest with
SCE cleared. This is rather terminal for 64bit guests, as
syscall/sysret instructions take a #UD fault.
I can't see anything relevant in the Specification Update for this
processor.
I've confirmed that by not using a load list, the current value in EFER
is preserved once the vmentry is complete, and by disabling the EFER
intercept, I can re-set SCE in non-root context and have syscall/sysret
work correctly.
However, given this behaviour, I can't think of any way to context
switch NX properly, and leave 64bit guests in a working state.
Do you have any suggestions?
Thanks,
~Andrew