[RFH]: AMD CR intercept for lmsw/clts

Post by Mukesh Rathor
Hi,
On AMD, clts/lmsw will cause "mov cr" vmexit, but unlike intel, they
can't be handled via svm_vmexit_do_cr_access and are emulated thru
..
ASSERT(!is_pvh_vcpu(curr));
if ( cpu_has_svm_decode && (vmcb->exitinfo1 & (1ULL << 63)) )
svm_vmexit_do_cr_access(vmcb, regs);
else if ( !handle_mmio() ) <==========
hvm_inject_hw_exception(TRAP_gp_fault, 0);
break;
Soooo, this leaves no choice but to make the ASSERT conditional
for intel only, and let handle_mmio go thru x86_emulate and let
x86_emulate fail for anything other than lmsw/clts? I was thinking
x86_emulate()
int fail_pvh_emul = 1;
...
.....
fail_pvh_emul = 0;
then
if (fail_pvh_emul)
rc = X86EMUL_UNHANDLEABLE;
return rc;

I'd strongly recommend against adding any such to x86_emulate():
There's nothing precluding the emulator to be used for PVH guests.

Post by Mukesh Rathor
Or, should I just create a new function for clts/lmsw and call it
directly from vmexit switch itself?

I'd prefer this - it seems pretty ugly to me that handle_mmio()/
x86_emulate() gets used for this purpose - but am not certain this
will actually work out nicely for other than CLTS: All the
instructions currently handled specially are ones with fixed
operands, and only CLTS fits that.

You'll btw have the same problem with SMSW and DRx accesses,
string I/O instructions, as well as (on older CPUs) with moves to/from
CRx and INVLPG.

In the case this doesn't turn out reasonable, rather than
manipulating handle_mmio() any further, I'd suggest to investigate
bypassing that function in favor of a more direct access to the x86
emulator. After all you're not after any MMIO emulation here, just
bare instructions (many of which without memory operands at all).

Jan

Andrew Cooper

2014-08-05 11:16:16 UTC

There's nothing precluding the emulator to be used for PVH guests.

Post by Mukesh Rathor
Or, should I just create a new function for clts/lmsw and call it
directly from vmexit switch itself?

SMSW certainly, but what is wrong with the others? For those, the exit
info appears to give fully decoded information.

Post by Jan Beulich
In the case this doesn't turn out reasonable, rather than
manipulating handle_mmio() any further, I'd suggest to investigate
bypassing that function in favor of a more direct access to the x86
emulator. After all you're not after any MMIO emulation here, just
bare instructions (many of which without memory operands at all).

In the AMD System manual (vol 2), there are special cases listed for
lmsw, clts and smsw in section 15.8.1. In each case, bit 63 of
exitinfo1 will be clear (which is checked using magic numbers in the
code). It would appear that under these circumstances, the only
information provided is "it wasn't a mov instruction". As a result, I
think x86_emulate() is the best way go to, rather than hacking something
up locally, although I would agree that handle_mmio() is not really
appropriate.

~Andrew

Jan Beulich

2014-08-05 12:11:43 UTC

Post by Jan Beulich
I'd prefer this - it seems pretty ugly to me that handle_mmio()/
x86_emulate() gets used for this purpose - but am not certain this
will actually work out nicely for other than CLTS: All the
instructions currently handled specially are ones with fixed
operands, and only CLTS fits that.
You'll btw have the same problem with SMSW and DRx accesses,
string I/O instructions, as well as (on older CPUs) with moves to/from
CRx and INVLPG.

SMSW certainly, but what is wrong with the others? For those, the exit
info appears to give fully decoded information.

All the decoding info is conditional upon decoding assists being
available. And for string I/O handle_mmio() is being used kind of
unconditionally.

Right, except that at least for CLTS (not having any operands) the
light weight decoding could still be used. I'd even think decoding the
CRx/DRx moves would be okay as they only permit register
operands. But with LMSW and SMSW having memory operands it
would indeed start to become like re-implementing x86_emulate().
Question is whether one couldn't simply forbid PVH guests to use
them, as moves to/from CR0 provide all the needed functionality.

Jan

Andrew Cooper

2014-08-05 13:00:25 UTC

SMSW certainly, but what is wrong with the others? For those, the exit
info appears to give fully decoded information.

All the decoding info is conditional upon decoding assists being
available. And for string I/O handle_mmio() is being used kind of
unconditionally.

The decode assists apply strictly to mov CRx/DRx, INTn, INVLPG and
(nested) #PF. In contrast, the IOIO intercept appears to
unconditionally fill the decode information in exitinfo1, so I believe
the current code is correct.

I think that is a very dangerous route to take.

Despite the current limitations, I firmly believe that PVH should be HVM
- device model, rather than PV + VMX/SVM. Restricting the use of
certain instructions would certainly move PVH into the latter category.

Fundamentally, the end goal of PVH needs deciding ASAP, and documenting,
to help guide decisions like this.

~Andrew

Jan Beulich

2014-08-05 13:15:21 UTC

SMSW certainly, but what is wrong with the others? For those, the exit
info appears to give fully decoded information.

All the decoding info is conditional upon decoding assists being
available. And for string I/O handle_mmio() is being used kind of
unconditionally.

And I didn't say anything to the contrary. All I said was that there is
another use of handle_mmio() lurking.

Generally I would agree. For those two specific instructions though,
I don't think any reasonably modern OS would be using them
anyway outside of 16-bit boot code. After adding to this the fact
that PVH, just like PV, implies a modified OS, putting on such a
restriction doesn't seem very harmful to me.

Jan

Mukesh Rathor

2014-08-05 22:30:25 UTC

On Tue, 05 Aug 2014 14:00:25 +0100
...

Post by Andrew Cooper
Despite the current limitations, I firmly believe that PVH should be HVM
- device model, rather than PV + VMX/SVM.

I think that might be a dangerous route to take, classifying upfront
whether it's that way or the other. Eg, if we say it's former, then
anyone adding any feature would not examine the best approach, but just
take hvm approach.

Post by Andrew Cooper
Fundamentally, the end goal of PVH needs deciding ASAP, and
documenting, to help guide decisions like this.

I think it's decided somewhat. Evolve to one of three approaches: PV,
HVM, or alternate, picking the easiest and fastest. IMO, at the very
least, pvh should retain "guest modified" characteristic, that would be
good for xen future imho.

Mukesh

Andrew Cooper

2014-08-06 09:34:21 UTC

Post by Mukesh Rathor
On Tue, 05 Aug 2014 14:00:25 +0100
...

Post by Andrew Cooper
Despite the current limitations, I firmly believe that PVH should be HVM
- device model, rather than PV + VMX/SVM.

There are many PV-isms which already exist for HVM. Saying "HVM -
device model" does not preclude further PVism from being introduced and
used. It does however means that PV-aware HVM guests get equal
opportunity at these improvements. Fundamentally, having PVH closer to
HVM than PV means fewer modifications required to turn a native kernel
into a PVH kernel, which is a *very* good thing from the point of view
of the kernel authors.

But as I said, this is only my opinion.

Post by Mukesh Rathor

Post by Andrew Cooper
Fundamentally, the end goal of PVH needs deciding ASAP, and
documenting, to help guide decisions like this.

It clearly is not decided, or even semi-certain, by virtue of having
this conversation.

There are currently many opinions (some of which certainly can't
coexist, many which can), a lot of semi-baked code with many
restrictions (and repeated breaking of PVH/PVHdom0 by making seemingly
innocent code changes elsewhere), and no concrete plan of what PVH is or
what it should be.

What needs to happen urgently is for someone to make a firm decision,
and prepare a document for /docs/specs/pvh. A document like that is not
immutable in the future if hindsight shows otherwise, but it will
provide solid guidance as to how to proceed in matters like this.

~Andrew

Konrad Rzeszutek Wilk

2014-08-15 21:04:20 UTC

Post by Mukesh Rathor
On Tue, 05 Aug 2014 14:00:25 +0100
...

Post by Andrew Cooper
Despite the current limitations, I firmly believe that PVH should be HVM
- device model, rather than PV + VMX/SVM.

Right. I would like to stress that the x86 maintainers are excited
about this as it would remove the pvops that don't have clear
semantic.

Post by Andrew Cooper
But as I said, this is only my opinion.

Post by Mukesh Rathor

Post by Andrew Cooper
Fundamentally, the end goal of PVH needs deciding ASAP, and
documenting, to help guide decisions like this.

It clearly is not decided, or even semi-certain, by virtue of having
this conversation.

HA!

Post by Andrew Cooper
There are currently many opinions (some of which certainly can't
coexist, many which can), a lot of semi-baked code with many
restrictions (and repeated breaking of PVH/PVHdom0 by making seemingly
innocent code changes elsewhere), and no concrete plan of what PVH is or
what it should be.
What needs to happen urgently is for someone to make a firm decision,
and prepare a document for /docs/specs/pvh. A document like that is not
immutable in the future if hindsight shows otherwise, but it will
provide solid guidance as to how to proceed in matters like this.

That could certainly be done but I think we are all tied in fixing
code and trying to get features in Xen 4.5 before the feature
freeze gates are shut.

It should be fairly easy as most of it is 'runs like HVM' with
some HVM-ism disabled (so point to Intel SDM and AMD). And then
going through the hypercalls and seeing which are enabled.

Then there is the business of the startup which is complex, but
fortunatly there is a Wiki page to rip:
http://wiki.xenproject.org/wiki/X86_Paravirtualised_Memory_Management

Andrew, that nice template you used for the migrationv2 - where can
one find it?

Post by Andrew Cooper
~Andrew
_______________________________________________
Xen-devel mailing list
http://lists.xen.org/xen-devel

Andrew Cooper

2014-08-15 21:48:14 UTC

Post by Konrad Rzeszutek Wilk

Post by Mukesh Rathor
On Tue, 05 Aug 2014 14:00:25 +0100
...

Post by Andrew Cooper
Despite the current limitations, I firmly believe that PVH should be HVM
- device model, rather than PV + VMX/SVM.

Right. I would like to stress that the x86 maintainers are excited
about this as it would remove the pvops that don't have clear
semantic.

Post by Andrew Cooper
But as I said, this is only my opinion.

Post by Mukesh Rathor

Post by Andrew Cooper
Fundamentally, the end goal of PVH needs deciding ASAP, and
documenting, to help guide decisions like this.

It clearly is not decided, or even semi-certain, by virtue of having
this conversation.

HA!

That could certainly be done but I think we are all tied in fixing
code and trying to get features in Xen 4.5 before the feature
freeze gates are shut.
It should be fairly easy as most of it is 'runs like HVM' with
some HVM-ism disabled (so point to Intel SDM and AMD). And then
going through the hypercalls and seeing which are enabled.
Then there is the business of the startup which is complex, but
http://wiki.xenproject.org/wiki/X86_Paravirtualised_Memory_Management
Andrew, that nice template you used for the migrationv2 - where can
one find it?

For the pdf spec? That is just completely standard pandoc.

See
http://xenbits.xen.org/gitweb/?p=people/andrewcoop/xen.git;a=commitdiff;h=ba4c1c9072c623ffb795310e538ea6eed81bd658
for how I expect it to be committed when migration v2 is accepted.

~Andrew

Mukesh Rathor

2014-08-05 22:22:10 UTC

On Tue, 05 Aug 2014 08:46:20 +0100

Post by Mukesh Rathor
Hi,
On AMD, clts/lmsw will cause "mov cr" vmexit, but unlike intel, they
can't be handled via svm_vmexit_do_cr_access and are emulated thru

.....