Andrew Cooper
2012-02-13 16:03:23 UTC
Hello,
XenServer6.0 (Xen 4.1.1) has had a support escalation against it for
Cisco C210 M2 servers. I do not have access to any of these servers, so
cant debug the issue myself.
The pcpu LAPICs support EOI Broadcast suppression and Xen enabled it.
In arch/x86/apic.c:verify_local_APIC, there is a comment stating that
directed EOI support must use the old IO-APIC ack method.
A hypervisor with this check disabled (i.e. never checking for, or
enabling directed EOI) seems to make the system stable again (5 days
stable now, as opposed to a hang due to lost interrupts once every few
hours before).
First of all, I have discovered that forcing "ioapic_ack=new" does not
have the indented effect, because verify_local_APIC trashes it, even if
the user has specified the ack method. I intend to send a patch to fix
this in due course.
However, as for the main issue, I cant work out any logical reason why
directed EOI would not work with the new ack mode. I am still trying to
work out the differences in the code path incase I have missed something
subtle, but I wondered if anyone on the list has more knowledge of these
intricacies than me? Either way, it appears that there is a bug on the
codepath with directed EOI and old ack method.
Thanks in advance,
XenServer6.0 (Xen 4.1.1) has had a support escalation against it for
Cisco C210 M2 servers. I do not have access to any of these servers, so
cant debug the issue myself.
The pcpu LAPICs support EOI Broadcast suppression and Xen enabled it.
In arch/x86/apic.c:verify_local_APIC, there is a comment stating that
directed EOI support must use the old IO-APIC ack method.
A hypervisor with this check disabled (i.e. never checking for, or
enabling directed EOI) seems to make the system stable again (5 days
stable now, as opposed to a hang due to lost interrupts once every few
hours before).
First of all, I have discovered that forcing "ioapic_ack=new" does not
have the indented effect, because verify_local_APIC trashes it, even if
the user has specified the ack method. I intend to send a patch to fix
this in due course.
However, as for the main issue, I cant work out any logical reason why
directed EOI would not work with the new ack mode. I am still trying to
work out the differences in the code path incase I have missed something
subtle, but I wondered if anyone on the list has more knowledge of these
intricacies than me? Either way, it appears that there is a bug on the
codepath with directed EOI and old ack method.
Thanks in advance,
--
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com