Post by Milan BobericHi,
Post by Stefano StabelliniThe device tree with everything seems to be system.dts, that was enough
:-) I don't need the dtsi files you used to build the final dts, I only
need the one you use in uboot and for your guest.
I wasn't sure so I sent everything, sorry for being bombarded with
all those files. :-)
Post by Stefano StabelliniIt looks like you set xen,passthrough correctly in system.dts for
Thank you for taking a look, now we are sure that passthrough works
correctly because there is no error during guest creation and there
are no prints of "DEBUG irq slow path".
Great!
Post by Milan BobericPost by Stefano StabelliniIf you are not getting any errors anymore when creating your baremetal
guest, then yes, it should be working passthrough. I would double-check
that everything is working as expected using the DEBUG patch for Xen I
suggested to you in the other email. You might even want to remove the
"if" check and always print something for every interrupt of your guest
just to get an idea of what's going on. See the attached patch.
(XEN) DEBUG virq=68 local=1
which is a good thing I guess because interrupts are being generated non-stop.
Yes, local=1 means that the interrupt is injected to the local vcpu,
which is exactly what we want.
Post by Milan BobericPost by Stefano StabelliniOnce everything is as expected I would change the frequency of the
timer, because 1u is way too frequent. I think it should be at least
3us, more like 5us.
Okay, about this... I double checked my bare-metal application and
looks like interrupts weren't generated every 1 us. Maximum frequency
of interrupts is 8 us. I checked interrupt frequency with oscilloscope
just to be sure (toggling LED on/off when interrupts occur). So, when
- interrupts to be generated every 8 us I get jitter of 6 us
- interrupts to be generated every 10 us I get jitter of 3 us (after
2-3mins it jumps to 6 us)
- interrupts to be generated every 15 us jitter is the same as when
only bare-metal application runs on board (without Xen or any OS)
These are very interesting numbers! Thanks again for running these
experiments. I don't want to jump to conclusions but they seem to verify
the theory that if the interrupt frequency is too high, we end up
spending too much time handling interrupts, the system cannot cope,
hence jitter increases.
However, I would have thought that the threshold should be lower than
15us, given that it takes 2.5us to inject an interrupt. I have a couple
of experiments suggestions below.
Post by Milan BobericI want to remind you that bare-metal application that only blinks LED
with high speed gives 1 us jitter, somehow introducing frequent
interrupts causes this jitter, that's why I was unsecure about this
timer passthrough. Taking in consideration that you measured Xen
overhead of 1 us I have a feeling that I'm missing something, is there
anything else I could do to get better results except sched=null,
vwfi=native, hard vCPU pinning (1 vCPU on 1 pCPU) and passthrough (not
sure if it affects the jitter) ?
I'm forcing frequent interrupts because I'm testing to see if this
board with Xen on it could be used for real-time simulations,
real-time signal processing, etc. If I could get results like yours (1
us Xen overhead) of even better that would be great! BTW how did you
measure Xen's overhead?
When I said overhead, I meant compared to Linux. The overall IRQ latency
with Xen on the Xilinx Zynq MPSoC is 2.5us. When I say "overall", I mean
from the moment the interrupt is generated to the point the interrupt
service routing is run in the baremetal guest. I measure the overhead
using TBM (https://github.com/sstabellini/tbm phys-timer) and a modified
version of Xen that injects the generic physical timer interrupts to the
guest. I think you should be able to reproduce the same number using
the TTC timer like you are doing.
In addition to sched=null and vwfi=native, I also passed
serrors=panic. This last option further reduces context switch times and
should be safe on your board. You might want to add it, and run the
numbers again.
Post by Milan BobericPost by Stefano StabelliniKeep in mind that jitter is about having
deterministic IRQ latency, not about having extremely frequent
interrupts.
Yes, but I want to see exactly where will I lose deterministic IRQ
latency which is extremely important in real-time signal processing.
So, what causes this jitter, are those Xen limits, ARM limits, etc? It
would be nice to know, I'll share all the results I get.
Post by Stefano StabelliniI would also double check that you are not using any other devices or
virtual interfaces in your baremetal app because that could negatively
affect the numbers.
I checked the bare-metal app and I think there is no other devices
that bm app is using.
This should also be confirmed by the fact that you are only getting
"DEBUG virq=68 local=1" messages and nothing else. If other interrupts
were to be injected you should see other lines such as
DEBUG virq=27 local=1
I have an idea to verify this, see below,
Post by Milan BobericPost by Stefano StabelliniLinux by default uses the virtual
timer interface ("arm,armv8-timer", I would double check that the
baremetal app is not doing the same -- you don't want to be using two
timers when doing your measurements.
Hmm, I'm not sure how to check that, I could send bare-metal app if
that helps, it's created in Xilinx SDK 2017.4.
Also, should I move to Xilinx SDK 2018.2 because I'm using PetaLinux 2018.2 ?
I'm also using hardware description file for SDK that is created in
Vivado 2017.4.
Is all this could be a "not matching version" problem (I don't think
so because bm app works)?
Post by Stefano StabelliniEven though the app. is the only one running on the CPU, the CPU may
be used to handle other interrupts and its context (such as TLB and
cache) might be flushed by other components. When these happen, the
interrupt handling latency can vary a lot.
What do you think about this? I don't know how would I check this.
I think we want to fully understand how many other interrupts the
baremetal guest is receiving. To do that, we can modify my previous
patch to suppress any debug messages for virq=68. That way, we should
only see the other interrupts. Ideally there would be none.
diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
index 5a4f082..b7a8e17 100644
--- a/xen/arch/arm/vgic.c
+++ b/xen/arch/arm/vgic.c
@@ -577,7 +577,11 @@ void vgic_inject_irq(struct domain *d, struct vcpu *v, unsigned int virq,
/* the irq is enabled */
if ( test_bit(GIC_IRQ_GUEST_ENABLED, &n->status) )
+ {
gic_raise_guest_irq(v, virq, priority);
+ if ( d->domain_id != 0 && virq != 68 )
+ printk("DEBUG virq=%d local=%d\n",virq,v == current);
+ }
list_for_each_entry ( iter, &v->arch.vgic.inflight_irqs, inflight )
{
Next step would be to verify that there are no other physical interrupts
interrupting the vcpu execution other the irq=68. We should be able to
check that with the following debug patch:
diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index e524ad5..b34c3e4 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -381,6 +381,13 @@ void gic_interrupt(struct cpu_user_regs *regs, int is_fiq)
/* Reading IRQ will ACK it */
irq = gic_hw_ops->read_irq();
+ if (current->domain->domain_id > 0 && irq != 68)
+ {
+ local_irq_enable();
+ printk("DEBUG irq=%d\n",irq);
+ local_irq_disable();
+ }
+
if ( likely(irq >= 16 && irq < 1020) )
{
local_irq_enable();