Xen balloon driver discuss

Discussion:

tinnycloud

2010-11-21 06:26:01 UTC

Hi:
Greeting first.

I was trying to run about 24 HVMS (currently only Linux, later will
involve Windows) on one physical server with 24GB memory, 16CPUs.
Each VM is configured with 2GB memory, and I reserved 8GB memory for
dom0.
For safety reason, only domain U's memory is allowed to balloon.

Inside domain U, I used xenballooned provide by xensource,
periodically write /proc/meminfo into xenstore in dom
0(/local/domain/did/memory/meminfo).
And in domain 0, I wrote a python script to read the meminfo, like
xen provided strategy, use Committed_AS to calculate the domain U balloon
target.
The time interval is 1 seconds.

Inside each VM, I setup a apache server for test. Well, I'd
like to say the result is not so good.
It appears that too much read/write on xenstore, when I give some of
the stress(by using ab) to guest domains,
the CPU usage of xenstore is up to 100%. Thus the monitor running in
dom0 also response quite slowly.
Also, in ab test, the Committed_AS grows very fast, reach to maxmem
in short time, but in fact the only a small amount
of memory guest really need, so I guess there should be some more to
be taken into consideration for ballooning.

For xenstore issue, I first plan to wrote a C program inside domain
U to replace xenballoond to see whether the situation
will be refined. If not, how about set up event channel directly for
domU and dom0, would it be faster?

Regards balloon strategy, I would do like this, when there are
enough memory , just fulfill the guest balloon request, and when shortage
of memory, distribute memory evenly on the guests those request
inflation.

Does anyone have better suggestion, thanks in advance.

MaoXiaoyun

2010-11-22 04:33:13 UTC

Permalink

Since currently /cpu/meminfo is sent to domain 0 via xenstore, which in my opinoin is slow.
What I want to do is: there is a shared page between domU and dom0, and domU periodically
update the meminfo into the page, while on the other side dom0 retrive the updated data for
caculating the target, which is used by guest for balloning.

The problem I met is, currently I don't know how to implement a shared page between
dom0 and domU.
Would it like dom 0 alloc a unbound event and wait guest to connect, and transfer date through
grant table?
Or someone has more efficient way?
many thanks.

> From: ***@hotmail.com
> To: xen-***@lists.xensource.com
> CC: ***@oracle.com; ***@eu.citrix.com
> Subject: Xen balloon driver discuss
> Date: Sun, 21 Nov 2010 14:26:01 +0800
>
> Hi:
> Greeting first.
>
> I was trying to run about 24 HVMS (currently only Linux, later will
> involve Windows) on one physical server with 24GB memory, 16CPUs.
> Each VM is configured with 2GB memory, and I reserved 8GB memory for
> dom0.
> For safety reason, only domain U's memory is allowed to balloon.
>
> Inside domain U, I used xenballooned provide by xensource,
> periodically write /proc/meminfo into xenstore in dom
> 0(/local/domain/did/memory/meminfo).
> And in domain 0, I wrote a python script to read the meminfo, like
> xen provided strategy, use Committed_AS to calculate the domain U balloon
> target.
> The time interval is 1 seconds.
>
> Inside each VM, I setup a apache server for test. Well, I'd
> like to say the result is not so good.
> It appears that too much read/write on xenstore, when I give some of
> the stress(by using ab) to guest domains,
> the CPU usage of xenstore is up to 100%. Thus the monitor running in
> dom0 also response quite slowly.
> Also, in ab test, the Committed_AS grows very fast, reach to maxmem
> in short time, but in fact the only a small amount
> of memory guest really need, so I guess there should be some more to
> be taken into consideration for ballooning.
>
> For xenstore issue, I first plan to wrote a C program inside domain
> U to replace xenballoond to see whether the situation
> will be refined. If not, how about set up event channel directly for
> domU and dom0, would it be faster?
>
> Regards balloon strategy, I would do like this, when there are
> enough memory , just fulfill the guest balloon request, and when shortage
> of memory, distribute memory evenly on the guests those request
> inflation.
>
> Does anyone have better suggestion, thanks in advance.
>

Dan Magenheimer

2010-11-28 02:36:29 UTC

Permalink

Am I understanding correctly that you are running each linux-2.6.18 as HVM (not PV)? I didn't think that the linux-2.6.18 balloon driver worked at all in an HVM guest.

You also didn't say what version of Xen you are using. If you are running xen-unstable, you should also provide the changeset number.

In any case, any load of HVM guests should never crash Xen itself, but if you are running HVM guests, I probably can't help much as I almost never run HVM guests.

From: cloudroot [mailto:***@sina.com]
Sent: Friday, November 26, 2010 11:55 PM
To: tinnycloud; Dan Magenheimer; xen devel
Cc: ***@eu.citrix.com
Subject: re: Xen balloon driver discuss

Hi Dan:

I have set the benchmark to test balloon driver, but unfortunately the Xen crashed on memory Panic.

Before I attach the details output from serial port(which takes time on next run), I am afraid of I might miss something on test environment.

My dom0 kernel is 2.6.31, pvops.

Well currently there is no driver/xen/balloon.c on this kernel source tree, so I build the xen-balloon.ko, Xen-platform-pci.ko form

linux-2.6.18.x86_64, and installed in Dom U, which is redhat 5.4.

What I did is put a C program in the each Dom U(total 24 HVM), the program will allocate the memory and fill it with random string repeatly.

And in dom0, a phthon monitor will collect the meminfo from xenstore and calculate the target to balloon from Committed_AS.

The panic happens when the program is running in just one Dom.

I am writing to ask whether my balloon driver is out of date, or where can I get the latest source code,

I've googled a lot, but still have a lot of confusion on those source tree.

Many thanks.

From: tinnycloud [mailto:***@hotmail.com]
Date: 2010.11.23 22:58
TO: 'Dan Magenheimer'; 'xen devel'
CC: '***@eu.citrix.com'
Subject: re: Xen balloon driver discuss

HI Dan:

Appreciate for your presentation in summarizing the memory overcommit, really vivid and in great help.

Well, I guess recently days the strategy in my mind will fall into the solution Set C in pdf.

The tmem solution your worked out for memory overcommit is both efficient and effective.

I guess I will have a try on Linux Guest.

The real situation I have is most of the running VMs on host are windows. So I had to come up those policies to balance the memory.

Although policies are all workload dependent. Good news is host workload is configurable, and not very heavy

So I will try to figure out some favorable policy. The policies referred in pdf are good start for me.

Today, instead of trying to implement "/proc/meminfo" with shared pages, I hacked the balloon driver to have another

workqueue periodically write meminfo into xenstore through xenbus, which solve the problem of xenstrore high CPU

utilization problem.

Later I will try to google more on how Citrix does.

Thanks for your help, or do you have any better idea for windows guest?

Sent: Dan Magenheimer [mailto:***@oracle.com]
Date: 2010.11.23 1:47
To: MaoXiaoyun; xen devel
CC: ***@eu.citrix.com
Subject: RE: Xen balloon driver discuss

Xenstore IS slow and you could improve xenballoond performance by only sending the single CommittedAS value from xenballoond in domU to dom0 instead of all of /proc/meminfo. But you are making an assumption that getting memory utilization information from domU to dom0 FASTER (e.g. with a shared page) will provide better ballooning results. I have not found this to be the case, which is what led to my investigation into self-ballooning, which led to Transcendent Memory. See the 2010 Xen Summit for more information.

In your last paragraph below "Regards balloon strategy", the problem is it is not easy to define "enough memory" and "shortage of memory" within any guest and almost impossible to define it and effectively load balance across many guests. See my Linux Plumber's Conference presentation (with complete speaker notes) here:

http://oss.oracle.com/projects/tmem/dist/documentation/presentations/MemMgmtVirtEnv-LPC2010-Final.pdf

http://oss.oracle.com/projects/tmem/dist/documentation/presentations/MemMgmtVirtEnv-LPC2010-SpkNotes.pdf

From: MaoXiaoyun [mailto:***@hotmail.com]
Sent: Sunday, November 21, 2010 9:33 PM
To: xen devel
Cc: Dan Magenheimer; ***@eu.citrix.com
Subject: RE: Xen balloon driver discuss

Since currently /cpu/meminfo is sent to domain 0 via xenstore, which in my opinoin is slow.
What I want to do is: there is a shared page between domU and dom0, and domU periodically
update the meminfo into the page, while on the other side dom0 retrive the updated data for
caculating the target, which is used by guest for balloning.

The problem I met is, currently I don't know how to implement a shared page between
dom0 and domU.
Would it like dom 0 alloc a unbound event and wait guest to connect, and transfer date through
grant table?
Or someone has more efficient way?
many thanks.

> From: ***@hotmail.com
> To: xen-***@lists.xensource.com
> CC: ***@oracle.com; ***@eu.citrix.com
> Subject: Xen balloon driver discuss
> Date: Sun, 21 Nov 2010 14:26:01 +0800
>
> Hi:
> Greeting first.
>
> I was trying to run about 24 HVMS (currently only Linux, later will
> involve Windows) on one physical server with 24GB memory, 16CPUs.
> Each VM is configured with 2GB memory, and I reserved 8GB memory for
> dom0.
> For safety reason, only domain U's memory is allowed to balloon.
>
> Inside domain U, I used xenballooned provide by xensource,
> periodically write /proc/meminfo into xenstore in dom
> 0(/local/domain/did/memory/meminfo).
> And in domain 0, I wrote a python script to read the meminfo, like
> xen provided strategy, use Committed_AS to calculate the domain U balloon
> target.
> The time interval is ! 1 seconds.
>
> Inside each VM, I setup a apache server for test. Well, I'd
> like to say the result is not so good.
> It appears that too much read/write on xenstore, when I give some of
> the stress(by using ab) to guest domains,
> the CPU usage of xenstore is up to 100%. Thus the monitor running in
> dom0 also response quite slowly.
> Also, in ab test, the Committed_AS grows very fast, reach to maxmem
> in short time, but in fact the only a small amount
> of memory guest really need, so I guess there should be some more to
> be taken into consideration for ballooning.
>
> For xenstore issue, I first plan to wrote a C program inside domain
> U to replace xenballoond to see whether the situation
> will be refined. If not, how about set up event channel directly for
> domU and dom0, would it be faster?
>
> Regards balloon strategy, I would do like this, when there ! are
> enough memory , just fulfill the guest balloon request, and when shortage
> of memory, distribute memory evenly on the guests those request
> inflation.
>
> Does anyone have better suggestion, thanks in advance.
>

Dan Magenheimer

2010-11-22 17:46:31 UTC

Permalink

Xenstore IS slow and you could improve xenballoond performance by only sending the single CommittedAS value from xenballoond in domU to dom0 instead of all of /proc/meminfo. But you are making an assumption that getting memory utilization information from domU to dom0 FASTER (e.g. with a shared page) will provide better ballooning results. I have not found this to be the case, which is what led to my investigation into self-ballooning, which led to Transcendent Memory. See the 2010 Xen Summit for more information.

In your last paragraph below "Regards balloon strategy", the problem is it is not easy to define "enough memory" and "shortage of memory" within any guest and almost impossible to define it and effectively load balance across many guests. See my Linux Plumber's Conference presentation (with complete speaker notes) here:

http://oss.oracle.com/projects/tmem/dist/documentation/presentations/MemMgmtVirtEnv-LPC2010-Final.pdf

http://oss.oracle.com/projects/tmem/dist/documentation/presentations/MemMgmtVirtEnv-LPC2010-SpkNotes.pdf

From: MaoXiaoyun [mailto:***@hotmail.com]
Sent: Sunday, November 21, 2010 9:33 PM
To: xen devel
Cc: Dan Magenheimer; ***@eu.citrix.com
Subject: RE: Xen balloon driver discuss

Since currently /cpu/meminfo is sent to domain 0 via xenstore, which in my opinoin is slow.
What I want to do is: there is a shared page between domU and dom0, and domU periodically
update the meminfo into the page, while on the other side dom0 retrive the updated data for
caculating the target, which is used by guest for balloning.

The problem I met is, currently I don't know how to implement a shared page between
dom0 and domU.
Would it like dom 0 alloc a unbound event and wait guest to connect, and transfer date through
grant table?
Or someone has more efficient way?
many thanks.

> From: ***@hotmail.com
> To: xen-***@lists.xensource.com
> CC: ***@oracle.com; ***@eu.citrix.com
> Subject: Xen balloon driver discuss
> Date: Sun, 21 Nov 2010 14:26:01 +0800
>
> Hi:
> Greeting first.
>
> I was trying to run about 24 HVMS (currently only Linux, later will
> involve Windows) on one physical server with 24GB memory, 16CPUs.
> Each VM is configured with 2GB memory, and I reserved 8GB memory for
> dom0.
> For safety reason, only domain U's memory is allowed to balloon.
>
> Inside domain U, I used xenballooned provide by xensource,
> periodically write /proc/meminfo into xenstore in dom
> 0(/local/domain/did/memory/meminfo).
> And in domain 0, I wrote a python script to read the meminfo, like
> xen provided strategy, use Committed_AS to calculate the domain U balloon
> target.
> The time interval is ! 1 seconds.
>
> Inside each VM, I setup a apache server for test. Well, I'd
> like to say the result is not so good.
> It appears that too much read/write on xenstore, when I give some of
> the stress(by using ab) to guest domains,
> the CPU usage of xenstore is up to 100%. Thus the monitor running in
> dom0 also response quite slowly.
> Also, in ab test, the Committed_AS grows very fast, reach to maxmem
> in short time, but in fact the only a small amount
> of memory guest really need, so I guess there should be some more to
> be taken into consideration for ballooning.
>
> For xenstore issue, I first plan to wrote a C program inside domain
> U to replace xenballoond to see whether the situation
> will be refined. If not, how about set up event channel directly for
> domU and dom0, would it be faster?
>
> Regards balloon strategy, I would do like this, when there ! are
> enough memory , just fulfill the guest balloon request, and when shortage
> of memory, distribute memory evenly on the guests those request
> inflation.
>
> Does anyone have better suggestion, thanks in advance.
>

tinnycloud

2010-11-23 14:58:26 UTC

Permalink

HI Dan:

Appreciate for your presentation in summarizing the memory
overcommit, really vivid and in great help.

Well, I guess recently days the strategy in my mind will fall into
the solution Set C in pdf.

The tmem solution your worked out for memory overcommit is both
efficient and effective.

I guess I will have a try on Linux Guest.

The real situation I have is most of the running VMs on host are
windows. So I had to come up those policies to balance the memory.

Although policies are all workload dependent. Good news is host
workload is configurable, and not very heavy

So I will try to figure out some favorable policy. The policies referred in
pdf are good start for me.

Today, instead of trying to implement "/proc/meminfo" with shared
pages, I hacked the balloon driver to have another

workqueue periodically write meminfo into xenstore through xenbus,
which solve the problem of xenstrore high CPU

utilization problem.

Later I will try to google more on how Citrix does.

Thanks for your help, or do you have any better idea for windows
guest?

Sent: Dan Magenheimer [mailto:***@oracle.com]
Date: 2010.11.23 1:47
To: MaoXiaoyun; xen devel
CC: ***@eu.citrix.com
Subject: RE: Xen balloon driver discuss

Xenstore IS slow and you could improve xenballoond performance by only
sending the single CommittedAS value from xenballoond in domU to dom0
instead of all of /proc/meminfo. But you are making an assumption that
getting memory utilization information from domU to dom0 FASTER (e.g. with a
shared page) will provide better ballooning results. I have not found this
to be the case, which is what led to my investigation into self-ballooning,
which led to Transcendent Memory. See the 2010 Xen Summit for more
information.

In your last paragraph below "Regards balloon strategy", the problem is it
is not easy to define "enough memory" and "shortage of memory" within any
guest and almost impossible to define it and effectively load balance across
many guests. See my Linux Plumber's Conference presentation (with complete
speaker notes) here:

http://oss.oracle.com/projects/tmem/dist/documentation/presentations/MemMgmt
VirtEnv-LPC2010-Final.pdf

http://oss.oracle.com/projects/tmem/dist/documentation/presentations/MemMgmt
VirtEnv-LPC2010-SpkNotes.pdf

From: MaoXiaoyun [mailto:***@hotmail.com]
Sent: Sunday, November 21, 2010 9:33 PM
To: xen devel
Cc: Dan Magenheimer; ***@eu.citrix.com
Subject: RE: Xen balloon driver discuss

Since currently /cpu/meminfo is sent to domain 0 via xenstore, which in my
opinoin is slow.
What I want to do is: there is a shared page between domU and dom0, and domU
periodically
update the meminfo into the page, while on the other side dom0 retrive the
updated data for
caculating the target, which is used by guest for balloning.

The problem I met is, currently I don't know how to implement a shared page
between
dom0 and domU.
Would it like dom 0 alloc a unbound event and wait guest to connect, and
transfer date through
grant table?
Or someone has more efficient way?
many thanks.

> From: ***@hotmail.com
> To: xen-***@lists.xensource.com
> CC: ***@oracle.com; ***@eu.citrix.com
> Subject: Xen balloon driver discuss
> Date: Sun, 21 Nov 2010 14:26:01 +0800
>
> Hi:
> Greeting first.
>
> I was trying to run about 24 HVMS (currently only Linux, later will
> involve Windows) on one physical server with 24GB memory, 16CPUs.
> Each VM is configured with 2GB memory, and I reserved 8GB memory for
> dom0.
> For safety reason, only domain U's memory is allowed to balloon.
>
> Inside domain U, I used xenballooned provide by xensource,
> periodically write /proc/meminfo into xenstore in dom
> 0(/local/domain/did/memory/meminfo).
> And in domain 0, I wrote a python script to read the meminfo, like
> xen provided strategy, use Committed_AS to calculate the domain U balloon
> target.
> The time interval is ! 1 seconds.
>
> Inside each VM, I setup a apache server for test. Well, I'd
> like to say the result is not so good.
> It appears that too much read/write on xenstore, when I give some of
> the stress(by using ab) to guest domains,
> the CPU usage of xenstore is up to 100%. Thus the monitor running in
> dom0 also response quite slowly.
> Also, in ab test, the Committed_AS grows very fast, reach to maxmem
> in short time, but in fact the only a small amount
> of memory guest really need, so I guess there should be some more to
> be taken into consideration for ballooning.
>
> For xenstore issue, I first plan to wrote a C program inside domain
> U to replace xenballoond to see whether the situation
> will be refined. If not, how about set up event channel directly for
> domU and dom0, would it be faster?
>
> Regards balloon strategy, I would do like this, when there ! are
> enough memory , just fulfill the guest balloon request, and when shortage
> of memory, distribute memory evenly on the guests those request
> inflation.
>
> Does anyone have better suggestion, thanks in advance.
>

tinnycloud

2011-01-12 14:41:03 UTC

Permalink

Hi Geogre:

We have quite strange CPU usage behaivor in one of our DomU(2008
HVM)
Totally, our host has 16 physical CPU, and 9 VMS.

Most of time, the all VMs works fine, the CPU usage are low and
resonable,
But at every high workload time(say 9:00-11:00AM, there are 8 VMs, each is a
web server,
cutomers accesses the page at this time), we login into the 9th VM which is
idle, find that
its CPU usage is at 85%, doesn't make any sense since we have no task
running, also the
usage distrbutes evenly across most of the processes.

I wonder if it relates to CPU schedule algorithm in Xen.
After go through
http://lists.xensource.com/archives/html/xen-devel/2010-07/msg00414.html

I can't figure out any assumptiones explains our situation.
So what do u think of this?

Many thanks.

George Dunlap

2011-01-12 16:41:07 UTC

Permalink

Where is that 85% number coming from -- is this from within the VM, or
from xentop?

If it's Windows reporting from within the VM, one hypothesis is that
it has to do with processing and running with virtual time. It may
simply be a side effect of the VM only getting a small percentage of
the cpu.

If it's xentop, it's probably the vm reacting somehow to getting only
a small percentage of the CPU. We saw something like this with early
versions of Windows 2k3, but that problem was addressed in later
service packs. At any rate, to find out what Windows is doing would
require a bit more investigation. :-)

-George

On Wed, Jan 12, 2011 at 2:41 PM, tinnycloud <***@hotmail.com> wrote:
> Hi Geogre:
>
>          We have quite strange CPU usage behaivor in one of our DomU(2008
> HVM)
>          Totally, our host has 16 physical CPU, and 9 VMS.
>
>          Most of time, the all VMs works fine, the CPU usage are low and
> resonable,
> But at every high workload time(say 9:00-11:00AM, there are 8 VMs, each is a
> web server,
> cutomers accesses the page at this time), we login into the 9th VM which
> is idle, find that
> its CPU usage is at 85%, doesn't make any sense since we have no task
> running, also the
> usage distrbutes evenly across most of the processes.
>
>         I wonder if it relates to CPU schedule algorithm in Xen.
>         After go
> through http://lists.xensource.com/archives/html/xen-devel/2010-07/msg00414.html
>         I can't figure out any assumptiones explains our situation.
>         So what do u think of this?
>
>         Many thanks.
>
>
>
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-***@lists.xensource.com
> http://lists.xensource.com/xen-devel
>
>

MaoXiaoyun

2011-01-13 04:29:05 UTC

Permalink

85% is from VM.
I forget to tell that, 8VMS each of them has 2 VCPUS, and the 9th VM, which is 2008
has 8VCPUs. We are still trying to reproduce the scence.

I have questiones on VM idle. How does Xen know VM is idle, or when VM is idle,
what is VCPU state in Xen, blocked or runable, and how is the CPU utiliazation
calcauted?
(I assume that the Idle VM finish physical CPU use before the time splice,
and its state come to blocked, then put it into *inactive* queue, right?
But will it is possible VM's VCPU come back to *active* queue when VM still
in idle, then we may have the phenomenon of VCPU shift between twe queues?)

Also, when VM's load comes up, will its priority be set BOOST, thus put
the head of *active* queue to be sheduled earlier?

> Date: Wed, 12 Jan 2011 16:41:07 +0000
> Subject: Re: [Xen-devel] strange CPU utilization, could related to credit schedule ?
> From: ***@eu.citrix.com
> To: ***@hotmail.com
> CC: xen-***@lists.xensource.com
>
> Where is that 85% number coming from -- is this from within the VM, or
> from xentop?
>
> If it's Windows reporting from within the VM, one hypothesis is that
> it has to do with processing and running with virtual time. It may
> simply be a side effect of the VM only getting a small percentage of
> the cpu.
>
> If it's xentop, it's probably the vm reacting somehow to getting only
> a small percentage of the CPU. We saw something like this with early
> versions of Windows 2k3, but that problem was addressed in later
> service packs. At any rate, to find out what Windows is doing would
> require a bit more investigation. :-)
>
> -George
>
> On Wed, Jan 12, 2011 at 2:41 PM, tinnycloud <***@hotmail.com> wrote:
> > Hi Geogre:
> >
> > We have quite strange CPU usage behaivor in one of our DomU(2008
> > HVM)
> > Totally, our host has 16 physical CPU, and 9 VMS.
> >
> > Most of time, the all VMs works fine, the CPU usage are low and
> > resonable,
> > But at every high workload time(say 9:00-11:00AM, there are 8 VMs, each is a
> > web server,
> > cutomers accesses the page at this time), we login into the 9th VM which
> > is idle, find that
> > its CPU usage is at 85%, doesn't make any sense since we have no task
> > running, also the
> > usage distrbutes evenly across most of the processes.
> >
> > I wonder if it relates to CPU schedule algorithm in Xen.
> > After go
> > through http://lists.xensource.com/archives/html/xen-devel/2010-07/msg00414.html
> > I can't figure out any assumptiones explains our situation.
> > So what do u think of this?
> >
> > Many thanks.
> >
> >
> >
> >
> >
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-***@lists.xensource.com
> > http://lists.xensource.com/xen-devel
> >
> >

MaoXiaoyun

2011-01-17 03:52:22 UTC

Permalink

Hi George:

I've been looking into the credit schedule over agian and again.
Well, I not smart enough to get fully understanding.
Could you help to clarify below understanding?

1. From the algorithm, since domains credits is direct proportion to its weight,
I think if there are two cpu-bound domains with same weight, no matter how many
vcpus they have, they will have the same CPU times accmulated, right?
2. if 1 is true, what the different between domains with same weight but have
different VCPUS(say one has 4 vcpus, another has 8)?
3. I am fully understand the problems of "credit 1 schedule "in your ppt of "Xenschedulerstatus"

(1)Client hypervisors and audio/video ï¬
Audio VM: 5% CPU
ï¬ 2x Kernel-build VMs: 97% cpu
ï¬ 30-40 audio skips over 5 minutes

Do you mean "kernel-build VMs" has great impact on "Audio VM", and does priority CSCHED_PRI_TS_BOOST
solve this?

(2)Not fair to latency-sensitive workloads
ï¬ Network scp: âFair shareâ 50%, usage 20-30%
(3)ï¬ Load balancing 64 threads (4 x 8 x 2)
ï¬ Unpredictable
ï¬ Not scalable
ï¬ Power management, Hyperthreads

Could you help to explan more ?

many many thanks, those confusions really makes me headache, I am a bit of silly.

From: ***@hotmail.com
To: ***@eu.citrix.com; xen-***@lists.xensource.com
Subject: RE: [Xen-devel] strange CPU utilization, could related to credit schedule ?
Date: Thu, 13 Jan 2011 12:29:05 +0800

85% is from VM.
I forget to tell that, 8VMS each of them has 2 VCPUS, and the 9th VM, which is 2008
has 8VCPUs. We are still trying to reproduce the scence.

I have questiones on VM idle. How does Xen know VM is idle, or when VM is idle,
what is VCPU state in Xen, blocked or runable, and how is the CPU utiliazation
calcauted?
(I assume that the Idle VM finish physical CPU use before the time splice,
and its state come to blocked, then put it into *inactive* queue, right?
But will it is possible VM's VCPU come back to *active* queue when VM still
in idle, then we may have the phenomenon of VCPU shift between twe queues?)

Also, when VM's load comes up, will its priority be set BOOST, thus put
the head of *active* queue to be sheduled earlier?

> Date: Wed, 12 Jan 2011 16:41:07 +0000
> Subject: Re: [Xen-devel] strange CPU utilization, could related to credit schedule ?
> From: ***@eu.citrix.com
> To: ***@hotmail.com
> CC: xen-***@lists.xensource.com
>
> Where is that 85% number coming from -- is this from within the VM, or
> from xentop?
>
> If it's Windows reporting from within the VM, one hypothesis is that
> it has to do with processing and running with virtual time. It may
> simply be a side effect of the VM only getting a small percentage of
> the cpu.
>
> If it's xentop, it's probably the vm reacting somehow to getting only
> a small percentage of the CPU. We saw something like this with early
> versions of Windows 2k3, but that problem was addressed in later
> service packs. At any rate, to find out what Windows is doing would
> require a bit more investigation. :-)
>
> -George
>
> On Wed, Jan 12, 2011 at 2:41 PM, tinnycloud <***@hotmail.com> wrote:
> > Hi Geogre:
> >
> > We have quite strange CPU usage behaivor in one of our DomU(2008
> > HVM)
> > Totally, our host has 16 physical CPU, and 9 VMS.
> >
> > Most of time, the all VMs works fine, the CPU usage are low and
> > resonable,
> > But at every high workload time(say 9:00-11:00AM, there are 8 VMs, each is a
> > web server,
> > cutomers accesses the page at this time), we login into the 9th VM which
> > is idle, find that
> > its CPU usage is at 85%, doesn't make any sense since we have no task
> > running, also the
> > usage distrbutes evenly across most of the processes.
> >
> > I wonder if it relates to CPU schedule algorithm in Xen.
> > After go
> > through http://lists.xensource.com/archives/html/xen-devel/2010-07/msg00414.html
> > I can't figure out any assumptiones explains our situation.
> > So what do u think of this?
> >
> > Many thanks.
> >
> >
> >
> >
> >
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-***@lists.xensource.com
> > http://lists.xensource.com/xen-devel
> >
> >

George Dunlap

2011-01-17 10:41:04 UTC

Permalink

On Mon, Jan 17, 2011 at 3:52 AM, MaoXiaoyun <***@hotmail.com> wrote:
> Hi George:
>        1. From the algorithm, since domains credits is direct proportion
> to its weight,
> I think if there are two cpu-bound domains with same weight, no matter how
> many
> vcpus they have, they will have the same CPU times accmulated, right?

It used to be the case, yes. But since that is very
counter-intuitive, some months ago I introduced a change such that the
weight is calculated on a per-vcpu basis. If you look in
csched_acct(), when accounting credit, weight of a domain is
multiplied by sdom->active_vcpu_count.

>        2. if 1 is true, what the different between domains with same
> weight but have
> different VCPUS(say one has 4 vcpus, another has 8)?

If two domains have the same number of "active" vcpus (4 each, for
example) they'll get the same amount of CPU time. But if the 8-vcpu
domain has 8 vcpus in "active" mode, it will get twice as much time.

But this is a recent change; in earlier versions of Xen (before 3.4
for sure, and possibly 4.0, I can't remember), if two VMs are given
the same weight, they'll get the same cpu time.

>        3. I am fully understand the problems of "credit 1 schedule "in your
> ppt of "Xenschedulerstatus"
>
> (1)Client hypervisors and audio/video 
>     Audio VM: 5% CPU
>  2x Kernel-build VMs: 97% cpu
>  30-40 audio skips over 5 minutes
>
> Do you mean "kernel-build VMs" has great impact on "Audio VM", and does
> priority CSCHED_PRI_TS_BOOST
> solve this?

BOOST does not solve this problem. I think I described the problem in
the paper: BOOST is an unstable place to be -- you can't stay there
very long. The way BOOST works is this:
* You are put into BOOST if your credits reach a certain threshold
(30ms worth of credit)
* You are taken out of BOOST if you are interrupted by a scheduler "tick"

If you run at about 5% (or about 1/20 of the time), you can expect to
be running on average every 20 ticks. Since timer ticks happen every
10ms, that means you can expect to stay in BOOST for an average of
200ms.

So no matter how little cpu you use, you'll flip back and forth
between BOOST and normal, often several times per second.

> many many thanks, those confusions really makes me headache, I am a bit of
> silly.

不是! 懂scheduling非常难. It probably took me about six months to really
understand what was going on. :-)

-George

kim.jin

2011-01-17 10:51:31 UTC

Permalink

Then, how about the frequency of CPU? e.g., one VM have 1GHz CPUs, but the other have 2GHz CPUs.

------------------
Best Regards!

Kim King
2011-01-17

-------------------------------------------------------------
George Dunlap
2011-01-17 18:41:35
MaoXiaoyun
xen devel
Re: [Xen-devel] strange CPU utilization, could related to creditschedule ?

>On Mon, Jan 17, 2011 at 3:52 AM, MaoXiaoyun <***@hotmail.com> wrote:
>> Hi George:
>> Â Â Â Â Â Â 1.Â From the algorithm, since domains credits isÂ direct proportion
>> to its weight,
>> I think if there are two cpu-bound domains with same weight, no matter how
>> many
>> vcpus they have, they will have the same CPU times accmulated, right?
>
>It used to be the case, yes. But since that is very
>counter-intuitive, some months ago I introduced a change such that the
>weight is calculated on a per-vcpu basis. If you look in
>csched_acct(), when accounting credit, weight of a domain is
>multiplied by sdom->active_vcpu_count.
>
>> Â Â Â Â Â Â 2. if 1 is true, whatÂ the different betweenÂ domains withÂ same
>> weightÂ but have
>> different VCPUS(say one has 4 vcpus, another has 8)?
>
>If two domains have the same number of "active" vcpus (4 each, for
>example) they'll get the same amount of CPU time. But if the 8-vcpu
>domain has 8 vcpus in "active" mode, it will get twice as much time.
>
>But this is a recent change; in earlier versions of Xen (before 3.4
>for sure, and possibly 4.0, I can't remember), if two VMs are given
>the same weight, they'll get the same cpu time.
>
>> Â Â Â Â Â Â Â 3.Â I am fully understand the problems of "credit 1 schedule "in your
>> ppt of "Xenschedulerstatus"
>>
>> (1)Client hypervisors and audio/video ï¬
>> Â Â Â Audio VM: 5% CPU
>> ï¬ 2x Kernel-build VMs: 97% cpu
>> ï¬ 30-40 audio skips over 5 minutes
>>
>> Do you mean "kernel-build VMs" has great impact on "Audio VM", and does
>> priority CSCHED_PRI_TS_BOOST
>> solve this?
>
>BOOST does not solve this problem. I think I described the problem in
>the paper: BOOST is an unstable place to be -- you can't stay there
>very long. The way BOOST works is this:
>* You are put into BOOST if your credits reach a certain threshold
>(30ms worth of credit)
>* You are taken out of BOOST if you are interrupted by a scheduler "tick"
>
>If you run at about 5% (or about 1/20 of the time), you can expect to
>be running on average every 20 ticks. Since timer ticks happen every
>10ms, that means you can expect to stay in BOOST for an average of
>200ms.
>
>So no matter how little cpu you use, you'll flip back and forth
>between BOOST and normal, often several times per second.
>
>> many many thanks, those confusions really makes meÂ headache, I am a bit of
>> silly.
>
>äžæ¯! æschedulingéåžžéŸ. It probably took me about six months to really
>understand what was going on. :-)
>
> -George
>
>_______________________________________________
>Xen-devel mailing list
>Xen-***@lists.xensource.com
>http://lists.xensource.com/xen-devel
>.

George Dunlap

2011-01-17 10:56:11 UTC

Permalink

On Mon, Jan 17, 2011 at 10:51 AM, kim.jin <***@stromasys.com> wrote:
> Then, how about the frequency of CPU? e.g., one VM have 1GHz CPUs, but the other have 2GHz CPUs.

Do you mean, if someone is using CPU frequency scaling?

-George

kim.jin

2011-01-17 11:30:49 UTC

Permalink

>On Mon, Jan 17, 2011 at 10:51 AM, kim.jin <***@stromasys.com> wrote:
>> Then, how about the frequency of CPU? e.g., one VM have 1GHz CPUs, but the other have 2GHz CPUs.
>
>Do you mean, if someone is using CPU frequency scaling?
The similar thing. Do the new algorith care about the frequency of vCPU?
> -George

Best Regards!

Kim King
2011-01-17