SLES SP2 slows Windows-VM

Hi Suse,

we have a performance problem with Xen-virtualized Windows VM`s.

After very good experinces with Windows under Xen and VMDP-drivers, we used the wintertime to update and rebbuild tree SLES-Servers.
The Xen hypervisor 4.1 was so promised, that we wanted to push up our servers for new services comming in 2013.

The old and well running configuration on SLES 10 SP3 and SLES11 was, that we used sparse files for the VM-block devices.

Do we pushed our HP DL360 Servers with more RAM and the RAID with new SAS2 discs in RAID10.

The local storage was now configured with LVM and the images moved with dd into new lvm-volumes.
Everything worked fine. At the end i installed the new vmdp-drivers 2.02 into the vm’s and everything looked good.

Next business day the telefones do’nt stoped ringing. The applications (Webserver with seperat SQL-Server) then worked so slowly, that the users couldnt work longer on it. Each SQL-request was worked very quick, but the connection between webserver ans sql-server had a time-gap of up to 10 secons.

So we splitted the servers on to SLES 11SP2 hosts and the user was happy again.

Afterwards we changed the vmdp-driver back to the older version 1.7. The application-performance than was higher, as with 2.02, but never so high, that with SLES10SP3 or SLES11.

We do not understand now, why Xen 4.1 with more RAM, more CPU, more HDD in better configuration as phy-device has so dramatic slower performance.
We see this on different HP DL180 and DL360 Servers.

Do you have any idea ?

Greetings
Thomas

ThomasZ wrote:
[color=blue]

We do not understand now, why Xen 4.1 with more RAM, more CPU, more
HDD in better configuration as phy-device has so dramatic slower
performance.
We see this on different HP DL180 and DL360 Servers.[/color]

You have made a lot of changes which makes it difficult to know exactly
what is causing your performance issues. The correct way to isolate the
issue is to change one thing at a time and see if that change resolves
the performance problem. I realize that is a lot of work and can take a
lot of time.

There have been various performance issues related to TCP Offloading.
Some examples are described in these TID’s: TID7000478, TID3344651,
TID7007604.

TID7005304 describes a workaround:
Howto change network specific settings using ethtool in combination
with NetworkManager
http://www.novell.com/support/kb/doc.php?id=7005304

It’s a simple matter to disable TCP Offloading to see if it resolves
your issue. If it doesn’t, one possible cause has been eliminated.


Kevin Boyle - Knowledge Partner
If you find this post helpful and are logged into the web interface,
show your appreciation and click on the star below…

Hello Kevin,

thank you for that promising TID. But id had the perfomance not impruved.
It looks like, that the situation with TSO Off is bader, than before.

But I have some other interesting results.
On a minimal CPU & memory load the VM performs well over the VNC-viewer.
Using RDP at the same time, the Monitor and its applications hanging somtimes.
The connection between the SQL-Server and the webserver is paused over 6-8 seconds on new connections.
Loading a non existing dummy file needs 600 ms for an application interception.

I believe, that the server internal vm-IP-transport will be the source of the problem.

The SLES is installed complete with Gnome but without AppArmor.

On a other forum i read, that TLS should be deactivated. Can this be the braking timeslize an the communication between the two vm’s ?

ThomasZ wrote:
[color=blue]

Can this be
the braking timeslize an the communication between the two vm’s ?[/color]

Performance issues can be very difficult to troubleshoot and there are
limits to the kind of help we can provide via these forums. Others may
have some suggestions to offer you if they have experience with issues
like yours.

I would suggest you search the knowledge base. Perhaps you will find
something that can help. If not, then consider opening a Service
Request to have someone look at your system.


Kevin Boyle - Knowledge Partner
If you find this post helpful and are logged into the web interface,
show your appreciation and click on the star below…

Hello Kevin,

it looks like, that the support had found the problem. Here the answer of it:

Please go to the Windows vm’s adapter Properties, under the Advanced tab, see the 2 following properties:

  • IPv4 Checksum Offload
  • IPv4 Large Send Offload
    Make sure both are on “Disable”.
    The second one (LSO) in particular, if enabled, will cause slowness in most environment. LSO was not available in VMDP 1.7 so you wouldn’t see the issue.

I did this on the “Suse network device” adapter of all Windows VM’s with VMDP 2.02 and we reduced the load time of our project website from 21 secons to 600 ms.
That means, that the disabling of his three parameters did speeded up it 3.5 times.

Thank you very much for your help
Best regards
Thomas

ThomasZ wrote:
[color=blue]

Hello Kevin,

it looks like, that the support had found the problem. Here the answer
of it:

Please go to the Windows vm’s adapter Properties, under the Advanced
tab, see the 2 following properties:

  • IPv4 Checksum Offload
  • IPv4 Large Send Offload
    Make sure both are on “Disable”.
    The second one (LSO) in particular, if enabled, will cause slowness in
    most environment. LSO was not available in VMDP 1.7 so you wouldn’t
    see the issue.

I did this on the “Suse network device” adapter of all Windows VM’s
with VMDP 2.02 and we reduced the load time of our project website
from 21 secons to 600 ms.
That means, that the disabling of his three parameters did speeded up
it 3.5 times.

Thank you very much for your help
Best regards
Thomas[/color]

I appreciate your providing this information. I was aware that
performance suffered with TP Offload enabled but I had no idea that the
impact was that significant.

By sharing this information you may help others avoid the same problems.


Kevin Boyle - Knowledge Partner
If you find this post helpful and are logged into the web interface,
show your appreciation and click on the star below…