SLES12 in HyperV

Hi,

there is a SLES12SP2 on a HyperV and this host has a RAW-SAN connection.

But even if there is heavy load at the SAN-Store the waitstaits
increasing to much. (Sometimes they reach 80!)

At the old system (SLES11) at a phys. host but with the same SAN we have
never such high waitstaits.

What should be the next steps to increase the perfomance?

Bernd

Hi Bernd,

I’m not familiar with your setup: is “raw-san” the same as “pass-through disks”? Or is it that the (Fiber Channel? iSCSI?) adapter is passed through to the guest system? It might be helpful to know a bit more about the actual “device stack”, including how you’re using the according disks inside the SLES12 guest (some details about the access pattern, raw disk device access or file system, virtual disk or LUNs behind a passed-through SAN adapter, what’s the setup at the HyperV level and what’s the actual SAN technology).

Can you tell if your SLES12 guest is creating lots of I/O (especially compared to the replaced SLES11 host), or if it is waiting for I/O even with not much I/O occurring at the guest level?

Regards,
J

Am 05.12.18 um 13:34 schrieb jmozdzen:(…)> I’m not familiar with your
setup: is “raw-san” the same as "pass-through[color=blue]

disks"? Or is it that the (Fiber Channel? iSCSI?) adapter is passed
through to the guest system?There are LUN from a SAN. They are[/color]
populated to the VM but connected to the HyperV (qlogic-driver) and the
SLES uses a ´hv_storvsc´ kernel module from the HyperV guest tools to
connect to the LUN. The LUN are RAW access with a strange file system (NSS)[color=blue]
Can you tell if your SLES12 guest is creating lots of I/O (especially
compared to the replaced SLES11 host), or if it is waiting for I/O even
with not much I/O occurring at the guest level?I have no data from th[/color]
eSLES11 right now (because the LUN is used at the new host. But iostat
says: (6 pictures ´a 2 seconds)

rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz
await r_await w_await svctm %util
267,00 28,00 1040,50 226,50 9322,00 1102,00 16,45 150,59
151,93 12,75 791,33 0,79 100,00

256,50 10,50 983,50 106,00 7618,00 474,00 14,85 152,99
97,95 11,16 903,19 0,92 100,00

132,50 12,50 962,50 37,00 9072,00 154,00 18,46 165,22
68,43 14,29 1476,65 1,00 100,00

160,00 23,50 1765,00 28,00 23980,00 114,00 26,88 176,77
83,40 13,31 4501,50 0,56 100,00

242,50 174,50 1593,00 117,00 13032,00 622,00 15,97 164,29
158,40 8,88 2194,17 0,58 100,00

179,50 7,00 1272,50 14,50 10894,00 378,00 17,52 168,67
36,34 11,26 2236,83 0,78 100,00

From my pov this is not exremly slow but in ´top´ I see waitstates
around 30-50.

But mayby top is wrong?

Bernd

Hi Bernd,

to me it looks like a problem with the interaction with the hypervisor. I haven’t toyed with HyperV yet, so have no practical experience with it. But those numbers don’t look very impressive to me, except for “the disk is 100% busy” :wink:

If the LUNs aren’t as busy on the host system, too, then maybe some optimization regarding that driver interface could improve the situation, but that’s beyond what I can be of help with. Do you have a vendor for the HyperV system you could ask for details on the hypervisor side of things?

Regards,
J