SLES12SP2 KVM guest: Uhhuh. NMI received for unknown reason 30 on CPU0.

Hi,

the complete message is:

[ 3411.029025] Uhhuh. NMI received for unknown reason 30 on CPU 0.
[ 3411.029028] Do you have a strange power saving mode enabled?
[ 3411.029028] Dazed and confused, but trying to continue
[ 3441.028990] Uhhuh. NMI received for unknown reason 20 on CPU 0.
[ 3441.028996] Do you have a strange power saving mode enabled?
[ 3441.028996] Dazed and confused, but trying to continue

When it happens, it seems to happen very often (every 30 seconds) and
the guest typically locks up totally after 1-2 days.

Host and guest are both SLES12SP2 with kernel-4.4.21-84/88. There is no
problem when the guest is freshly started (virsh start guest), but as
soon as I issue a ‘reboot’ within the guest, the messages start after a
few seconds.

Anyone seen this and knows a workaround?

Franz.

On 16/12/16 13:48, Franz Sirl wrote:
[color=blue]

the complete message is:

[ 3411.029025] Uhhuh. NMI received for unknown reason 30 on CPU 0.
[ 3411.029028] Do you have a strange power saving mode enabled?
[ 3411.029028] Dazed and confused, but trying to continue
[ 3441.028990] Uhhuh. NMI received for unknown reason 20 on CPU 0.
[ 3441.028996] Do you have a strange power saving mode enabled?
[ 3441.028996] Dazed and confused, but trying to continue

When it happens, it seems to happen very often (every 30 seconds) and
the guest typically locks up totally after 1-2 days.

Host and guest are both SLES12SP2 with kernel-4.4.21-84/88. There is no
problem when the guest is freshly started (virsh start guest), but as
soon as I issue a ‘reboot’ within the guest, the messages start after a
few seconds.

Anyone seen this and knows a workaround?[/color]

Can you try updating a guest (and possibly the host) to latest kernel
available for SLES12 SP2 - 4.4.21-90.1 ?

HTH.

Simon
SUSE Knowledge Partner


If you find this post helpful and are logged into the web interface,
please show your appreciation and click on the star below. Thanks.

Am 2016-12-16 um 15:31 schrieb Simon Flood:[color=blue]

On 16/12/16 13:48, Franz Sirl wrote:
[color=green]

the complete message is:

[ 3411.029025] Uhhuh. NMI received for unknown reason 30 on CPU 0.
[ 3411.029028] Do you have a strange power saving mode enabled?
[ 3411.029028] Dazed and confused, but trying to continue
[ 3441.028990] Uhhuh. NMI received for unknown reason 20 on CPU 0.
[ 3441.028996] Do you have a strange power saving mode enabled?
[ 3441.028996] Dazed and confused, but trying to continue

When it happens, it seems to happen very often (every 30 seconds) and
the guest typically locks up totally after 1-2 days.

Host and guest are both SLES12SP2 with kernel-4.4.21-84/88. There is no
problem when the guest is freshly started (virsh start guest), but as
soon as I issue a ‘reboot’ within the guest, the messages start after a
few seconds.

Anyone seen this and knows a workaround?[/color]

Can you try updating a guest (and possibly the host) to latest kernel
available for SLES12 SP2 - 4.4.21-90.1 ?[/color]

Hi Simon,

I misremembered, it was with kernel-4.4.21-84/90, the host is still
kernel-4.4.21-84. qemu and xen-libs on the host have the latest updates
though. I just retried the reboot on the guest again, but after ~10min
the messages appeared again.

I’ll try to fully update the host too over the weekend, but I don’t have
much hope because only 2 CVEs are listed for the -90 kernel update.

Franz.

Am 2016-12-16 um 19:31 schrieb Franz Sirl:[color=blue]

Am 2016-12-16 um 15:31 schrieb Simon Flood:[color=green]

On 16/12/16 13:48, Franz Sirl wrote:
[color=darkred]

the complete message is:

[ 3411.029025] Uhhuh. NMI received for unknown reason 30 on CPU 0.
[ 3411.029028] Do you have a strange power saving mode enabled?
[ 3411.029028] Dazed and confused, but trying to continue
[ 3441.028990] Uhhuh. NMI received for unknown reason 20 on CPU 0.
[ 3441.028996] Do you have a strange power saving mode enabled?
[ 3441.028996] Dazed and confused, but trying to continue

When it happens, it seems to happen very often (every 30 seconds) and
the guest typically locks up totally after 1-2 days.

Host and guest are both SLES12SP2 with kernel-4.4.21-84/88. There is no
problem when the guest is freshly started (virsh start guest), but as
soon as I issue a ‘reboot’ within the guest, the messages start after a
few seconds.

Anyone seen this and knows a workaround?[/color]

Can you try updating a guest (and possibly the host) to latest kernel
available for SLES12 SP2 - 4.4.21-90.1 ?[/color]

Hi Simon,

I misremembered, it was with kernel-4.4.21-84/90, the host is still
kernel-4.4.21-84. qemu and xen-libs on the host have the latest updates
though. I just retried the reboot on the guest again, but after ~10min
the messages appeared again.

I’ll try to fully update the host too over the weekend, but I don’t have
much hope because only 2 CVEs are listed for the -90 kernel update.[/color]

I was right, the -90 kernel update doesn’t change anything. But I think
I have found a hint now, it only happens with the Q35 emulation, another
VM with the I440FX emulation reboots fine. So there is likely a problem
in qemu with Q35 emulation. I’ll try to find more when there is time.

Franz