when using nfsd - kernel BUG in 3.12.67-60.64.18-default

looks like there’s a crash bug in the most recent kernel. Most Probably related to NFS.

rolling back to 3.12.62-60.64.8-default fixed the problem, users are grumpy now though.

So if you use NFS be aware.

Unfortunately I don’t know if SuSE is already aware of this bug - I couldn’t find its bugzilla and SRs are scarce…

2016-11-29T11:22:20.320676+01:00 plato kernel: [   88.574649] kernel BUG at ../fs/dcache.c:268!
2016-11-29T11:22:20.320677+01:00 plato kernel: [   88.575635] invalid opcode: 0000 [#1] SMP
2016-11-29T11:22:20.320678+01:00 plato kernel: [   88.576622] Modules linked in: binfmt_misc mptctl mptbase tcp_diag inet_diag xt_pktty
2016-11-29T11:22:20.320680+01:00 plato kernel: [   88.586386] Supported: Yes
2016-11-29T11:22:20.320693+01:00 plato kernel: [   88.587491] CPU: 10 PID: 3863 Comm: nfsd Not tainted 3.12.67-60.64.18-default #1
2016-11-29T11:22:20.320694+01:00 plato kernel: [   88.588597] Hardware name: HP ProLiant DL380p Gen8, BIOS P70 07/01/2015
2016-11-29T11:22:20.320695+01:00 plato kernel: [   88.589698] task: ffff88042b449700 ti: ffff8800b7012000 task.ti: ffff8800b7012000
2016-11-29T11:22:20.320695+01:00 plato kernel: [   88.590793] RIP: 0010:[<ffffffff8151e480>]  [<ffffffff8151e480>] dentry_rcuwalk_barri
2016-11-29T11:22:20.320696+01:00 plato kernel: [   88.591911] RSP: 0018:ffff8800b7013b10  EFLAGS: 00010246
2016-11-29T11:22:20.320697+01:00 plato kernel: [   88.592993] RAX: 0000000000030003 RBX: ffff8803ade58e40 RCX: ffff8804146cf8c0
2016-11-29T11:22:20.320698+01:00 plato kernel: [   88.594102] RDX: 0000000000000003 RSI: ffff8803adf05eb8 RDI: ffffffff82170434
2016-11-29T11:22:20.320699+01:00 plato kernel: [   88.595219] RBP: ffff8804146cf8c0 R08: ffff8800b7013a98 R09: 66bc1adcb35c9ebb
2016-11-29T11:22:20.320699+01:00 plato kernel: [   88.596351] R10: ffff880414705000 R11: 0000000000000000 R12: ffff880398896600
2016-11-29T11:22:20.320700+01:00 plato kernel: [   88.597478] R13: ffff8803ade58e40 R14: ffff8804295be720 R15: ffff8803ade58e40
2016-11-29T11:22:20.320701+01:00 plato kernel: [   88.598601] FS:  0000000000000000(0000) GS:ffff88043f540000(0000) knlGS:0000000000000
2016-11-29T11:22:20.320702+01:00 plato kernel: [   88.599712] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
2016-11-29T11:22:20.320702+01:00 plato kernel: [   88.600793] CR2: 000000000069c530 CR3: 0000000001c0c000 CR4: 00000000000407e0
2016-11-29T11:22:20.320703+01:00 plato kernel: [   88.601883] Stack:
2016-11-29T11:22:20.320704+01:00 plato kernel: [   88.602952]  ffffffff811be9ce ffff8803adf05da0 ffff8803ade58e40 ffffffff811c02db
2016-11-29T11:22:20.320704+01:00 plato kernel: [   88.604067]  ffff8804130e6b40 ffff880398896600 ffff8804130e6b40 ffff8804130e6b40
2016-11-29T11:22:20.320705+01:00 plato kernel: [   88.605175]  ffff8803ade58e40 ffff8804295be720 ffff8803ade58e40 ffffffff811b2549
2016-11-29T11:22:20.320706+01:00 plato kernel: [   88.606292] Call Trace:
2016-11-29T11:22:20.320706+01:00 plato kernel: [   88.607403]  [<ffffffff811be9ce>] __d_drop+0xee/0xf0
2016-11-29T11:22:20.320707+01:00 plato kernel: [   88.608510]  [<ffffffff811c02db>] d_materialise_unique+0x25b/0x3d0
2016-11-29T11:22:20.320708+01:00 plato kernel: [   88.609605]  [<ffffffff811b2549>] lookup_real+0x19/0x50
2016-11-29T11:22:20.320709+01:00 plato kernel: [   88.610696]  [<ffffffff811b2e0f>] __lookup_hash+0x2f/0x40
2016-11-29T11:22:20.320709+01:00 plato kernel: [   88.611802]  [<ffffffff811b3c1d>] lookup_one_len+0xcd/0x120
2016-11-29T11:22:20.320710+01:00 plato kernel: [   88.612926]  [<ffffffff8121d7d2>] reconnect_path+0x1c2/0x2e0
2016-11-29T11:22:20.320711+01:00 plato kernel: [   88.614058]  [<ffffffff8121dc1f>] exportfs_decode_fh+0xef/0x2c0
2016-11-29T11:22:20.320711+01:00 plato kernel: [   88.615194]  [<ffffffffa05e5485>] fh_verify+0x2f5/0x5e0 [nfsd]
2016-11-29T11:22:20.320712+01:00 plato kernel: [   88.616346]  [<ffffffffa05f4cad>] nfsd4_proc_compound+0x55d/0x7b0 [nfsd]
2016-11-29T11:22:20.320713+01:00 plato kernel: [   88.617522]  [<ffffffffa05e1d22>] nfsd_dispatch+0xb2/0x200 [nfsd]
2016-11-29T11:22:20.320714+01:00 plato kernel: [   88.618703]  [<ffffffffa033df46>] svc_process_common+0x476/0x6e0 [sunrpc]
2016-11-29T11:22:20.320715+01:00 plato kernel: [   88.619911]  [<ffffffffa033e2bc>] svc_process+0x10c/0x160 [sunrpc]
2016-11-29T11:22:20.320715+01:00 plato kernel: [   88.621113]  [<ffffffffa05e16df>] nfsd+0xaf/0x120 [nfsd]
2016-11-29T11:22:20.320716+01:00 plato kernel: [   88.622329]  [<ffffffff8107b4c4>] kthread+0xb4/0xc0
2016-11-29T11:22:20.320717+01:00 plato kernel: [   88.624019]  [<ffffffff8152f158>] ret_from_fork+0x58/0x90
2016-11-29T11:22:20.320718+01:00 plato kernel: [   88.625237] Code: 66 90 89 d0 49 89 c8 8b 56 24 48 8b 4e 28 3b 46 04 74 08 f3 90 b8 0
2016-11-29T11:22:20.320718+01:00 plato kernel: [   88.627824] RIP  [<ffffffff8151e480>] dentry_rcuwalk_barrier.part.10+0x0/0x2
2016-11-29T11:22:20.320719+01:00 plato kernel: [   88.629071]  RSP <ffff8800b7013b10>
2016-11-29T11:22:20.320720+01:00 plato kernel: [   88.630344] ---[ end trace 1fa9279381ad17fa ]---

horiba,

It appears that in the past few days you have not received a response to your
posting. That concerns us, and has triggered this automated reply.

These forums are peer-to-peer, best effort, volunteer run and that if your issue
is urgent or not getting a response, you might try one of the following options:

Be sure to read the forum FAQ about what to expect in the way of responses:
http://forums.suse.com/faq.php

If this is a reply to a duplicate posting or otherwise posted in error, please
ignore and accept our apologies and rest assured we will issue a stern reprimand
to our posting bot…

Good luck!

Your SUSE Forums Team
http://forums.suse.com

I’ve asked contacts within SUSE to see what they think about this; I’ll
let you know if they confirm either way.

In the meantime, could you share some details on your version/patch of
SLES, what youa re doing with NFS in particular (provide the /etc/exports
or /etc/fstab file contents perhaps), etc.?


Good luck.

If you find this post helpful and are logged into the web interface,
show your appreciation and click on the star below…

Could you open a Service Request (SR) and cite Bug# 984194?

Are you able to get a core dump for this?

Apparently you are not the first to see a core that may be related, and
the bug above MAY have the same root cause, but it is still under
investigation. Since it is a kernel bug, it is probably best to work
directly with SUSE on getting a fix for you.


Good luck.

If you find this post helpful and are logged into the web interface,
show your appreciation and click on the star below…

Ignore me; unless I’m very mistaken, this is your bug report already. I
was thrown off since it is not very new.

If you have some details of your environment that you can share to try to
reproduce this, I have a SLES 12 SP1 box with the same kernel version
where I could do some NFS testing.


Good luck.

If you find this post helpful and are logged into the web interface,
show your appreciation and click on the star below…