Running SLES 11 SP2 with latest patches (as of 15/7/2013), SLES HAE SP2 OCFS2 Version 1.6_3.0.13, XEN
5 Node Cluster OCFS2 gets broken by the latest kernel update.
All other SLES nodes running kernel 3.0.58
Updated one server to kernel 3.0.80 and ocfs2 will no longer mount
ocfs2 Internal logic failure while trying to join the group
Jul 15 12:48:44 sl-bne-hs23-01 kernel: [ 3496.524201] (mount.ocfs2,21615,11):o2hb_map_slot_data:1638 ERROR: status = -12
Jul 15 12:48:44 sl-bne-hs23-01 kernel: [ 3496.524208] (mount.ocfs2,21615,11):o2hb_region_dev_write:1768 ERROR: status = -12
By back rev’ing the kernel to 3.0.58, the system starts up again with no problems, ocfs2 mounts coming up correctly
Version 3.0.74 was not tested due to the need to get a live system back on line. No other patches had to be downgraded on the 5th server. Just the kernel-xen and the kernel-xen-base
This kernel version has been marked protected on our SLES cluster servers until a resolution is found.
Thanks
Eric
I just checked the patch announcement, which doesn’t mention “rolling updates”. I for one would therefore expect that rolling updates are supported, but have asked my contacts at SUSE if I’m reading that correctly. I’ll let you know once I’ve received a response.
I just double-checked to confirm yesterday’s status: On my machines I see 3.0.80-7.1, we’re running against our own SMT server which receives the packets from Novell:
As you can see from above’s “zypper lu” output, it’s from the SLES11-SP2 repository, so no, I didn’t confuse with SP3 - enough coffee this time
Looking at the repository directory on our SMT server, I see that this update is two weeks old:
Okay, my SMT server says it’s working but I suspect nobody is home. I’ve kiled the SMT service and started it again. A whole bunch of updates are starting to show up. It will be a while before I can get back to this and these are production machines. As soon as the opportunity avails itself, I’ll try the later kernel once I have it.
Thanks
Eric.
if the problem persists after applying the latest updates, please let me know (I’m monitoring this thread) so I can try to get some feedback from SUSE.
if the problem persists after applying the latest updates, please let me know (I’m monitoring this thread) so I can try to get some feedback from SUSE.
Regards,
Jens[/QUOTE]
Hi Jens,
After resolving my SMT issue (For some reason a file was missing write rights) I now have the latest kernel patches. However, as these machines are production and other hardware issues have been resolved, they are now very stable. It’s unlikely that I’m going to get the opportunity to test the later kernel. We are also preparing to move to SP3 which will mean that our whole virtual infrastructure will have to be updated at the same time. This is likely to be the next time the machines are taken down.
I can confirm from our test environment for SP3 that there is no problems with OCFS under SLES SP3 and HAE SP3.
If an opportunity arises to test the later SP2 kernel, I’ll add an additional post. Otherwise, I’ll have to leave this issue as resolved.
Thanks
Eric.