unable to boot after SLES11 SP1 to SP2 update

Please, I need help!

Using yast2 wagon (and a couple of reboot) I upgraded from SLES11 SP1 to SP2.
Everything seemed to be OK, but now my machine (HP DL585G2 with 70 TB of data) doesn’t boot anymore.

At boot the green SUSE splash images is properly shown, but after a while the screen becomes black in character interactive mode waiting for keyboard input. If I type a return the dollar sign is the prompt. I can type a few commands as ls, pwd, cd /etc but to let the boot process to continue I have to type exit or control-d. After exiting the sh the boot sequence halts forever due to problems with disks (but I hope they are due to the first problem).

Here a some bad (to me!) boot message printed just before the console starts waiting for user input:

……
Trying manual resume from /dev/disk/by-id/cciss-xxxxxxxxxxxxxxx-part1
Invoking user space resume from /dev/disk/by-id/cciss-xxxxxxxxxxxxxxx-part1
resume: libcrypt version 1.5.0
Trying manual resume from /dev/disk/by-id/cciss-xxxxxxxxxxxxxxx-part1
[10.673421] PM starting manual resume from disk
Invoking in-kernel resume from /dev/disk/by-id/cciss-xxxxxxxxxxxxxxx-part1
Waiting for device /dev/disk/by-id/cciss-xxxxxxxxxxxxxxx-part2 to appear: ok
Invalid root file system: exiting to /bin/sh
[64.223233] rport-0:0-4: blocked FC remote port time out: removing rprot

After the above messages the console is waiting for keyboard input.
If I type the return key, a dollar sign is the new prompt.
When I type exit the boot sequence starts again but it hangs after a lot of complains
about devices not ready

As you imagine my users will be really upset on monday morning!

Can anybody help me?

If I boot from SP2 DVD and choose REPAIR can I have back my system as it was before?

Thanks a lot for any help,
Emanuele Lombardi

It’s me again.
Now the worst has gone and the system has booted, but it not yet able to boot without an user action.

After the “Invalid root file system: exiting to /bin/sh” message I played around at the $ prompt:

$ ls
bin bootsplash dev init lib64 proc run_all.sh sys usr
boot config etc lib mkinitrd.config root sbin tmp var

$ mkdir /lele

$ mount /dev/disk/by-id/cciss-3600508b100104a395354583056420005-part2 /lele

$ ls /lele

HERE IT IS MY OLD /ROOT !!!

$umount /lele

If now exit from the sh the booting process goes on like a charms!

$ exit
Mounting root /dev/disk/by-id/cciss-3600508b100104a395354583056420005-part2
mount -o rw,acl,user_xattr /dev/disk/by-id/cciss-3600508b100104a395354583056420005-part2 /root
Boot logging started on /dev/tty1((no tty)) at Mon Mar 26 10:15:56 2012

The fact the booting sequence halts when unattended seems to be a kind of a “not ready boot device problem”.
Infact if I soon exit from the shell, then the mount of /root is not performed and the boot sequence halts a few seconds later,
but if I play around a few second before exiting the shell, the boot continues as it should.

The booting device is an HP Smart Array P400 controller and it’s firmware is uptodate.

Thus I still need your kind help!

Ciao from Italy,
Emanuele Lombardi

What kind of filesystem do you use?

Check your /etc/fstab. Filesystem type, and last two columns (should be 1 1)

If your filesystem’s kernel module is missing from initrd, you have to include it. Boot with this workaround, and create a new initrd (see /etc/sysconfig/kernel, lines INITRD_MODULES and MODULES_LOADED_ON_BOOT) + mkinitrd .

For / I use ext3
This is the relevant line from /etc/fstab
/dev/disk/by-id/cciss-3600508b100104a395354583056420005-part2 / ext3 acl,user_xattr 1 1

Thanks for your help

It’s always me.
Here is

ls -l /dev/disk/by-id/cciss-3600508b100104a395354583056420005*
lrwxrwxrwx 1 root root 16 Mar 26 15:08 /dev/disk/by-id/cciss-3600508b100104a395354583056420005 → …/…/cciss/c0d0
lrwxrwxrwx 1 root root 18 Mar 26 10:13 /dev/disk/by-id/cciss-3600508b100104a395354583056420005-part1 → …/…/cciss/c0d0p1
lrwxrwxrwx 1 root root 18 Mar 26 10:13 /dev/disk/by-id/cciss-3600508b100104a395354583056420005-part2 → …/…/cciss/c0d0p2

Here are some files relevant to the boot process:

=================== /etc/fstab
/dev/disk/by-id/cciss-3600508b100104a395354583056420005-part1 swap swap defaults 0 0
/dev/disk/by-id/cciss-3600508b100104a395354583056420005-part2 / ext3 acl,user_xattr 1 1

=================== /boot/grub/menu.lst

Modified by YaST2. Last modification on Sat Mar 24 10:46:23 CET 2012

default 0
timeout 18
##YaST - generic_mbr
gfxmenu (hd0,1)/boot/message
##YaST - activate

###Don’t change this comment - YaST2 identifier: Original name: linux###
title SUSE Linux Enterprise Server 11 SP2SUSE Linux Enterprise Server 11 SP1 - 3.0.13-0.27
root (hd0,1)
kernel /boot/vmlinuz-3.0.13-0.27-default root=/dev/disk/by-id/cciss-3600508b100104a395354583056420005-part2 resume=/dev/disk/by-id/cciss-3600508b100104a395354583056420005-part1 splash=silent showopts vga=0x317
initrd /boot/initrd-3.0.13-0.27-default

###Don’t change this comment - YaST2 identifier: Original name: failsafe###
title Failsafe – SUSE Linux Enterprise Server 11 SP2SUSE Linux Enterprise Server 11 SP1 - 3.0.13-0.27
root (hd0,1)
kernel /boot/vmlinuz-3.0.13-0.27-default root=/dev/disk/by-id/cciss-3600508b100104a395354583056420005-part2 showopts ide=nodma apm=off noresume edd=off powersaved=off nohz=off highres=off processor.max_cstate=1 nomodeset x11failsafe vga=0x317
initrd /boot/initrd-3.0.13-0.27-default

###Don’t change this comment - YaST2 identifier: Original name: floppy###
title Floppy
rootnoverify (fd0)
chainloader +1

=========================== /boot/grub/device.map
(hd5) /dev/disk/by-id/scsi-3600601600a50250088e562e1277dde11
(fd0) /dev/fd0
(hd1) /dev/disk/by-id/scsi-3600805f300174190c5df1d95d1000084
(hd3) /dev/disk/by-id/scsi-3600805f3001741901b2f730a8fb00082
(hd0) /dev/disk/by-id/cciss-3600508b100104a395354583056420005
(hd6) /dev/sdb
(hd2) /dev/disk/by-id/scsi-3600805f300174190ff40afb90ece0085
(hd4) /dev/disk/by-id/scsi-3600805f300174190317f3c204eee0083

[QUOTE=emanuele_lombardi;3527]Please, I need help!

Using yast2 wagon (and a couple of reboot) I upgraded from SLES11 SP1 to SP2.
Everything seemed to be OK, but now my machine (HP DL585G2 with 70 TB of data) doesn’t boot anymore.

At boot the green SUSE splash images is properly shown, but after a while the screen becomes black in character interactive mode waiting for keyboard input. If I type a return the dollar sign is the prompt. I can type a few commands as ls, pwd, cd /etc but to let the boot process to continue I have to type exit or control-d. After exiting the sh the boot sequence halts forever due to problems with disks (but I hope they are due to the first problem).

Here a some bad (to me!) boot message printed just before the console starts waiting for user input:
[/QUOTE]

Hello, I had the same Problem, so I booted from DVD and used the repair option. After this it worked …

Kindly regards Meike

Interesting…

I have just built a SLES11 SP1 server and applied all updates.

SP2 came down and it also broke with the same symptoms - “invalid root file system”
Also booted from DVD and used the repair option, which did fix it…something strange here?

Regards

Lewis

itdlrt wrote:
[color=blue]

SP2 came down and it also broke with the same symptoms - “invalid root
file system”[/color]

You aren’t the only one. The same happened to me.
My SLES11-SP2 iso image is on the downed server which is my Internet
gateway. I’m still trying to burn a DVD. :frowning:


Kevin Boyle - Knowledge Partner
If you find this post helpful and are using the web interface,
show your appreciation and click on the star below…

KBOYLE wrote:
[color=blue]

I’m still trying to burn a DVD.[/color]

In the mean time, I went to the console and typed “exit”.

The boot process continued and the server booted successfully. I
immediately uploaded a supportconfig to attach to my opened Service
Request. When I have more information, I’ll update this thread.

If you can, I suggest you open a Service Request. The more information
SUSE/Novell have the quicker they will be able to zero in on the
problem.


Kevin Boyle - Knowledge Partner
If you find this post helpful and are using the web interface,
show your appreciation and click on the star below…

I just had the exact same problem with a SLES11 SP2 that was recently upgraded from SP1. I after having it boot by exiting at the shell I typed a “mkinitrd” to recreate the initrd file. I rebooted and it looks good now!

andy

emanuele lombardi wrote:
[color=blue]

Please, I need help![/color]

These comments are a little late for you (and for me) but they may help
others.

When installing SLES11-SP2 (or SLES11-SP1) under certain conditions,
mkinitrd may not have been run before the system is rebooted. initrd
may not be in sync with the updated packages causing the subsequent
boot to fail.

Workaround:
The workaround is to simply run mkinitrd (as root)!

su - root
mkinitrd

If you do this after the upgrade has completed but before the system is
rebooted, you may avoid the reboot issues.

Many of the system affected by this issue display the following on the
console:

invalid root filesystem -- exiting to /bin/sh
$

If you find yourself in this situation, you may be able to resume the
boot process by typing “exit” .

invalid root filesystem -- exiting to /bin/sh
$ exit
exit
Mounting root /dev/disk/by-id/................................

Once the system boots, you can then run mkinitrd.

If you are unable to get your system to boot, you may have to open a
Service Request.


Kevin Boyle - Knowledge Partner
If you find this post helpful and are using the web interface,
show your appreciation and click on the star below…

I’d check the /etc/sysconfig/kernel file for the INITRD_MODULES entry,
especially if it includes the cciss driver. If not, correct this entry
to include all drivers you need for booting. Then run mkinitrd to
create a new boot image. That you can do also via repair from a SLES
11.2 DVD.

If you have different HP smart array controllers on your PC see if you
have some, which need the hpsa drivers. If that’s the case consult the
documentation.

It could also be, that the hpsa driver is taking over the the smart
array P400, too as this is the newer driver. If you want to use that
you will have to rearrange quite a lot.

See
http://h20000.www2.hp.com/bc/docs/support/SupportManual/c02677069/c02677
069.pdf for further information.


W. Prindl

emanuele lombardi wrote:
[color=blue]

For / I use ext3
This is the relevant line from /etc/fstab
/dev/disk/by-id/cciss-3600508b100104a395354583056420005-part2 / ext3
acl,user_xattr 1 1

Thanks for your help[/color]