Bootup takes very long time

Hello Team,
I am running SLES15 SP2 on my host. And I am getting NVMe devices as a storage from a SAN box. When I have very less devices (<50) I see the bootup time is reasonable to < 1m. But as I increase the devices (>1000), the bootup time increases significantly. I see that upon configuring about 6000 devices, the bootup time is about 21m, and on reaching 8000 devices, it goes uptill 33 minutes. I checked systemd-analyze blame, and I see the most time taking service is lvm2-monitor. But I couldn’t figure out why it is taking time. Moreover, dev-ttyS0 times out.
Any clue on how can I speed up the bootup time.
Attaching the bootup logs with 6000 devices.
Regards

@Smash Hi, if possible can you upload the boot log to a paste site, eg https://paste.opensuse.org unverified tarballs would (or shouldn’t) be downloaded from the forum :wink:
Is it just lvm2-monitor or other lvm2 services, one wonders if it’s hitting a race condition.

Have put the boot logs at https://paste.opensuse.org/46113668

@Smash can you add the following to the GRUB kernel options crashkernel=162M,high crashkernel=72M,low?

Since you have multipath disabled, have you modified /etc/lvm/lvm.conf?

If not, can you read and modify the following lines in the above file;

- multipath_component_detection = 1
+ multipath_component_detection = 0

- md_component_detection = 1
+ md_component_detection = 0

- udev_sync = 1
+ udev_sync = 0

- udev_rules = 1
+ udev_rules = 0

Hello Malcom, Thanks a lot for this tip. This greatly improved the boot up time. Reduced from 33 minutes to 5 minutes. But I see that udev is configuring the devices even after boot up is completed. Anyways, to increase udev threads or speed up creation of those udev files ?
I have added blame and boot logs for 8000 devices.
blame - https://paste.opensuse.org/39558159
boot log - https://paste.opensuse.org/56897874
Apart from that, I am still trying to find out how to avoid ttyS0 timing out. It starts timing out as soon as 2000 devices are added. Any clues ?

@Smash Hi, the maintenance service is transient, for postfix, are you using IPv6, If not /etc/postfix/main.cf needs at edit to change inet_protocol from all to ipv4.
Since your not using plymouth I would remove all the installed plymouth packages (about 10) add a zypper lock and rebuild initrd. See how that goes for the moment.

Hello Malcom,
I have done required changes, and also incorporated some changes for my needs (like addition of docker). Added new logs for 8000 devices. Though systemd-analyze reported as -
Startup finished in 4.029s (kernel) + 2min 30.783s (initrd) + 3min 13.203s (userspace) = 5min 48.016s
Blame[8000]: https://paste.opensuse.org/64440302
Boot log[8000]: https://paste.opensuse.org/86475707
But when I had 16K devices configured the time to boot up shot to 12 min -
Startup finished in 4.029s (kernel) + 9min 9.478s (initrd) + 3min 21.455s (userspace) = 12min 34.962s
Blame [16000]: https://paste.opensuse.org/570656
Boot log [16000]: https://paste.opensuse.org/99398385
Any idea, why “dracut-initqueue.service” is taking so long with 16K devices ?
At the end things are in much better shape now, except that I am not able to login to console due to ttyS0 timing out, for which I don’t have any solution yet.
Thanks