Hi,
i have a problem with some servers and multipath:
HW setup:
Dell R720 with 64GB ram.
2x qla2xxx FC controllers with 2 ports each
2x Dell MD3620f Storage
Installed drivers:
qla2xxx (from Dell)
scsi_dh_rdac (from Dell)
after a path failure i get the following errors:
[ 2513.720533] rport-10:0-12: blocked FC remote port time out: removing rport
[ 2513.720542] rport-10:0-13: blocked FC remote port time out: removing rport
[ 2513.720547] rport-10:0-11: blocked FC remote port time out: removing rport
[ 2514.104142] rport-9:0-11: blocked FC remote port time out: removing rport
[ 2514.110747] rport-11:0-11: blocked FC remote port time out: removing rport
[ 2514.619173] rport-8:0-11: blocked FC remote port time out: removing rport
[ 2571.599640] device-mapper: multipath: Failing path 70:48.
[ 2571.599952] device-mapper: multipath: Failing path 70:80.
[ 2571.600215] device-mapper: multipath: Failing path 70:112.
[ 2571.600479] device-mapper: multipath: Failing path 70:128.
[ 2571.600737] device-mapper: multipath: Failing path 70:144.
[ 2571.601067] device-mapper: multipath: Failing path 70:160.
[ 2571.601325] device-mapper: multipath: Failing path 70:192.
[ 2571.601716] device-mapper: multipath: Failing path 70:240.
[ 2571.601978] device-mapper: multipath: Failing path 71:16.
[ 2571.602237] device-mapper: multipath: Failing path 71:48.
[ 2571.602502] device-mapper: multipath: Failing path 71:64.
[ 2571.602750] device-mapper: multipath: Failing path 71:80.
[ 2571.603007] device-mapper: multipath: Failing path 71:96.
[ 2571.603265] device-mapper: multipath: Failing path 71:128.
[ 2571.603543] device-mapper: multipath: Failing path 71:176.
[ 2571.603786] device-mapper: multipath: Failing path 71:208.
[ 2571.604036] device-mapper: multipath: Failing path 71:240.
[ 2571.604282] device-mapper: multipath: Failing path 128:0.
[ 2571.604527] device-mapper: multipath: Failing path 128:32.
[ 2571.604869] device-mapper: multipath: Failing path 128:64.
[ 2571.605131] device-mapper: multipath: Failing path 128:16.
[ 2571.605395] device-mapper: multipath: Failing path 128:112.
[ 2571.605724] device-mapper: multipath: Failing path 128:176.
[ 2571.605977] device-mapper: multipath: Failing path 128:144.
[ 2571.606227] device-mapper: multipath: Failing path 128:192.
[ 2571.606484] device-mapper: multipath: Failing path 128:224.
[ 2571.606730] device-mapper: multipath: Failing path 128:208.
[ 2571.607052] device-mapper: multipath: Failing path 129:0.
2603.690284] rport-10:0-7: blocked FC remote port time out: removing target and saving binding
[ 2603.690307] rport-10:0-1: blocked FC remote port time out: removing target and saving binding
[ 2603.690317] rport-10:0-2: blocked FC remote port time out: removing target and saving binding
[ 2603.690325] rport-10:0-3: blocked FC remote port time out: removing target and saving binding
[ 2603.690334] rport-10:0-8: blocked FC remote port time out: removing target and saving binding
[ 2603.691765] rport-10:0-4: blocked FC remote port time out: removing target and saving binding
[ 2603.691781] rport-10:0-5: blocked FC remote port time out: removing target and saving binding
[ 2603.691790] rport-10:0-6: blocked FC remote port time out: removing target and saving binding
[ 2603.692404] sd 10:0:6:0: rdac: Detached
[ 2603.703358] sd 10:0:6:0: [sdee] Synchronizing SCSI cache
[ 2603.703399] sd 10:0:6:0: [sdee] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
[ 2603.704404] sd 10:0:6:1: rdac: Detached
[ 2603.706457] sd 10:0:6:1: [sdef] Synchronizing SCSI cache
[ 2603.706504] sd 10:0:6:1: [sdef] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
[ 2603.707020] sd 10:0:6:2: rdac: Detached
[ 2603.719430] sd 10:0:6:2: [sdeg] Synchronizing SCSI cache
[ 2603.719479] sd 10:0:6:2: [sdeg] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
[ 2603.720086] sd 10:0:6:3: rdac: Detached
[ 2603.726895] sd 10:0:6:3: [sdeh] Synchronizing SCSI cache
[ 2603.726985] sd 10:0:6:3: [sdeh] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
[ 2603.727822] sd 10:0:6:4: rdac: Detached
[ 2603.735006] sd 10:0:6:4: [sdei] Synchronizing SCSI cache
[ 2603.735053] sd 10:0:6:4: [sdei] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
[ 2603.735617] sd 10:0:6:5: rdac: Detached
[ 2603.741101] sd 10:0:6:5: [sdej] Synchronizing SCSI cache
[ 2603.741149] sd 10:0:6:5: [sdej] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
[ 2628.151416] BUG: soft lockup - CPU#0 stuck for 22s! [kworker/0:3:2154]
[ 2628.151419] Modules linked in: ipmi_si iptable_filter ip_tables x_tables xfs binfmt_misc edd mpt2sas scsi_transport_sas raid_class mptctl mptbase i
pmi_devintf ipmi_msghandler dell_rbu(X) bonding mperf microcode fuse nls_utf8 loop pciehp qla2xxx(X) joydev usbhid hid usb_storage ixgbe dca scsi_tran
sport_fc scsi_tgt tg3 shpchp pci_hotplug mdio sg sr_mod cdrom ipv6 ipv6_lib wmi dcdbas(X) pcspkr acpi_power_meter acpi_pad button iTCO_wdt iTCO_vendor
_support rtc_cmos dm_round_robin ehci_hcd usbcore usb_common sd_mod crc_t10dif processor thermal_sys hwmon scsi_dh_alua scsi_dh_emc scsi_dh_hp_sw dm_s
napshot dm_multipath dm_mod scsi_dh_rdac scsi_dh ext3 mbcache jbd ahci libahci libata megaraid_sas scsi_mod [last unloaded: ipmi_si]
[ 2628.151467] Supported: Yes
[ 2628.151468] CPU 0
[ 2628.151469] Modules linked in: ipmi_si iptable_filter ip_tables x_tables xfs binfmt_misc edd mpt2sas scsi_transport_sas raid_class mptctl mptbase i
pmi_devintf ipmi_msghandler dell_rbu(X) bonding mperf microcode fuse nls_utf8 loop pciehp qla2xxx(X) joydev usbhid hid usb_storage ixgbe dca scsi_tran
sport_fc scsi_tgt tg3 shpchp pci_hotplug mdio sg sr_mod cdrom ipv6 ipv6_lib wmi dcdbas(X) pcspkr acpi_power_meter acpi_pad button iTCO_wdt iTCO_vendor
_support rtc_cmos dm_round_robin ehci_hcd usbcore usb_common sd_mod crc_t10dif processor thermal_sys hwmon scsi_dh_alua scsi_dh_emc scsi_dh_hp_sw dm_s
napshot dm_multipath dm_mod scsi_dh_rdac scsi_dh ext3 mbcache jbd ahci libahci libata megaraid_sas scsi_mod [last unloaded: ipmi_si]
[ 2628.151498] Supported: Yes
[ 2628.151500]
[ 2628.151502] Pid: 2154, comm: kworker/0:3 Tainted: G X 3.0.42-0.7-default #1 Dell Inc. PowerEdge R720/0VWT90
[ 2628.151506] RIP: 0010:[] [] _raw_spin_unlock_irqrestore+0x8/0x10
[ 2628.151514] RSP: 0018:ffff8807f65e3df8 EFLAGS: 00000202
[ 2628.151516] RAX: ffff881000375000 RBX: 0000000000001654 RCX: 000000000000948c
[ 2628.151518] RDX: 000000000000948c RSI: 0000000000000202 RDI: 0000000000000202
[ 2628.151519] RBP: ffff8807f7d22060 R08: ffffc900060e6000 R09: 0000000000000594
[ 2628.151521] R10: 0000000000001654 R11: 00000000fffffffc R12: ffffffff8144b66e
[ 2628.151523] R13: ffff8807f7d22060 R14: ffffffff8144b66e R15: ffff8807f5696800
[ 2628.151525] FS: 0000000000000000(0000) GS:ffff88082fc00000(0000) knlGS:0000000000000000
[ 2628.151527] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 2628.151529] CR2: 00007f1cea4af214 CR3: 00000007fd5c9000 CR4: 00000000000406f0
[ 2628.151530] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2628.151532] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 2628.151534] Process kworker/0:3 (pid: 2154, threadinfo ffff8807f65e2000, task ffff8807fdc560c0)
[ 2628.151536] Stack:
[ 2628.151539] ffffffffa000f46d ffff8807f7d223c8 0000000000000000 ffff8807f7d223c8
[ 2628.151544] ffff8807fdfabb40 ffff88082fc0cf80 ffffffff8107426c ffffe8f7ffa02e00
[ 2628.151547] 0000000000000000 ffffe8f7ffc0b300 ffff8807fdfabb40 ffff88082fc0cf80
[ 2628.151550] Call Trace:
[ 2628.151568] [] scsi_remove_target+0xbd/0xf0 [scsi_mod]
[ 2628.151587] [] process_one_work+0x16c/0x350
[ 2628.151593] [] worker_thread+0x17a/0x410
[ 2628.151597] [] kthread+0x96/0xa0
[ 2628.151602] [] kernel_thread_helper+0x4/0x10
[ 2628.151605] Code: 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 07 f3 90 0f b7 17 eb f5 c3 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 66 ff 07 48 89 f7 57 9d
[ 2628.151622] 66 90 66 90 c3 66 90 b8 ff ff ff ff f0 0f c1 07 83 e8 01 ba
[ 2628.151629] Call Trace:
[ 2628.151637] [] scsi_remove_target+0xbd/0xf0 [scsi_mod]
[ 2628.151650] [] process_one_work+0x16c/0x350
[ 2628.151654] [] worker_thread+0x17a/0x410
[ 2628.151657] [] kthread+0x96/0xa0
[ 2628.151661] [] kernel_thread_helper+0x4/0x10
[ 2656.107213] BUG: soft lockup - CPU#0 stuck for 22s! [kworker/0:3:2154]
[ 2656.107215] Modules linked in: ipmi_si iptable_filter ip_tables x_tables xfs binfmt_misc edd mpt2sas scsi_transport_sas raid_class mptctl mptbase i
pmi_devintf ipmi_msghandler dell_rbu(X) bonding mperf microcode fuse nls_utf8 loop pciehp qla2xxx(X) joydev usbhid hid usb_storage ixgbe dca scsi_tran
sport_fc scsi_tgt tg3 shpchp pci_hotplug mdio sg sr_mod cdrom ipv6 ipv6_lib wmi dcdbas(X) pcspkr acpi_power_meter acpi_pad button iTCO_wdt iTCO_vendor
_support rtc_cmos dm_round_robin ehci_hcd usbcore usb_common sd_mod crc_t10dif processor thermal_sys hwmon scsi_dh_alua scsi_dh_emc scsi_dh_hp_sw dm_s
napshot dm_multipath dm_mod scsi_dh_rdac scsi_dh ext3 mbcache jbd ahci libahci libata megaraid_sas scsi_mod [last unloaded: ipmi_si]
[ 2656.107264] Supported: Yes
[ 2656.107265] CPU 0
[ 2656.107266] Modules linked in: ipmi_si iptable_filter ip_tables x_tables xfs binfmt_misc edd mpt2sas scsi_transport_sas raid_class mptctl mptbase i
pmi_devintf ipmi_msghandler dell_rbu(X) bonding mperf microcode fuse nls_utf8 loop pciehp qla2xxx(X) joydev usbhid hid usb_storage ixgbe dca scsi_tran
sport_fc scsi_tgt tg3 shpchp pci_hotplug mdio sg sr_mod cdrom ipv6 ipv6_lib wmi dcdbas(X) pcspkr acpi_power_meter acpi_pad button iTCO_wdt iTCO_vendor
_support rtc_cmos dm_round_robin ehci_hcd usbcore usb_common sd_mod crc_t10dif processor thermal_sys hwmon scsi_dh_alua scsi_dh_emc scsi_dh_hp_sw dm_s
napshot dm_multipath dm_mod scsi_dh_rdac scsi_dh ext3 mbcache jbd ahci libahci libata megaraid_sas scsi_mod [last unloaded: ipmi_si]
[ 2656.107295] Supported: Yes
[ 2656.107296]
[ 2656.107298] Pid: 2154, comm: kworker/0:3 Tainted: G X 3.0.42-0.7-default #1 Dell Inc. PowerEdge R720/0VWT90
[ 2656.107301] RIP: 0010:[] [] _raw_spin_unlock_irqrestore+0x8/0x10
[ 2656.107310] RSP: 0018:ffff8807f65e3df8 EFLAGS: 00000202
[ 2656.107312] RAX: ffff881000375000 RBX: 000000000000674f RCX: 000000000000c76b
[ 2656.107313] RDX: 000000000000c76b RSI: 0000000000000202 RDI: 0000000000000202
[ 2656.107315] RBP: ffff8807f7d22060 R08: ffffc900060e6000 R09: 0000000000000594
[ 2656.107317] R10: 0000000000001654 R11: 00000000fffffffc R12: ffffffff8144b66e
[ 2656.107319] R13: 0000000000000594 R14: 0000000000001654 R15: 00000000fffffffc
[ 2656.107321] FS: 0000000000000000(0000) GS:ffff88082fc00000(0000) knlGS:0000000000000000
[ 2656.107323] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 2656.107324] CR2: 00007f1cea4af214 CR3: 00000007fd5c9000 CR4: 00000000000406f0
[ 2656.107326] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2656.107328] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 2656.107330] Process kworker/0:3 (pid: 2154, threadinfo ffff8807f65e2000, task ffff8807fdc560c0)
[ 2656.107331] Stack:
[ 2656.107335] ffffffffa000f46d ffff8807f7d223c8 0000000000000000 ffff8807f7d223c8
.
.
.
.
i tried with or without transparent_hugepage=never with no effekt.