group_by_prio vs multibus - multipath recomend configuration

OS: SLES 11 SP 3

/dev/dm-0 comprises of sdg, sdq AND sdb, sdl

If “path_grouping_policy” is set to “group_by_prio” then I observe(via iostat -xk /dev/dm-0) all I/O only on /dev/sdg and /dev/sdq, while sdb and sdl remains 100% idle.
But when “path_grouping_policy” is set to “multibus”, then I can see that I/O is properly distributed on all the disks(sdg, sdq … sdb, sdl).

when “path_grouping_policy” is set to “group_by_prio” then running “multipath -ll” shows:

mpatha (3600c0ff0001e6435600e9b5401000000) dm-0 HP,MSA 2040 SAN
size=2.2T features='1 queue_if_no_path' hwhandler='0' wp=rw
|-+- policy='round-robin 0' prio=50 status=active
| |- 3:0:1:0  sdg 8:96  active ready running
| `- 4:0:1:0  sdq 65:0  active ready running
`-+- policy='round-robin 0' prio=10 status=enabled
  |- 3:0:0:0  sdb 8:16  active ready running
  `- 4:0:0:0  sdl 8:176 active ready running

when “path_grouping_policy” is set to “multibus” then running “multipath -ll” shows:

mpatha (3600c0ff0001e6435600e9b5401000000) dm-0 HP,MSA 2040 SAN
size=2.2T features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=30 status=active
  |- 3:0:0:0  sdb 8:16  active ready running
  |- 3:0:1:0  sdg 8:96  active ready running
  |- 4:0:0:0  sdl 8:176 active ready running
  `- 4:0:1:0  sdq 65:0  active ready running

Please recommend me which one is recommended for best performance(group_by_prio vs multibus). Also /etc/multipath.conf on my system looks like:

defaults {
            polling_interval            10
            path_selector              "round-robin 0"
            path_grouping_policy    group_by_prio
            uid_attribute                 "ID_SERIAL"
            prio                              alua   
            path_checker                tur    
            rr_min_io_rq                 100
            flush_on_last_del          no
            max_fds                      "max"
            rr_weight                      uniform   
            failback                        immediate
            no_path_retry                18    
            queue_without_daemon     no
            user_friendly_names      yes
            mode                     644
            uid                      0
            gid                      disk
}

Hi sharfuddin,

definitely no subject that I can cover with experience - but logic tells me that only you can decide which one is better:

If all devices are available at equal path cost, then “round-robin” would nicely distribute the work-load across all devices.

If some devices are more expensive than others (i.e. because they’re not “local” or a different technology), then you’d prioritize accordingly.

Again, I’ve never dealt with multi-pathing, so I may easily be missing some important aspect.

Regards,
Jens

On 02/10/2015 05:44 AM, sharfuddin wrote:
,[color=blue]

Please recommend me which one is recommended for best
performance(group_by_prio vs multibus).[/color]

Really depends on your hardware setup.

If you have a good storage subsystem and multiple controllers, then round
robining across all active connections could gain you some performance.

But if have limited controllers (looks like you have 2?), and your storage
really can’t saturate your connections (just some examples), then last used or
even priority groups works just fine. Priority groups could work best when
there are different types of controllers and/or topologies in use. In which
case you’d assign priority to the better paths…

Regardless, if you decided to trial and benchmark each way, make sure you are
driving read/write loads from multiple clients. Otherwise, you might not see a
difference at all (of course, depending on config, you still might not see a
big difference).

On 02/11/2015 09:45 PM, cjcox wrote:[color=blue]

On 02/10/2015 05:44 AM, sharfuddin wrote:
,[color=green]

Please recommend me which one is recommended for best
performance(group_by_prio vs multibus).[/color]

Really depends on your hardware setup.

If you have a good storage subsystem and multiple controllers, then round
robining across all active connections could gain you some performance.

But if have limited controllers (looks like you have 2?), and your storage
really can’t saturate your connections (just some examples), then last used or
even priority groups works just fine. Priority groups could work best when
there are different types of controllers and/or topologies in use. In which
case you’d assign priority to the better paths…

Regardless, if you decided to trial and benchmark each way, make sure you are
driving read/write loads from multiple clients. Otherwise, you might not see a
difference at all (of course, depending on config, you still might not see a
big difference).[/color]

HP has something to say about your storage unit and preferred multipathing
scenario, see:

http://h20564.www2.hp.com/hpsc/doc/public/display?docId=emr_na-c01476873-1

[QUOTE=cjcox;26325]On 02/11/2015 09:45 PM, cjcox wrote:[color=blue]

On 02/10/2015 05:44 AM, sharfuddin wrote:
,[color=green]

Please recommend me which one is recommended for best
performance(group_by_prio vs multibus).[/color]

Really depends on your hardware setup.

If you have a good storage subsystem and multiple controllers, then round
robining across all active connections could gain you some performance.

But if have limited controllers (looks like you have 2?), and your storage
really can’t saturate your connections (just some examples), then last used or
even priority groups works just fine. Priority groups could work best when
there are different types of controllers and/or topologies in use. In which
case you’d assign priority to the better paths…

Regardless, if you decided to trial and benchmark each way, make sure you are
driving read/write loads from multiple clients. Otherwise, you might not see a
difference at all (of course, depending on config, you still might not see a
big difference).[/color]

HP has something to say about your storage unit and preferred multipathing
scenario, see:

http://h20564.www2.hp.com/hpsc/doc/public/display?docId=emr_na-c01476873-1[/QUOTE]

Genltemen. Nice help/explanation.