Bad performance on new hardware ?

mpibgc · July 17, 2013, 4:07pm

Hi,

we have a global file system (Quantum StorNEXT) up and running in a HA config.
2 MDC are controlling the file systems. The hardware are 2 DELL 1950, with 16G RAM, 3GHz CPU and a 4 GB FC HBA.
Both are running SLES 10 SP4 (x86_64).
We exchanged one machine. The new one is a DELL R620, 32GB, 3.3 E5-CPU, 8GB FC HBA.
On this node we installed SLES 11SP1 (higher is not supported for the Software).
The node is working well, but the performance as meta data server for the filesystem
is 10-20% SLOWER compared to the 1950!
In both case we are using only the default drives. On both machines qlogic hba are installed.

Any idea, what’s going on ? We never set any kernel specific data …

it’s the first time I saw a new machine performing slower then an old one!

Bye, Peer

Jens-U · July 17, 2013, 5:16pm

Hi Peer,

have you already tried to identify the current bottleneck, like CPU, network, FC, local disks? is running mixed OSs supported with your HA solution?

It might be the setup of the new machine, but as well some incompatibility between the two HA nodes leading to massive overhead. Has the network access changed (i.e. 100 Mbps to Gigabit), is the link properly configured (autoconfiguration sometimes does funny things to your link configuration)?

Regards,
Jens

mpibgc · July 18, 2013, 10:30am

[QUOTE=jmozdzen;14547]Hi Peer,

have you already tried to identify the current bottleneck, like CPU, network, FC, local disks? is running mixed OSs supported with your HA solution?

It might be the setup of the new machine, but as well some incompatibility between the two HA nodes leading to massive overhead. Has the network access changed (i.e. 100 Mbps to Gigabit), is the link properly configured (autoconfiguration sometimes does funny things to your link configuration)?

Regards,
Jens[/QUOTE]

Yes,
I did some basic tests. The StorNEXT needs SAN connection and Network. SAN to get the META DATA, network to
talk with the clients. First I tested the raw read performance using different block sizes.
The system seems to use the full 8GB HBA bandwidth. Small I/Os are not easy to test, because I do not have free
devices to test and can not override parts of the file system

Network: I did some small tests transferring smaller files using rsync. The tg3 module seems to be slower then
the bnx on the old hardware. But no errors …
I’ll check this.

Bye, Peer

Jens-U · July 18, 2013, 12:12pm

Hi Peer,

it generally is a good idea to monitor production servers and log historical data - that way you can identify changes in the usage impacts (i.e. reduced/increased network usage, distribution of CPU cycles, response times, …) and of course it does provide availability monitoring, too Maybe someone is already running such a tool so you could get an actual picture and check against pre-hardware-swap values?

Regards,
Jens

mpibgc · July 19, 2013, 10:14am

We are monitoring quiet a lot of thinks, but first you have to know where to look (cpu, memory, disk, hba, network, …)!
However we could track it down to the broadcom NIC! It’s hard to believe, but the
dumb BIOS setting “BEST PERFORMANCE per watt” was active. We changed this setting to “MAX PERFORMANCE”
and doubled the performance of the NIC !
So at least my fault, I thought the BIOS was already set to performance, but I never had expected such influence.

Always double check the BIOS settings …

Thanks and bye, Peer

Jens-U · July 19, 2013, 12:18pm

Hi Peer,

Oh so true And even when you look in the right place, you’ll have to interpret right… in your case, you’d probably see less network traffic than before the hardware change… but no errors or alike, implying that less users than before are using the system…

[QUOTE=mpibgc;14629] However we could track it down to the broadcom NIC! It’s hard to believe, but the
dumb BIOS setting “BEST PERFORMANCE per watt” was active. We changed this setting to “MAX PERFORMANCE”
and doubled the performance of the NIC !
So at least my fault, I thought the BIOS was already set to performance, but I never had expected such influence.

Always double check the BIOS settings …

Thanks and bye, Peer[/QUOTE]

Thank you for the feedback - I wouldn’t have come up with such a setting :o, glad you found it!

Regards,
Jens

Topic		Replies	Views
slow storage performance - HW or DBA issue ? SLES Hardware	5	388	January 20, 2015
SLES SP2 slows Windows-VM SLES Virtualization	5	251	March 5, 2013
Xen Performance SLES Virtualization	9	241	September 1, 2014
Poor performance when turning HT off SLES Configure-Administer	0	235	November 23, 2015
very bad network performance SLES Networking	4	391	July 5, 2016

Bad performance on new hardware ?

Related topics