Possible regression in PERC H710P Mini firmware version 21.3.0-0009 or above ( 21.3.1-0004 or 21.3.2-0005 ).
Hello,
We have a few Poweredge R720(xd) hosts, running RHEL6.6 and 6.7. I am wondering if anyone else has seen...
for the megasas driver interrupts for the versions of the firrmware mentioned, we only see interrupts on the first interrupt, irqbalance (with latest RHEL6 NUMA fix) is spreading all the interrupts over all the CPUs, so with only one interrupt in use all the interrupts land on one (or sometimes two) CPUs. On hosts with a high IO load these CPUs are a limiting factor as they are saturated handling the interrupts.
I suspect the PERC H710P Mini firmware version 21.3.0-0009 or above ( 21.3.1-0004 or 21.3.2-0005 ) is responsible because there are a few hosts that display this behaviour (only first interrupt in use), and all the hosts with the problem have these firmware versions, while all hosts without the behaviour have older versions of the firmware. I have compared hosts that have and do not have this behaviour, and they have been identical in all things (Hardware, BIOS, OS daemons, kernel and drivers) apart from the PERC H710P Mini firmware version.
This is seen by running:
> cat /proc/interrupts | egrep 'CPU|mega'
I have been checking hosts using:
> echo -n goodhost{1..16} badhost{1..6} | xargs -d' ' -I {} ssh {} "awk -v hostname=\$HOSTNAME '/:.*[1-9].*megasas/{count++} END {printf \"Host %16s has %2d megasas interrupts in use, \",hostname,count;if(count==1){printf \"NOT OK. \"}else{printf \"OK. \"}}' /proc/interrupts;/opt/dell/srvadmin/bin/omreport storage controller | grep '^Firm';" | sort -nk 4
Host badhost1 has 1 megasas interrupts in use, NOT OK. Firmware Version : 21.3.2-0005
Host badhost2 has 1 megasas interrupts in use, NOT OK. Firmware Version : 21.3.2-0005
Host badhost3 has 1 megasas interrupts in use, NOT OK. Firmware Version : 21.3.2-0005
Host badhost4 has 1 megasas interrupts in use, NOT OK. Firmware Version : 21.3.2-0005
Host badhost5 has 1 megasas interrupts in use, NOT OK. Firmware Version : 21.3.2-0005
Host badhost6 has 1 megasas interrupts in use, NOT OK. Firmware Version : 21.3.2-0005
Host badhost7 has 1 megasas interrupts in use, NOT OK. Firmware Version : 21.3.0-0009
Host badhost8 has 1 megasas interrupts in use, NOT OK. Firmware Version : 21.3.0-0009
Host badhost9 has 1 megasas interrupts in use, NOT OK. Firmware Version : 21.3.0-0009
Host badhost10 has 1 megasas interrupts in use, NOT OK. Firmware Version : 21.3.0-0009
Host badhost11 has 1 megasas interrupts in use, NOT OK. Firmware Version : 21.3.2-0005
Host badhost12 has 1 megasas interrupts in use, NOT OK. Firmware Version : 21.3.2-0005
Host badhost13 has 1 megasas interrupts in use, NOT OK. Firmware Version : 21.3.2-0005
Host badhost14 has 1 megasas interrupts in use, NOT OK. Firmware Version : 21.3.2-0005
Host badhost15 has 1 megasas interrupts in use, NOT OK. Firmware Version : 21.3.2-0005
Host badhost16 has 1 megasas interrupts in use, NOT OK. Firmware Version : 21.3.1-0004
Host goodhost1 has 12 megasas interrupts in use, OK. Firmware Version : 21.2.0-0007
Host goodhost2 has 12 megasas interrupts in use, OK. Firmware Version : 21.2.0-0007
Host goodhost3 has 12 megasas interrupts in use, OK. Firmware Version : 21.2.0-0007
Host goodhost4 has 16 megasas interrupts in use, OK. Firmware Version : 21.2.0-0007
Host goodhost5 has 16 megasas interrupts in use, OK. Firmware Version : 21.2.0-0007
Host goodhost6 has 16 megasas interrupts in use, OK. Firmware Version : 21.2.0-0007
Host goodhost7 has 16 megasas interrupts in use, OK. Firmware Version : 21.2.0-0007
Host goodhost8 has 16 megasas interrupts in use, OK. Firmware Version : 21.2.0-0007
Host goodhost9 has 16 megasas interrupts in use, OK. Firmware Version : 21.2.0-0007
Host goodhost10 has 16 megasas interrupts in use, OK. Firmware Version : 21.2.0-0007
Host goodhost11 has 16 megasas interrupts in use, OK. Firmware Version : 21.2.0-0007
Host goodhost12 has 16 megasas interrupts in use, OK. Firmware Version : 21.2.0-0007
Host goodhost13 has 16 megasas interrupts in use, OK. Firmware Version : 21.2.0-0007
Host goodhost14 has 16 megasas interrupts in use, OK. Firmware Version : 21.2.0-0007
Host goodhost15 has 16 megasas interrupts in use, OK. Firmware Version : 21.2.0-0007
Host goodhost15 has 16 megasas interrupts in use, OK. Firmware Version : 21.2.0-0007
Thanks,
Peter (Stig) Edwards