Degraded IO performance on kernel 2.6.32-504.3.3

Issues related to hardware problems
Kirys
Posts: 95
Joined: 2005/07/17 08:46:18
Contact:

Degraded IO performance on kernel 2.6.32-504.3.3

Post by Kirys » 2014/12/21 14:19:50

Hi all
It seems that somehow I'm haunted by performance degradation issues ;) so here my latest one.
Since kernel kernel 2.6.32-504.3.3 I've had a drastic drop in disk io performance.
as example
dd if=/dev/md1 of=/dev/null bs=64k count=1000000
give me about 210mb/sec under 2.6.32-431.29.2
while it gives me less than 120mb/sec under 2.6.32-504.3.3.
This reflect highly on tape backup time. on 431 it used to reach the top tape speed while on 2.6.32-504.3.3 dump never surpass 90mb/s, and backup take twice the time!
I've also drop in performance on VM io (and of course boot times).
All my HDD are under software raid 5 (mdadm).

What's happening?
Thank You
K.

Kirys
Posts: 95
Joined: 2005/07/17 08:46:18
Contact:

Re: Degraded IO performance on kernel 2.6.32-504.3.3

Post by Kirys » 2015/01/25 17:40:03

So am I the only one this time?

Kirys
Posts: 95
Joined: 2005/07/17 08:46:18
Contact:

Re: Degraded IO performance on kernel 2.6.32-504.3.3

Post by Kirys » 2015/02/24 08:34:47

kernel 2.6.32-504.8.1.el6.x86_64 still has the same issue for me

gulikoza
Posts: 188
Joined: 2007/05/06 20:15:23

Re: Degraded IO performance on kernel 2.6.32-504.3.3

Post by gulikoza » 2015/02/24 21:02:00

I'm following this thread but unfortunately I don't have anything relevant to add.
I still experience slower I/O than expected from time to time, but quite frankly I don't feel like debugging it again. I couldn't find any definitive test case to reproduce the problem, I don't even know where to start... I don't have time for that, so I upgraded a few servers to SSDs and hope for the best.

Just the other day I got a (host to VM) ping timeout warning from one of my VMs (now residing on SSD) while copying a (unrelated) large file over the network to the internal md-RAID6 array. Either the machine is still swapping when the page cache is full or something else has to be going wrong...but as I've said, I haven't checked it in detail.

Kirys
Posts: 95
Joined: 2005/07/17 08:46:18
Contact:

Re: Degraded IO performance on kernel 2.6.32-504.3.3

Post by Kirys » 2015/02/24 21:43:53

I had a problem that looked similar to that the one you was talking about take a look at this topic
viewtopic.php?f=14&t=6838&start=20

for the degraded performance the check that really shows the difference to me is dump+pv
I currently use dump of ext4 partition to backup to LTO5 a rsnapshot backup volume that resides on a dedicated 3 disk raid 5 array.
It is a good benchmark because
the partition is unmounted during the dump process (it mounted only during the rsnapshot process at night)
almost all happens on dedicated hardware (the raid is on 3 disk that host only that volume those disks are on a sas controller that has only those and the LTO tape)

so this is how dump behave with 2.6.32-504.8.1:

DUMP: Date of this level 0 dump: Tue Feb 24 09:05:54 2015
DUMP: Dumping /dev/md1p1 (/mnt/OnlineBackup) to standard output
DUMP: Label: OnlineBackup
DUMP: Writing 64 Kilobyte records
DUMP: mapping (Pass I) [regular files]
DUMP: mapping (Pass II) [directories]
DUMP: estimated 1337167688 blocks.
DUMP: Volume 1 started with block 1 at: Tue Feb 24 09:06:45 2015
DUMP: dumping (Pass III) [directories]
DUMP: dumping (Pass IV) [regular files]
DUMP: 0.82% done at 36439 kB/s, finished in 10:06
DUMP: 2.53% done at 56476 kB/s, finished in 6:24
DUMP: 4.37% done at 64944 kB/s, finished in 5:28
DUMP: 6.40% done at 71297 kB/s, finished in 4:52
DUMP: 8.47% done at 75510 kB/s, finished in 4:30

It seems better than previous kernel, but the tape drive still have an huge number of slowdowns and at least a stop every minute! (that is really too stressful for the LTO tape)
pv shows that the speed never become higher than 90-95mb/s and it goes to an averange of about 80mb/s

This is how dump behave with 2.6.32-431.29.2

DUMP: Date of this level 0 dump: Tue Feb 24 09:56:56 2015
DUMP: Dumping /dev/md1p1 (/mnt/OnlineBackup) to standard output
DUMP: Label: OnlineBackup
DUMP: Writing 64 Kilobyte records
DUMP: mapping (Pass I) [regular files]
DUMP: mapping (Pass II) [directories]
DUMP: estimated 1337167688 blocks.
DUMP: Volume 1 started with block 1 at: Tue Feb 24 09:57:52 2015
DUMP: dumping (Pass III) [directories]
DUMP: dumping (Pass IV) [regular files]
DUMP: 1.03% done at 45972 kB/s, finished in 7:59
DUMP: 3.56% done at 79339 kB/s, finished in 4:30
DUMP: 6.54% done at 97188 kB/s, finished in 3:34
DUMP: 9.46% done at 105427 kB/s, finished in 3:11
DUMP: 12.72% done at 113380 kB/s, finished in 2:51
DUMP: 16.37% done at 121606 kB/s, finished in 2:33

and pv says that that the speed it mostly a stable 135mb/s but sometimes it goes up to 195mb/s (this should happen when compressible data is encountered)
so about 30% slower due to the speed limit of LTO5 but actually it is about 50% slower! and no clue on WHY!

I know that now there is Centos 7 but I cannot upgrade in the short term and I don't know if upgrading would solve this issue :(
So I hope that somehow the issue will be solved (like the one i had with ISCSI)
Cya
K.

BackAndi
Posts: 1
Joined: 2015/03/13 08:05:17

Re: Degraded IO performance on kernel 2.6.32-504.3.3

Post by BackAndi » 2015/03/13 10:22:42

Hi all,

I experience the same problem and will have to switch to the old kernel, first i thought i lost some tuning options but nothing helped.
the storage is used as backup to disk to tape buffer and was able to write 10 streams at a sum of 200MB/s and one stream reading at least at 140MB/s
Now i can read oly with 60MB/s when writing. without write io it is still fast enough to feed the tape effective.
The write performance seems unchanged.

I am using EXT4 on a HW Raid 50.
It seems not related to a fileystem, what are you using?

Thank you

Kirys
Posts: 95
Joined: 2005/07/17 08:46:18
Contact:

Re: Degraded IO performance on kernel 2.6.32-504.3.3

Post by Kirys » 2015/03/13 10:59:08

ext4 on mdraid

gulikoza
Posts: 188
Joined: 2007/05/06 20:15:23

Re: Degraded IO performance on kernel 2.6.32-504.3.3

Post by gulikoza » 2015/04/05 08:18:00

Is it possible the interrupts are not being properly balanced?

https://bugzilla.redhat.com/show_bug.cgi?id=911649

My megasas driver is all on cpu0 judging from /proc/interrupts.
It would seem that 6.7 better be out fast...

edit: I have downgraded to irqbalance-1.0.4-9.el6_5.x86_64 and the counters are now rising on other cpus...I'll see if io performance is improved...

Kirys
Posts: 95
Joined: 2005/07/17 08:46:18
Contact:

Re: Degraded IO performance on kernel 2.6.32-504.3.3

Post by Kirys » 2015/04/05 10:47:51

I've three controller
one is intel sata, another is asmedia sata
and the latest is an atto sas LBA
A this moment I cannot test on the latest kernel cause.
Hope to have the time to make a test soon
cya
K.

Kirys
Posts: 95
Joined: 2005/07/17 08:46:18
Contact:

Re: Degraded IO performance on kernel 2.6.32-504.3.3

Post by Kirys » 2015/08/08 07:22:01

do someone know if the regression has been resolved?

Post Reply