If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. |
|
|
Thread Tools | Display Modes |
#1
|
|||
|
|||
Writing to block device is *slower* than writing to the filesystem!?
Hi all,
we have a new machine with 3ware 9650SE controllers and I am testing hardware RAID and linux software MD raid performances For now I am on hardware RAID. I have setup a raid-0 with 14 drives. If I create an xfs filesystem on it (whole device, no partitioning, aligned stripes during mkfs, etc) then I write to a file with dd (or with bonnie++) like this: sync ; echo 3 /proc/sys/vm/drop_caches ; dd if=/dev/zero of=/mnt/tmp/ddtry bs=1M count=6000 conv=fsync ; time sync about 540MB/sec come out (last sync takes 0 seconds). This is similar to 3ware-declared performances of 561MB/sec http://www.3ware.com/KB/Article.aspx?id=15300 however, if instead I write directly to the block device like this sync ; echo 3 /proc/sys/vm/drop_caches ; dd if=/dev/zero of=/dev/sdc bs=1M count=6000 conv=fsync ; time sync performance is 260MB/sec!?!? (last sync takes 0 seconds) I tried many times and this is the absolute fastest I could obtain. I tweaked the bs, the count, I removed the conv=fsync... i ensured 3ware caches are ON on the block device, I set anticipatory scheduler... No way. I am positive that creating the xfs filesystem and writing on it is definitely faster than writing to the block device directly. How could that be!? Anyone knows what's happening? Please note that the machine is absolutely clean and there is no other workload. I am running kernel 2.6.31 (ubuntu 9.10 alpha live). Thank you |
#2
|
|||
|
|||
Writing to block device is *slower* than writing to the filesystem!?
On Fri, 07 Aug 2009 14:30:11 +0200, kkkk wrote:
Hi all, we have a new machine with 3ware 9650SE controllers and I am testing hardware RAID and linux software MD raid performances For now I am on hardware RAID. I have setup a raid-0 with 14 drives. If I create an xfs filesystem on it (whole device, no partitioning, aligned stripes during mkfs, etc) then I write to a file with dd (or with bonnie++) like this: sync ; echo 3 /proc/sys/vm/drop_caches ; dd if=/dev/zero of=/mnt/tmp/ddtry bs=1M count=6000 conv=fsync ; time sync about 540MB/sec come out (last sync takes 0 seconds). This is similar to 3ware-declared performances of 561MB/sec http://www.3ware.com/KB/Article.aspx?id=15300 however, if instead I write directly to the block device like this sync ; echo 3 /proc/sys/vm/drop_caches ; dd if=/dev/zero of=/dev/sdc bs=1M count=6000 conv=fsync ; time sync performance is 260MB/sec!?!? (last sync takes 0 seconds) I haven't played with UNIX since I retired in 1994, but here are some suggestions: .. Does dd buffer correctly? (Probably does, but good to check) .. Compare the I/O counts for both methods; Compare the CPU time for both methods. I remember issues where the dummy devices used a small {block size, record size, or some such} so that there were high I/O counts and therefore high CPU use when we didn't expect it. .. Can you report the results for each of the various bs values that you used? You say that you tried various values and are just reporting the best result, but it would be nice to see how things are affected as the blocksize changes. .. Perhaps the filesystem is smart enough to avoid some movement between buffers that writing to a block device has to do. A difference in CPU use might be an indication of this, but a difference could be caused by other reasons. (Don't laugh at my experience being so old: I've seen a couple of problems reported this year [2009] that were the same as I saw before 1975. And that doesn't count all of the buffer overflow crap that was solved in hardware before 1961.) I tried many times and this is the absolute fastest I could obtain. I tweaked the bs, the count, I removed the conv=fsync... i ensured 3ware caches are ON on the block device, I set anticipatory scheduler... No way. I am positive that creating the xfs filesystem and writing on it is definitely faster than writing to the block device directly. How could that be!? Anyone knows what's happening? Please note that the machine is absolutely clean and there is no other workload. I am running kernel 2.6.31 (ubuntu 9.10 alpha live). Thank you |
#3
|
|||
|
|||
Writing to block device is *slower* than writing to thefilesystem!?
On Aug 7, 5:30*am, kkkk wrote:
If I create an xfs filesystem on it (whole device, no partitioning, aligned stripes during mkfs, etc) then I write to a file with dd (or with bonnie++) like this: * sync ; echo 3 /proc/sys/vm/drop_caches ; dd if=/dev/zero of=/mnt/tmp/ddtry bs=1M count=6000 conv=fsync ; time sync about 540MB/sec come out (last sync takes 0 seconds). This is similar to 3ware-declared performances of 561MB/sec *http://www.3ware.com/KB/Article.aspx?id=15300 however, if instead I write directly to the block device like this * sync ; echo 3 /proc/sys/vm/drop_caches ; dd if=/dev/zero of=/dev/sdc bs=1M count=6000 conv=fsync ; time sync performance is 260MB/sec!?!? (last sync takes 0 seconds) I tried many times and this is the absolute fastest I could obtain. I tweaked the bs, the count, I removed the conv=fsync... i ensured 3ware caches are ON on the block device, I set anticipatory scheduler... No way. I am positive that creating the xfs filesystem and writing on it is definitely faster than writing to the block device directly. How could that be!? Anyone knows what's happening? There could be a lot of reasons, but the most likely is that they're writing to opposite ends of the drive. To test, put a 'skip' in your 'dd' to the block device. See if larger skips result in higher speeds. DS |
#4
|
|||
|
|||
Writing to block device is *slower* than writing to the filesystem!?
David Schwartz wrote:
There could be a lot of reasons, but the most likely is that they're writing to opposite ends of the drive. To test, put a 'skip' in your 'dd' to the block device. See if larger skips result in higher speeds. Nope, it's not that. I seeked as you said to the end of the device and the speed is not significantly different. Writing to the device goes from 239 to 233 MB/sec (it's actually a bit faster at the beginning). I am positive that the seek value I used for dd is correct because I tried to raise it a bit further and it gave me error: dd: `/dev/sdc': cannot seek: Invalid argument Next idea...? Thank you! |
#5
|
|||
|
|||
Writing to block device is *slower* than writing to the filesystem!?
kkkk wrote:
Hi all, we have a new machine with 3ware 9650SE controllers and I am testing ... I found it! I found it! dd apparently does not buffer writes correctly (good catch, Mark): apparently disregards bs value and submits very small writes. It needs oflags=direct to really do that, and even then there's a limit. Also, elevator merging of small writes does not try hard enough and cannot achieve good throughput. More details tomorrow. |
#6
|
|||
|
|||
Writing to block device is *slower* than writing to the filesystem!?
In article s.com,
kkkk wrote: :kkkk wrote: : Hi all, : we have a new machine with 3ware 9650SE controllers and I am testing ... : :I found it! I found it! : :dd apparently does not buffer writes correctly (good catch, Mark): :apparently disregards bs value and submits very small writes. It needs flags=direct to really do that, and even then there's a limit. Also, :elevator merging of small writes does not try hard enough and cannot :achieve good throughput. More details tomorrow. Curious. I'm not seeing that behavior in either Centos 5 or Fedora 11 (coreutils-5.97-19.el5, coreutils-7.2-2.fc11). In both of those, when I run: strace dd if=/dev/zero bs=1M count=1 of=somefile conv=fsync I see exactly one read and write, each of size 1048576. -- Bob Nichols AT comcast.net I am "RNichols42" |
#7
|
|||
|
|||
Writing to block device is *slower* than writing to the filesystem!?
Robert Nichols wrote:
Curious. I'm not seeing that behavior in either Centos 5 or Fedora 11 (coreutils-5.97-19.el5, coreutils-7.2-2.fc11). In both of those, when I run: strace dd if=/dev/zero bs=1M count=1 of=somefile conv=fsync I see exactly one read and write, each of size 1048576. I haven't straced it but this is what appears from iostat -x 1 (grabbed from live iostat) Without direct: (bs=1M) Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sdc 0.00 559294.00 0.00 14384.00 0.00 570550.00 39.67 143.98 9.96 0.07 100.00 With direct: (bs=1M) Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sdc 0.00 0.00 0.00 3478.00 0.00 890368.00 256.00 5.77 1.66 0.28 98.40 You see, without direct there are a whole lot of wrqm/s (= probably lots of wasted CPU cycles), and the average submitted size is still 143.98 256.0 (I suppose 143.98 is after the merges, correct?) With direct there are no wrqm/s, and the submitted request size is 256 sectors exactly. With oflag=direct, performances increase with increasing bs, like this: (3ware 9650SE-16ML hw raid-0 256K chunk size, 14 disks [1TB 7200RPM SATA]) bs size - speed: 512B - 4.9MB/sec 1K - 13.3MB/sec 2K - 26.6MB/sec 4K - 54.1MB/sec 8K - 96MB/sec 16K - 157MB/sec 32K - 231 MB/s 64K - 300 MB/s 128K - 359 MB/s (from this point on, avgrq-sz does not increase anymore, but performances still increase) 256K - 404MB/sec 512K - 430MB/sec 1M - 456MB/sec 2M - 466MB/sec 4M - 473MB/sec 3584K (stripe size) - 494MB/sec 8M - 542MB/sec !! A big performance jump!! 16M - 543MB/sec 32M - 568MB/sec ! Another big performance jump 64M - 603MB/sec ! Again !! Here are CPU occupations: real 0m11.213s, user 0m0.004s, sys 0m3.880s 128M - 641MB/sec 256M - 676MB/sec 512M - 645MB/sec (performances start dropping) 1G - 620MB/sec Avgrq-sz apparently cannot go over 256 sectors, is this a hardware limit by the device, 3ware? Notwithstanding this, performances still increase up to bs=256M. From iostat the only apparent change (apart from increasing wsec/s obviously) is avgqu-sz, being 1.0 up to bs=128K, and then raising to about 20.0 at bs=256M. Do you think this can be the reason for the performance increase up to 256M? Thanks for any thoughts. |
Thread Tools | |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
dvd writing | DNA | Cdr | 7 | June 18th 05 08:11 AM |
CDRW writing always slower than CDR? | SleeperMan | Cdr | 4 | December 13th 04 03:50 PM |
HardDrive: Reading & Writing Slower | UWS | General Hardware | 0 | January 8th 04 06:32 PM |
CD Writing | Blade | General | 8 | October 7th 03 07:12 PM |
NF7-S cd writing | Richard Rollins | Abit Motherboards | 1 | July 22nd 03 10:31 PM |