Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3

 

This page describes an unresolved (early 2014) performance issue that Diego is having with Intel 520 SSDs. Feedback and ideas are welcome: ongaro at cs stanford edu.

Update Tue, 25 Feb 2014 00:21:06 -0800: added "Effects of Write-Caching" section.

Update Tue, 25 Feb 2014 13:12:01 -0800: added "Larger Writes" section.

Update Sat, 01 Mar 2014 19:31:34 -0800updated "Larger Writes" section with big graph, updated microbenchmark section to link to improved benchmark code.

 Contents: 
Table of Contents
maxLevel3
 

Summary

I'm optimizing a system in which synchronous disk writes are a key factor in performance. My current drives are too slow. I have a single Intel 520 drive that performs well enough (<1ms), but I need five drives. I bought five Intel 530 SSDs in hopes they would perform like the Intel 520, but instead they take 10ms to write a single byte to disk under various versions of Linux. Curiously, if I connect the Intel 530 drives over a USB-to-SATA adapter instead of using SATA directly, they're much faster (~200us per write). If I turn off write-caching, on some machines they take 1.6ms per write. What's wrong?

 

Microbenchmark

Here's the original microbenchmark, which calls write() and fdatasync() on a single byte of data 1000 times: https://gist.github.com/ongardie/9177853 I run this as "time ./bench" and divide the wall time by 1000 to get the approximate average time per write.

A much improved and extended version of the benchmark is at https://github.com/ongardie/diskbenchmark .

I normally run this on ext4. (I've tried it on a raw device as well; see negative results section below). 

Machines

These are the machines I've tried. The Intel 530 over SATA takes 10ms per write on each of these, and other SSDs take <1ms per write on each of these.

nametypedescriptionmodeldistrokernelSATAUSBPurchased
rc66serverThis is the machine type I'd like to use (we have 80 of them).Colfax CX1180-X4 / Supermicro 5016TI-TFRHEL62.6.323gbps2.02011
rcmonsterserverThis is a newer, beefier machine with Intel 520 drives, but also stuck on RHEL6.Colfax CX1265i-X5RHEL62.6.326gbps3.02013
flygeckodesktopThis is a newer desktop machine with a newer version of Linux. Arch3.123gbps2.02013
x1laptopThis is a laptop with a newer version of Linux.Lenovo Thinkpad X1 CarbonDebian Jessie3.12mSATA 6gbps3.02012

 

Disks

These are the disks I've tried and their performance:

modelqtycapacitorperformance
Crucial M4160 3.7ms per write on rc66
Intel 320 (SSDSC2CW120A3)1yes~200us per write on rcmonster
Intel 520 (SSDSC2CW120A3)2no<1ms per write on rc66, rcmonster (440us)
Intel X25-M (SSDSA2M120G2GC)1 210us per write on flygecko
Intel 530 (SSDSC2BW120A4)5no10ms per write on rc66 (9.8ms), rcmonster (10.1ms), and flygecko (9.7ms)
Intel 530 attached over USB-to-SATA adapter1no<1ms per write on rc66, rcmonster, flygecko, and x1

SanDisk X100 (SD5SG2128G1052E)

1 830us per write on x1
Cheap USB thumb driven 7.6ms per write on x1

My goal is to get the Intel 530 drives to run fast on rc66 and similar machines.

Negative Results

  • The first thing I did was update from the DC22 firmware to the current DC33 firmware. No effect.
  • I tried a machine with 6gbps SATA (rcmonster). No effect.
  • I tried a machine with a newer kernel (flygecko). No effect.
  • I tried disabling APM power saving with hdparm -B. No effect.
  • Did I try different NCQ sizes? Doesn't work on some machinesI wasn't able to change NCQ sizes with /sys/block/sdc/device, but it's stuck at 1 on flygecko and 31 on rcmonster.
  • I tried changing the I/O scheduler. This shouldn't have an effect since there's only one I/O outstanding at a time. No effect.
  • I tried running the benchmark on the raw block device rather than an ext4 partition. This helped but only slightly, reducing latency per write from 10ms to about 9ms.
  • I tried doing bigger writes to see if they would be faster. See "Larger Writes" section below. Didn't help.

Additional Questions

  • One of the differences between the Intel 520 and Intel 530 is that the 530 does more aggressive power saving. Is there a way to turn that off?
  • Are the software paths under Linux for the Intel 520 and the Intel 530 identical?
  • Would Intel be willing to trade my five 530s in exchange for five 520s or similarly performing drives?

blktrace measurements

I ran blktrace while running the benchmark on flygecko with the Intel X25-M and the Intel 530 SSDs. Here's the summary produced by btt:

...

So that raises two questions: why does the Intel 530 take so much longer in the driver/device, and why does it take even longer end-to-end?

Effects of Write-Caching

Tom Lyon suggested I try to enable write-caching (https://twitter.com/aka_pugs/status/438148846476996608). Write-caching was already enabled on all the machines I tried, but disabling it had some interesting effects. I used hdparm -W 0 /dev/sdx to toggle write-caching. 

hostdiskwrite caching offwrite caching on
rcmonster520100us per write440us per write
rcmonster53010.4ms per write10.1ms per write
rc66M41.5ms per write2.7ms per write
rc665301.7ms per write9.8ms per write
flygeckoX25-M1.1ms per write210us per write
flygecko5301.5ms per write9.5ms per write

Each drive seems to behave differently. For the 530, write-caching seems to have no effect on rcmonster but improves latency for one-byte writes by a factor of six on rc66 and flygecko. This opens up a lot of questions... See also Larger Writes section below.

Larger Writes

Many people have suggested that writing 1 byte is not efficient and writing more bytes should be faster. I tried it both with write-caching on and off. 

Image Added

The writes were done at 0 and 512 byte offsets into the file, and the whole experiment was repeated 5 times. The best time is shown for each point.  Each data point represents only a small number of writes, though, so the error may be high. Also, the "x1" machine is my laptop, so it wasn't always idle.