LSI MegaRAID CacheCade Pro 2.0 Testing

For about now i run an virtualization environment running XenServer on the following hardware:

Standard virtualization hardware:

Supermicro SNSC416: max 4 x SATA/SAS hot-swap 96GB DDR-3 redundante PSU
Supermicro X8DTU-F Server KVM-over-IP en software Intel ICH10R 6xSATA RAID
Supermicro SC815TQ-R700U, hot-swappable, 1U, redundante PSU, rails incl.
6 x Kingston 8192MB DDR-3 1333Mhz Reg. ECC ( 3 x 8GB en 3 x 4GB per cpu )
6 x Kingston 4096MB DDR-3 1333Mhz Reg. ECC ( 3 x 8GB en 3 x 4GB per cpu
2 x Intel Xeon Proc X5690/3.46GHz 12M 6.4 LGA1366 (BX80614X5690)
1 x LSI Raid 9266-4i
4 x Western Digital 1000GB, SATA II, 64MB, 7200rpm, raid edition

Because we do not use shared storage the limit you will (almost) always hit first is the IO limit.

Some time ago i found that LSI has a software called CacheCade Pro that enables your LSI raid controller to do caching on SSDs.

To test this out i asked my supplier to send me a test model to play around with. This week i finally received the controller + license + SSD’s.
I could build the test box.

Test box:

Supermicro SNSC416: max 4 x SATA/SAS hot-swap 96GB DDR-3 redundante PSU
Supermicro X8DTU-F Server KVM-over-IP en software Intel ICH10R 6xSATA RAID
Supermicro SC815TQ-R700U, hot-swappable, 1U, redundante PSU, rails incl.
6 x Kingston 8192MB DDR-3 1333Mhz Reg. ECC ( 3 x 8GB en 3 x 4GB per cpu )
6 x Kingston 4096MB DDR-3 1333Mhz Reg. ECC ( 3 x 8GB en 3 x 4GB per cpu
4 x Western Digital 1000GB, SATA II, 64MB, 7200rpm, raid edition
2 x Intel Xeon Proc X5690/3.46GHz 12M 6.4 LGA1366 (BX80614X5690)
1 x LSI Raid 9271-4i
2 x Seagate Constellation 3TB SAS 64MB, 7200rpm
2 x Intel DC S3700 Series, 200GB , 2.5in SATA, MLC, Read 500MBs/75k io, Write 365MBs/32k io

Additional Information

Just to explain, all tests will be run on virtual machines running on the XenServer 6.5 Hypervisor.

Since this hardware is not used for bare-metal deployments, i have no particular interest in testing it without the hypervisor in between.

All the virtual machines i used are running Ubuntu 14.04 LTS. The vm’s are running in PVHVM mode.

The vm’s have 4 cores and 4Gb of memory. And a separate 15Gb data disk on which to run the test. Formatted ext4 with default mount options.

All the bonnie++ graphs are made using bonnie2gchart

Test Method 1 – Bonnie++ singe instance

bonnie++ -d /mnt/test -s 9000 -n 100 -m standard -u nobody -q

Ran the test 4 times on each box. Here are the results:

Block IO

single-blockio

Block IO Latency

single-block-latency

Block IO CPU

single-cpu

File metadata

single-meta

File metadata Latency

single-meta-latency

Test Method 1 – Bonnie++ Multiple instances

Since running a single VM on a server will hardly ever occur, i have also run the same test as above on 5 vm’s simultaneously. The averages of all the test are below:

Block IO

multi-blockio

sec block output 51% slower
block rewrite 3% slower
block input 239% faster

Block IO Latency

multi-block-latency

block output 24% lower latency
rewrite 32% lower latency
input block 50% lower latency

Block IO CPU

multi-block-cpu

Seq Block Output CPU 71% lower
Block Rewrite CPU 66% lower
Block Input CPU 66% increase
Random Seek CPU 1700% increase

File metadata

multi-meta

sec create 20% faster
sec delete 32% faster
random create 6% faster
random delete 25% faster

File metadata Latency

multi-meta-latency

seq Create 43% lower latency
Seq Delete 5% lower latency
Ran Create 6% higher latency
Ran Delete 75% lower latency

Test Method 2 – sysbench MySQL

Running a stock mysql-server, inserting a couple records and then run some tests on them.

The test is done using a tool called sysbench. Im using the following test setup commands:


sysbench --test=oltp --oltp-table-size=10000000 --mysql-db=test --mysql-user=root --mysql-password=yourpwhere prepare
sysbench --test=oltp --oltp-table-size=10000000 --mysql-db=test --mysql-user=root --mysql-password=yourpwhere --max-time=300 --oltp-read-only=on --max-requests=0 --num-threads=8 run

Standard

OLTP test statistics:
queries performed:
read: 3915170
write: 0
other: 559310
total: 4474480
transactions: 279655 (932.17 per sec.)
deadlocks: 0 (0.00 per sec.)
read/write requests: 3915170 (13050.42 per sec.)
other operations: 559310 (1864.35 per sec.)

Test execution summary:
total time: 300.0033s
total number of events: 279655
total time taken by event execution: 2398.6410
per-request statistics:
min: 3.31ms
avg: 8.58ms
max: 80.49ms
approx. 95 percentile: 10.55ms

Threads fairness:
events (avg/stddev): 34956.8750/109.37
execution time (avg/stddev): 299.8301/0.00

SSD CacheCade

OLTP test statistics:
queries performed:
read: 4191152
write: 0
other: 598736
total: 4789888
transactions: 299368 (997.88 per sec.)
deadlocks: 0 (0.00 per sec.)
read/write requests: 4191152 (13970.37 per sec.)
other operations: 598736 (1995.77 per sec.)

Test execution summary:
total time: 300.0029s
total number of events: 299368
total time taken by event execution: 2398.6331
per-request statistics:
min: 3.53ms
avg: 8.01ms
max: 160.17ms
approx. 95 percentile: 9.82ms

Threads fairness:
events (avg/stddev): 37421.0000/83.58
execution time (avg/stddev): 299.8291/0.00

Conclusion
I’ve run the test multiple times, the outcome is always (roughly the same)

The one running on the SSD cachcade is about 5-7% faster

Test Method 3 – winstat drive


Run CMD as Administrator
winsat disk -drive c:

winstat-disk-perf

Conclusion
Since the windows scoring is so amazing, i’ve wrote down the difference below here so its easy to read.

Disk:

  • Sequential 64.0 Read – 129% faster
  • Random 16.0 Read – 38.7% faster
  • Sequential 64.0 Write – 65% faster
  • Responsiveness:

  • Average IO Rate – 80% faster
  • Grouped IOs – 85% faster
  • Long IOs – 656% fasters
  • Overall – 1291% faster
  • PenaltyFactor – n/a
  • Latency:

  • 95th Percentile – 954% lower
  • Maximum – 19% lower
  • Average Read Time:

  • Sequential Writes – 3158% faster
  • Random Writes – 277% faster
  • Test Method 4 – Atto disk Benchmark

    atto-standard

    atto-ssdcache

    Conclusion

    Write:
    0.5 – 3.9% faster
    1 – 2.4% faster
    2 – 1.2% faster
    4 – 0.5% faster
    8 – 2.1% faster
    16 – 19% slower
    32 – 24.5% faster
    64 – 41.6% faster
    128 – 52% faster
    256 – 21% faster
    512 – 18% faster
    1024 – 33% faster
    2048 – 55% faster
    4096 – 32% faster
    8192 – 45% faster

    Read:
    0.5 – 10% slower
    1 – 8.8 % slower
    2 – 15.8% slower
    4 – 8.6% slower
    8 – 19% faster
    16 – 6% slower
    32 – 58.8% faster
    64 – 74.8% faster
    128 – 217% faster
    256 – 406% faster
    512 – 449% faster
    1024 – 330% faster
    2048 – 243% faster
    4096 – 128% faster
    8192 – 224% faster

    Test Method 5 – Single CrystalDiskMark

    SSD

    crystaldiskmark-ssd

    Standard

    crystaldiskmask-standard

    Test Method 5 – Multi CrystalDiskMark

    Running the test on 5 virtual machines simultaneous gave the following result:

    SSD
    Read / Write
    209.3 / 167.5
    339 / 515.9
    17.192 / 20.112

    Standard
    Read / Write
    88.706 / 97.086
    437.26 / 319
    21.8 / 13.504

    Comments

    Leave a Reply

    Your email address will not be published. Required fields are marked *