LSI Raid Card suggestions for ESXI

mda

2[H]4U
Joined
Mar 23, 2011
Messages
2,207
Hi,

Looking to get opinions on a good RAID controller for IR mode RAID 1, as well as to get your experiences on which to use.

I'm currently running ESXI 6.5/6.7 on a 2700X/X470 machine, and a RAID 1 Samsung 860 EVO 1TB array on an LSI 9211-8i is very slow.

CrystalDiskMark (Windows), HDParm (on a Linux), and some real world tests (Restoration of a mySQL dump) are all significantly slower than a single 850 EVO 250GB from the motherboard SATA drive.

I know the RAID card will make the drive slower, but I was not expecting 50% slower.

Some questions:
a. What RAID controller will work fine for ESXI RAID 1/10 for database use? I'm currently looking at either 9440-8i's or 9361-8i's, both of which are on the ESXI compatibility chart.
b. What is the reason why the RAID is slow? When I run a CrystalDiskMark on a bare metal Windows 10 with the LSI 9211 RAID 1 formatted as NTFS, the benchmarks are not far off compared to a SATA drive plugged into the board. This is more in line with what I expected.

Thanks!

Thinking of trying some Lenovo 530-8i's, which are MegaRAID 9440-8i to be purely ran in IR mode for RAID 1/10
 
Last edited:
This is propably related to a secure write behaviour of ESXi that avoids any write cache. I write commit must mean that a write is on disk. The 9211 is a HBA without cache mainly intended for software raid.

Your main options
- use a raid 1/5/6 raidcontroller with cache and bbu/flash protection, https://www.broadcom.com/products/storage/raid-controllers
The cache gives performance and the bbu/flash protections guarantees writes after a crash

- use a ZFS storage VM with enough RAM for read/write caching and sync enabled to protect the writecache.
In such a situation a HBA like a 9211 or 9300 is perfect, see my howto https://napp-it.org/doc/downloads/napp-in-one.pdf

btw
the evo is not well suited for VM storage.
On steady write performance lowers after some time and without powerloss protection a VM filesystem may get corrupted on a crash during write. May be ok at home but "not for production use"
 
Thanks for this reply.

Unfortunately, I think that the issue runs deeper than just the write cache.

I performed a side by side CrystalDiskMark on the same ESXI box with 2 Windows 10 VMs that are installed - 1 on an 850 EVO 250GB running off the motherboard port, and another on the 860 EVO 1TB on the RAID 1 Array

The result is that the Win10 on the 850 Evo / Motherboard port performs largely similarly to a bare metal Windows 10 Install (at least, for crystaldiskmark), while the VM on the 860 Evo array gives decent reads but very very bad writes.

Not trying to shoot down what you've posted -- but if write cache was the only issue, then the issue would also come out to an extent on the VM installed on the drive running off the motherboard port?

Both are VMs on the same ESXI box
Left Benchmark - 860 EVO 1TB RAID 1, Right - 850 EVO 250GB attached to motherboard
Please see the large large discrepancy in writes, which is noticeable in day to day use.
HD0wp7s.png

I've been reading a bit on the ZFS, and I'd prefer to not do that route due to the complexity / nesting of all the storage spaces especially when we'll eventually want to move to some sort of similar configuration to our production. Since I have an excess of 9211 cards however, I'll give this a look when the less complex options are done. How much RAM would I need for the ZFS for maybe 1-2TB of database work? Only planning for RAID 1 to reduce complexity.

** Also, thanks for the info on the EVO drives. Yeah, these are strictly for dev only. For production, we'll use something a little better, although for mostly read only databases, this *may* actually do fine and will likely beat the pants off our current production WD 2TB Black drives drives from 2011 :D

Still not sure why the 9211 is giving very bad write speeds.. Will a newer gen 9440 boost the speed somewhat?

Thank you!
 
I've been reading a bit on the ZFS, and I'd prefer to not do that route due to the complexity / nesting of all the storage spaces especially when we'll eventually want to move to some sort of similar configuration to our production. Since I have an excess of 9211 cards however, I'll give this a look when the less complex options are done. How much RAM would I need for the ZFS for maybe 1-2TB of database work? Only planning for RAID 1 to reduce complexity.

** Also, thanks for the info on the EVO drives. Yeah, these are strictly for dev only. For production, we'll use something a little better, although for mostly read only databases, this *may* actually do fine and will likely beat the pants off our current production WD 2TB Black drives drives from 2011 :D

Still not sure why the 9211 is giving very bad write speeds.. Will a newer gen 9440 boost the speed somewhat?

Thank you!


ZFS is basically a quite simple concept. You create a datapool from a single disk, a mirror or a raidZ. You can optionally stripe multiple of them to increase capacity and performance. You then create filesystems on the datapool. Unlike partitions on a traditional filesystem they have no size, they can dynamically grow to poolsize. You can limit usage with quotas and guarantee a capacity with reservations. All ZFS properties (on Solaris even SMB and NFS shares) are properties of filesystems. Beside raid management, ZFS offers volume management and on Solarish share management. In combination with ESXi you share such a filesystem via NFS3 and use it as a datastore within ESXi and share additionally via SMB for a simple and fast access via Windows for VM copy/clone and unlimited ZFS snaps via Windows previous versions. A ZFS filer VM on ESXi gives each ESXi server its own local SAN storage that you can access from this machine or from others. Only price is enough RAM.

ZFS is the best of all filesystems regarding data security but as it processes checksums, writes metadata twice and use Copy on Write to be crash resistent, it cannot be as fast as filesystems with less data security. You need enough RAM to (over)compensate this with advanced rambased read and write caches.

The minimal ram for a stable use of a ZFS filer is 2-4 GB on Solaris/OmniOS and 4-8 GB on Free-BSD or Linux. Ram above is for performance. A good value for a production filer is 12-32 GB RAM, more when you add dedup or in a use case with many users and volatile data ex a mailserver. Then even 128GB+ can make sense.

With up to 8 disks the 9211 is very good on ZFS but slow with SSD or outside good software raid. With SSDs you may want a 12G HBA like a 9300 or 9305. The 9440 is a very bad choice for ZFS (hardware raid 5/6). Good if you want to use it on ESXi. You can use a 9400 for ZFS but the 94xx only adds NVMe capability over the other 12G cards. I see nearly no sense as you can simply use M.2, Oculink or a cheap pci-e card for NVMe.
 
Thanks! Based on readings a few years before when I was considering building my own NAS, a requirement of ZFS was ECC RAM. The local availability of that in retail channels made me go the Synology route. (I was told rather tersely on the freeNAS forums that it would be stupid to run ZFS on non ECC).

Based on what I'm reading now, that's not really the line of thinking anymore?

I may consider trying ZFS for my 9211s (I have 4 of them I otherwise have no more use for, but purchased a 9440-8i for some modern RAID 1 for simplicity).

Thank you so far for the information! No plans for RAID5/6 at the moment, just RAID 1 to keep things simple, and just want to get SSD levels of speed from the SSD RAID1..
 
Thanks! Based on readings a few years before when I was considering building my own NAS, a requirement of ZFS was ECC RAM.

Undetected RAM errors due missing ECC can corrupt files and even a high security filesystem like ZFS but this is not different to any other filesystem. Only the intensive RAM usage was special in the past when others were proud of a "free RAM 90%" on average use while ZFS use all RAM for caching. But any modern OS does the same today so in the end you should use ECC anyway if you love your data does not matter the filesystem. The more RAM you use the higher the propability of RAM errors, the better is using ECC.

In the end, other filesystems are more critical regarding RAM errors as only ZFS has a chance to detect errors due checksums. When errors happen above a certain level, ZFS sets a disk to "offline due too many errors". The myth that ZFS kills a filesystem due ongoing "false repairs on checksum errors" is exactly this, a myth - despite some comments on a certain forum. Mostly ZFS can limit and report any damage due RAM errors on single files even without raid redundancy (due checksum on data and metadata and double metadata)
 
Last edited:
A question if you've ran ESXI and FreeNAS on a filesystem. Over SCSI, how are the file transfer speeds? Is this going to be limited by the gigabit network or something? Or will I be able to use the full speed of the SSD?

Thanks again for the info, I have 1 9211 still in IT mode so I think I'll start there with ZFS soon ;D
 
Just in case anyone is interested --

I ran some benchmarks on VMs on RAID10 with 1TB ssds / ESXI 6.7 and the write speeds are still quite bad, considering I can get 500MB/s on an SSD that runs off the board SATA port. The reads look impressive though.

Looks like there's a write bottleneck somewhere. The RAID 10 writes are performing similarly to the RAID 1 writes, which is underperforming a single disk SSD write. The RAID card perhaps? If it were the PCIE 2.0 alone, then the reads wouldn't be so good either.

Wish ESXI had some form of mildly robust software RAID like linux.
 

Attachments

  • WindowsVM.JPG
    WindowsVM.JPG
    47.7 KB · Views: 0
  • CentOSVM.JPG
    CentOSVM.JPG
    93.3 KB · Views: 0
Last edited:
Any better sort of softwareraid does not matter if it is mdadm or ZFS use mainboard RAM for read/write caching. Caching is a key method for performance when you connect something fast (CPU/RAM) with something slow (disk,ssd). ESXi itself does not use RAM for caching as this is critical for VMs. Any rambased writecache, does not matter it it is on disk/ssd, the raidcontroller or the operating system leads to a dataloss/data corruption on a crash during write unles there is a powerloss protection.

A hardware raid-1 without cache cannot be as fast as a softwareraid with cache support from mainboard RAM.

btw.
If you use a ZFS storage VM to provide storage for ESXi, you can have a cache and a cache protection (sync write) at a reduced performance (security costs performance anyway). Connectivity between a storage vm and ESXi is "internal in software" and depend on type of vnic. E1000 is qute slow, vmxnet3 is fast although not as fast as a directly connected SSD due the network stack between

I would not use iSCSI for ESXi storage but NFS. Performance should be quite similar with same sync settings but NFS is by far easier, you can use ZFS snaps for a filebased rollback or SMB for easy and fast access for copy/move/clone/backup or recovery via Windows "previous versions"

I would not prefer FreeNAS as RAM need is quite high. Most RAM efficient are Solaris based storage VMs (Oracle where NFS and ZFS comes from and is native or a Solaris fork like the minimalistic OmniOS with OpenZFS). On Free-BSD, XigmaNAS should be more RAM efficient than FreeNAS.
 
Last edited:
Back
Top