Petabyte Storage

miercoles13 · Oct 8, 2012

Heard of Aberdeen? I seen full petabyte rack advertised by them for 500k
http://www.tomshardware.com/picturestory/582-petarack-petabyte-sas.html

parityboy · Oct 8, 2012

miercoles13 said:
Heard of Aberdeen? I seen full petabyte rack advertised by them for 500k
http://www.tomshardware.com/picturestory/582-petarack-petabyte-sas.html

That's some interesting hardware. I notice they don't use RAID cards, but use "plain jane" HBAs & SAS expanders. I assume then that a customer would have to specify additional features such as SSD or BBU RAM caching to improve performance.

levak · Oct 8, 2012

They use ZFS filesystem and it doesn't need an underlying HW raid. For caching you can add SSD or ram as ZIL(write) or L2ARC(read) cache...

madrebel · Oct 8, 2012

aberdeen sells nexenta powered solutions so zfs is doing all the 'raid'. you can add write or read cache as you need. that particular system doesn't have any by default but you can add whatever you need. however, the raw storage number will drop below 1PB.

also, i'm not a huge fan of how they cabled their system. daisy chain is fine in some scenarios however i'm not a fan of daisy chain and high availability together. mix in a switch for that use case IMO.

parityboy · Oct 8, 2012

@levak

Thanks for that. I didn't know they sold Nexenta software, otherwise I would not have asked such a silly question.

@madrebel

I agree with you, daisy chaining and HA shouldn't even be in the same sentence (oops!!), let alone the same system. As for adding cache, if the cache was an ECC RAMdisk backed by a local battery and an UPS, it wouldn't impact on raw storage now would it?

madrebel · Oct 8, 2012

you can't do HA with that type of ram disk though. well, you 'can' but your pool will always be degraded because you have to create the pool, add the local ramdisk, export/import, add the other ramdisk. that 'works' but your pool always shows as degraded since you can't mount the local drive from the opposing head.

you also can't create multiple pools without slicing up the ram into many ramdisks which would get very problematic to manage.

zeusRAM's are expensive, but they're awesome at what they do.

JohannesA · Oct 9, 2012

Draugauth said:
You mentioned that you're not even fully utilizing the IOPS of your current setup due to network saturation. Have you thought about adding in additional network cards and teaming/bonding them? This would split the work load over multiple network cards and lower the overall saturation of a single card thus enabling you to reach the IOPS limit of your current server.

We could do that. Anyway we need more space and will not reuse this server for this setup. It probably will do other tasks. Right now we're still in the process to find out how much throughput we have. Our setup wasn't properly monitored. I want to have solid numbers that I can judge better.

JohannesA · Oct 9, 2012

Bigdady92 said:
I've had some dealing with Compellent, they are very agressive in going after EMC customers in even buying back old EMC equipment when you purchase new Compellent hardware. The warranty, service, and support is similar to EMC's with the monitoring and phone home equipment.

I will be in contact with Compellent. Thank you for that information.

JohannesA · Oct 9, 2012

madrebel said:
use LSI only. LSI HBAs and LSI controllers in the JBODs. if you mix in another vendor in the SAS chain you're asking for trouble.

Ty. LSI was always our only serious considered option.

madrebel said:
you will want nexenta, the things they have coming for 4.0 are absolutely amazing.

We still have no quote from Nexenta. It boils down how much we would pay for licenses.

madrebel said:
I also get a lot of use from a product called SANTools,

This is very useful. TY for that tip.
Thank you all and especially madrebel for this insight. I presented my newly found knowledge to my bosses.

Everything points towards a self build setup. I have some more questions I need to ask:

With ZFS as fs. How would we organize vdevs. I understood there's a practical limit in connecting more than X drives in one pool. This probably also depends a lot on workload. We will use 3.5 inch 2 or 3 TB SAS disks and some ssds/zeus for zil and l2arc. We still don't know how many servers + disks per server and how to power these in an appropriate way.

XFS + mdadm or XFS + hardware raid is still not ruled out, we see Linux as an option and ZOL is maybe to young.
This has the potential of starting a flame war, but I would ask about your experience in this direction. Please stay focused.

As madrebel and others pointed out it all depends on what we need in throughput, latency etc. Is there a common way to find all the metrics we need, to proper decide which route to go.
TY for your time

brutalizer · Oct 9, 2012

JohannesA said:
With ZFS as fs. How would we organize vdevs. I understood there's a practical limit in connecting more than X drives in one pool. This probably also depends a lot on workload. We will use 3.5 inch 2 or 3 TB SAS disks and some ssds/zeus for zil and l2arc. We still don't know how many servers + disks per server and how to power these in an appropriate way.

For IOPS, mirrors are recommended. Build a zpool with of lots of mirrors. If you only need throughput, then raidz2 suffice. Maybe 8 disks in each raidz2. So you can connect many vdevs, each consisting of raidz2.

For instance, the Sun X4500 Thumper had 48 sata disks connected to 6 HBAs. So typically you had vdevs consisting of six disks. The first disk connected to the first HBA. The next disk to the next HBA. etc. This way a 6 disk vdev was connected to unique HBAs. This means if one HBA died, it did not matter because all other HBAs where active. So you could have 8 vdevs, each consisting of 6 disks - in one huge zpool. This makes 48 disks. (Not really true, because you needed some disks to host the OS)

XFS + mdadm or XFS + hardware raid is still not ruled out, we see Linux as an option and ZOL is maybe to young.
This has the potential of starting a flame war, but I would ask about your experience in this direction. Please stay focused.

XFS is not safe with respect to data corruption. Here are lot of research papers if you want credible sources when you talk to your boss:
http://en.wikipedia.org/wiki/ZFS#Data_Integrity

If you store lot of media, then data corruption might not be of a great concern. Because a single pixel might be blue instead of red - that is no big deal. But if you run a database then you might have a problem with pixels randomly flipped - because of bit rot.

madrebel · Oct 9, 2012

JohannesA said:
Is there a common way to find all the metrics we need, to proper decide which route to go.
TY for your time

not really. you can ballpark by counting up the drives in your environment and estimating what the physical drives can deliver. be careful to account for current raid levels etc.

vmware has some survey tools that get you close .... ish.

you can run perfmon on windows boxes and capture use stats. sometimes you have an app that you know needs to push/pull X MB/s or requires Y IOPs or you would like it to have Y IOPs.

sizing up storage requirements is as much art as science though unless you already have solid monitoring in place which very few do.

cantalup · Oct 9, 2012

parityboy said:
@OP

This would be your first mistake. Right now, the performance of ZFS on Linux simply isn't there, and since the project isn't very widespread, it hasn't really been put to the question. I would NOT trust my data to it.

Having said that, I can see nothing wrong with using a number of ZFS nodes running an OpenSolaris variant like NexentaStor with the zvols exposed via iSCSI and then perhaps using CentOS nodes in front of them running Lustre for a clustered file system. You could also look at RedHat Storage Server, which uses GlusterFS.

I'd get some expert advice though, if only to broaden your awareness of what's involved.

would like to add ZoL information:
ZoL is not ready for prime time that has many hidden holes, which need to avoid manually (as long as we read the release and dev/discussion mailing list, should be OK), some function is not fully support in ZoL that can rended ZFS into a disaster

. I would say "ready for SOHO only".

I am using ZoL on centos 6,3 as a back-up server, the performance is pretty good comparing with OI ZFS

ehem actually bit better ... where I use SAT2-MV8 old cards.
actually moving from legacy adaptec Raid card....
one a big concern is.. when you roll-back via snapshot without unmounting, Zol would be S.O.L

.

_Gea · Oct 9, 2012

JohannesA said:
With ZFS as fs. How would we organize vdevs. I understood there's a practical limit in connecting more than X drives in one pool. This probably also depends a lot on workload. We will use 3.5 inch 2 or 3 TB SAS disks and some ssds/zeus for zil and l2arc. We still don't know how many servers + disks per server and how to power these in an appropriate way.

There s no technical limit in number of vdevs/disks in a a pool or in the reachable capacity. The practical limit is the fact, that with many disks you have more and more or more often faulted disks which degrades availability or performance.

Mostly it is better to have not that big one server but several smaller ones - each with a just enough sequential and I/O performance and network connectivity

Without databases ot ESXi/ NFS use, you do not need a ZIL. You need at first and mostly RAM to serve nearly all reads from RAM.

XFS + mdadm or XFS + hardware raid is still not ruled out, we see Linux as an option and ZOL is maybe to young.
This has the potential of starting a flame war, but I would ask about your experience in this direction. Please stay focused.

With large storage, you have the following problems:

- you always have silent data errors. They grow with the amount of storage.
With large storage, you need checksums to discover these errors at all and to repair them on reads from Raid redundancy. Beside that you must check your whole storage regularly for silent errors. This must be done without a dismount in background even for open files.

You will always have these problems that usually can only be repaired with a fschk run. This can last days with a PT storage and must be done offline without guarantee of success. A sudden needed chkdsk on my mail server some years ago results in thousands of unusable data chunks.

Only a fs with copy on write ensures that writes are done correct or not so that these errors cannot occur.

With a PB storage, I would go ZFS.
It is called the last word in filesystems, has all these features and it is OpenSource. No vendor or OS lock-in.
Only the Linux port is maybee not yet ready. ZFS on Nexenta/OI/OmniOS/Solaris and BSD is interoperable and stable enough.

Real data checksums and CopyOnWrite are the two major key features for modern storage.
You need them more than every other feature.

dilidolo · Oct 9, 2012

The only problem with ZFS is when you grow your zpool by adding more vdevs, there is no reallocate to balance existing data.

_Gea · Oct 9, 2012

dilidolo said:
The only problem with ZFS is when you grow your zpool by adding more vdevs, there is no reallocate to balance existing data.

Why should that be a problem?
All modified data is striped over all disks and is rebalanced automatically due to copy on write. The other data is read from the disks where it is. Disk read performance is not a problem and is mostly done from arc cache. Automatic rebalancing may only disturb because it would add the same load like a resilver on a disk failure.

madrebel · Oct 9, 2012

i agree with him gea_, somewhat. it would be nice to rebalance. required? no. especially since there is no scheduler to prioritize user/app facing IO from background IO.

maybe one day bp_rewrite or something better will make it into the code base. till then, is what it is.

dilidolo · Oct 9, 2012

_Gea said:
Why should that be a problem?
All modified data is striped over all disks and is rebalanced automatically due to copy on write. The other data is read from the disks where it is. Disk read performance is not a problem and is mostly done from arc cache. Automatic rebalancing may only disturb because it would add the same load like a resilver on a disk failure.

Is it absolutely required? No
Did I say automatic rebalancing? No

You have vdev at 70% full, now you add new vdev. ZFS is SMART enough to distribute IO among the vdevs, how does it distribute in this case? With COW, when free capacity reaches certain threshold, write performance suffers.

When you add multiple vdevs at a time, this may not be a big issue. But if you only add 1 vdev, the performance is not very predictable.

I love ZFS, but for such massive scale storage, bp_write should be a standard feature and you should be able to schedule it.

dedobot · Oct 10, 2012

Animators and composers can and love to produce a lot of data. RenderLayers_v01 to v10 is pretty common where only one version of render is finally used.It's depends of the workflow but surely the OP needs to operate with a LOT of temporally data which will be deleted after the project ends.In this case rebalancing is not big deal,also the backup solution dont need to be petabyte adequate,most probably a quarter of it.

brutalizer · Oct 10, 2012

dilidolo said:
The only problem with ZFS is when you grow your zpool by adding more vdevs, there is no reallocate to balance existing data.

Just move all data to a newly created zfs filesystem and the data will be rebalanced. Yes, it can be a pain to move data when you have added a vdev, but it can be done.

Alternatively, move data off your zpool, create a new zfs filesystem and move data back to your zpool. But it is easier to just create a new filesystem and move data to it.

fibroptikl · Oct 17, 2012

SeaneyC said:
Yeh anyone who is suggesting that EMC could possibly come in for this is living in fairyland. Not to mention their low end stuff is actually crap.

Compellent is pretty interesting, ZFS based, but also with tiering rather than a standard L2ARC implementation, plus you get the support and as you grow you can continually add shelves of cheap storage and let the heads do the work of shifting your old data to the slow discs when it becomes outdated or unused, but still always available instantly.

The Compellent product is not ZFS based, at all. The Dell Compellent zNAS product uses ZFS. It sits in front of the Compellent Storage Center and provides NFS and CIFS. The most recent is the Dell Compellent FS8600, but does not use any ZFS.

kdh · Nov 9, 2012

I'm a little late to this thread.

EMC Isilon is your answer.

Isilon was originally built to support large giant media files on simple to support nodes. It got its start in the media industry.

I am running 8 X nodes, running NFS, and CIFS and sharing the same filesystems between Centos/Redhat/Oracle and Windows 2k3, and 2k8 with zero issues. The stuff just works.

The nodes are reasonable as well. Just depends on budget.

kdh · Nov 9, 2012

madrebel said:
he said he wants 1PB of storage. the cost of this with the letters EMC anywhere involved rapidly approaches 9 figures.

Not true in regards to Isilon. Im at 48Ts of space for WAY less then you would might think.

You'd hit 9 figures running a V-Max 40k, with all SSD drives.

kdh · Nov 9, 2012

Child of Wonder said:
Agreed. The VNXe is garbage.

Disagree.

For an entry box, its a great start. But not a good fit for the OP.

kdh · Nov 9, 2012

Bigdady92 said:
As shown by a poster above, getting even close to the required amount would require nearly doubling the budget.

I've had some dealing with Compellent, they are very agressive in going after EMC customers in even buying back old EMC equipment when you purchase new Compellent hardware. The warranty, service, and support is similar to EMC's with the monitoring and phone home equipment.

Prices are going to come in lower than EMC for sure with the Tiered support everyone is mentioning above. My company has 1/2PB of storage on 2 EMCs and we are actively looking to move off this platform due to $$$.

The only reason why Compellent is that agressive is because Dell and EMC used to be best buds, and Dell resold EMC. They had a nasty break up about 2 years ago when Dell bought out Compellent. Its Dells way of giving EMC the finger. I still have CX4-240s that I have to call Dell for support on.

With that said, Compellent isn't a bad product and if it works for someone then perfect. But its defintily no match for V-Max10k, or even something like a VNX5700.

kdh · Nov 9, 2012

madrebel said:
you can ballpark by counting up the drives in your environment and estimating what the physical drives can deliver. be careful to account for current raid levels etc.

You are 100% correct with this statement.. But, it almost become null and void if you go with a modern storage subsystem that can incorporate EFD/SSD drives and move only the busiest blocks of data to those disks.

No ones data runs 100% of the time at full tilt. You can leverage dump trucks of sata disks, and then have smart caching with EFDs. Counting drives becomes a dead way of thinking.

As an example, I've got 4 apps in my environment pushing a combined 80K Iops. Almost 85% random IO. I would need roughtly 960 15K FC drives to pull that off. How do I do it? Less then 360 FC drives, 64 sata drives, and 16 EFDs.

When you start looking a physical foot print, its not just about storage, but you have to start thinking about physical power, cooling, and how much space in your DC things will take. If you rent DC space, your DC will charge you per month on the WIPS you use. Not to mention the man hours for care and feeding something thats really large. Cost goes up in other ways.

Isilon will do it, and they also have have an optional onboard fast cache excellorator. So you you can have internal SSDs and a Cache Excellorator. One of my nodes was DOA and I had to take it apart to replace some of the guts and saw it. Same with VNX, and V-Max.

Unless you already have 1PB of space used, I would start out with a few Isilon nodes, but go with a 16 port infiniband switch knowing your going grow into it. Adding a node is stupid easy, cable it up, power it on, and tell it to find and join a cluster and you are done.

madrebel · Nov 9, 2012

all true however you still need a ballpark or you're just throwing SSDs at the problem until it goes away. which isn't the 'wrong' answer however it isn't the ideal answer either. pre-planning goes a long ways. if pre-planning fails, throw SSD at it

.

kdh · Nov 9, 2012

Every array should have SSDs as a shared resource. Maybe not at install time, but down the road adding them will save a lot of headache.

But again.. Depends on the goal of the array, direction you plan to go, and budget. OP said his budget was 130K.. I think he could pull off a good chunk of what he's planning on doing with Isilon and excellorator cards.

If someone gave me a 130k and said build a zfs based system using linux and some sas cards on some no name server? I'd walk out the door, aint the job for me.

madrebel · Nov 9, 2012

nobody runs production zfs on linux yet.

he could do the same scale out infiniband with nexenta/zfs too.

kdh · Nov 12, 2012

madrebel said:
nobody runs production zfs on linux yet.

he could do the same scale out infiniband with nexenta/zfs too.

I do like your infniband idea, but I'm not trying to thread crap on your solution so don't take it this way. Are you talking about this: www.nexenta.com or the nexentaOS? The NexentaOS looks like it was retired so I'd be very wary of running it. The product on the nexenta.com website looks neat, but its only a few years old, and is based off of OpenSolaris. I know I'm going to ruffle some feathers with this next statement.. Now that Oracle owns OpenSolaris, it'll be a dead OS in a few years. I'm not being trying to be a d-bag, but really... How long has/was Solaris 11/OpenSolaris 11 been in development? 9 years? If I had to put money on it, Oracle will close OpenSolaris and make it go away. Anyone who's worked with Oracle, knows they are very proud of their products and gives nothing away for free. On top of that, the problem I see with most Software based storage products, such as Nexenta is its Software based and you are limited to the hardware you running it on. Unlimited combos of hardware could easily throw wrenches into this. This might be a great fit for a small to medium install, less than 100Ts of space. But outside of that, it becomes an animal that'll be too unwieldy to manage. If your box takes a dump, your storage takes a dump. This is where a hardware based solution would be a good fit. Not just EMC, but Netapp, Isilon, Compellent, or even HDS.

1P of space is no joke, and nothing to laugh at. It requires a solution that is easy to manage, easy to backup(ndmp?), and has a vendor with a proven track record that will support it and has its own dedicated hardware.

madrebel · Nov 12, 2012

current production nexenta runs off opensolaris kernels, yes. the pending 4.x branch switches to an illumos kernel.

my personal prod install of nexenta has 720TB raw andisn't difficult to manage at all. you do have a point though, you really need to work with a nexenta hardware partner or do your own homework. you can't just buy any hardware and expect it to work. stick with intel and lsi and for the most part you're fine.

i know of a lot of large installs (20+ PB) running nexenta and some even larger ones running opensolaris and ZFS.

it does work quite well and is a LOT less expensive than the emc, netapp, compellent, et al.

kdh · Nov 12, 2012

madrebel said:
current production nexenta runs off opensolaris kernels, yes. the pending 4.x branch switches to an illumos kernel.

my personal prod install of nexenta has 720TB raw andisn't difficult to manage at all. you do have a point though, you really need to work with a nexenta hardware partner or do your own homework. you can't just buy any hardware and expect it to work. stick with intel and lsi and for the most part you're fine.

i know of a lot of large installs (20+ PB) running nexenta and some even larger ones running opensolaris and ZFS.

it does work quite well and is a LOT less expensive than the emc, netapp, compellent, et al.

Wow, thats pretty amazing. Whats the use case for such large data stores? Oracle, VMware, Filesystem data, staging space, temp space?

Deleted member 82943 · Nov 12, 2012

DIY is nice and all but wouldn't you want a tier 1 design it and support it so you don't have the headache when something fails?

madrebel · Nov 12, 2012

kdh said:
Wow, thats pretty amazing. Whats the use case for such large data stores? Oracle, VMware, Filesystem data, staging space, temp space?

a lot of vmware type stuff and back end for cloud compute etc.

DIY is nice and all but wouldn't you want a tier 1 design it and support it so you don't have the headache when something fails?

i have news for you, if your EMC fails, and it can, you have a head ache. there are no failure scenarios where you're not really upset.

in my space margin rules. my job is to drive margin while delivering feature rich solutions. EMC makes great solutions but they demand a ridiculous margin with their crap. $1000 for a 3TB SAS drive that costs $350, wtf?

DIY also isn't entirely fair. there is a nice eco system growing now with many solid hardware partners. and that is another thing, you're more flexible in the build when you work with a hardware vendor. the way i build clusters is NOT required for most other use cases however i had very specific requirements so i built my system a very specific way.

idk, imo you'll go broke buying emc/netapp/dell. virtualization has allowed us to completely commoditize the compute layer. its time to do the same with storage.

Deleted member 82943 · Nov 12, 2012

that's a fair point. If the input costs are such that the overall cost is greater than the cost saved by the third party supporting the system then forget it.

i'm really interested in what the op chooses as SANs have fascinated me a lot lately

what if we as a community came together to offer advice on a DIY solution (hardware: raid controllers, harddrives, etc and software: nexenta, linux etc) and then the op found vendors that would support them on a more enterprise level i.e when a drive failed the drive vendor would promise to replace it within the business day or something and then leave IT to manage the system?

that is of course if the highly specialised storage solutions like the enterprise version of apple's corestorage offer performance that offsets the price then the DIY solution is moot

in the end I think we need more info and personally I'd like to be witness to the community-comes-together-offers-solutions-and-sees-how-it-works idea

it's like crowdsourced business consulting

kdh · Nov 13, 2012

madrebel said:
a lot of vmware type stuff and back end for cloud compute etc.

i have news for you, if your EMC fails, and it can, you have a head ache. there are no failure scenarios where you're not really upset.

in my space margin rules. my job is to drive margin while delivering feature rich solutions. EMC makes great solutions but they demand a ridiculous margin with their crap. $1000 for a 3TB SAS drive that costs $350, wtf?

DIY also isn't entirely fair. there is a nice eco system growing now with many solid hardware partners. and that is another thing, you're more flexible in the build when you work with a hardware vendor. the way i build clusters is NOT required for most other use cases however i had very specific requirements so i built my system a very specific way.

idk, imo you'll go broke buying emc/netapp/dell. virtualization has allowed us to completely commoditize the compute layer. its time to do the same with storage.

While I agree with some of your points, because they are very valid, there are some flaws with them. More so in the shops I've worked in, and currently work in. So in other words, you are 100% correct in your situation, but it doesn't apply to everyone else. With that said, there are some flaws with what I'm posting below. I'm not thread crapping, and hope im not coming off that way. Just adding a different view.

The reason a shop will run a tier 1 provider of storage is not because of cost but because of the proven uptime. I've worked in fortune 500, and currently in the financial sector. In both areas, when your storage takes a dump, you lose millions in a very short amount of time(hours). Those millions lost could have easily paid for a tier 1 solution.

With that said, yes, EMC charges a ridiculous price, but you get what you pay for. 5 9's of uptime is real with my solutions. I've never had a complete melt down of a CX/NS/VNX array. Sure I've had an SP reboot, or a flarecode upgrade go sideways, but because powerpath was installed on all my hosts I never had a complete outage. Maybe a performance hit while SP Cache rebooted. I've done entire cache upgrades on my V-Max20ks which requires a director outage.. Never had a meltdown. There are very few times an EMC array would take a complete dump, complete power loss(preventable), multiple hardware failures(preventable), but its most likely caused by operator error. In the unlikely event, my gear does take a dump.. I have a 1800-its-broke number to call to get someone to help me. With any DIY solution, you are limited to forums, mailing lists, and praying to the storage gods that you also got good backups and a place to restore your backups.

You can DIY a storage system all day long, and even work with multiple hardware vendors to do it. But when things go sideways, everyone points fingers at everyone else and the only person that suffers is the storage admin who built that DIY solution. What happens when you leave your current gig? How hard would it be for someone to jump into your shoes and pickup where you left off? While you think your DIY solution is simple, its a one off solution that only you know the qwerks of. Let me ask it another way; have you ever jumped into a new gig and picked up someones bastard "solution" and been like.. WTF what have I gotten myself into?

Storage virtualization is already there, you can use a v-max40k/20k to virtualized cx/ns/vnx, and hds storage behind it. I'm currently doing it. You can also do the same with Hitachi's USPV/VSP, I'm doing it as well. But I do not agree(doesn't mean youre wrong) with the way you envision storage virtualization. To many ways to do it, and too many variables to go sideways. It would become an unwieldy complicated animal to manage.

I've reread my response above a few times, and tried to adjust for tone, and to not come off as a d-bag. I Apologize up front if I seem sideways.

kdh · Nov 13, 2012

gigatexal said:
that's a fair point. If the input costs are such that the overall cost is greater than the cost saved by the third party supporting the system then forget it.

i'm really interested in what the op chooses as SANs have fascinated me a lot lately

what if we as a community came together to offer advice on a DIY solution (hardware: raid controllers, harddrives, etc and software: nexenta, linux etc) and then the op found vendors that would support them on a more enterprise level i.e when a drive failed the drive vendor would promise to replace it within the business day or something and then leave IT to manage the system?

that is of course if the highly specialised storage solutions like the enterprise version of apple's corestorage offer performance that offsets the price then the DIY solution is moot

in the end I think we need more info and personally I'd like to be witness to the community-comes-together-offers-solutions-and-sees-how-it-works idea

it's like crowdsourced business consulting

I can see this for small shops that just don't have the cash flow. But if your shop can't be down for more then a few minutes, then the above doesn't work. Here is why: lack of set standards. When you piecemeal a system together, regardless of how simple it is.. someone else who comes along is going to scratch thier head for quite awhile till they figure it out. When you buy a storage subsystem package(i'm not talking just emc), you buy a stack of hardware and software that has been tested and is known to operate the way its expected to. It only has a small amount of variables, and you can pretty much predict how its going to behave. It also means that if you quit, and someone steps into your roll, they can keep on keeping on with out much fuss.

the above is all just my opinion, and doesn't mean it applies to everyone or is the set standard way to go.

madrebel · Nov 13, 2012

in the past, there was no choice. sure we've all had that one box that was linux + mdraid or whatever but it did nothing important. you had netapp or emc, end of story.

the isn't the case now. nexenta has a 1800 for support. they're staffed with some really great support technicians. if you were to say build something from open indiana or freebsd, yes, you're on your own however you don't have to be.

as for design, if you're not 2N don't talk about uptime. if you are 2n, and you sound like you are, then uptime really comes down to operator error and or things like super storm sandy taking out your data center. emc and netapp both had plenty of systems go down in that storm. the name on the box and the extreme margin really didn't matter.

sure you can build DC redundancy with emc/netapp but you can do the same thing with nexenta or a DIY zfs setup.

as for what happens when i leave, not my problem. i have no plans to leave so that pretty much means what happens if i die ... again ... not my problem

. however, again, nexenta has a support line just like emc and netapp. any competent storage admin can figure it out really rather quickly.

you sound like a knowledgeable guy however you also sound a bit like some admins i've known in the past who absolutely have to have 'the name' or it just won't work. much of this is handed down from on high because the C levels just don't understand that there may be a better and less expensive alternative. however i've known many more admins that if there wasn't a support number to help with every little thing then they wouldn't get anything done at all, ever. many of these guys go and get certifications and the whole lot but are terrified of actually doing anything fearing they won't be able to point a finger. maybe i've been around to long but there used to be a time when folks became experts with their tools. learned everything there was to learn and only called support when shit was smoking or something extremely complicated needed to be done.

idk, i think your view on anything off brand should be ... adjusted. massive sites you go to everyday or that millions of people go to every day that never go offline run something other than emc/netapp. take youtube for example, you really think they're housing hundreds of petabytes worth of video on EMC? they would have gone bankrupt years ago trying that. all the really exciting 'big data' efforts right now are focused not on emc/netapp technologies but on software driven technology backed up by commodity hardware.

there is no reason enterprise can't or shouldn't take notice. the reason they move so glacially slow on this stuff, IMO, is terrified admins. admins that could be saving their companies millions of dollars but in most cases lack the understanding or ability to communicate it up the chain. however, there are some cases too where according to company charter it is 'illegal' to use anything that isn't at least 5 years old. that is a very real thing in some environments, not many, but some (really large too).

i'm obviously biased, but i have ... crap getting close to 20 years in this space and most of that was spent with me watching 7-8 figure POs get signed for hard drives that are marked up 400% just absolutely mind boggled as to wtf we were getting for that 400% margin ... obviously i'm very excited about the past few years advancements and the software defined/driven focus for storage.

madrebel · Nov 13, 2012

kdh said:
I can see this for small shops that just don't have the cash flow. But if your shop can't be down for more then a few minutes, then the above doesn't work.

but it does. all the major high traffic sites use something that is off the shelf to one degree or another.

Here is why: lack of set standards.

iscsi, nfs, cifs, sas, ses, FC ... only emc/netapp have access to these?

When you buy a storage subsystem package(i'm not talking just emc), you buy a stack of hardware and software that has been tested and is known to operate the way its expected to. It only has a small amount of variables, and you can pretty much predict how its going to behave.

emc/netapp use fairly standard stuff really. in fact you used to be able to buy the exact same JBODs they use. they're just regular old JBODs with LSI or PMC Sierra controllers. same thing you can buy off the shelf. they use regular SAS or FC drives from seagate and wd/hitachi. only difference is they flash a special firmware that allows the drive to function in their JBODs. you did know that they flash the commodity JBODs with a certain firmware that only allows specially flashed drives to function right? you can't just use any old drive in the JBOD you have to buy their marked up drives that are made from unicorn horn.

It also means that if you quit, and someone steps into your roll, they can keep on keeping on with out much fuss.

this is a business decision that may or may not be your concern. in one respect it is to your advantage as you're 'the only one who can do it'. if the savings is large enough and you do a good job of letting finance know how much youre saving the company they would be fools to not keep you financially satisfied. if they dont, not your fault.

at some point enterprise needs to value people/expertise more than names on boxes. thats just my opinion.
the above is all just my opinion, and doesn't mean it applies to everyone or is the set standard way to go.[/QUOTE]

kdh · Nov 13, 2012

In the past there were other choices, but they were either gobbled up by netapp or emc, or only now finally growing. 3par, Compellant, Xiotech to name a few.

Im 2n for most things, but my super critical stuff Im n3. 1 dc in town, a 2nd in Vegas. But when youre talking about natural disasters I get what youre saying. But if you taking natural disasters into the design of your data center, you have the money for 2 data centers, then you should have the money to buy a branded hardware solution if you really care about your data and business that much.

I totally get what you're saying about nexenta, but at some point, you'll have an issue that impacts your nexenta install, but it will be outside the realm of nexenta support.(Hardware). Nothing is forever.. When you do leave your company, you seem to think that any admin could step in and do what you did. I doubt it. You built a custom hardware and software solution. Unless there is a training class that has the exact config you have, it would be impossible for someone to pick it up out the shoot.

I never said anything about C-level folks handing down things to my level or saying its not going to work. On top of that, my C-level folks don't care about what I use, as long as it doesn't ever tank on me. I know without a doubt, I can get 5 9's of uptime with my gear, and deliver the performance by business partners need. What I've said is, you have a proven hardware and software stack that will be predictable, and you can pretty much expect it to perform the same way every time. You can't do that with a DYI solution. It's impossible. Your hardware/software stack could completely break because your server manufacture did a firmware update that busted the way your sas cards work. I don't have that problem with a solution that my vendor tested for me when I do a code upgrade on it. The amount of hardware variables are pretty small. I don't fit the mold of most storage admins.. I pretty much do everything on my own without the support of EMC.

Youtube while a good example, is also a bad example. If a youtube video doesn't load.. People goto the next video.. With the work that I do, if the service my company provides doesn't work, we go out of business. I would almost guess the same for you, if one of your large data stores took a dumb, there would be a fairly large impact to your business.

While I do agree, yes, there are some terrified admins.. but that's a small view of things. I think your view of the storage landscape is a bit jaded. Anyone and any solution can provide giant dumping grounds of space. Not everyone can provide the level of performance, ease of use and level of uptime except a few key players. I have zero problems presenting a 7 figure quote to my leadership when I know it's aligned with their goals and the level of service they expect from me. I just went to nexentas website and did a search for "uptime".. Found 1 hit in regards to how things are monitored. I did the same with EMC's front page, and first document, dated 2007 talking about CX3 Clarrions has 5 9s of uptime right here on page 4. If a storage vendor doesn't brag about uptime, then its a storage vendor I would be wary of. 400% markup is silly, and rarely if ever see that. If my sales guy tried to pull those kinds of shenanigans.. then we just get a different sales guy.

kdh · Nov 13, 2012

madrebel said:
but it does. all the major high traffic sites use something that is off the shelf to one degree or another.

All depends on your buisness needs. Do I think what google uses for youtube a good fit for the work I do? Hell no. Do I think youtubes model for storage is a good fit for say oracle? Hell no.

iscsi, nfs, cifs, sas, ses, FC ... only emc/netapp have access to these?

When I ment standards I ment hardware stacks. a ns480 built 2 years ago, will be the same hardware stack as an ns480 built a year ago. A DYI solution built 2 years ago, will be different today.

emc/netapp use fairly standard stuff really.

To some extend yes, Disk drives, cpus, ram fans etc.. But everything else? Most of it may have a known name on it, buts built for that specific vendor. When I upgraded my V-Max20k directors, it had EMC stamped all over it. Not stickers, but stamps in the PCB.. Same as my VNXs. My DataDomain boxes, same thing.. DD stamped all over the PCBs.

in fact you used to be able to buy the exact same JBODs they use. they're just regular old JBODs with LSI or PMC Sierra controllers. same thing you can buy off the shelf

In EMC terms.. Nope. Not even close. The DAEs in both the CX/NS/VNX and V-Max are made for EMC. Provide a link to prove me wrong. Netapp.. I can't answer that.

they use regular SAS or FC drives from seagate and wd/hitachi. only difference is they flash a special firmware that allows the drive to function in their JBODs. you did know that they flash the commodity JBODs with a certain firmware that only allows specially flashed drives to function right? you can't just use any old drive in the JBOD you have to buy their marked up drives that are made from unicorn horn.

From a mechanical stand point, they are close to the same. But there are differences. Try looking up the part numbers from an EMC provided drive with a seagate label. You can't buy that model off the shelf.

this is a business decision that may or may not be your concern. in one respect it is to your advantage as you're 'the only one who can do it'. if the savings is large enough and you do a good job of letting finance know how much youre saving the company they would be fools to not keep you financially satisfied. if they dont, not your fault.

Is the amount of money saves on a cheap solution worth it when your cheap solution
takes a dump and wipes out the whole company? I don't think so.

at some point enterprise needs to value people/expertise more than names on boxes. thats just my opinion.

I agree with the above statement but there are times when the enterprise also needs to buck up and get real compute solutions instead of one off solutions. One off solutions will always cause problems down the road. You, as someone who stated being in the buisness for the last 20 years.. you can't tell me that some random 1 off solution didn't punch you in the IT face at some point in time.

Petabyte Storage

n00b

Limp Gawd

Limp Gawd

Gawd

Limp Gawd

Gawd

n00b

n00b

n00b

[H]ard|Gawd

Gawd

Gawd

Supreme [H]ardness

Limp Gawd

Supreme [H]ardness

Gawd

Limp Gawd

Weaksauce

[H]ard|Gawd

Supreme [H]ardness

Gawd

Gawd

Gawd

Gawd

Gawd

Gawd

Gawd

Gawd

Gawd

Gawd

Gawd

Deleted member 82943

Guest

Gawd

Deleted member 82943

Guest

Gawd

Gawd

Gawd

Gawd

Gawd

Gawd