WD Green drives in an Areca RAID
Before anyone explodes, I know that Green drives are a no-no, but the damage is done, and there is no possibility of buying reds or enterprise drives except for in the future going forward.
My system very quickly is 5,1 Mac Pro with SmallTree 10GBE card into a HP switch with 10GBE and 1GBE ports, Areca 1882 with 60 odd TB of storage in RAID6 (3TB green drives). 3 or 4 FCP clients playing a stream of ProRes 1080i50 or so connect over gigabit ethernet through the switch (max 100Mb/s all up really...)
I have spent hours going through the plethora of information that is the CC / Bob Zelin Finalshare goldmine, and nothing that I am doing except for having green drives is out of whack with the setup.
The drives were bought for just storing raw data, and did not have performance in mind, and they filled that capacity admirably I must say.
Now they are being edited from and unless it is some other basic setup problem, I feel it comes down to the drives. Dropped frames seem to occur mainly when media from the array NOT being accessed is coming up or begins to be played. I get the feeling that the green drives are needing to spin up or unpark their heads or whatever intellipower thing it is that they do to access the media, and can't do it soon enough.
My question is, does anyone have any experience with the wdidle3 program that I have come across repeatedly in my travels? It is my last ditch hope at a solution.
If anyone else can think of anything, i'd be keen to try it and to offer up any extra information required.
Thanks in advance :)
Too late. I just heard Bob's head explode from across the country.
BTW, it's not performance that makes green drives a no-no in RAIDs. When you're talking total MB/s, all SATA HDDs are pretty similar, no matter the RPMs. RAID drives have to work in teams and green drives are not designed to do that. They will make the RAID randomly go offline because the RAID controller will see drives taking too long to do certain operations and will think the drives have failed.
Production Workflow Designer / Consultant / Colorist / DIT
No matter what it costs,
you have to replace the eco drives!
The whole array will fail in the near future.
Mac pro 8core
several raid systems
Actually green drives have a power saving feature, if not in use, they'll spin down. At least that how I understand them.
Vertical Sales Manager
Proavio Storage by Enhance Technology Inc.
12221 Florence Ave.
Santa Fe Springs, CA 90670
Main: 562-777-3488 X106
It's amazing that people treat an article I wrote in early 2008 as "the gospel".
What HP switch are you using ?
Rescue 1, Inc.
Oh and the switch is
HP PROCURVE 2910AL-48G SWITCH 48-port 10/ 100/ 1000 basic L3 fixed port switch
[Murray North] "My question is, does anyone have any experience with the wdidle3 program that I have come across repeatedly in my travels? It is my last ditch hope at a solution."
WDIdle3 sounds just like a possible solution. Haven't tried it myself - never worked in Greens in RAID sets. Would love to hear if it worked for you.
Ideally you'd want to do the voodoo on just a few and test them out before changing the timers on production drives.
There're also warnings to increase the timers rather than disable them.
Unless I'm mistaken, aren't the green drives all running at 5400RPM?
Yet another reason to avoid them.
David Roth Weiss
Sales | Integration | Support
David is a Creative COW contributing editor and a forum host of the Apple Final Cut Pro forum.
I'm intreaged by your openness of mind. Perhaps we should listen to this unique experiment.
Mac pro 8core
several raid systems
this is the same story as the post above this one. This gentleman could have purchased the correct drives, but he found the absolutely lowest price possible, and this was MUCH more important that spending two minutes to read the spec's on the drives. Why ? Because it costs SO MUCH LESS MONEY. And now he will suffer.
Don't worry, there will be plenty more posts like this soon.
Rescue 1, Inc.
I actually like the idea of sticking it to the man, and instead of paying $50 extra per drive just for the firmware upgrade and maybe $5 worth of hardware (on Red drives), try to use cheap stuff. Google, MS, FB, Amazon - they all get away with that - they use dirt cheap drives with triple redundancy backed by a really good risk management science. A small potato like me wants to get away with that too, and it's a noble pursuit, in a way. Stick it to the man.
That said, I am a VAR. If a client asks me for an 8-spindle parity RAID, I have to go by the book and use enterprise class drives unless I get a waiver from the client that he understands the risks of using non-enterprise drives.
Haha I love all these responses! Bob I am suffering and will likely continue to, but that's all part of the fun! Okay so, whilst in theory I understand that perhaps my green raid should be falling to pieces and failing, it and others like it are kicking ass at backing up our editshares and storing raw camera data etc.
The drive fail rate is unremarkable, though higher certianly than RE drives, and the RAIDs do not degrade every 10 minutes from drives spinning down. We have ~300TB of this kind of storage and I'd go so far as to recommend the 'ol green drives for this purpose.
However that is neither here nor there as this particular raid is being used for editing and seems to be causing the problem.
Running this wdidle on 48 odd drives one by one on some DOS system is a daunting task, and I hope to avoid it, but yes Alex I plan on trying to find a way to test this first, although I don't think I have the hardware to do so really.
Does anyone have any input as to whether my symptoms can be found on RE drive systems, or perhaps if Red or Black drives are up to the task, or if THEY exhibit similar symptoms as well?
Thanks all :)
Please explain the setup a bit more. The dropped frames occour where? On your machine (which, I assume has the storage attached, ie. is the host) or on the machines accessing the storage via 1GB Ethernet?
While green drives are very likely to drop out of a raid sooner or later, their performance should be good enough for a couple of Prores streams, especially in larger raid setups.
What are you connecting to with the 10GB card? Have you checked your client's network bandwith? Are you running Qmaster on the same network? That's where I would start digging. Jumbo frames on?
"You also agree that you will not use these products for... the development, design, manufacture or production of nuclear, missiles, or chemical or biological weapons."
iTunes End User Licence Agreement
Frank writes -
Please explain the setup a bit more.
REPLY - I have asked Murray what switch he is using, but he has not replied. Does he have an HP 2910al ProCurve like EditShare uses, or is it some piece of crap ? Who knows.
Frank writes -
What are you connecting to with the 10GB card?
REPLY - he has already stated that he has a Small Tree 10GbE card in his server computer. Now, is it configured correctly. Does he have the correct driver? Should we assume that it is going to the SFP+ port on the HP ProCurve (is it a ProCurve, and does he even have an SFP+ - maybe it's an SFP !). Has he checked the CLI to even see if he is getting the proper communication ?
Frank writes -
Have you checked your client's network bandwith?
REPLY - does he know how to do this ?
Frank writes - Are you running Qmaster on the same network? That's where I would start digging.
REPLY - very good point.
Frank writes - Jumbo frames on?
REPLY - if he doesn't know how to configure the switch, and it's at MTU1500 by default, then even if Jumbo is on, it's doing nothing.
He bought a Small Tree 10GbE card - why didn't he rely on Small Tree to provide a solution for him ? Frank, if you've been doing this for a while, you know the answer. He got a DEAL on the HP switch. It's probably used. In his research he probably spoke with Small Tree and passed out when they told him what they wanted to help him (and you would have been too expensive too, Frank). So of course, there are lots of variables, and from this brief description, the WD Green drives are an obvious first possible issue, but there could be countless issues, and you have just pointed out in your post.
This is what happens when you try to do it yourself. Close - but no cigar.
Rescue 1, Inc.
Thanks for the responses.
I didn't want to get too bogged in the network setup as my hunch is that it is a problem with the drives. Once I have eliminated the drives as a problem, if any of you have the patience I will outline my setup and see if I can find any problems there, but at this stage I don't want to waste anyones time.
Thanks for those suggestions Frank, after reading extensively through these forums however, I feel as though I have ticked all those boxes unfortunately.
For your interest the switch is a HP PROCURVE 2910AL-48G SWITCH 48-port 10/ 100/ 1000 basic L3 fixed port switch.
Currently, however, I am just keen on working out whether the drives are providing and insurmountable hurdle to my shared storage system here, and whether maybe wdidle3 or something else can magically save the day.
Will get back soon, thanks all :)
you have an excellent switch. This is the same switch that EditShare often uses for their systems (they also use the Fujitsu XG0224). So, it's unlikely that you have an issue with your switch. And of course, the Small Tree 10GbE card is excellent as well.
Rescue 1, Inc.
The symptoms you describe could be network or drive related. While you have good network hardware, the cabling quality is unknown and is a top source of network problems at even much lower speeds than 10GigE which is even more finicky about not following the IEEE rules on cable lengths, bend radii, proximity to ballasts, and other electrical equipment, pinching, pressure (think stapling cables to a wall or a table length squishing one, that's BAD). You should be able to isolate this by putting a single USB drive on the server and just doing some basic (large) file copies with curl or rync, while checking the performance in iotop or equivalent to see if you're getting source read stalls from the array. Or if only happens over the network. If it's sourced at the array then you'll need to figure out why that's happening, and it wouldn't surprise me if it's bad sectors on new green drives.
WDC explicitly proscribes the use of Green, Blue, and Black drives in anything other than raid1 or raid0. The Red's are proscribed in arrays comprised of more than 4 disks. The RE's are recommended for 5+ drive arrays.
Thanks for the input Chris. We addressed any cabling issued very early in the piece and had high grade cat6 installed to eliminate the chance of this being a problem.
Just to sign off on this whole saga, we discovered that the problem wasn't just random dropped frames, it was specifically FCP having problems as soon as it had to access media from a different raid array from the one it was looking at currently. That is, if you were playing back media from only raid A or only raid B, it would be fine, but if you had a sequence with media on raid A, and the playhead approached media from drive B, final cut would drop frames and potentially not recover for 5 or 10 seconds at worst.
We've installed a new raid box, media managed all the important media onto it, and now the editors work exclusively from it, and it works capably. To anyone who says green drives can't work in a raid, and work reliably and well, respectfully, you are wrong. That isn't to say that we aren't buying red drives from here on in, but whatever the problem is, and it may still be a green drive thing, it can still work in certain circumstances.
Thanks everyone for their help though, and I hope that this can be of help to someone else. Feel free to message me if you are having similar problems.
Yeah, about the green drives and raid thing. I don't know that anyone said you can't do it. Just that use in raid isn't recommended, including by WDC. In fact WDC says these drives are for secondary usage, implying they don't recommend them for boot drives either. I don't see the point in arguing with a manufacturer who is basically saying in a marketing data sheet "we really don't want your money for your intended use case."
Further, it's just a matter of time before there will be problems with these drives. Forums everywhere are full of such stories of raid5's collapsing when green drives are used. The common sequence is: one drive dies or takes too long in error recovery for the controller, controller kicks out the drive or resets the bus, user replaces the bad drive (which may or may not be bad) and then all it takes is a single bad sector to cause either another kicked drive, bus reset, or an actual sector read error. In all of those scenarios the raid5 rebuild halts, and it's no longer merely degraded it has collapsed. So then people freak out because only one drive died and this isn't supposed to happen, blah blah blah.
The real problem with the drive is that the ERC is too long, and it can't be configured with any of the recent Greens. If you want to play with fire, set the controller error time out so that it's at least 121 seconds to give the drive enough time to actually report a read error, so that the bad sector is repaired by the raid controller. And also do regular scrubs.
Of course, in the meantime, your application must be able to gracefully contend with up to 2 minute hangs while the drive sorts out whether or not the data on that sector can be read or recovered. Many applications get pretty pissy (let alone the user) when there's an IO delay of 30 seconds, let alone 2 minutes.
And it's not like it's a whole lot better on the Seagate consumer side, where they now have in their marketing spec sheet under Reliablity, a 2400 hour power-on spec. That's 100 days at 24x7. A Google or Amazon, if they were even to use such a drive, would bust through that spec on day 101, and exceed it by a factor of 7 before the warranty was up.
There's no good reason for these companies to honor warranties at all when drives are used in situations that are plainly proscribed.
Sorry Chris, my post wasn't a stab at you. I think somewhere earlier in the thread someone had said that it was impossible and the worst idea ever, so it was more a nod to that. At any rate, you clearly have more drive and RAID knowledge than me, and i'm sure everything you say is true. But all I can say is that for better or worse, we have a few 100TBs of green drive raids, and we follow every precaution we can, regular verifies and the rest of it, and to date we have lost no data. I haven't had to deal with random hangs, and rebuilds go without issue. I might just be very lucky, but I am just reporting what has happened to me.
Thanks for the tips though :)
I didn't take it as a stab, and even if it did I'm fairly immune. I think the issue you're likely to see is marginally bad sectors creeping in that aren't detected during normal or scrub operations because the drive firmware is designed to mask such problems. The point at which they're unrecoverable is when the drive times out and finally reports a read error. And only on a read error can the controller rebuild the affected chunk from parity and cause the bad sector to be overwritten, at which point the firmware will determine if the bad sector is transient (it just needed to be rewritten) or if it's persistent, and if it is persistent then the firmware will remap the LBA(s) to a reserve sector and write the data. The ability for parity raid to "self-heal" in normal read operations and in scrubs in the described manner is thwarted with Green drives. The use case and design goal are incongruent, and that is what technically nullifies the warranty.
With ~300TB of Green drives I think you'll see untimely collapse of an array rather than the normal degrade and rebuild. With this many drives the proper drive from WDC is the Se. Even the Red is limited to 5 drives per array so if you're over that, technically they could deny warranty because of the use case and design goals aren't compatible. They say this rather plainly on the marketing spec sheet.
So that's the extra long version of what "impossible" and "worst idea ever" probably translate into.
Also note that the WDC Blue and Black also are not meant for anything other than raid0 or 1. The first applicable drive is the Red, but that's for 5 or fewer disk arrays. More than that and it's the Se.
[Chris Murphy] "WDC explicitly proscribes the use of Green, Blue, and Black drives in anything other than raid1 or raid0. The Red's are proscribed in arrays comprised of more than 4 disks. The RE's are recommended for 5+ drive arrays."
..and I thought WDC "proscribes" this (and cripples the firmware on Greens, Blues and Blacks) primarily because they want to sell pricier models, not because there is some other scientifically sound reason to? :)
Is Red anything but a Green with a slightly modified firmware that somehow costs $50 extra to the end user?
WDC Red 2TB = $119
WDC Green 2TB =$109
WDC Red 3TB = $150
WDC Green 3TB =$128
It's $10 and $22 respectively. Most people are going to save a couple hundred bucks. At least Murray has "saved" in the realm of $2k-5k, but I think they're going to end up being more hassle than they're worth.
But if you look at the specs, they match up very much the same, although the power consumption and acoustics are slightly different and I doubt that comes from firmware alone. If they are identical, WDC could be sorting them based on test performance data after assembly, before labeling and setting a key in firmware that makes the drive behave according to its model. So it might even be the same firmware. The real question is, why does it matter?
Back in the late 90's I consulted with a TV station about to supplement their quantel's with a Mac based video editing station. I can't for the life of me remember the company/brand, but what I do remember is the various products were in fact a single software and hardware base, with feature sets unlocked by firmware in the hardware card and a serial number (or possibly a dongle) for the software. The difference in price was in the realm of $5000 for the low end and $30,000 for what the customer ended up buying. Same hardware and software installed, enabled by firmware and a code.
Today I work primarily in color management and the same thing is common, you get identical hardware, but features are unlocked depending on what you paid for, all of it being dongle protected.
And this was extended several years ago to CPUs. Intel disables some of the cache and the cores and sells those chips as a different brand (Core i7) than the ones with all cores and cache enabled (Xeon). You get what you paid for.
So I don't see why it matters. You want to save $10-50 and get the exact wrong firmware behavior? Good luck with that, but I don't see it as sticking it to the man when you make that choice, you're sticking it to yourself.
[Chris Murphy] "You want to save $10-50 and get the exact wrong firmware behavior? Good luck with that, but I don't see it as sticking it to the man when you make that choice, you're sticking it to yourself."
Sticking it to the man doesn't mean shooting yourself in the foot. I never used Green drives in RAID and don't plan to.
Sticking it to the man means you have to be smart about it and mitigate the adverse affects of using eco drives - smartly, not stupidly. All I said was - there're ways to do it. Did you read my post? :)
Yes, there are also smart ways to use a pistachio shell as a water filter.
So you aren't using or recommending Green's, and yet in the previous post your two questions insinuate that either the difference in firmware behavior is unimportant/meaningless, or that somehow WDC is acting immorally by charging more for different firmware behavior. And you're also discounting the 3 year versus 2 year warranty, and the fact that the Green's 2 year warranty is technically invalid per the marketing spec sheet if you use it in raid. So it's a three year versus 0 year warranty, for a $22 difference. And you think this is sticking it to the man. Yes I have read your post and the point escapes me, other than something in between "ill advised" and "rather tedious to mitigate", can still be done. Actually the pistachio shell as a water filter makes vastly more sense, at least it's compostable.
Meanwhile I'm informed off-list that bin sorting is not used for mechanical drives. So it seems more likely there is in fact a physical difference between Green and Red models, no matter how similar they look from the outside or on paper. Obviously WDC trusts the Reds for an additional year of usage on top of the fact it's qualified for 24x7 use and the Green is not.
[Chris Murphy] "yet in the previous post your two questions insinuate that either the difference in firmware behavior is unimportant/meaningless, or that somehow WDC is acting immorally by charging more for different firmware behavior"
Point: green drives can be used where the difference in their behavior (with reds or other similar drives) is either irrelevant or well mitigated. Didn't I say it three or four times already? If it still escapes you, perhaps contact me off list?
At price differences you mentioned - no point. When the price difference were to the tune of 30-50% (I think a year ago or so) - there clearly is a point.
Oh, and the fact that using green drives wasn't the culprit in OP's issue: did that point escape you too, or are you still out to make compostable water filters?
USB stick RAID can be used where differences in their behavior is either irrelevant or well mitigated.
If you have a mitigation example for using Green drives in raid, that might be a useful qualification.
Even at a 50% premium for Red vs Green, the Green's warranty is invalidated by using it in any raid, so is that really worth 1/2 off, in addition to having the wrong ERC timeout?
Yes I did notice the OPs problem wasn't related to the use of Green drives, because his thread update was in reply to my belated comment, in which I didn't focus on Green drives being the culprit.
[Chris Murphy] "If you have a mitigation example for using Green drives in raid, that might be a useful qualification."
MSS, RAIDZ, Linux RAID(6), Drobo BeyondRAID. I believe big cloud players use similar schemes (perhaps geographically distributed and with triple redundancy) that are would well mitigate the use of green drives - if they don't already. I do know they use desktop drives - which of course invalidates their warranty, right?
[Chris Murphy] "USB stick RAID can be used where differences in their behavior is either irrelevant or well mitigated."
First compostable water filters, now USB sticks. I take it, you're trying to make the idea of using green drives in RAID sound outlandish? :)
You do have a point about potentially invalidating drives' warranty and about the difference in cost being too minor. That changes nothing about mine.
I can't speak to MSS, but I can the other three. None by themselves mitigate the consequences of the WDC Green's high SCT ERC timeout. RAIDZ can of course detect errors that the drive cannot, but before ZFS (or Btrfs or ReFS) kick in with their own ECC, the drive ECC is in charge. When it encounters a problem, it will attempt error recovery itself for up to 2 minutes before it sends the data to the controller. If you're lucky, the drive errors out much sooner and ZFS can then correct for that, as can the linux md driver. But likewise, the md driver can't work around the high time out either. It simply must wait. The linux SCSI layer by default has a lower timeout than these drives (as do many hardware raid controllers), so if the SCSI layer timeout is reached before the drive timeout, the drive is reset. The drive then can't report a read error or what sector was experiencing a problem to the kernel, and thus it isn't fixed by either RAIDZ or the md driver.
With some luck, the drive errors out sooner, and then either ZFS or the md driver will receive the read error from the drive which includes the sector LBA, and the data can be reconstructed from parity, handed off upstream to your application and downstream back to the drive to overwrite the bad sector. This is far less certain and common when the drive has high ERC timeouts though. And that timeout isn't configurable on the Green. Regular scrubs are likewise adversely impacted by the high ERC timeout of the drive. The timeout is critical to getting the correct data for the bad sector and causing it to be overwritten.
As for Drobo, well, I think it's weird they'd want to support something that WDC themselves do not. And after Scott Kelby Drobo Experience, I wouldn't trust it for photos let alone video I cared about, regardless of the drive it uses.
Now, I'll grant that using a Green drive in raid can be done, and can work, but with such huge PITA caveats that it's just not worth it in the likely context of this forum. Google, Amazon, I don't know if they use these drives, but they do use consumer drives for certain use cases, I wouldn't be surprised if they get a great deal for a DOA only warranty. But they also use distributed file systems. So they can lose not just a whole array, not just a whole rack of storage, but an entire data center, and keep plugging along. They don't use distributed file systems to mitigate using cheapo drives. They use cheapo drives in some cases because they have a distributed file system in place for many reasons other than storage failure mitigation. And when Google have problems with this system the delays aren't just a few seconds, so I'm skeptical if it's adequate enough for the demands of video work.
A mere mortal could pre-emptively mitigate bad sectors by periodically zeroing (or writing anything, or even faster would be to use ATA Security Erase) the drive which will force the firmware to remove any sectors with persistent write failures; and also doing a weekly SMART extended offline test using smartmontools. Both probably qualify as PITA for most people. And then, moving data around to free up drives for their regular wiping, there's the increased risk getting hit with silent data corruption.
So yes, USB stick raid is probably more ridiculous than using Green drives in raid. But compostable pistachio shell water filters are a better idea.
Bring thread from the dead =)... I just wanted to add a few things related to WD Green drives and Areca 1880 cards.
I've been using and areca 1880i with 5 x 2TB green drives RAID5. I got them a couple years back when they had 3 year warranty. Started with 4 drives and migrated to 5 drives few months back(long story but needed drives for other purposes). Monthly scrubs. Drive is constantly accessed, so it rarely "parks".
I have just recently attempted to replace my 5x2TB Greens them with 5xWD SE 4TB drives and it turned out a failure. Aside from one DOA, two more failed from too much sector errors after a few days use. Only two survived. Returning all.
Notes related to Hard drives:
Notes on Areca 188x cards:
Hope someone might find this useful.