FORUMS: list search recent posts

Hitachi SMART status temps are way wrong

COW Forums : RAID Set-Up

<< PREVIOUS   •   VIEW ALL   •   PRINT   •   NEXT >>
John Davidson
Hitachi SMART status temps are way wrong
on Feb 1, 2012 at 2:33:02 am

Hi all,

Testing out a new system with ATTO R680/ ProAvio 8MS and 3TB Hitachi Deskstar drives. ATTO Config's SMART status is saying the drives are 170-211 degrees, which ATTO says might be why we're getting erratic RAID System Test speeds. Using the finger method, these drives are not that hot at all. It's reporting these temps even when the drives are cold and freshly turned on.

Does anyone have 3Tb Hitachi drives installed? If so, can you tell me what SMART status says their temperature is?

ATTO says the temperature might be fine and that Hitachi might be formatting the SMART temperature in a way where the drives are 100 degrees less than SMART says, but I can't find if that is accurate anywhere online. If that is the case, I may have a faulty drive, because something is causing erratic speeds out of our R680. We have a R380 w/ 2tb drives in another room, and the AJA system test graph shows beautiful even lines for read and write, but the R680 graph look like a seismologists worst nightmare.

My RAID is configured using settings given to my by ATTO. Initially, we got one crap drive that kept faulting the initialization and we had to RMA it, perhaps there's a second? Sometimes the RAID just unmounts in the middle of a system test, which is not good.

Thanks in advance!

(* Hitachi claims these drives are excellent for arrays, the 3Tb Deskstars are recommended on ProAvio's website for use in the 8MS enclosure, and I was personally recommended to them by the awesome ProAvio tech support.)


Return to posts index

Petros Kolyvas
Re: Hitachi SMART status temps are way wrong
on Feb 1, 2012 at 5:04:56 am

ATTO is correct regarding the temperature possibilities:

100 - temperature is a possible SMART temp readout format. (This would however, put your drives below zero.) Having said that we get similar readings from WD RE drives on our R680. I wouldn't pay much attention to smart ... it's often useless, we've seen disks fail that were reporting nothing but "in the clear" with SMART. Enclosure temperatures are a much better signpost in my very humble (and not so experienced) opinion.

From the SMART entry on wikipedia: (http://en.wikipedia.org/wiki/S.M.A.R.T.#Information_Provided)
190 0xBE Temperature Difference from 100
Value is equal to (100−temp. °C), allowing manufacturer to set a minimum threshold which corresponds to a maximum temperature.


I can't speak to the inconsistencies in performance however. I hope you find the cause!

--
There is no intuitive interface, not even the nipple. It's all learned. - Bruce Ediger


Return to posts index

John Davidson
Re: Hitachi SMART status temps are way wrong
on Feb 1, 2012 at 10:31:29 pm

Interesting, that appears to be for code 190. Code 194 is the one giving us warnings, and it doesn't seem to have a 100-temp label:

0019 10:48:13 WARN Disk [MK0311YHGK8M4A] SMART attribute 194 worst is now
127
0020 11:48:13 WARN Disk [MK0311YHGLNS9A] SMART attribute 194 worst is now
125
0021 11:48:13 WARN Disk [MK0301YHGM9WKD] SMART attribute 194 worst is now
122
0022 11:48:13 WARN Disk [MK0311YHGLW9ZA] SMART attribute 194 worst is now
162
0023 11:48:14 WARN Disk [MK0301YHGLZ0PA] SMART attribute 194 worst is now
150
0024 12:48:13 WARN Disk [MK0311YHGK8M4A] SMART attribute 194 worst is now
125
0025 12:48:13 WARN Disk [MK0301YHGLBLPA] SMART attribute 194 worst is now
136

I have two additional drives coming in to replace one that was bad and another that seems to have a higher temperature than the others. We'll see what happens....

Thanks again Petros!

John Davidson | President / Creative Director | Magic Feather Inc.


Return to posts index


Bob Zelin
Re: Hitachi SMART status temps are way wrong
on Feb 2, 2012 at 1:39:26 am

Hi John -

I just had a similar thing happen with the R680. And I called up ATTO, after saying "I installed a second card, and the same thing is happening - what the hell is wrong with your card".

It turned out that nothing was wrong with either R680 card. I have never seen this, but the fans in the Mac Pro stopped working, and within 15 minutes, the temerature warnings came up.

Open the side of your Mac Pro, and make sure ALL your fans are running, while you have power on.

Bob Zelin



Return to posts index

John Davidson
Re: Hitachi SMART status temps are way wrong
on Feb 2, 2012 at 2:21:14 am

My understanding is that the ATTO cards are supposed to crank up the internal fans to full blast to compensate for the lack of on board fan. My issue is related to drives in a RAID 5 ProAvio enclosure. As we had a second room with WD drives and a second identical 8MS enclosure, this is what we've done to isolate the issue:

1. Swapped enclosures. It's not a faulty enclosure issue, drives still claim to be hot.
2. Plugged the RAID into a R380 card. Drives still look hot ( ATTO Config SMART says 211 degrees on some drives, even when cold).
3. Swapped drives with 2Tb WD Caviar Black into the R680, all chilling at a nice 111 degrees. It's not an ATTO issue.

At best I can isolate the issue to being specific to how Hitachi Drives report SMART status temperature, as in real life they are not hot. With this in mind, I'm just disabling SMART status monitoring, as it seems to be only really good at giving incorrect information.

We just got two new drives in. I'll be removing the drive claiming to be the hottest and rebuilding the array tonight. With a little luck, we'll be back to nice even read / write lines and can move on to deal with other looming disasters.

John Davidson | President / Creative Director | Magic Feather Inc.


Return to posts index

Jon Schilling
Re: Hitachi SMART status temps are way wrong
on Feb 2, 2012 at 9:40:20 pm

We did some testing and are seeing the same results, indication of "High Heat", while in actuality the drives were well within the correct operating temperature parameters with both the R380 & R680 cards.

Latest as of 2/2/12 at 2:04PM PST:

I just got wind of some new drivers. I believe John Davidson will post on an update to his original post.


Jonathan Schilling
Vertical Sales Manager
ProAvio
Main: 562-777-3488 X106
Fax: 562-777-3499
Email: jon@proavio.com








Return to posts index


John Davidson
Update R680, Hitachi Deskstar 3Tb, ProAvio 8ms
on Feb 3, 2012 at 12:28:05 am

So, new drivers came out, I installed them, my RAID faulted, then rebuilding crashed, and now I'm just reinitializing the entire RAID and hoping that it doesn't report 'faulted' during initialization and require system restart over and over again as it did last night.

I installed the 64 bit flash firmware, and then noticed that my Lion kernel was at 32 bit, so I changed that and am now officially running 64bit kernel. I was not aware that Lion still ran 32 bit kernel.

ATTO said some people had problems with the firmware released in December. The new firmware is dated 1/24/12.

Hitachi has no idea how their drives report smart temperature. None. I'm not actually surprised about this. Big corporations are awesome.

Luis and Jon at ProAvio are awesome, as always. Thank you for helping me!!! Steven from ATTO has also been really good about directing me to potential issues. Hopefully their new firmware fixes all.

As a last resort I still have the old Highpoint RAID card laying around. If I get more faults on the current initializing (right now at 12% with no faults yet) I'm going to rebuild the RAID using that card to make sure it's not a bad drive (I've replaced three already, but I don't even know if they were really bad). If it weren't for the fact that ProAvio gets such excellent performance from the combo (Read/write is 1000 and the graph line is beautiful) I'd have run for the hills.

At this point it's entirely possible that I have a wonky R680. While hard drives are definitely in short supply, I just can't imagine that I'd get so many bad drives in a single delivery. After the next initialization and running the RAID Maintinance App from ATTO, if there are STILL issues, I'll take out each drive and put them inside a mac pro. At least that way I'll be able to officially rule out the drives as the culprit.

Here's a grab of the AJA speed test results I was getting. If I used the 16Gb test, sometimes it would just knock the RAID offline. Lame.


This is what Luis at ProAvio is getting with the same setup. Look at that gorgeous line!


Forgive the length of this, I'm merely trying to document as much as possible for other readers who encounter issues in the future. I'll continue documenting further developments.


Return to posts index

John Davidson
Re: Update R680, Hitachi Deskstar 3Tb, ProAvio 8ms
on Feb 4, 2012 at 4:38:48 am

Last night at 2am the initialization completed with only a single error requiring a restart to continue. I then ran the RAID maintenance utility from ATTO to confirm it was healthy. The utility said the RAID was healthy. Then I ran AJA System test - and on the 2,4,6 and 8 gig tests results were somewhat erratic, but halfway through every single 16g test, the RAID would unmount, I'd get a warning about improperly ejected disk, and then a minute or two later it would remount.

So I went into the office at 2am to swap out the RAID card with the older Highpoint 4322 that had been in the system for a year until last weeks "upgrade".

Initializing didn't work on the Highpoint for some odd reason, but all drives were seen and reported excellent temperatures and healthy smart status. So today I took out each drive, dropped them into a mac pro one at a time, and formatted/tested individually to see if there was a problem. Each drive has it's own 'personality' on AJA system test, but all were essentially normal. I have screen grabs of each.

I put everything back in, went back into the highpoint manager to create a new RAID, skipped initialization and used old data, and instantly the drive was there and ready to rock. Speed tests were slower than the R680 when it works, but were more consistent than I've actually ever gotten from this highpoint card. For whatever reason, the Highpoint wouldn't complete the initializing, but at least I knew the drives were fine. I'm not interested in using the Highpoint any more anyways.

I also switched the cables with another room and added a new one, just to rule out the cables.

The resolution to this snipe hunt is that I grabbed the ProAvio box w/ drives, plugged it into the other edit suite with a R380, and lo and behold all my drives look good and it easily and swiftly began the initialization process via Express Initialization, which lets you mount a drive and use it at a slower rate when initializing. The RAID works great now. No dropping. No faults.

Obviously, I got a bad R680. It's a bit annoying that "do I have a bad card" was the first thing I asked ATTO before they told me my problem was possibly related to the heat warnings. Ironically, their software that isn't capable of telling the actual temperature of a drive (even Highpoint seems to have that down). These 3Tb Hitachis have been out a while, are recommended for RAID by Hitachi, and ATTO should know that their system erroneously reports wrong temperatures for this very common drive. Further, I got no answer when I asked what could possibly be knocking the RAID offline, presumably because the answer is "You got a bad card". This could have also been a bit easier if the download link for the R680 driver and maintenance utility had actually worked on the ATTO website last weekend. I understand faulty products get put out all the time by every manufacturer, but if I made RAID cards, I'd test it with virtually every drive in the world - there are only 2 or 3 companies that make hard drives, after all.

The process of building a DIY RAID is always a bit challenging, but I did my homework and still got hammered by it. Areca, here I come.

For what it's worth, our editing iMac w/ Promise Pegasus RAID5 came pre striped and worked right out of the gate, no configuration required. I hope ProAvio has a Thunderbolt enclosure coming down the pipes soon. Their tech support is fantastic.


Return to posts index

John Davidson
Resolved
on Feb 7, 2012 at 10:06:04 pm

Yesterday we put in the Areca 1882x, which is a little overkill, but we're getting a pretty awesome solid line on our AJA System Test graph at read/write of about 930Mb/s each. I'll post a grab to show it off tomorrow when all our media is done copying back to it, because it's awesome.

I really have to thank Jon and Luis at ProAvio again for their help. They're responsibility was only with the enclosure which was never the problem, but they were still incredibly helpful in getting me up to speed and even offered to configure the system at their offices! Be sure to hit them up at NAB!

The bright side of all this is that I've learned Areca, Highpoint and ATTO setup procedures for RAID cards.

John Davidson | President / Creative Director | Magic Feather Inc.


Return to posts index


John Davidson
Postmortem - ATTO isn't providing RMA
on Feb 9, 2012 at 8:18:11 pm

ATTO is not providing me with an RMA. The reseller doesn't want to take it back without one, but as this was an amazon purchase, it looks like I'm going to have to eat a 20% restocking fee.

I think the worst thing about this is that I was a big ATTO fan. http://forums.creativecow.net/thread/71/861027

So let me show you the performance of the Areca 1882x card, which installed, initialized and worked right out of the gate with no problem.
This is a little off of the 900 Mb/s I was getting on an empty RAID. The RAID now has about 10 Tb's of data on it.



Return to posts index

Petros Kolyvas
Re: Postmortem - ATTO isn't providing RMA
on Feb 9, 2012 at 8:31:09 pm

That's sad to hear. Apparently Areca has very highly regarded support as well!

--
There is no intuitive interface, not even the nipple. It's all learned. - Bruce Ediger


Return to posts index

Bob Zelin
Re: Postmortem - ATTO isn't providing RMA
on Feb 12, 2012 at 8:57:21 pm

Hi John -
I am going to bash you now. I am sure that you are a very nice guy.
You bought a product, and you expected it to work.

So what has happened here. You did not buy from a VALUE ADDED RESELLER that would help you every step of the way. You wanted to buy from Amazon, get the cheapest possible price, with NO SUPPORT, and when things didn't work, you became frustrated. If you had spent more money with a VALUE ADDED RESELLER, that would have helped you with the install, then you would not have suffered as you did.

The ATTO R680 card is a great card, but requires setup. So does the Areca ARC-1882x. These products just don't plug in and work. The Areca cards are notorious for having difficulty getting into the web gui. Only recently has the MRAID Utility been fixed for Areca.

What I have learned is that NONE OF THESE COMPANIES (ATTO, Areca, Highpoint, LSI Logic) are anything like AJA, who makes products that just plug in and work, and have extensive EASY TO READ AND FOLLOW documentation, and wonderful FREE tech support. AJA is the exception. To get a painless experience with drive arrays and host adaptor cards, you MUST go thru a VALUE ADDED RESELLER that knows the product.

For years I have seen people say "I bought this on Amazon", or I bought this on B&H Photo" and they get frustrated and angry.
That is why value added resellers exist. And they charge money for this service.

I am glad you have the ARC-1882x working. It's a great card.
But so is the ATTO R680.

Bob Zelin



Return to posts index


John Davidson
Re: Postmortem - ATTO isn't providing RMA
on Feb 13, 2012 at 3:02:11 am

I've built quite a few RAIDs at this point so I wasn't going into it green. I actually enjoy the process usually, but this time I was that one in a million guy who got a defective card. ATTO pretty much wrote me off as of last Wednesday, so believe it or not I'm thankful that I have Amazon's protections in place to return this card.

You're right in that people who are completely new to it should go through a VAR, at least until Mac Pros get Thunderbolt. After that happens I can't imagine why anyone would need to buy these types of cards, aside from large server setups.


Return to posts index

John Davidson
Re: Postmortem - ATTO isn't providing RMA
on Dec 29, 2012 at 1:21:48 am

Update: We had an old R380 that broke when moving our server. To access the media on the old R380 RAID, I got another R680 - it works flawlessly.

I really did have a crap card.

John Davidson | President / Creative Director | Magic Feather Inc.


Return to posts index

<< PREVIOUS   •   VIEW ALL   •   PRINT   •   NEXT >>
© 2017 CreativeCOW.net All Rights Reserved
[TOP]