FORUMS: list search recent posts

All fibre channel SAN users - read this if your SAN starts acting strange.

COW Forums : SAN - Storage Area Networks

<< PREVIOUS   •   VIEW ALL   •   PRINT   •   NEXT >>
Francois Stark
All fibre channel SAN users - read this if your SAN starts acting strange.
on Sep 9, 2006 at 3:27:32 pm

Hi

We've been running SanMP for about 17 months now, and this week had the first relatively large problem. This problem worsened at the same time as I upgraded SanMP from V1.5 to V1.6 to get the sync working again, so I'm not quite sure wether this version had anything to do with the problem.

After upgrading the software, some random machines would unmount SAN volumes. And you know when you unmount a media drive while FCP is running, it crashes... This machine could be rebooted by itself. Normally after this, our one specific edit suite, Edit 3 would normally also drop a SAN volume. When we rebooted Edit 3, EVERY TIME the whole SAN would crash. Dead - all drives unmount. 4 FCP suites and 3 FCP suites. All clients storm out and act all upset. Boo hoo. Stress. Lots of it...

This gradually got worse. I uninstalled sanMP V1.6 and went down to V1.5. Problem worsened, but seems to be isolated to edit 3. The ADTX drive array's software did not show any issues - all volumes and drives functioning normally.

Eventually I figured out the problem is on edit 3 - running dual fibre links to the qlogic switch. Our Qlogic fibre switch's performance monitor showed me the two fibre channels were not running symmetrically. MMMmmmmm.

So I pulled out the one fibre cable. Problem gone. Swopped cables, LC converters and ports on the Apple (LSI) fibre channel card, positively isolated the problem to be the one port on the apple fibre card.

So- if your san starts acting strange - watch the fibre ports' data throughput carefully - it can show you where the problem is.

Now I must just pick up the courage to try sanMP V1.6 again...

Luckly we did not lose any data this time, or damage any volumes. We have in the past had some damaged SanMP volumes, which we could not mount with write access anymore. Eventually had to copy all data off, re-format and copy all data back to solve it. This has happened twice in the past 17 months, but it could have been edit 3's faulty fibre port that caused it...

Regards
Francois






Return to posts index

chrispy
Re: All fibre channel SAN users - read this if your SAN starts acting strange.
on Sep 11, 2006 at 7:09:21 am

Hey there Francois...sorry to hear about your woes...I have my own share with one site which we still have not found a solution yet.

SANmp 1.5 running on Apple XServeRAID (14 drives) and there are 2 x FCPs, 1 x Avid and 1 x Nuendo audio systems attached. On the Avid and Nuendo, it works fine but on the 2 FCPs, we kept getting dropped frames during injest and out of sync during playback and we tried all codecs, right down to DV. The Kona benchmark shows the throughput is more than sufficient for even 10bit SD uncompressed. So we tried all sorts of things ...update to SANmp 1.6, unconvert-->initialize-->convert back to SANmp volume, swap FC HBAs (Apple & ATTO), new OSX installations, bypass FC switch...nothing works. It kept giving the same dropped frame error. We even bypassed FCP and use the AJA VTR Xchange...same problem. On the last attempt, we bypassed everything and connected the G5 straight to the XServeRAID as a standard HFS+ volume. Same problem...so we gave up...for now at least. The only thing we have no tried is re-do the LUN on the XServeRAID but I'm hesitant coz it takes ages for it to finish. Unfortunately they already have the XServeRAID there if not I'd sell them an ADTX which I've done a few installations and all are working fine.

If you have any idea on the above, I'd like to hear your opinion. Thanks.

-chrispy


Return to posts index

Francois Stark
Re: All fibre channel SAN users - read this if your SAN starts acting strange.
on Sep 11, 2006 at 4:56:55 pm

Hi Chrispy

The only time I had intermittent problems with dropped frames was when sanmp's autosync feature was switched on. That was my first thought as I read your description.

But if you ran the FCP systems direct from the Apple drive array, hfs+, you're not using sanmp, which means you were not using autosync. Were the other two machines still conected to the san at that stage?

If not, it seems like there is a serious problem with the Apple XServeRAID array, because you were running it as a pure standalone storage unit and it still dropped frames! In that case, your problem has nothing to do with the san.

regards
Francois


Return to posts index


chrispy
Re: All fibre channel SAN users - read this if your SAN starts acting strange.
on Sep 11, 2006 at 10:15:57 pm

Hi Francois,

We eliminating one item at a time to the point where SANmp is no longer involved and although we do not have any answer at this stage, it does sound like an XServeRAID issue. It happened on two different FCP systems with identical results. However, the other system, which is an Avid Media Composer, ran without problem at all...so it was difficult for us to fault the XServeRAID unit.

Initially we thought Autosync was turned on but we checked and it was off all the while. We did some research on the XServeRAID and was advised to update the firmware and check the controller setting and they were all set to the way it was recommended. Also, when we capture to the internal SATA drive, no problem. Given all the combination of hardware and software testing we've done, the only thing that is constant is the XServeRAID so it is likely there is something wrong with it.

-chrispy


Return to posts index

<< PREVIOUS   •   VIEW ALL   •   PRINT   •   NEXT >>
© 2017 CreativeCOW.net All Rights Reserved
[TOP]