Creative COW SIGN IN :: SPONSORS :: ADVERTISING :: ABOUT US :: CONTACT US :: FAQ
Creative COW's LinkedIn GroupCreative COW's Facebook PageCreative COW on TwitterCreative COW's Google+ PageCreative COW on YouTube
APPLE XSAN:HomeXsan ForumSAN TutorialsSAN Forum

File opened via AFP Xsan re-share causes XSAN restart/lockup

COW Forums : Apple Xsan

<< PREVIOUS   •   VIEW ALL   •   PRINT   •   NEXT >>
Share on Facebook
Ryan BerdinkaFile opened via AFP Xsan re-share causes XSAN restart/lockup
by on May 29, 2012 at 2:13:45 pm

Posted this on Xsanity a while ago but didn't really get any bites. Hoping someone here might have some insights...

Wondering if anyone has seen this before or has any ideas on this…

Having an issue when a Microsoft PowerPoint or Word document (version 2011 for Mac) is opened on a client that is accessing the Xsan through an AFP re-share. I know, I know - these kinds of files should not be stored on the SAN… but scrips and things tend to get put with the rest of the project media files. Anyway when the Microsoft document is opened it causes the computer to freeze (spinning beach ball) and then the Xsan volume disconnects from all the other clients (both AFP and Fibre connected). From what I can tell the Xsan volume tries to failover over to the Backup MDC, but the Backup MDC also freezes which locks up all network accounts.

This has happened multiple times now - each time it is triggered by opening a Microsoft document. The only way I have been able to get things going agin is to shut down/hard power both servers and bring them up again… testing the failover between both Primary MDC and Backup MDC. Wondering if this is some issue with the AFP re-share that is causing larger issues overall. I wouldn't think that a simple Power Point document could cause something so major but I can't seem to find any other explanation.


Here is the setup:

• 2 Xserve's running 10.6.8
- Primary Xsan MDC, plus the following services:
- Open Directory Replica
- Secondary DNS
- Backup Xsan MDC, plus the following services:
- Open Directory Master
- DHCP
- Primary DNS
- AFP
- SMB
- Groupware (iCal, iChat…)
• Xsan 2.2.2
• Separate ethernet networks for Public & Metadata
• Qlogic SanBox 5600 & 5602 Fibre switch's with stacking ports
• Promise Vtrak E/J-Class Storage Units - about 25TB storage
• 9 Fibre clients - all 10.6.8
• 5 AFP clients - all 10.6.8


Return to posts index
Reply   Like  

Jordan WoodsRe: File opened via AFP Xsan re-share causes XSAN restart/lockup
by on May 29, 2012 at 3:54:31 pm

You should pull the logs on the MDCs at the exact moment this happens. Or, if you feel like replicating this you can open terminal or an SSH connection to the MDCs and run a "tail" on the MDCs to watch exactly what is causing them to hang. The command might look like: tail -f /var/log/system.log

Also, is this coming from a single document or any microsoft documents?


-Jordan


Return to posts index
Reply   Like  

Ryan BerdinkaRe: File opened via AFP Xsan re-share causes XSAN restart/lockup
by on May 29, 2012 at 8:06:09 pm

Hello,

Thanks Jordan for replying. No this does not seem to be from a single Microsoft Document, as it has happened multiple times with various Microsoft Office 2011 Documents (Word & PowerPoint). Also I will add that if a document is opened on a workstation with a Fibre connection it does not cause any problems.

Below is a short export from the All Messages log on the Backup Xsan MDC (the server that is re-sharing the volume via AFP). There is plenty more information of course, but this is a good start.

May 10 13:37:16 dc1 KernelEventAgent[79]: tid 00000000 received event(s) VQ_NOTRESP (1)
May 10 13:37:16 dc1 KernelEventAgent[79]: tid 00000000 type 'acfs', mounted on '/Volumes/XSAN', from '/dev/disk8', not responding
May 10 13:37:16 dc1 KernelEventAgent[79]: tid 00000000 found 1 filesystem(s) with problem(s)
May 10 13:37:17 dc1 fsmpm[288]: PortMapper: Initiating activation vote for FSS 'XSAN'.
May 10 13:37:22 dc1 fsm[292]: Xsan FSS 'XSAN[1]': Windows Security has been turned off in config file but clients have been requested to enforce ACLs. Windows Security remains in effect.
May 10 13:37:22 dc1 fsmpm[288]: PortMapper: Reconnect Event for /Volumes/XSAN
May 10 13:37:22 dc1 fsmpm[288]: PortMapper: Requesting MDS recycle of /Volumes/XSAN
May 10 13:37:22 dc1 KernelEventAgent[79]: tid 00000000 received event(s) VQ_NOTRESP (1)
May 10 13:37:22 dc1 fsm[292]: Xsan FSS 'XSAN[1]': SNFS Client 'edit2.comp.san' (10.0.0.17) disconnected unexpectedly from file system 'XSAN', reason: client socket shut down
May 10 13:37:23 dc1 com.apple.xsan[62]: xsan:perfDispatchMicroseconds = 791875
May 10 13:37:23 dc1 com.apple.xsan[62]: xsan:perfFunctionMicroseconds = 792145
May 10 13:37:23 dc1 fsm[292]: Xsan FSS 'XSAN[1]': SNFS Client 'edit2.comp.san' (10.0.0.17) disconnected unexpectedly from file system 'XSAN', reason: client socket shut down
May 10 13:37:27: --- last message repeated 4 times ---
May 10 13:37:28 dc1 fsm[292]: Xsan FSS 'XSAN[1]': SNFS Client 'edit2.comp.san' (10.0.0.17) disconnected unexpectedly from file system 'XSAN', reason: client socket shut down
May 10 13:37:46: --- last message repeated 17 times ---
May 10 13:37:46 dc1 fsm[292]: Xsan FSS 'XSAN[1]': PANIC: /Library/Filesystems/Xsan/bin/fsm ASSERT failed "rangep->headp == NULL" file range_ops.c, line 387
May 10 13:37:46 dc1 fsm[292]: PANIC: /Library/Filesystems/Xsan/bin/fsm ASSERT failed "rangep->headp == NULL" file range_ops.c, line 387
May 10 13:37:46 dc1 KernelEventAgent[79]: tid 00000000 received event(s) VQ_NOTRESP (1)
May 10 13:37:46 dc1 KernelEventAgent[79]: tid 00000000 type 'acfs', mounted on '/Volumes/VMG_XSAN', from '/dev/disk8', not responding
May 10 13:37:46 dc1 KernelEventAgent[79]: tid 00000000 found 1 filesystem(s) with problem(s)
May 10 13:37:46 dc1 fsm[292]: Xsan FSS 'XSAN[1]': PANIC: wait 3 secs for journal to flush
May 10 13:37:46 dc1 fsm[292]: Xsan FSS 'XSAN[1]': PANIC: aborting threads now.
May 10 13:37:49 dc1 fsmpm[288]: PortMapper: Initiating activation vote for FSS 'XSAN'.
May 10 13:37:52 dc1 fsmpm[288]: PortMapper: Reconnect Event for /Volumes/XSAN
May 10 13:37:52 dc1 fsmpm[288]: PortMapper: Requesting MDS recycle of /Volumes/XSAN
May 10 13:37:52 dc1 KernelEventAgent[79]: tid 00000000 received event(s) VQ_NOTRESP (1)


Return to posts index
Reply   Like  


Steve ModicaRe: File opened via AFP Xsan re-share causes XSAN restart/lockup
by on May 29, 2012 at 11:31:03 pm

You should have keepsyms set to 1 in the boot options on these two MDC systems. That will capture a reasonable stack trace on a panic. You should also check the filesystem. This is probably a corrupt inode causing the panic. It could be one of the parent directories.

Steve Modica
CTO, Small Tree Communications


Return to posts index
Reply   Like  

Jordan WoodsRe: File opened via AFP Xsan re-share causes XSAN restart/lockup
by on May 30, 2012 at 12:00:41 am

I agree with Steve, you are probably looking at a corrupt inode that is getting kicked off here. Did you check cvfsck? If you are unfamiliar with this:

http://support.apple.com/kb/HT1081


---always remember to backup before choosing any potentially destructive options in cvfsck



-Jordan


Return to posts index
Reply   Like  

Ryan BerdinkaRe: File opened via AFP Xsan re-share causes XSAN restart/lockup
by on May 30, 2012 at 3:57:11 pm

I ran cvfsck about 2 weeks ago when I had this same problem happen, and everything came back clean. I will go ahead and run it again tonight and see if anything has changed. Anything else I should look for?


Return to posts index
Reply   Like  


Ryan BerdinkaRe: File opened via AFP Xsan re-share causes XSAN restart/lockup
by on Jun 3, 2012 at 5:19:20 pm

Finally was able to run cvfsck and everything came back clean again. I am at a loss, any other suggestions?


Return to posts index
Reply   Like  

Steve ModicaRe: File opened via AFP Xsan re-share causes XSAN restart/lockup
by on Jun 3, 2012 at 9:13:46 pm

[Ryan Berdinka] "Having an issue when a Microsoft PowerPoint or Word document (version 2011 for Mac) is opened on a client that is accessing the Xsan through an AFP re-share."

A few ideas:

1. What if you run "cat filename.doc" or "cat filename.ppt" in a terminal? Do you have the same issue?
2. If not, you might want to dtruss Word or Powerpoint as it opens the file. I think it's not the file itself, but some library that MS Word or PPT is trying to open along with it. (You can even dtruss just the opens)
3. What if you open them with something else like openoffice?

For some reason, when I first read this, I thought you had one bad file. Now that I read it again, I realize it's more related to the apps. So we should be tracing the app's open process.

Steve

Steve Modica
CTO, Small Tree Communications


Return to posts index
Reply   Like  

Ryan BerdinkaRe: File opened via AFP Xsan re-share causes XSAN restart/lockup
by on Jun 3, 2012 at 9:32:58 pm

Thanks for your ideas Steve, I will give them a shot. I am not familiar with using the dtruss command, can you give me some pointers on how I should use it?

I do want to mention that besides the issue with opening any Microsoft Office documents, there also is an issue with saving. I had a user who had created a new Microsoft Word Document, and then tried to save it to a folder on the SAN volume (via AFP re-share). This caused the same exact problems as before.


Return to posts index
Reply   Like  


Steve ModicaRe: File opened via AFP Xsan re-share causes XSAN restart/lockup
by on Jun 3, 2012 at 9:39:40 pm

I have started MS Word.
I'll find it with ps command in the terminal:

Spongebob:~ modica$ ps -ef | grep Micro
502 18422 163 0 4:34PM ?? 0:02.57 /Applications/Microsoft Office 2011/Microsoft Word.app/Contents/MacOS/Microsoft Word -psn_0_3306279

It's PID is 18422

So now I can grab it with dtruss:

$ sudo dtruss -p 18422 -f

then this happens when I hit save (it goes fast, so you might want to catch this in a file)

18422/0x27468c: munmap(0x24482000, 0x8CB000) = 0 0
^C18422/0x27468c: pwrite(0x3D, "", 0x1, 0x238) = 1 0
18422/0x27468c: pwrite(0x3D, "264224MO302@20206357&376207f257246]360`214241pP<*21130317353vn253335217354,_377336i20106I241(zi322N337367}f332235336`251213h16362255)353&35262012216Sf222262327361c|313"f302d24226006R26602d203376345Eo274r20021251r246l322022733433434524264300304:0T311255327"320255237p'344247230", 0x18B, 0x239) = 395 0
18422/0x27468c: pwrite(0x3D, "03", 0x2, 0x3C4) = 2 0
18422/0x27468c: pwrite(0x3D, "PK030424", 0x239, 0x0) = 569 0
18422/0x27468c: pwrite(0x3D, "", 0x1, 0x5F6) = 1 0
18422/0x27468c: pwrite(0x3D, "254222301N3030f206357H274C344373232n 204320322]20322nb225a021132706332$J<330336236260303J245QMb307304316237357367357345j327w342223b262336)23027%br33233353320525736532335436Dbt06;357H30123622254252353253345vu310371QjmH"253270244240e1617R&335R217251360201\256l|354221363162240376300206344242,357d37425501325HS25421520227067 352}310?377G[366304h220Qj37i26b&213l26327Qcl2102530257237363u:t2423132344i240333363201374fc5=z275355311361t317222vL316220231F30220246210346227$32317363371362321310<244203225)2323053714177/303203026733337631524135506224cT307Z36136250371t", 0x10A, 0x5F7) = 266 0
18422/0x27468c: pwrite(0x3D, "03", 0x2, 0x701) = 2 0
18422/0x27468c: pwrite(0x3D, "PK030424", 0x231, 0x3C6) = 561 0
18422/0x27468c: pwrite(0x3D, "", 0x1, 0x844) = 1 0

You can also tell dtruss to just capture opens, or writes or whatever to whittle down the traffic.
My sense is that something other than the file (like MS Office support files) are the culprit here. Are any of those shared in any way from the SAN?

Steve

Steve Modica
CTO, Small Tree Communications


Return to posts index
Reply   Like  

Ian Liuzzi-FedunRe: File opened via AFP Xsan re-share causes XSAN restart/lockup
by on Jul 8, 2012 at 7:47:48 pm

Ever figure this out?



Return to posts index
Reply   Like  

Ryan BerdinkaRe: File opened via AFP Xsan re-share causes XSAN restart/lockup
by on Jul 24, 2012 at 6:02:07 pm

Unfortunately no - I haven't had much of a chance to do any further investigating. Luckily it hasn't been an issue again as I've pretty much banned any MS documents from the SAN.


Return to posts index
Reply   Like  

<< PREVIOUS   •   VIEW ALL   •   PRINT   •   NEXT >>
Share on Facebook


FORUMSTUTORIALSFEATURESVIDEOSPODCASTSEVENTSSERVICESNEWSLETTERNEWSBLOGS

Creative COW LinkedIn Group Creative COW Facebook Page Creative COW on Twitter
© 2014 CreativeCOW.net All rights are reserved. - Privacy Policy

[Top]