Issues with Xsan 4.1 & Open Directory Master/Replicas

Ryan Berdinka
Issues with Xsan 4.1 & Open Directory Master/Replicas
on May 19, 2017 at 5:25:50 pm

I understand this is a complex configuration and likely beyond the knowledge of most, however hoping someone might be able to provide some insight or possible recommendations.

I have an office environment of many El Capitan (10.11.6) clients and 3 MacMini servers running El Capitan (10.11.6), macOS Server app (5.2) & Xsan (4.1). As recommended/detailed for typical Xsan configuration, we have separate physical public (internet) and private (Xsan metadata) ethernet networks as well as a switched Fibre Channel network.

"Server C"
• Primary Xsan Metadata Controller
• Open Directory Master (as recommended/detailed in the Xsan setup information)
• DNS (secondary zone).
"Server B"
• Secondary (failover) Xsan Metadata Controller
• Open Directory Replica
• DNS (secondary zone).
"Server A"
• Connected to Xsan Volume as a Client, not a controller
• Open Directory Replica
• File Sharing (re-sharing Xsan volume via SMB/AFP as well as another attached Thunderbolt storage device for general network storage)
• DNS (primary zone)

In our previous Mountain Lion (10.8.5) configuration utilizing the exact same hardware, the Open Directory Master was hosted on one of the servers that WAS NOT serving as the Primary Xsan metadata controller. I understand there have been numerous under the hood OS & Server changes since Mountain Lion and that the OD Master is now required to be running on the machine that is serving as the Primary Xsan metadata controller. Please correct me if that is not the case!

There seems to be issues with the replication of Open Directory information from the OD Master to the Replicas. This has resulted in various problems such as:
• File Permissions - Group ACL permissions are not always properly applied resulting in staff not being able to read/write to various folders or files. Temporary fix has been to propagate permissions weekly. This mostly happens on the network storage that is hosted by "Server A".
• User Connectivity - After adding new users to the directory they are not always able to access various services or connect to shared volumes. Usually this resolves within a day. Currently I have a user that is not able to connect to a share via SMB, but AFP works fine.
• Crashes & Rebuild - There was a random issue where deleting a user from Open Directory caused all Users and Groups to disappear, and for Open Directory & Xsan to revert to an un-configured state. There appeared to be IntermediateCA certificates and keys that did not match between servers. This was sorted out by another tech - I believe they had to restore from backup and repair the directory configuration. I am not certain what all was done.

I received a recommendation that in order to resolve some of the issues, the permissions in particular, that two separate directories should be configured. "Server C" would remain an OD Master with "Server B" as a replica specifically for the Xsan side of things. "Server A" would then become a separate OD Master to just serve the file shares it is hosting. Downside would be that I would have to duplicate work when making any changes. Example - if I need to add a new user I would have to do so in both directories in order for them to have access to all the various services.

From my somewhat limited knowledge this seems like an unconventional way to go, and definitely not preferred. I feel like we are missing something here and that our system should work with the OD Master/Replica tree as configured.

If you have any thoughts, insights or recommendations I would be thrilled to hear!

Thank you!

Chris Collum
Re: Issues with Xsan 4.1 & Open Directory Master/Replicas
on May 24, 2017 at 5:26:02 pm


This does seem a bit unusual. Have you looked into any of the slap logs for replication errors?

Ryan Berdinka
Re: Issues with Xsan 4.1 & Open Directory Master/Replicas
on Jun 7, 2017 at 9:31:01 pm

Hello Chris,

Thanks for your response. On what I'm calling "Server A", the OD replica that shares volumes via SMB & AFP, I', seeing the following reoccurring message "slapd[231]: conn=1489517 op=1 do_extended: unsupported operation """. Not seeing any weird messages on either the other replica or master.



