Microsoft Cluster Server anomaly

2

Two weeks ago I encountered a very weird issue when using Microsoft Cluster Server.

What happened?
We created a MS Cluster Server environment with two servers (NodeA and NodeB, the virtual cluster node name NodeV), each of them had their own boot disk (C:) and a set of shared disks (D: for data and Q: as Quorum resource).
All of this was configured within VMWare ESX.

Someone came along and did some changes to the SAN; another LUN was added.
After this was done, a decision was made to move the MSCS virtual machines to this new LUN. This was done through a migrate option from within VirtualCenter.

After this migrate was done, everything seemed to work perfect. The Cluster came up nicely, we were able to move cluster resource groups along the nodes in the cluster and everything seemed to work in perfect order.

so we thought…

....

One day, about a week after the migration, we discovered something weird. Our system manager came to me and told me it seemed that the quorum disk (Q:) on node A was a different diskfile (.vmdk) than the quorum disk on Node B. This should have been the same diskfile across the cluster, according to this guy. So was my opinion, however, you never know with VMWare ESX, so we investigated on ESX level (command prompt) about the diskfiles. To our astonishment, we found two separate diskfiles for the quorum disk, one called NodeA_Disk3.vmdk the other NodeB_Disk3.vmdk… No symlinks, or whatsoever.

We went back to Windows, connected to the cluster machine (NodeV), which was running perfectly. To test our suspicions, we created a folder on Q:\ named "I am lost", after which we moved the resource group that contained Q: to the other cluster member.

Et voila! The folder was gone…

Nonetheless, the cluster service started without reporting any problems. The application (on the cluster) was available, and no issues were found, sofar.

However, the application was installed on D:, the other shared disk. You may guess what we were thinking at that moment.
Indeed, there were also two diskfiles for D:, NodeA_Disk2.vmdk and NodeB_Disk2.vmdk.

I had been changing various things in the application settings (the D: disk), but on which machine? or machines…?

We were so lucky in the end, because from the moment the migration was performed, the cluster had been started and no groups had been moved along the cluster. In the end, we were able to determine which was the most recent disk, that contained all of the changes, but you don’t want to know how much work it would have been to trace all of those changes back if they were to be shattered on both disks…

Ok, sofar the story.

What really happened?
Migrating Virtual Machines (VirtualCenter), causes VMWare to look at the configuration of the machine involved, and copy all of the disks attached to that VM to another location.
VMWare doesn’t take into account that shared disks occur only once, so it takes the shared disk, copies it to the other location, but it does the same for all of the other systems that use the same shared disk! Therefore, you will end up with multiple shared disks, only connected to their respective nodes (still shared, though).

IMHO I can live with the fact that VMWare is not able to determine the shared disk should be copied only once, however I think there is some room for improvement here.
What worries me is that Microsoft apparently can easily be tricked this way. You do not want to think about the consequences when a frequently updated and queried database is on your shared disk, and it turns out to be not so shared as you expected…(especially to find that out after a problem, failover, and failback…). Personally, I would expect some kind of mechanism (don’t ask me which!) that would prevent this kind of misconfiguration.

Personally I think the same applies to Physical hardware. Why not? The only thing that you need to do is to create a clone of your shared disk(s), attach them to each node, and up you go…

By the way, this could open up some interesting possibilities (e.g. a MSCS across two physically separated sites, with data on both sites, perhaps?).

Share.

About Author

2 Comments

  1. For me this is expected behaviour. The first node of the cluster is started and checks if he can manage the assigned quorum disk. In fact it can, since no other node contacted the quorum disk. So there you have cluster number 1. The second node tries to do the exact same thing, but instead of contacting the same quorum disk, it contacts a new quorum disk. So the second node also “thinks” that the other node is down and start running a cluster.

    In the disk manager you could (and should) have checked if the disks were mapped to the same lun. Seems that clustering on Windows is made a bit too easy ;)