Remember the-chicken-or-the-egg-problem of not having a physical Domain Controller?
This especially requires a specific way of starting the cluster. The primary dependency here is having the DC1 up & running before starting the cluster otherwise all kinds of weird stuff happens.
Preparing the cluster-start process
First, to prevent an automatic start of the cluster service on the nodes, I’ve set the Cluster Service to start manually instead of automatically on each node.
There are two ways of approaching this problem:
- from the DC1, if this VM is started then one of the clusternodes is online, check from the DC1 the status of all nodes and then start the cluster service on each node
- from each clusternode, by checking if the DC1 is up and then start the cluster service.
As I want the cluster to be as simple as possible I’m going for approach #2. This means adding a generic script (the domain does not change) to the deploy template of a cluster node once which makes changing/adding nodes to the cluster easier than customizing/maintaining a script on DC1.
It is possible to work with delays and timers to provide the Hyper-V service and DC1 enough time to start up and then start the cluster service on each node. I believe it’s better to work with dependencies and service status checks, if the DC1 remains unavailable when the timer runs out you don’t want the the cluster service trying to start and produce only errors.
Let’s draft the steps from a cluster node perspective:
- node starts
- query the domain for DC1 service status (e.g. Netlogon): sc \tartarus.intranet query Netlogon | find “STATE”
- if the status is not running , loop / wait for 1 min
- if the status is running, next
- Start the cluster service on the node: sc start ClusSvc
<still working out the script details .. will post here shortly .. any ideas script samples are welcome>