Site Server High Availability or Clustering
Configuration Manager is becoming more mission critical every day. The ability to Cluster the Site Server role (Inboxes & SMS Provider) is more and more important as that is the single point of failure for a Primary Site.
If the Site Server role goes down the Primary Site is down no matter how many Management Points or SMS Provider etc. exist.
If we could Cluster the Site Server or at least have 2 systems share that role for High Availability then this would no longer be an issue.
Even though we can designate multiple SMS Providers, if the site system itself goes down, consoles will not connect to ANY of these other providers. That’s because the console ALWAYS collects to the Site Server for a list of the SMS Providers every time a console is opened, so it’s not true SMS Provider redundancy if the Site Server goes down, no new consoles can connect.
We have now added hierarchy support with #sccm 1810. There is still some planned work outstanding (represented by UV items in the comments); but the core work for this is now complete in #sccm 1810.
The big piece of this work is done. There is still some planned work in these areas, represented by these UV items:
1) Max of 2 machines simple config:
2) Active/Active mode:
3) Simplify movement:
Quentin Gerlach commented
The features that Microsoft have been releasing to support this is great, but this still don't support hierarchies yet. It would be extremely helpful if these HA considerations would start supporting multi-level hierarchies, with a CAS and multiple primaries distributed geographically out.
Jay Tuckey commented
Good work, @djam and team, I'm looking forward to having this fully implemented.
Kerim Hanif commented
We have been improving this feature in every TP, in 1805 TP we added site server install folder customization and more, try it out..
Building on the above, Remote Content Library was added in 1804 TP: https://docs.microsoft.com/en-us/sccm/core/get-started/capabilities-in-technical-preview-1804
Has anyone actually tried this in a lab? I've just installed 1710 TP in my lab (FileServer, SQL Cluster, Active PS, Passive PS), but HA does not act as described in the docs.
There is no way to customize the passive server's Installation, everything is copied to and installed on the c:\ drive. The active server does not keep the passive server synced and the manual failover takes ages to complete - it would be faster performing a recovery of the botched site via site recovery. What sense does high availability have in this case then? The current implementation leaves much to be desired.
Philip Webb commented
Is there any suggestion as to when this might hit production, since it didn't make the 1706 production release? On course for 1710?
Lenny Caputo commented
This is great news and a big win for us. We can now present an acceptable solution to servers in our Hosted environment.
Finally Primary Sites will now have an acceptable HA solution. I'm assuming we will see the CAS included in later releases of SCCM 17xx.
Richard van Nuland commented
In the 1706 tech preview only the primary site server is available as high available. But when the CAS goes down we still have a major issue. When will that be supported as HA as well?
It's coming very soon... ;-)
It's coming soon....
This idea is still planned ? Yes give us three new windows 10 popup balloons options and leave high availability for server and whole sccm for year 2147,
That would be super cool. In our company, patching a primary site server means getting an approval from all the teams that rely on SCCM functionality. Sometimes it's quite a challenge to find the perfect window for an outage.
Lenny Caputo commented
I'm glad to see that this is now a Planned addition to SCCM. This will be a big WIN in the datacenter!
James Mymryk commented
Key business case for this is with Business Continuity return to service times for environments. If you are using Configuration Manager as your server build infrastructure component and it doesn't have redundancy to a different datacenter, you cannot start your recovery of other systems until it has been recovered thereby extending recovery times.
Nash Pherson (MVP) commented
I'd tweak this one to say "Support Hyper-V Replicas for Site Server High Availability" as hyperv and vmware high availability should be a tested scenario