Fix collection update engine
The way collection membership updates are handled is completely flawed. Should this be fixed, the (soft) limit of collections that can have incremental updates enabled should also be changed.
We all know that when a collection is updated, all collections limited to it are also updated. And should any of the children collections be updated, then so will the children of the children, and so forth and so on.
However, there is a massive design flaw that in this system that means a huge number of collections that should *never* be updated are being updated.
Imagine this scenario:
CollA is created and limited to All Systems
CollA has *no* collection membership rules at all.
Whether CollA has incremental updates or not is irrelevant, as the result will be the same.
When All Systems updates and something changes within All Systems, CollA will also be updated. Yet, in this particular scenario where there are no membership rules specified on CollA, it's impossible for something to have changed and it's an effort in futility to evaluate its membership.
CollB is created limited to All Systems
CollB *only* has direct membership rules
All Systems is updated and at least a resource has changed, so CollB's membership rules are also updated.
Again, in the scenario where only direct membership rules are used in a collection, that collection should not be updated if its parent has updates and found that resources changed.
CollA is limited to All Systems
CollB is limited to All Systems
CollB contains a single collection rule - include collection CollA
In this scenario, if All Systems updates and at least a resource changed, both CollA and CollB will be updated as well - even if CollA's membership hasn't changed. What should happen is that the system identifies that CollB contains a single collection include collection rule set to CollA and first process CollA. If CollA's membership changed, then update CollB, else leave CollB alone.
CollA is created with no membership rules.
CollA has an update scheduled of every 7 days (default).
In this scenario, despite the fact that CollA has no collection membership rules at all, its membership will still be evaluated.
In an ideal scenario, SCCM would ignore collections with no collection membership rules from full or incremental evaluation schedules.
I could keep going with the scenarios as this applies to exclude collections, collections with incremental updates enabled, etc etc etc.
Basically the bottom line is:
- If a collection has no membership rules, it should never be updated.
- If a collection only has direct membership rules, it should never be updated (I believe a collection update is not required when a device is deleted from the database?)
- If a collection only contains include/exclude membership rules, it should only be updates on a schedule (if set to update on a schedule) or if one of the include/exclude collections it references has changed - as oppose to when the parent collection has changed.
Where I work we have over 1300 collections limited to 'All Systems'. Over 90% of those collections only contain direct membership rules, or no rules at all (because no direct membership rules were added yet).
Yet, every time something changes in 'All Systems', all 1300 collections are updated. This takes well over 5m to do.
As mentioned above, if this is changed I think the soft limitation of 200 collections with incremental updates can be raised, as a lot less collections will be updating overall.
Fausto Nascimento commented
Well this removed my formatting, so the collection structure I meant below is this:
--All Company Devices (basically excludes unknown computers)
------All Windows 7
------All Windows 10
--------All Windows 10 1703
--------All Windows 10 1709
------All Windows Server 2012 R2
------All Windows Server 2016
Fausto Nascimento commented
@Paul Actually the problem is much bigger than what is described here (last I had a look at it at least).
Incremental updates works by keeping track of the resource types that have changed since the last incremental update cycle ran (there's 4 resource types I know of: device, user, group and unknown device).
When it runs the first thing is does is determine, out of all collections marked for incremental updates, which *should* based on the known resource types that were changed and the types of queries in collections marked with incremental updates.
Once it's done that, it evaluates those collections and if those collections had any changes, it does the same for its children.
There's a massive bug here in the sense that if you create a new user in AD and it gets picked up by SCCM, it will say that a resource of type user has changed since last incremental evaluation cycle. When that cycle next runs, it will check which collections need to potentially be evaluated. It will mark "All Users and User Groups" as needing to be evaluated (since it has a user resource query) and also "All Users" (for the same reason) but it will *not* evaluate "All User Groups" because that collection contains no user resource queries, just group resource queries.
So far so good. However... when it evaluates "All Users and User Groups" the collection membership will change because a new user got added and as part of that it will force an evaluation on all child collections (including "All User Groups", even though it had already been marked as not requiring an update).
Yes limiting all group resource related queries to "All Groups" will minimise the extent of the problem somewhat. But that's false security.
If I have 5 OSs on my organisation (Windows 7, Windows 10 1703, Windows 10 1709, Windows Server 2012 R2 and Windows Server 2016) it makes sense to create the following collection structure to begin with:
All Company Devices (basically excludes unknown computers)
All Windows 7
All Windows 10
All Windows 10 1703
All Windows 10 1709
All Windows Server 2012 R2
All Windows Server 2016
(as oppose to a flat structure with everything limited to All Systems)
But here's what the trade-in is:
With a completely flat structure (no All Company Devices, All Workstations, All Windows 10 and All Server collections) if All Systems detects a change, 5 collections will be evaluated - always.
With the layered structure as above, if All Systems detects a change then in a best case scenario 1 collection is evaluated. In a worst case scenario 9 collections will be evaluated (All Systems detects a change, marks its children as needing to be evaluated. All Company Devices detects a change, marks its children as needing to be evaluated. All Workstations detects a change and marks its children as needing to be evaluated and the same for All Servers.
In this particular case it's definitely worth to have the layered approach despite a worst case scenario. But the choice is not always that simple...
So the problems mentioned above are just the tip of the iceberg really...
Paul Faulkner commented
I do think they should look at overhauling how collection updates work. In your scenario though I don't think you're helped by using All Systems as your main limiting collection, if you read the best practice advice it says not to (otherwise you will definitely get slow collection evaluations).
Having said that, even taking that out of the equation as I said I do think collection evaluations could be improved. Even when following best practice to the letter they can be slow in certain situations.