Put HostManager._instance_info on a diet
This is a specless blueprint to cleanup some technical debt between the compute manager and host manager in the scheduler.
The default scheduler driver is the filter scheduler, and the affinity/
Those rely on the _instance_info cache in the HostManager which is either populated by the ComputeManager sending updates (configured via the CONF.filter_
The only in-tree filters/weighers that use the _instance_info cache are the affinity ones.
There are a few issues here:
1. By default, the instance info stuff is always sent from compute to scheduler regardless of whether or not the filter scheduler is being used. If I'm using the caching scheduler, then this is just unnecessary RPC traffic.
2. The full instance objects are being sent over RPC but the only thing that the affinity filter/weigher needs is the uuid, because that's what InstanceGroup.
3. There is a _sync_scheduler
4. As in #1 on the compute, the HostManager in the scheduler by default tracks instance changes, even if not using the FilterScheduler. If mis-configured, and the computes aren't sending the full instance lists on startup, the scheduler will pull them from the database.
This is all obviously problematic and wasteful if (1) you're not even using the filter scheduler or using the affinity filter/weigher, and (2) even if you are, you don't need the full instance object and all of it's sub-objects sent over RPC every 2 minutes from all computes in the deployment. At any reasonably scaled deployment this is a lot of overhead.
In this blueprint we'll tackle the main issues by simply sending uuids insteand of full instance objects, and not actually do this work if not configured for the filter scheduler.
Note that any out of tree filters/weighers relying on the HostManager.
Blueprint information
- Status:
- Started
- Approver:
- Dan Smith
- Priority:
- Low
- Drafter:
- Matt Riedemann
- Direction:
- Needs approval
- Assignee:
- Matt Riedemann
- Definition:
- Pending Approval
- Series goal:
- None
- Implementation:
- Started
- Milestone target:
- None
- Started by
- Matt Riedemann
- Completed by
Related branches
Related bugs
Bug #1680616: instance_get_all_by_host joins tables even if told not to | Fix Released |
Bug #1737465: [cellv2] the performance issue of cellv2 when creating 500 instances concurrently | Confirmed |
Sprints
Whiteboard
Gerrit topic: https:/
Addressed by: https:/
Don't send instance updates from compute if not using filter scheduler
I've hit one snag here, which is the TypeAffinityFilter relies on the instance.
--
https:/
Addressed by: https:/
Deprecate TypeAffinityFilter
I'm going to move this to Queens. If we can deprecate the TypeAffinityFilter in Pike and remove it in Queens, it will make the changes need for this quite a bit easier as all we'll need to pass between nova-compute and nova-scheduler is instance uuids rather than full objects. -- mriedem 20170517
I didn't get the time to work on this in Queens so I'm deferring to Rocky. -- mriedem 20171120
Addressed by: https:/
Delete the TypeAffinityFilter
Gerrit topic: https:/
Addressed by: https:/
Avoid unnecessary joins in HostManager.
Addressed by: https:/
WIP: Trim the fat on HostState.instances
Addressed by: https:/
Avoid unnecessary joins in HostManager.
Deferring from Rocky but this is being pursued albeit piecemeal. -- mriedem 20180607