Scheduler: Introduce lightwieght transactional model for HostState
Nova FilterScheduler implementation even though inherently multi-threaded, uses
no locking for access to the shared in-memory HostState data structures, that
are shared between all active threads. Even though this means that most of
decisions that scheduler makes under load are not internally consistent, this
is not necessarily a huge issue for the basic use case, as Nova makes sure that
the set resource usage policy is maintained even due to races using the retry
mechanism. This can however cause issues in several more complex use cases.
A non exhaustive list of some examples would be: high resource utilization,
high load, specific types of host and resources (e.g. Ironic nodes and
complex resources such as NUMA topology or PCI devices).
We propose to change the scheduler code to use a lightweight transactional
approach to avoid full blown locking while still mitigating some of the race
conditions.
Blueprint information
- Status:
- Started
- Approver:
- John Garbutt
- Priority:
- Medium
- Drafter:
- Nikola Đipanov
- Direction:
- Needs approval
- Assignee:
- Yingxin
- Definition:
- Pending Approval
- Series goal:
- None
- Implementation:
- Needs Code Review
- Milestone target:
- None
- Started by
- John Garbutt
- Completed by
Related branches
Related bugs
Sprints
Whiteboard
Gerrit topic: https:/
Addressed by: https:/
Scheduler Introduce lightwieght transactional model for HostState
Please note this blueprint will delayed until the M release if it is not in the NeedsCodeReview state (with all the code up for review) before July 16th, and merged by July 30th. We expect to re-open master for the M release in September. For more information, please see: https:/
--johnthetubaguy 15th July 2015
Unapproved for liberty due to the Non-Priority Feature Proposal Freeze. --johnthetubaguy 16th July 2015
Addressed by: https:/
Add lock to scheduler host state updating
Addressed by: https:/
Add lock to host-state consumption
Addressed by: https:/
Claim cpu ram and disk in scheduler host state
Addressed by: https:/
Handle claim failure in FilterScheduler
Addressed by: https:/
Claim numa topology in scheduler host state
Addressed by: https:/
Claim pci in scheduler host state
Addressed by: https:/
Remove the horrible catch-all during consumption
Addressed by: https:/
Scheduler use Claim to check resource consumption
Addressed by: https:/
Refactor claim code to eliminate duplication
Addressed by: https:/
Correct MoveClaim in getting pci requests
We have to hit Feature Freeze today, please resubmit this for Newton. --johnthetubaguy 3rd March 2016
This looks more or less stalled/abandoned and we're nearly at non-priority feature freeze (6/30) so I'm going to defer this from Newton. -- mriedem 20160629