Controlling Disk Use and Handling Overflow
The goal here is to handle the case of a shard exceeding its storage capacity as gracefully as possible. Two mechanisms that come to mind for the handling of overflow are listed below:
- operator-set per-queue maximum storage size
- automatic partitioning of set of messages for a given queue across multiple shards
Blueprint information
- Status:
- Not started
- Approver:
- None
- Priority:
- Medium
- Drafter:
- Allele Dev
- Direction:
- Needs approval
- Assignee:
- None
- Definition:
- New
- Series goal:
- None
- Implementation:
- Not started
- Milestone target:
- None
- Started by
- Completed by
Related branches
Related bugs
Sprints
Whiteboard
Schedule for Juno.
We have two issues to address:
-- Noisy neighbor
-- Unlimited queue length
This BP addresses the latter. See https:/
I think infinite storage can be addressed by using mongo's built-in sharding - you could hash on the marker, but this would require a marker proxy because we can no longer depend on the unique index.
If you shard the queue across two different backend db nodes, no matter how you look at it, you will need to have a marker proxy.
Anyway, I could be wrong, but this is my current thinking. Flavio is in favor of not sharing a queue across multiple backends, period (FWIW).
An alternative to "infinite queue length" is to solve that in the filesystem layer using something like Ceph, Gluster, SolidFire, etc.
We should also consider using filesystems that offer realtime compression, or use snappy to compress message bodies before even sending them to storage - see also this bp: https:/
Another possibility: if we can rebalance queues based on CPU and network usage, why not also on queue length? Move long queues to nodes with more disk. Combine this approach with realtime compression (either in the app or in the filesystem), we would likely have an upper ceiling that is impractical for any queue to ever reach given they can only post so many messages/sec anyway.
See also:
http://
(starting at: 2014-01-