Add Alarm Grouping Functionality
Alarm grouping is used to categorize alarms of similar nature into a single notification. This functionality is very usefully especially during large outages when many systems fail at once and thousands of alarms go off simultaneously. For example, when a network partition occurs in your cluster, half of your service instances can not longer reach the database. In this case, hundreds of alarms will fire for different services. As a user, receiving hundreds of notifications caused by the same problem is not a good experience. Instead, having the functionality to group alarms by specific fields such as cluster name, alarm name and send one compact notification will be very helpful.
In order to implement alarm grouping, a new resource called alarm grouping manager will be added in Monasca API. Grouping rules are created by using alarm grouping manager resource and you can also query, update, patch or delete the grouping manager rules too. Please see more details in blueprint: Monasca API Alarm Managers.
Inside Monasca-
Grouping example:
GroupingRule1 = '{"alarm-
Three alarm transitions: AT1, AT2 and AT3
AT1_hostname = host1
AT2_hostname = host1
AT3_hostname = host2
AT1_alarm_name = cpu_percent_high
AT2_alarm_name = cpu_system_
AT3_alarm_name = cpu_percent_high
AT1_state = ALARM
AT2_state = ALARM
AT3_state = ALARM
Output:
AT1 and AT3 match exclusions and send notifications immediately.
Generate a grouped notification “group_
Note: There are no alarm_actions, ok_actions or undermined_actions associated with the AT1, AT2, AT3 alarm definitions.
Please see more examples in Monasca wiki page: https:/
Whiteboard
Gerrit topic: https:/
Addressed by: https:/
[WIP]Modify Notification Engine to allow inhibit, silence, and group
Addressed by: https:/
Documentation for alarm state transition flow
Addressed by: https:/
Add alarm rule table in mysql for querying
Gerrit topic: https:/