Reboot ElastiCache clusters
Using Site24x7's IT automation framework you can create an action profile to automatically reboot and flush all keys on your monitored Redis and Memcached compliant cache nodes/clusters. You can either choose to reboot some or all cache nodes in a cluster or reboot the entire cluster itself.
Required Permissions
Please make sure the IAM role assumed by Site24x7 or the IAM user created for Site24x7 has the following partial write actions in the attached policy document to perform the actions.
- elasticache:RebootCacheCluster
Constraints
- Rebooting cache clusters is only supported for the Memcached cache engine type.
- To perform the action, the cache nodes/cluster need to be in a running state and are required to be monitored by Site24x7.
Create an action profile
- Login to the Site24x7 web console, select Admin > IT Automation Templates
- Click on the drop down and select the action to be performed (Viz. cluster and node level reboot for the Memcached engine type and node level reboot for Redis engine type)
- Provide an appropriate display name for identification purpose
- The action to be performed will get pre populated in the field below
- Next, click the drop-down to select the list of cache nodes/clusters that requires reboot (You can choose the option $LOCALHOST to execute the action on all mapped cache clusters/nodes)
- Max Allowed Action Execution Time: The maximum number of seconds Site24x7 has to wait before the request times out. The execution time is set at 15 seconds, by default. You can define an execution time between 1-90 seconds.
- Send the Automation Result via Email: You can choose to receive an email regarding the automation result, by toggling to Yes. Share automation results via an email to your User Alert Group configured in the Notification Profile. This email will contain parameters including the automation name, type of automation, incident reason, destination hosts, and more.
- Save the profile
Simulate the Automation
Before mapping the action profile, you can test its functionality by invoking the operational task manually within the Site24x7 console or by using our REST APIs. This is done to check whether appropriate write level permissions required to execute the reboot action in place or not. To test, navigate back to the IT Automation summary page and click on the to execute a dry run.
Map the Action Profile
To execute the automation, map the action profile to an desired alert event. You can either map the profile to a predefined monitor level event type or to an custom attribute level event type.
Monitor level mapping
Navigate to the Edit monitor page of the monitored ElastiCache node/cluster and map the action profile with any of the following monitor status changes.
- Execute on Down
- Execute on Up
- Execute on Trouble
- Execute on any Status Change
Attribute level mapping
You can also associate the action profile to any monitored AWS resource or application related metric data points like CPU usage, connections, read write IOPS and more. Navigate to the Edit threshold profile page of the monitored AWS resource or application service (Navigate to the Edit Monitor page of the resource > click on the Pencil icon adjacent to the Threshold and Availability field) and map the profile to any desired attribute by clicking on the "Select Automation to Execute" field.