Some time ago I published a post where I took a test drive with OCI Autoscaling. Now after an eventful summer I think it’s time to retest Autoscaling mainly due to some enhancements we have seen. If you want to get more details on Autoscaling then please read the older post and remarks I’ve made there first and then continue with this post.
New features which made me write this post are Terraform 0.12 being released which required me partly re-write the earlier code. In addition we have seen Oracle Events service coming GA and now just last week they added several Compute related events to the service. Now we can get notified during an autoscaling event! I also want to test and see if there are any performance improvements on the monitoring side of scaling events.
My Terraform code for this test can be found from here and can be freely used on your own tests.
It will deploy few resources so if you are planning to perform similar test please note there is marginal cost involved. The resources that get deployed are almost same as in the first post:
- Public load balancer and a backend set which has instance pool as destination
- Two public subnets for the load balancer and jump server including route table and security list (ports 22 and 80 are open)
- One private subnet for the instance pool instances with route table and security list (ports 22 and 80 are open)
- Instance configuration with standard linux image, instance pool with minimum of two servers, maximum of four servers and autoscaling group which acts when there is certain CPU load on the server
- Notification Topic and Event with a rule to match instance launch and autoscaling event – NEW
Notifications and Events
Terraform package will create a notification topic “MyScalingNotification” which I need to subscribe to through Console once Terraform script has finished. After confirming it through my email I will receive notifications on this topic.
Script will also create Events Rule with following matching types:
This way we get notified not only on scaling action but also if an instance gets terminated in the instance pool. You can also filter events with more specific rules which is specified in the documentation.
Good thing to note is that in Terraform you need to define multiple rules in a similar way like this:
variable "rule_condition" {
default = "{\"eventType\": [\"com.oraclecloud.computeapi.terminateinstance.begin\", \"com.oraclecloud.autoscaling.scalingaction\", \"com.oraclecloud.computeapi.launchinstance.begin\"]}"
}
Testing notifications
As a first action I want to terminate an instance and see if I get notified on it. Through Console and Compute – Instance Pools I pick one of the two instances I have running and terminate it, within few minutes I get an email with a subject of “OCI Event Notification :com.oraclecloud.computeapi.terminateinstance.begin”.
After this I wait as Instance Pool has minimum servers set to two so I should see an event where create instance gets launched. After 12 minutes I see an instance being launched and at the same time I receive an email with a subject of “OCI Event Notification :com.oraclecloud.computeapi.launchinstance.begin”.
This seems to be as slow as it was earlier, I would expect customers to want instance spawned quicker compared to 12 minutes. This is partly caused at least by the load balancer health check where minimum time is 300 seconds. But still it seems to take two rounds of health checks to get an instance up and running.
Next I will test autoscaling by running utility “stress” with parameters stress -c 2 from command line on one of the hosts in the load balancer backend and after a while I should see a scaling event where I get a notification. Autoscaling is set to have a scale-out event in case CPU threshold of 10 is reached and a scale-in event when CPU threshold goes lower than 10.
After around 6 minutes of running stress I get an email with a subject of “OCI Event Notification :com.oraclecloud.autoscaling.scalingaction” and in the email content I can see following:
"additionalDetails" : {
"policyName" : "MyScalingPolicy",
"ruleName" : "MyScaleOutRule",
"actionType" : "SCALE_OUT",
"previousSize" : 2,
"newSize" : 3
}
I stop running stress and will wait for the 300 second cooldown period to finish. The cooldown period is important so you don’t get additional scale events right away and your instance pool can observe what is right capacity with the new servers running.
After the cooldown period I get an email with subject of “OCI Event Notification :com.oraclecloud.autoscaling.scalingaction” and in the content I now see action type SCALE_IN.
"additionalDetails" : {
"policyName" : "MyScalingPolicy",
"ruleName" : "MyScaleInRule",
"actionType" : "SCALE_IN",
"previousSize" : 3,
"newSize" : 2
}
Summary
This was a really simple addition to the existing autoscaling case which I had, it’s great Oracle has made integrating services so easy. Also rewriting the initial Terraform script to support version 0.12 wasn’t too complicated, you can get the full code from my repository from the link posted in the start of the post. It doesn’t have all the latest things I’ve done with dynamic blocks but I will try to add those eventually as well.
Overall I think scaling events work in similar fashion as they did earlier, they seem to be responding to thresholds within 300 second limit but if there is sudden termination of an instance that still takes more than 10 minutes easily? Would be great if we could modify healthcheck and cooldown times below 300 seconds for faster response, I would assume customers would want this.
The action in the Events can also be a Function, so instead of getting just notified you can then run additional configurations or processes using the Functions Service. This would be really good for advanced requirements!