This will be a short post, mainly just to guide where to look if you ever encounter this issue.
Recently I was tasked on adding a database node on Exadata Cloud Service X8M. The new X8M has dynamic deployment options so if you need more storage or compute power for your database nodes, you can just add those.
Sve has written good posts on this earlier on provisioning X8M and also scaling X8M – read details from there!
My issue with scaling was that after scaling event started – it basically hanged! Nothing happened and the work requests seemed to be stalling. I had done this same action to another X8M just few days prior and thought there was something odd.
Since debugging options are fairly limited (and at that point in time I wasn’t sure which log file to look) I created SR to look this through. Around same time we received an email from OCI:
But it can’t be the security lists since I had looked them through multiple times! Or can it? Looking it another time through I noticed a typo in the CIDR block which I then corrected. Network requirements are defined in the OCI documentation which I always use as a reference.
There are rules in general for ports ICMP, Service Gateway and ports 22, 6200 and 1521. But there’s also note referencing X8M and scaling:
For X8M systems, Oracle recommends that all ports on the client subnet need to be open for ingress and egress traffic. This is a requirement for adding additional database servers to the system.
This is anyway in general what I’ve seen done in many implementations, size the subnet for Exadata only as per network requirements and then open all subnet traffic. But now due to typo this had failed, only problem was nothing happened after opening the ports and work request continued to hang!
We had rather long conversation with support as they didn’t believe me, luckily you can get log file addNodeActions*.log under /u01/app/oraInventory/logs which showed the error AND also showed nothing was running at the moment.
Once that was confirmed, support restarted the workflow and everything completed smoothly within the normal timeframe.
Summary
Small mistake but took some time to resolve, always double check the network rules before scaling! Also additional log files on node 1 can be found under /u01/app/oraInventory/logs for scaling event itself.
Apart from that positive experience with the scaling email received and how fast adding a node overall is as it takes only 4-5 hours!