It is tempting to consider the cloud a limitless of resources that are always available to us to provision our services from. However, that is of course too simplistic. The cloud is someone else’s computer. It consists of real hardware, servers, storage components, a lot of network, power and cooling and it lives in a real building with actual people taking care of it. It is optimized, professionalized, consolidated, automated. But not limitless. As we are experiencing very concretely at this very moment.
The resource pool of the cloud is exhausted: for several customers we could not create new resources, scale up or scale out existing resources or upgrade our zonal implementation to a zone-redundant (cross availability zone) setup. We also have received word from other parties who are struggling with these same limitations (in Azure West Europe Region). We have logged the incident with Microsoft; the problem is acknowledged and we are promised monthly updates to inform us about the progress of a resolution.
More recently, Microsoft has suggested we can engage in conversations regarding “cloud resource capacity management” to discuss how we expect to provision new resources in the coming period.
One workaround is to create resources in other regions. For various reasons – latency chief among them – this is less than perfect. Clearly this is far from the ideal situation where new resources can be created at any moment, without lead time. Our process and our cloud strategy and design need to cater for this eventuality (and yours to). To prevent innovation from coming to a halt we need ways to circumvent the freeze in this one Azure Region. In addition to our current region: also use another region, also use another public cloud, also use on premises resources. These are the main options we are considering.
Details on the current situation – that we first ran into in December 2023
When the cloud does not allow creation or extension of resources due to lack of capacity, you get a strong reminder of the fact that the cloud is not this magical, infinite pool of resources you can easily make yourself believe it is. It is a physical building with power and cooling, racks and servers and people operating them. And it is finite. At some point the building is full or at least the servers are all in use. Or the staff is at the limit of what it can handle.
Microsoft obviously is larger than your own IT department ever was, with more automation presumably than you ever had in place. But in the end, physical resources are required. They need to be bought, shipped, installed, connected, operated. And there are physical constraints. I am sure the popularity of the West Europe Region is a factor and perhaps Microsoft has been overwhelmed by demand. However understandable, it has now become our problem (and our customers’ problem). And we cannot go into the Azure data center and set up our own resources. We have to wait for Microsoft to resolve their issues. It is their computer that we rely on.
When I tweeted about our issue, I received the response shown below.
Recently I came across this article that describes a similar situation where new resources could not be created because some of the Availability Zones in West Europe are full. This image of a very revealing Reddit-thread is from that article:
The problem is quite wide spread and has existed for some time. Azure West Europe is not the only region suffering (other regions mentioned in the thread, though I do not have personal experience with these, are (Europe) UK South, (North America) Canada Central, (North America) East US, (North America) East US 2. Using other regions is the only workaround available within the Azure environment – although not all regions offer the same features (yet) and some are more expensive than others.
As I understand it, the problems are expected to continue into the Summer of 2024. Meanwhile, construction is ongoing for a huge extension of the current Microsoft data center facility in North Holland (permission was granted by the province in April 2023). This Microsoft webpage describes the Data Centers in Hollands Kroon (Middenmeer).
“Microsoft datacenters in the Netherlands currently employ 334 people. Since construction started in 2013, more than 12.3 million hours have been worked on construction projects, with an average of 687 construction jobs per year. We anticipate it will take more than 3.7 million hours across an estimated 600 annual construction jobs to complete the new datacenter facilities. By the end of 2026, we project 382 full-time employees and contractors will work across all operational facilities. “