AI Supply Chain - Different Types of Data Centers & Neoclouds

brencronin
May 14
10 min read

Updated: May 15

Understanding Cloud Providers and Their Relationship to Data Centers

The term "The Cloud" often refers to the vast network of computers distributed across global data centers, delivering compute power, storage, and services over the internet. However, "the cloud" encompasses a wide range of compute & services delivery, all underpinned by large-scale data center infrastructure.

At a foundational level cloud 101; cloud service delivery is typically categorized into a few primary models:

IaaS (Infrastructure as a Service) – Provides virtualized hardware resources over the internet, such as servers, networking, and storage (e.g., Amazon EC2, Microsoft Azure VMs).
CaaS (Containers as a Service) - Delivers and manages all the hardware and software resources to develop and deploy applications using containers. Sometimes viewed as a subset of IaaS.
PaaS (Platform as a Service) – Delivers development platforms and tools to build, test, and deploy applications without managing the underlying infrastructure.
SaaS (Software as a Service) – Offers fully managed software applications accessible via the internet (e.g., Office 365, Google Workspace).

There are several models used to explain cloud computing responsibilities, with one of the most common being the Shared Responsibility Model. This model outlines how responsibilities are divided between the cloud provider and the customer, depending on the type of compute service being used. It essentially breaks down the technology stack required to deliver a function and clarifies which party is responsible for each layer, whether it's the customer managing it directly or the cloud provider handling it as a paid service.

To illustrate, consider a scenario where a customer builds and manages their own compute environment. They would begin by acquiring, installing, and configuring physical servers, which would also require ongoing maintenance. These servers would need to be networked and connected to storage infrastructure.

On top of this physical layer, virtualization technologies such as Xen, VMware, or Nutanix might be added. Above the virtualization layer sits the operating system (OS), commonly a Linux distribution. The OS supports a runtime environment, such as Java Virtual Machines (JVM), which in turn runs applications. Finally, these applications interact with data and configuration settings essential to delivering business functions.

The diagram below, from the article “PaaS vs. IaaS vs. SaaS vs. CaaS: How are they different?” by ThecloudGirl.dev, illustrates this concept clearly. On the far left, the Traditional On-Premise column represents a scenario where an organization uses no cloud services. In this model, the organization is responsible for acquiring, installing, configuring, and maintaining the entire stack of compute resources and services.

As you move to the right, the next column, Infrastructure as a Service (IaaS), shows a shift in responsibility. In the IaaS model, the cloud provider manages the physical server infrastructure, while the customer handles everything from the operating system upward. This approach primarily eliminates the capital expense and time required to procure and deploy server hardware.

Progressing through each service model, such as Platform as a Service (PaaS) and Software as a Service (SaaS), the cloud provider assumes responsibility for more layers of the stack. With each additional layer managed, customers benefit from reduced operational burden but incur higher service fees in return.

How do these cloud models relate to AI and Data Centers?

AI workloads vary widely in how they are deployed, which directly impacts the underlying data center infrastructure. Some AI providers build and manage their own dedicated compute infrastructure, while others rely on Infrastructure-as-a-Service (IaaS) or Compute-as-a-Service (CaaS) offerings to GPUs from major cloud providers. Additionally, a broad ecosystem of AI platforms and software services operates on top of this infrastructure.

The type of AI service and its provisioning model, whether it's self-hosted, cloud-native, or delivered as a managed service, largely determines which data center providers are involved and what kind of infrastructure supports it. This, in turn, introduces varying levels of security considerations, all of which must be assessed when evaluating AI services for risk.

Major Cloud Providers

The largest global cloud providers that offer these services include:

Amazon Web Services (AWS)
Microsoft Azure
Google Cloud Platform (GCP)

Additionally, there are many other large cloud service providers that offer different types of cloud compute services. Some of the other major loud providers include:

IBM
Oracle Cloud
Rackspace
Digital Ocean
Linode (Akamei)

In China, the leading cloud providers are:

Alibaba Cloud (think of this as China's Amazon)
Tencent Cloud (think of this as China's Facebook/Instagram)
Huawei Cloud (think of this as China's Azure)
Baidu (think of this as China's Google)

To ensure high availability, performance, and global reach, these providers operate massive networks of data centers around the world. For instance, Microsoft Azure alone maintains over 300 data centers across its global footprint. These data centers form the physical backbone of cloud computing, hosting everything from simple websites to complex AI training workloads.

Example of Hyperscaler Data Center Network: Microsoft Azure

Microsoft Azure maintains a global resilient infrastructure, starting with Geographies, large-scale areas designed to meet specific data residency and compliance requirements. Each Geography contains one or more Regions, which are physically separate locations connected by low-latency networks. Regions serve as fixed data boundaries, making them critical when meeting data residency obligations.

Azure Geographies

The diagram below illustrates Azure geographies, represented by white dots. For example, the United States geography appears on the left and the France geography on the right, with other distinct geographies similarly marked. It’s important to note that these white dots do not represent specific data center locations. Instead, each dot is centrally placed within a country or region to signify that Microsoft recognizes it as a distinct geography. Within each geography, Azure deploys multiple regions and availability zones distributed across the area.

Azure geographies are especially important for data residency and compliance. Microsoft defines a geography as a discrete market, typically containing two or more regions, that preserves data residency and compliance boundaries. This ensures that customer data remains within the specified geography to meet local regulatory, legal, and sovereignty requirements. For organizations operating in regulated industries or under strict compliance mandates (such as GDPR in the EU or HIPAA in the U.S.), selecting the appropriate geography is a critical consideration in their cloud strategy.

Azure Regions and Availability Zones

Azure currently operates 60 regions worldwide, each comprising one or more Availability Zones.

Azure Availability Zones

Azure Availability Zones are physically separate groups of datacenters within a region, each with independent power, cooling, and networking. This isolation ensures that if one zone experiences an outage, the other zones can continue operating, enhancing overall fault tolerance and high availability.

Within a region, Availability Zones are interconnected via a high-performance, low-latency network. Microsoft targets round-trip latency of less than 2 milliseconds between zones which then allows for the support of services that need extremely low latency replication.

Azure supports two types of Availability Zone deployments:

Zone-redundant: Resources are automatically distributed across multiple zones. For example, zone-redundant storage replicates data across zones so that a failure in one does not impact data availability.
Zonal: Resources are pinned to a specific Availability Zone selected by the user. While this can provide slightly lower latency due to localized deployment, it offers less resilience against zone-wide failures such as power outages or natural disasters.

Not all Azure services support Availability Zones or zone-redundant deployments. Microsoft maintains an up-to-date list of services with zone support, including whether they support zonal or zone-redundant configurations. For details, refer to: Azure Services with Availability Zone Support

The diagram below illustrates the Azure US West 2 region, which is primarily located in Quincy, Washington, a small town in central Washington State. Microsoft operates expansive, multi-building data center campuses in the area and may also lease space from other data center providers to meet the infrastructure diversity required for three separate Availability Zones within the region. Quincy was chosen as a strategic location for several compelling reasons including: Low-cost power, tax incentives to build in rural areas, land availability, and location related to high-speed fiber routes.

Building Data Centers and Different Deployment Models

Data center construction is a large-scale, specialized industry. Many construction firms focus exclusively on building data centers, following a common model in which they develop the facilities and lease capacity to hyperscalers or other enterprise clients. Several variations of this model exist:

Full Turnover: The construction company builds the entire data center and transfers full ownership and operational control to the hyperscaler.
Partial Operational Control: The construction firm builds the facility and hands over the interior (all data halls) to the hyperscaler or client, but continues to operate critical infrastructure such as power, cooling, and building management systems.
Leased Space: The client or hyperscaler leases only a portion of the data center (e.g., specific data halls), while the rest remains under the control of the data center provider.

This distinction is important in the context of the AI infrastructure supply chain. While an AI company may operate the compute resources inside a data center, the underlying facility, along with its environmental controls and physical infrastructure, may be managed by a completely separate entity. This separation introduces potential vulnerabilities and risks at the facility layer and the operator that is running it, which may be outside the direct control of the AI operator.

Other AI Data Center Players

Beyond traditional cloud service providers, several major AI firms also require massive compute capacity and web-based infrastructure to support their AI workloads. These include companies like OpenAI, xAI, Anthropic, and others.

These organizations typically face three main infrastructure options:

Build their own data centers for full control and customization.
Lease physical space from existing data center providers.
Consume Infrastructure as a Service (IaaS) by renting GPU-based compute from cloud platforms that offer AI-optimized infrastructure.

A prime example is OpenAI, which initially partnered with Microsoft. As part of Microsoft's investment, OpenAI utilized Microsoft’s GPU infrastructure hosted within Azure data centers. More recently, OpenAI has begun diversifying its infrastructure strategy by tapping into Oracle's GPU compute capabilities and reportedly planning to build its own large-scale data centers.

The explosive demand for AI compute has also given rise to a new category of providers known as "Neclouds", cloud service providers focused specifically on delivering high-performance infrastructure for AI workloads.

What Are AI Neoclouds?

The term "Necloud" combines "neo", meaning "new" or "young" (from the Greek neos), with "cloud", to describe a new generation of cloud providers purpose-built for AI and high-performance computing (HPC) workloads. The name may also evoke the character Neo from The Matrix, symbolizing disruption and transformation, both fitting metaphors for these emerging players.

NeoClouds differ from traditional hyperscalers by offering specialized infrastructure optimized for AI performance, efficiency, and cost-effectiveness. These providers are built from the ground up to handle the unique demands of training and running large AI models.

Their origins vary, but many evolved from adjacent industries:

CoreWeave and Crusoe began as data center operators serving the crypto mining industry. As the demand for high-performance compute shifted to AI, they pivoted their business models to offer GPU-backed AI infrastructure.
Nebius, a European Neocloud provider, originated as a spinoff of Yandex, the Russian search engine company, and now delivers AI cloud services primarily in Europe.
Lambda Labs foresaw the surge in AI demand and positioned itself early with a focus on providing scalable GPU compute for deep learning workloads.
NVIDIA, the dominant manufacturer of GPUs used in AI, launched its own IaaS platform, NVIDIA DGX Cloud (formerly referred to as NVIDIA Neo), to capture more of the value being generated by its hardware.
Even venture capital firms have begun backing or launching their own Neocloud platforms to accelerate AI startup growth. Examples include Andromeda and ComputeFunds.ai, which provide targeted infrastructure to their AI-focused portfolios.

Understanding Different AI Workloads and Data Center Location Strategies

The physical location of AI data centers varies significantly depending on the type of AI workload being supported and data center design. Two primary categories of workloads, AI Training and AI Inference, have different performance and infrastructure requirements, which influence where they are deployed.

AI Training

AI training is the process of teaching models to recognize patterns and relationships by processing large datasets. In the case of large language models (LLMs), this involves consuming vast volumes of data, both open-source and proprietary, to develop models with billions or even trillions of parameters. Training is compute-intensive and sustained, making it ideal for deployment in hyperscale data centers located in rural or remote areas. These sites are preferred because:

They offer ample physical space for large-scale campuses.
Power costs tend to be lower due to proximity to renewable energy sources like hydro or solar.
There are fewer constraints related to latency, since training does not require real-time interaction with end users.

Examples of training model locations

Microsoft: Microsoft’s partnership with OpenAI includes housing massive training workloads in large Azure data center campuses like West Des Moines, Quincy, WA. These rural location offers inexpensive electricity (primarily hydro), reliable power infrastructure, and large tracts of land for expansive GPU clusters, ideal for long-duration model training.
OpenAI: OpenAI’s early GPT training efforts were largely hosted on Azure infrastructure. However, it has since begun to diversify by leveraging Oracle’s GPU Cloud and is reportedly planning its own hyperscale data centers, including a project codenamed “Stargate”, a multibillion-dollar initiative expected to be dedicated to training future frontier models.
Google: Google uses its TPU (Tensor Processing Unit) clusters in locations such as Council Bluffs, Iowa, to train its large AI models. These data centers are massive and power-efficient, optimized for intense training workloads. Google's deep integration of custom silicon also helps maximize performance per watt.

Example, large number of Microsoft data centers in West Des Moines Iowa used for AI training.

Single large data center campus in West Des Moines Iowa used for AI training.

AI Inference

AI inference, by contrast, is the process where trained models generate outputs, such as answering questions or making predictions, based on new, unseen data. Inference workloads are typically short, bursty, and latency-sensitive, especially when accessed via chatbots or APIs by globally distributed users and systems.

To reduce latency and improve responsiveness, AI inference workloads are increasingly being deployed through Content Delivery Networks (CDNs) or edge nodes, placing the AI compute as close to the end users as possible. This geographical distribution is critical for delivering fast, seamless AI experiences at scale.

While AI training requires immense compute power during training compared to individual AI inference transactions, AI inference generates a far higher volume of requests over time. Though each inference task consumes fewer resources than training, the cumulative demand often surpasses training in the long run due to widespread usage across consumer and enterprise applications.

Evolving Data Center Design Considerations for AI Training

Another key factor in choosing data center locations for AI training is the efficiency of compute power, which is directly influenced by data center design. As AI workloads become more demanding, newer data centers are being engineered to support higher power densities, improved cooling systems, and optimized layouts that enable more effective deployment of high-performance compute (HPC) infrastructure.

Organizations that rely on massive AI training workloads are increasingly shifting their operations to next-generation data centers to take advantage of these efficiency gains. In many cases, the performance improvements are significant enough to justify moving key AI workloads to newly built facilities.

The AI industry's rapid pace of advancement is also driving major changes mid-construction. For instance:

Meta (Facebook’s parent company) reportedly demolished recently built data center structures to make way for redesigned, more AI-optimized facilities.
Microsoft has delayed or paused construction on several data centers to re-evaluate architectural designs, ensuring the buildings can support the latest HPC and AI workloads more efficiently.