Powering Distributed AI/ML at Scale with Azure and Anyscale
The journey from developing AI and ML prototypes to deploying them in production is often quite complex. As your data pipelines grow and models become more intricate, teams frequently find themselves spending more time managing distributed computing rather than enhancing the intelligence behind their products. Transitioning from a simple laptop experiment to a fully functional production workload can feel like starting from scratch. But what if scaling AI workloads could be as seamless as coding in Python? This is where Ray comes into play. Ray is an open-source distributed computing framework that originated at UC Berkeley’s RISELab, and it is now set to launch on Azure in an exciting new form.
Today, during the Ray Summit, a new collaboration between Microsoft and Anyscale was unveiled. Anyscale, the company established by the creators of Ray, is set to offer its managed Ray service as an Azure-native solution currently in private preview. This managed service enhances Anyscale’s user-friendly developer experience while leveraging Azure’s powerful Kubernetes infrastructure. This means that users can now run distributed Python workloads with integrated governance and streamlined operations, all under their Azure subscriptions.
Ray: Open-Source Distributed Computing for Python
Ray transforms the landscape of distributed systems within the Python ecosystem, empowering developers to scale their code effortlessly from a single laptop to extensive clusters with only minor adjustments. Instead of needing to rewrite applications for distributed environments, Ray provides Python-friendly APIs that allow you to turn functions and classes into distributed tasks and actors without changing the underlying logic. Its intelligent scheduling efficiently manages workloads across CPUs, GPUs, and varied environments, ensuring optimal resource use.
With Ray’s native libraries, developers can create comprehensive AI solutions—using Ray Train for distributed training, Ray Data for data processing, Ray Serve for model deployment, and Ray Tune for hyperparameter tuning—all of which work seamlessly with popular frameworks like PyTorch and TensorFlow. By eliminating infrastructure complexities, Ray enables teams to concentrate on enhancing model performance and fostering innovation.
Anyscale: Enterprise Ray on Azure
Ray makes distributed computing more accessible, and with Anyscale on Azure, it reaches new heights of enterprise readiness. At the core of this offering is the Anyscale Runtime, a high-performance runtime designed specifically for Ray. Anyscale Runtime optimizes cluster efficiency and speeds up Python workloads, allowing Azure users to:
- Quickly launch Ray clusters in mere minutes, with no need for Kubernetes expertise, directly via the Azure portal or CLI.
- Allocate tasks dynamically across CPUs, GPUs, and various nodes to ensure effective resource use and reduce idle time.
- Conduct large experiments swiftly and cost-effectively, benefiting from elastic scaling, GPU packing, and native support for Azure spot VMs.
- Operate reliably at production scale with built-in fault recovery, zero-downtime updates, and integrated monitoring features.
- Maintain control and governance; all clusters operate within your Azure subscription, ensuring that data, models, and computing remain secure, with simplified billing and compliance measures aligned with Azure standards.
By combining Ray’s versatile APIs with Anyscale’s managed platform and performance capabilities, Python developers can transition from prototype to production more rapidly, with reduced operational burdens, and scale efficiently in the Azure environment.
Kubernetes for Distributed Computing
At its foundation, the Azure Kubernetes Service (AKS) drives this new managed solution, providing the infrastructure necessary to run Ray at a production level. AKS simplifies the complexities involved in orchestrating distributed workloads while ensuring scalability, resilience, and governance tailored for enterprise AI applications.
AKS features:
- Dynamic resource orchestration: Automatically allocate and scale clusters across CPUs, GPUs, and mixed configurations in response to changing demands.
- High availability: Self-healing nodes and failover capabilities ensure uninterrupted workload operations.
- Elastic scaling: From development clusters to expansive production deployments spanning hundreds of nodes.
- Integrated Azure services: Seamless connections to Azure Monitor, Microsoft Entra ID, Blob Storage, and policy tools help streamline governance for IT and data science teams.
AKS provides Ray and Anyscale with a strong framework—a system already trusted for enterprise workloads, ready to scale from small projects to global operations.
Empowering Teams with Anyscale on Azure
Through this collaboration, Microsoft and Anyscale bring together the strengths of open-source Ray, managed cloud infrastructure, and Kubernetes orchestration. This combination allows Azure customers to flexibly scale their AI workloads. Whether you aim to experiment quickly on a small scale or to operate critical systems on a global level, this offering allows you to embrace distributed computing without the hassles of building and managing infrastructure independently.
You’ll have access to Ray’s open-source ecosystem, integrate with Anyscale’s managed service, or blend both with Azure-native resources—all within your governance framework. This flexibility enables teams to choose the approach that suits them best: rapid prototyping, cost and performance optimization, or ensuring enterprise compliance.
Together, Microsoft and Anyscale are breaking down operational barriers, giving developers more freedom to innovate with Python on Azure. This collaboration allows teams to move quickly, scale intelligently, and focus on delivering innovative solutions. For detailed information, check out the full release here.
Getting Started
To learn more about the private preview and how to request access, visit https://aka.ms/anyscale or subscribe to Anyscale via the Azure Marketplace.


