Public cloud has, in the last few years, broken through the mainstream enterprise consciousness. Even if these enterprises are not betting it all on cloud, cloud adoption is, in one form or the other, an integral part of most enterprises’ infrastructure strategy and roadmap. For enterprises who are about to embark on this journey, the questions being asked are, “Which cloud platform should I adopt?”, “Which cloud platform provides cost effective services that are a fit for me?”, and “How do I go about my cloud adoption journey?” This blog attempts to answer the first two questions.
To that goal, I have compared the core infrastructure services across the most popular cloud providers, which are Amazon Web Services (AWS), Microsoft Azure and Google Cloud Platform (GCP). In addition to the core infrastructure services, each cloud provider brings their unique proprietary offerings in the NoSQL, Big Data, Analytics, ML and other such areas.
This blog calls out each cloud’s unique aspects that may influence an enterprise’s choice of cloud based on their specific business and technical requirements.
Amazon Web Services (AWS)
AWS is the oldest public cloud provider with the widest range of products, compute and data storage options and managed services. AWS Marketplace is also the largest marketplace for third party applications and appliances. They also rapidly iterate to continuously add a substantial set of product features. Highly driven by customer feedback, their new services provide close integration with their core services like IAM, KMS etc. They have a strong focus on security and architecture best practices. Their enterprise frameworks such as the Well-Architected Framework and Cloud Adoption Framework have been developed from their experience with large enterprise customers. Besides their mainstream services, they are also known to release unconventional services as SnowMobile (data transfer appliance in a truck), RoboMaker (robotics framework) and Ground Station-as-a-Service (for managed satellite data download). This keeps customer interest piqued and has potential to open up entire industries. Their 51% market share is a proof of that. That said, they aren’t the cheapest cloud on the market. They also don’t seem to be worried about providing deeper container offerings. Their EKS (managed Kubernetes) service was relatively late to the market. Instead, they seem to be placing bets on MicroVMs (Firecracker) and managed functions.
They have lately seem to realign their focus towards hybrid cloud, and have announced offerings such as Outposts, (in partnership with VMware) that will enable customers to use well known AWS services and APIs on infra running in their private data centers.
When to choose AWS
AWS is a great choice for startups and enterprises alike. From web and analytical workloads, to large scale data center migrations, AWS provides a breath of services that customers can leverage. To help get customers of all shapes and sizes started on the platform, AWS has released supposedly niche services such as RoboMaker on one end, while going backwards to build services such as LightSail (virtual private server) to help even the smallest single server workloads be onboarded without much overhead.
When it comes to compute, AWS provides the widest range of VM types. AWS also currently has the highest compute and storage options available in the market. Their wide range of VM types (136 VM types over 26 VM families) enable customers to run everything from small web workloads to the biggest HPC and SAP workloads.
For machine learning and AI workloads, AWS also provides the highest configurations of GPU enabled VM types. For workloads that require single tenancy for compliance and regulatory reasons, AWS also now provides Bare-Metal-as-a-Service. For virtualized workloads that might need it, AWS provides features such as placement groups to ensure that the workloads run on a designated underlying hardware.
AWS hopes that the various t-shirt VM sizes will match your workload requirements. It therefore doesn’t support creation of custom VM sizes (vCPU, RAM). Unlike other cloud providers (CSPs), it only provides a specific set of VM families which come with GPUs. It does not allow attaching GPUs to any or all VM types in its portfolio.
Block storage comes with a variety of options, such as dynamic resizing, different disk types (magnetic and SSD). Unlike other CSPs, AWS does not restrict IOPS by volume size. You can provision IOPS at extra charge to even small disks.
On the managed relational database front, AWS supports managed databases for MySQL, PostgreSQL, MariaDB, Oracle (both SE and EE) and MS SQL (Web and Enterprise editions) under their RDS offering. In addition, they have their own MySQL and PostgreSQL compatible database offering that boasts Oracle like performance at fraction of the cost. They are investing in this heavily and have also announced Multi-Master and Serverless versions.
For NoSQL databases, AWS has had their DynamoDB product available for over half a decade. This evolved from their SimpleDB offering. AWS is a proponent of and provides a range of purpose-built NoSQL databases. These include, DynamoDB (key value and document), Neptune (Graph), and Elasticache (key value caching).
AWS has improved its networking services portfolio over the last decade. It started with VPC and related network features such as security groups, network ACLs and Internet Gateways. At the time, users still had to configure their own NAT servers, bastion hosts etc. AWS has listened to customer feedback and gradually added these as managed network services to its portfolio. AWS now provides a managed NAT gateway, VPN Gateway, Transit Gateway, Direct Connect Gateway etc. They recently also announced a managed Client VPN service. This removes the need for customers to deploy OpenVPN servers to manage access to cloud VMs.
For network security, AWS has launched managed services for DDoS protection (AWS Shield) and Web Application Firewall (WAF), along with AWS Inspector, AWS Config and CloudTrail for inventory and policy management and auditing. GuardDuty provides threat detection.
For data security, AWS provides encryption at rest for most of its storage services. AWS also has KMS and CloudHSM services for key management. Macie provides an AI driven data loss prevention (DLP) service.
On the queueing, messaging and notification front, AWS provides managed AMQP compatible queue service (Amazon MQ) in addition to its SQS offering. For Pub/Sub, AWS has offered Kinesis and has recently added a managed Kafka offering. SNS provides a multi-channel integrated notification services that allows customer notifications over SMS, mobile, SMS and email notifications. Internally, it also connects with its other services to enable event driven loosely coupled architectures.
AWS serves US Government workloads in separate GovCloud regions in the US. Customers who need to provide services to customers in China can rely on AWS’s China region, which is provided via partnership with third party providers.
All in all AWS provides a breadth and depth of services and features that are suitable for a substantial number of enterprises.
Microsoft Azure
Microsoft had lagged behind AWS in the public cloud game, but it focused first on SaaS and PaaS offerings as its strengths lie in both enterprise and consumer software. Microsoft initially focused on PaaS services for Azure. These were focused on their existing base of Microsoft developers. Over time, Microsoft expanded focus to both Linux and IaaS services. This also reflected in their re-branding Azure from Windows Azure to Microsoft Azure, and Microsoft loves Linux campaigns. Over time, Microsoft has also made Azure more startup friendly and built out API support for its various services. However, despite the breadth of its services, Microsoft lags substantially behind AWS in enterprise adoption. Large enterprises that already have existing Microsoft relationships remain a large part of the user base, though Azure is seeing robust growth in year-on-year revenue.
When to choose Azure
Azure is a mature cloud platform with a breadth of features which may be a preferred platform for customers that are already using Microsoft products in some way. While Azure supports a number of open source product based services, the Microsoft portfolio on cloud is what sets it apart for customers.
Azure has over 151 VM types over 26 VM families that support everything from small web workloads to the HPC, Oracle and SAP workloads. Azure has both Windows and multiple flavors of Linux (RHEL, CentOS, SUSE, Ubuntu). Azure has a separate family of instances for ML/AI workloads.
If you need to run high-end workloads that require up to 128 vCPU and 3.5 TB memory, Azure is a good bet. If you have existing licenses for windows OS, MS-SQL and bring them to cloud (BYOL) via Microsoft License Mobility Program, Azure is the cloud to choose. License costs form a substantial part of infrastructure expenses, and will be a major consideration for customers who run large deployments of MS-SQL etc.
Azure was also the first cloud player to recognize the trend towards hybrid cloud, and had one of the first hybrid cloud and Cloud-in-your-Datacenter offering (Azure Stack). Customers who wanted the interface of Azure but wanted to run services in their own data centers could use Azure Stack. Other cloud players are only catching up with Azure in this space. Azure also provided support for hybrid storage appliances like StorSimple, which was unique in the public cloud space.
If you have a data center with predominantly Microsoft workloads and need to do a large scale data center migration to cloud, while taking advantage of familiar tools, Azure provides tooling and services, such as Azure Site Recovery.
When it comes to SQL and NoSQL databases, Azure has a fairly well rounded set of services. It provides managed MS SQL Server and SQL Datawarehouse. Azure also provides managed databases for MySQL, PostgreSQL, and MariaDB. Azure Table is a managed key value store, whereas CosmosDB provides multi-model, globally distributed NoSQL database with multiple consistency models. It provides an API compatible with MongoDB, Cassandra, Gremlin (Graph) and Azure Table Storage. If you need to run multiple managed data models, including document, graph, key-value, table, and column-family data models in a single cloud, Cosmos may be the way to go. Azure cache for Redis rounds off the offerings with a managed cache.
In addition to PAYG billing model with credit card and invoicing modes, customers with existing enterprise accounts may pre-purchase Azure subscriptions as part of their annual renewals. This is useful for customers who want to be able to budget the annual cloud spend upfront. This prevents the uncertainty and additional mid-year budget approvals that are usually associated with PAYG models. When doing this, enterprises should size their projected workloads with some accuracy so that no pre-paid credits are wasted at the end of the year.
License mobility to cloud for Microsoft products is also relatively easy for customers with multiple Microsoft products running on-prem.
Google Cloud Platform (GCP)
Google Cloud Platform (GCP), while late to the game and having the smallest market share of the public cloud providers compared here (current market share at about 4%), is showing a robust percentage growth. It boasts of several features that put it ahead of its competitors in certain areas. GCP is also riding the wave not only new customers who are already part of the ecosystem, but also early cloud adopters who are looking to expand their landscape to Google as part of a multi-cloud strategy. Google also started with PaaS services but has been steadily expanding its product portfolio.
Along with innovative features, Google boasts the lowest list price on infrastructure compared to all the other cloud providers. Of course, total expenditure for any enterprise depends on services used and cost governance measures in place.
When to choose GCP
From a compute perspective, Google has the smallest number of t-shirt VM sizes (28 instance types over 4 categories). However, it has one feature which makes these numbers slightly irrelevant. Google allows users to create their own custom sizes (CPU, memory) so that customers can match their cloud workloads sizing to their on-prem sizing. They also bill customers based on the total CPU and memory used, rather than individual VMs. This reduces wastage of unused capacity.
Another unique feature is that GCP allows almost all instance types to attach GPUs. This can turn any standard or custom instance into a ML ready VM. Google was also a leader in per-second billing, which forced other CSPs to follow suit. Compared to the pervious norm of per hour billing, per-second billing greatly reduces any capacity wastage. This results in up to a 40% savings overall, compared to relying on standard VM t-shirt sizes and per hour billing.
VM startup times in GCP are phenomenally fast, and leave other CSPs in the dust. This makes scaling out especially responsive. GCP also allows dynamic resizing of disks, so that you don’t have to do sysops acrobatics when your disks fill up. IOPS are assigned based on disk sizes and cannot be provisioned separately. This might be problematic for customers who want high IOPS on a small data set, and result on wasted dollars for unwanted storage.
Google has also tied up with or purchased third party cloud migration tools. These tools, such as CloudEndure, Velostrata and CloudPhysics, help customers assess, plan and live-migrate their VMs to GCP essentially for free. On other cloud providers, some of these tools cost several hundred dollars per VM. Google is clearly making migration to GCP as easy as possible.
Networking is where GCP shines. They have a global low latency network. Even from customer perspective, a VPC network spans all their regions. Other CSPs limit VPC networks to a region. This makes it easy for GCP customers to build applications that serve customers globally, without building complex cross region infrastructure design and data replication mechanisms.
Object storage also supports a multi-regional mode, where data is replicated across regions automatically. For customers considering a migration from AWS, or considering a multi-cloud strategy, GCP supports importing object storage from AWS.
For relational databases, GCP provides support for managed MySQL and PostgreSQL databases. For customers who want a globally distributed database that still supports immediate consistency and ACID properties, GCP has built Spanner. Spanner uses consensus algorithms and atomic clocks to synchronize transactions between nodes. This offering is unique to GCP and makes Spanner very attractive to large enterprise customers who have these requirements from their relational data store. In fact, another open source database, CockroachDB, is based on the Spanner paper that was published by Google.
From a NoSQL perspective, GCP has a product called BigTable. BigTable is a petabyte scale, managed wide-column NoSQL database that is used by Google in its own products such as Gmail.
From a billing perspective, Google provides automatic discounts such as sustained use discounts which reduce the on-demand price if a VM runs more than a certain number of hours in a month. If you want the most cost effective cloud provider in the market today, GCP is a good choice.
While it may not have the sheer depth of features of some of the other CSPs, it has some unique products in its portfolio, and is an attractive option being a price leader in the market.
Conclusion and Future Outlook
As detailed above, each cloud has features and advantages that appeal to specific customer needs. While all the cloud providers will continue to provide certain common services (such as managed MySQL database), each CSP will continue to build out unique, differentiated services (e.g. Aurora, Cosmos, Spanner) that are purpose built to solve very specific customer needs. CSPs hope that this will increase customer stickiness and create a lock in.
From a customer perspective, these services will also become a driver to adopt a multi-cloud strategy. As an example, a customer might likely want to use GCP for one app that needs Spanner’s features, while they use AWS for their AI services, and Azure for specific Windows workloads.
Even for future looking services like computer vision and speech recognition, customer needs might drive them to mix and match services across cloud platforms, to meet their application’s requirements. Customer will likely use one cloud as their primary platform, while using services from others for specific applications.