Intel vs. AMD: Comparing Instance Types for Big Data Workloads

Conclusion

Our summary findings from TPCDS benchmarks are as follows:

TPCDS queries are not as sensitive to local disk performance (and hence to EBS volume sizes)
r5 (Intel) instances are consistently faster than r5a (AMD) instances. However, the speedup depends on the engine and the speedup for r5 (Intel) is lower for Spark (10%) than for Hive (25%).
r5 instances are also either cheaper (by about 10% for Hive) or the same cost (for Spark)

So the net conclusion for Hive and Spark, in the context of TPCDS benchmarks, seems to be that r5 (Intel) instances are superior. They are generally always faster than the r5a (AMD) — and sometimes even cheaper (for a given performance level) at list price. However, there are some important additional points worth outlining that could affect the results:

Results may vary, as the speedups/cost are workload sensitive
Spot market prices are variable and any cost numbers here are not valid for Spot instances
Given how close the performance and cost numbers are — and the fact that r5a and r5 instances are otherwise identical in memory/vcpu etc configurations — r5a maybe a good choice for pairing with r5 instances for heterogeneous cluster configurations. This can potentially help in increasing the odds of getting Spot instances.

Appendix

Cost Calculations

For calculating cost of any benchmark run, we used the following formulae:

Total Cost = Instance Cost + EBS Cost
Instance Cost = num_instances * instance_cost_per_hour * runtime_seconds) / 3600
EBS Cost = (num_instances * ebs_size_in_GB * runtime_seconds * ebs_cost_per_month) / (86400 * 30)

Please refer AWS EC2 Pricing and AWS EBS Cost for details on AWS pricing.

TPCDS Query Selection

Below are the query categories and set of queries from each category that we picked for our analysis (with a total of 16 queries being chosen across all categories):