Amazon Redshift vs Google BigQuery: Which is better for your company?
On one hand, Amazon has been said to be leading the way in columnar storage and on another, Google has been gaining market share with continuous investment into cloud storage. It can be incredibly difficult to differentiate especially when new releases are being pushed out on a weekly basis. Still, it's a choice you have to make.
With the integrations of Kloudio, we didn't make a choice: We wanted to provide easy data sourcing reporting to both Redshift and BigQuery users. As our platform delivers fully automated data syncing, it was crucial to us that all the largest datasets in your warehouses could be analyze it in minutes. While we didn't have to make the choice between Redshift or BigQuery, we understand that it's an important decision you have to make.
That's why today, we'll make walking you through a comparison of Redshift and BigQuery, allowing you to see which fits your team's use case the best.
Which provides better performance?
You might be wondering, Amazon and Google: neither one should have performance issues. You're right but there's one thing. When it comes to processing hundreds of gigabytes of data, performance extends beyond that of a few simple queries.
A stall in performance could directly impact the productivity of your entire team. In business, we simply cannot afford to run a query for hours on end when it comes to high volume datasets.
Here's a test that has been run to compare two and its results:
Amazon Redshift outperformed BigQuery on 18 of 22 TPC-H benchmark queries by an average of 3.6X
When AWS ran an entire 22-query benchmark, they confirmed that Redshift outperforms BigQuery by 3.6X on average on 18 of 22 TPC-H queries.
When considering the relative performance for entire datasets, Redshift outperforms BigQuery by 2X.
Please note: Google has conducted a similar test using a different TPC-H benchmark.
Which is more affordable?
Redshift is considerably more expensive when comparing cost per GB at $0.08, versus BigQuery which costs $0.02 per GB.
Unfortunately, BigQuery only offers storage at their price point and not queries. When taking into account that BigQuery charges separately for queries at $5 per TB, suddenly it doesn't seem to be the best deal anymore.
Looking at these two pricing models, we can see that Redshift's pricing is much more predictable for those that have a restricted budget. With BigQuery, it's challenging to predict how much you'll end up forking at the end of the month without using BigQuery's Cost Control feature.
Which pricing model works better for your team? If you're not high on usage and perhaps want to save money, perhaps BigQuery is the way to go. Overall, on an enterprise level, Redshift is the better deal.
Which has higher usability?
Unlike performance and pricing, it's more challenging for us to put an objective number on usability to compare the two. Because it's so intangible and subjective to the user, it really comes down to your personal use case. So let's look at several determining factors:
Simple Presentation is where BigQuery reveals itself as the big winner by far. BigQuery abstracts your data so it appears simple and easy to use right away. Unfortunately, this comes at the cost of doing away with important details.
Data Types are pretty comparable for both data warehouses, which both including text, integers, floats, booleans and more. Redshift does support additional data types include that of financial data. Where BigQuery falls short is with user defined precision which may result in inaccuracies.
Loading time is more complex for Redshift users because a variety of formats like CSV, JSON and AVRO compress and complicate the process. On the other hand, it's simpler for the BigQuery, but again more limited as a result of it's simplicity.
Expansion potential for huge for both Redshift and BigQuery. BigQuery focuses on this with huge communities for support and implementation. That said, AWS is growing fast and Redshift is built on top of Postgres to which great tools already exist.
What other usability factors do you consider when assessing data warehouses?
The final verdict
As you can see from above, we've found Amazon Redshift to have the upper-hand in performance, cost and usability overall. This is particularly true when used on an enterprise scale. However, is simple accessibility and usage is important to you, BigQuery no doubt wins this round.
Which do you currently use: Redshift or BigQuery?