Heroku PostgreSQL vs. Amazon RDS for PostgreSQL

IndustryOperations

Reading Time: 12 minutes

PostgreSQL is becoming the relational database of choice for web development for a whole host of good reasons. That means that development teams have to make a decision on whether to host their own or use a database as a service provider.

The two biggest players in the world of PostgreSQL are Heroku PostgreSQL and Amazon RDS for PostgreSQL. Today I’m going to compare both platforms.

Heroku was the first big provider to make a push for PostgreSQL instead of MySQL for application development. They launched their Heroku PostgreSQL platform back in 2007. Amazon Web Services first announced their RDS for PostgreSQL service in November 2013 during the AWS re:Invent conference to an overwhelming ovation by the programmers in attendance.

Pricing Comparison

Before I get too far into the features, let’s cover the pricing differences up front. Of course, both services have areas with different value propositions for productivity and maintenance that go beyond these direct costs. However, it’s worth it to understand the basic costs so you can weigh those values against your needs later.

Heroku PostgreSQL has the simplest pricing. The rates and what you get for them are very clearly set at a simple per-month rate that includes the database, storage, data transfer, I/O, backups, SLA, and any other features built into the pricing tier.

With RDS for PostgreSQL, pricing is broken down into smaller units of individual resource usage. That means there are more factors involved in estimating the price, so it’s a little tougher to draw an exact comparison to Heroku PostgreSQL.

You have the price per hour for the instance type, higher if it’s a multiple availability zone instance, cheaper if you pay an upfront cost to reserve the instance for one to three years; storage cost and storage class (both single and multi AZ); provisioned IOPs rate; backup storage, and data transfer… then there are a whole lot of special cases to consider. Also, keep in mind that you get one year free of the cheapest plan when you sign up.

Here is a comparison of an RDS plan to the Heroku Premium 4 plan:

Heroku Premium 4

  • $1,200 / Month
  • 15 GB RAM
  • 512 GB storage
  • 500 connections
  • High Availability
  • Max 15 minutes downtime/month
  • 1 week rollback
  • Point in time recovery
  • Encryption at rest
  • Continuous protection (offsite Write-Ahead-Log)

RDS for PostgreSQL

  • $1,156/month on demand or $756/month 1 year reserved
  • db.m3.xlarge Multi-AZ at $0.780/hr ($580)
  • 4 vCPU, 15GB RAM
  • Encryption at rest
  • 512 GB provisioned (SSD) at $0.250/GB ($128)
  • 2000 provisioned IOPS at $0.20/IOPS ($400)
  • Estimated backup storage in excess of free for 1 week rollback, 512 GB at $0.095/GB ($48)
  • Data transfer estimated at $0 for most use cases
  • 22 minutes downtime/month (based on AWS RDS SLA 99.95% uptime)

Now, here are the caveats with such a comparison:

  • Heroku isn’t disclosing the number of CPUs associated with their plans.
  • Heroku’s High Availability is equivalent to AWS RDS Multi-AZ. In both setups, a read replica is maintained in a different geographic region specifically for the purpose of automatic failover in the event of an outage.
  • With Heroku, your storage is fully allocated, and you do not pay for IOPS. As such, we don’t know what the limits are for IOPS, but they are very high performance databases. I allocated the minimum IOPS that AWS would allow for 512 GB, which was 2,000. We could go as high as 5,000 IOPS which would increase the price by $600/month.
  • The AWS RDS backups may cost nothing depending on how much of the provisioned storage is actually being used. Backup storage is free up to the level of provisioned storage, and backups are generally smaller, incremental, and do not include the significant space used by indexes. This estimate was based on the seven days of storage needed to allow for one week rollback.
  • AWS RDS storage can be scaled up on the fly, so your specific needs for RAM versus storage could create a wildly different pricing pattern. This comparison is aiming to draw an equivalent.
  • AWS only charges for data transfers out of your availability zone (not including multi-AZ transfers), so transfer rates will not apply in most cases.

Clear as mud.

Setup Complexity

Heroku PostgreSQL setup is dead simple.

Whenever you create a PostgreSQL project, a free dev plan is already created with it with a connection waiting. Upgrading the database simply gives you a new connection string with a set username, password, hostname, and database identifier that are all randomly generated by their system. The database connection must be secure but is accessible anywhere on the internet, including directly from your home computer. You can also choose whether to deploy it in the US East region or in the European region.

RDS for PostgreSQL setup is slightly more involved; you must select the various options outlined in the pricing section, including the instance type, whether or not it should be Multi-AZ, whether to enable encryption at rest, type of storage, how much to provision, IOPs to provision (if any), backup retention period, whether or not to enable automatic minor version upgrades, selection of backup and maintenance windows, database identifier, name, port, master user and password, which availability zone you want it to be created in and the selection of your VPC group and subnet group, and your database configuration.

Obviously, RDS gives you significantly more control over the details. Depending on your point of view, that could be good or bad.

The database configuration, for example, has a set of defaults for each database version for each instance type. You can take these defaults and make modifications to them with your own custom settings and then save those as your own parameter group to assign to this and any future databases that you may choose to create. The initial setup time can be slightly more involved because of the various factors like VPC, subnet groups, and public accessibility. However, once these have been defined the first time for your account, everything gets much closer to a point-and-click experience.

New Call-to-action

Host Locations, Regional Restrictions

Heroku operates with the AWS US East Region (us-east-1) and Europe (eu-west-1). This also means that your database will be restricted to these regions. Availability Zones are managed internally.

If you choose to use Heroku PostgreSQL with something hosted in a different AWS region than those two, you should expect more latency between database requests and transfer rates may apply. Likewise, if you wish to use AWS RDS for PostgreSQL with a Heroku application, just ensure that it is set up in the appropriate region.

Security and Access Considerations

Within Heroku PostgreSQL, you’re given a randomized username with a randomized password and a randomized database name that must be connected to over SSL. Their network (as well as Amazon’s) have built-in protections against scanners that could potentially brute-force access such a database. That is fairly secure.

The downside is that anybody who needs access to the database and has the connection information can do so from anywhere in the world. This is more of a Human Resources-level risk from departed programmers on a project than anything else, but it is something to be aware of nonetheless. Swapping out the database credentials after having a programmer leave the team will generally alleviate this concern.

On the other hand, AWS RDS for PostgreSQL has a much more comprehensive security policy. The ability to set and define a VPC and private subnet groups will allow you to restrict database access to only the servers and people who need it. You have the ability to create as many database users with various permission levels as you like in order to more easily manage multiple users or applications accessing the database with different permission levels, while providing a log trail. Thanks to VPC, even if somebody did have the connection information, they still couldn’t access the database without being able to get inside the VPC.

For stricter (although more complex) security, RDS wins hands down. Depending on complexity, team, and the development state of your application, this level of security paranoia may not yet make sense and could be more of a headache than you want to manage. You can also configure it with the same access rules used by Heroku PostgreSQL.

Backup/Restore/Upgrade

Both platforms offer very similar options for backup and restore. Both have scheduled backups, point-in-time recovery, restoration to a new copy, and the ability to create snapshots.

Upgrades are more involved. On both platforms, major version upgrades will involve some downtime, which can’t be avoided.

Heroku provided three options that all involve some manual steps to complete: copying data, promoting an upgraded follower, or using the pg:upgrade command for an in-place upgrade of larger databases. The pg:upgrade most closely resembles the upgrade process on RDS.

With RDS, you select the Modify option for your instance and change the version. It will create pre- and post-snapshots around the in-place upgrade while maintaining the exact same connection string.

RDS will allow you to schedule the database upgrade automatically within your set maintenance window. Heroku PostgreSQL will automatically apply minor upgrades and security patches, while RDS allows you to choose whether or not you want them to do that automatically within your maintenance window.

Both are fairly straightforward processes, although the RDS process is a little more hands-off in this case.

Feature/Extension Availability

As of this writing, AWS RDS for PostgreSQL has version 9.3.1–9.3.6 and 9.4.1, while Heroku PostgreSQL has 9.1, 9.2, 9.3, and 9.4.

Minor version upgrades are automatic with Heroku, so the point releases are unnecessary. Heroku PostgreSQL has been around longer and because of that has more legacy versions available for their existing users. RDS launched with 9.3 and does not appear to have any intention to support older versions.

In addition to all of the functionality built into PostgreSQL, there’s a constantly growing set of extensions.

Both platforms have these extensions in common:

  • hstore
  • citext
  • ltree
  • isn
  • cube
  • dict_int
  • unaccent
  • PostGIS
  • dblink
  • earthdistance
  • fuzzystrmatch
  • intarray
  • pg_stat_statements
  • pgcrypto
  • pg_trgm
  • tablefunc
  • uuid-ossp
  • pgrowlocks
  • btree_gist
  • PL/pgSQL
  • PL/Tcl
  • PL/Perl
  • PL/V8

Available on Heroku PostgreSQL:

  • pgstattuple

Available on AWS RDS for PostgreSQL:

  • postgres_fdw
  • chkpass
  • intagg
  • tsearch2
  • sslinfo

Here are the full lists for both Heroku PostgreSQL and AWS RDS for PostgreSQL.

Scaling Options

“Scaling” is a tricky word with databases because it means different things depending on the needs of your application. Scaling for writes vs. reads is based on low intensity and high volume (web traffic) compared to low volume and high intensity (analytics).

The most common scaling case on the web is scaling for read traffic. Both Heroku and RDS address this need with the ability to create read replicas. RDS calls them read replicas and Heroku calls them followers, but they’re essentially the same thing: a copy of the database, receiving live updates via the write-ahead-log over the wire to allow you to spread read traffic over multiple servers. This is commonly referred to as horizontal scaling. To create read replicas on either platform is a point-and-click operation.

Vertical scaling refers to increasing or decreasing the power of the hardware of your database in place. AWS and Heroku each handle this scenario differently.

Heroku instructs users to create a follower of the newly desired database class and then promote it to the primary database once it’s caught up, destroying the original afterwards. Your application will need to update its database connection information to use the new database.

If your RDS database is a multi-AZ database, then the failover database will be upgraded first. Once ready, the connection will automatically failover to that instance while the primary is then upgraded, switching back to the primary afterwards. Without a Multi-AZ, you can do the upgrade in place, but downtime will vary depending on the size of the database. Your other option is to create a read replica with the newly desired stats and then promote it to primary when it is ready, just as Heroku recommends.

To scale beyond the standard vertical and horizontal options for something that can handle distributed write scaling, neither option is a particularly good fit. It will probably be necessary to either manage your own Postgres-XC installation or restructure your application to isolate the write-heavy traffic into a more use-case specific data source.

Monitoring

AWS RDS for PostgreSQL comes with all of the standard AWS monitoring options via Cloudwatch.

Cloudwatch provides extensive metrics that you can track history with a granular ability to set up alerts via email or SNS notifications (basically webhooks). These are great for integrating with tools like PagerDuty.

Heroku PostgreSQL monitoring relies more on logs and command line tools. Their pgextras command line tool will show current information about what’s going on in the database, including bloat, blocking queries, cache and index hit ratios, identification of unused indexes, and the ability to kill specific queries.

These tools, while not involving the stat tracking over time that you get from Cloudwatch, provide extremely valuable insights into what’s going on with your database that you don’t come close to getting from RDS. You can see more examples of pg-extras on GitHub. These type of insights are invaluable in tuning your application and database to avoid the problems you’d need a monitor to catch in the first place.

Other historical data is available in the logs, although Heroku recommends trying out Librato (which can work with any PostgreSQL database but has a Heroku plugin available for automatic configuration). Additionally, free New Relic plans will provide a wealth of insight into what’s going on with your application and database.

While Cloudwatch provides more detailed insight as to what’s going on within the machine, Heroku uses the metrics seen within pg-extras to monitor and notify you of the various problems they see that require correction on your end. If data corruption happens, Heroku identifies and fixes it. Security problems, they’ll handle it. A DBA or a DevOps position will care significantly more about the Cloudwatch metrics. Heroku PostgreSQL tries to focus on making sure you don’t have to worry about it.

Dataclips

One bonus feature that you get from Heroku PostgreSQL is Dataclips. Dataclips are basically a method for storing and sharing read-only queries among your team for the sake of reporting without having to grant access to every person who may need to see them. Just type in a query and view the results right there on the page. The queries are version controlled; if your team is passing them around and tweaking them, you’ll be able to see the changes over time.

In my personal experience, I’ve found dataclips to be a lifesaver, specifically for working with non-programmer teams.

When business or support staff need information on sales, fraud, user behavior, account activity, or anything else we happen to have in there, I’ve always had the ability to write up a query to get at the information. Before dataclips, this meant that I needed to write up the query, save it somewhere, usually export the result set to a CSV or spreadsheet, and then email it to whomever was requesting it. Eventually, this becomes a routine activity that you’re having to handle at every request.

Enter dataclips. Now I can take that query and just send the random hashed link over to whoever requested the information. If they want more up-to-date information the next day, week, or month, they need only refresh the page. I write the query, then never hear that request again. That is a developer time-saver right there. You can save them and name them, as well as manage more strict access if need be.

Summary and Recommendation

Overall, AWS RDS for PostgreSQL will usually be cheaper and more tightly tailorable to exactly what your application’s needs are. You’ll have much more granular control over access, security, monitoring, alerts, geographic location, and maintenance plans.

With Heroku PostgreSQL, you’ll pay a little bit more on a simplified pricing structure, although all of your development databases will be free. You won’t be able to control a lot of the details that RDS gives you access to, but that’s partially by design so that you don’t have to deal with managing those details. With Heroku, you’ll get insights directly into how your database is performing and using the internal resources to help you catch, tune, and improve your setup before it becomes a problem.

If I had to choose, I’d probably go with Heroku and Heroku PostgreSQL as a startup while I focused on actually getting my application developed and getting customers in the door. The value proposition of saving time to focus on business goals so we can build a revenue stream would be of the greatest importance. Then when things grew to a point that the database was no longer changing as much, it might make sense to start migrating things over to RDS as we focus on locking things down to focus on stability, long-term maintenance, and security.

In the end, it really boils down to what costs you more: time or infrastructure. If time costs you more, go with Heroku PostgreSQL. If infrastructure costs you more, go with RDS. Having both platforms living within the AWS datacenters makes switching between the two a lot easier as your needs change.

Subscribe via Email

Over 60,000 people from companies like Netflix, Apple, Spotify and O'Reilly are reading our articles.
Subscribe to receive a weekly newsletter with articles around Continuous Integration, Docker, and software development best practices.



We promise that we won't spam you. You can unsubscribe any time.

Join the Discussion

Leave us some comments on what you think about this topic or if you like to add something.

  • Flexic

    Hey, are you guys intentionally truncating your RSS feed for some reason? Or is it just left that way since it’s the WordPress default…

    https://www.dropbox.com/s/3nucpkslan4gx12/Screenshot%202015-06-17%2008.44.23.png?dl=0

    • Hey, thanks for pointing this out! This is not intentionally. Let me check if we can resolve this!
      Have a great day,
      Manuel

    • Roman Kuba

      Hi, this is a WordPress default. I switched the settings to full text. Hopefully that works for you :)
      Thanks for reaching out.
      Best,
      Roman

  • Good article and well-written

  • Felipe

    Please do not rely on dataclips. They are very buggy. I use them in my dashboards and often they just freeze. I need to edit and save to make them back to normal

  • Pericles Theodorou

    Really good analysis! Thanks

  • Thanks for the writeup. We’re generally been happy about our experience using Heroku PG, especially after they moved everything to SSD backed machines and reduced their prices somewhat. The ability to rollback to a certain point in time has saved us at least once.

    postgres_fdw is available on heroku, we have been using foreign data tables for a while.

  • Good comparison!
    One difference between Heroku’s high availability and Amazon’s RDS is though that Heroku is using asynchronous replication (https://devcenter.heroku.com/articles/heroku-postgres-ha). Amazon’s RDS is using “synchronous physical replication” (http://aws.amazon.com/rds/details/multi-az/). As an AWS engineer explained to me this means that IO write operations are executed synchronously to storage located in two different availability zones. This makes Amazon RDS potentially a bit slower (though you still have your guaranteed IOPS if you need them), but increases reliability.

  • Heroku actually have description of their machines here:
    https://devcenter.heroku.com/articles/heroku-postgres-production-tier-technical-characterization

    You are currently comparing Heroku pricing which offers half IOPS, and is more expensive then the same alternative from Amazon with double IOPS

    • Thanks for finding that. Based on that, it looks like the equivalent AWS machine to the Premium 4 plan would be the r3.large Multi-AZ at $0.50 / hr or $372 / month instead of the $580 / month for the m3.xlarge, making the on demand rate $948 instead.

      Provisioned IOPs wouldn’t let me go below 2000 for 512 GB provisioned storage though.

      I’ll see about getting the article updated.

    • To his point, I get his comparison.
      $300 or even $500 would be nothing compared to spending hours (consider from $50 to $100/hr for a dev) having to deal with configuration, backups, etc.

      For small teams, it might be even more cost effective having Heroku dealing with everything (I would prefer paying $300/month for Heroku than $150/month for Amazon and having to waste several precious hours).

      But for larger teams – that can afford to optimize and architect well their infrastructure and can pay upfront for most services for 3 years (saving them more than 50%) – it’s definitely a no-brainer to choose AWS.

      • Amazon RDS is “just like heroku”. It’s managed, you don’t have to deal with config/backup/etc plus you have full control on what to scale.

        When things are slow on heroku, your only options are: “opening a ticket support and upgrading for a more expensive plan”.

        Don’t get me wrong, it works for most of the teams, but when you are doing something out of the regular CRUD and need more control, you will be pissed off to discover that in Amazon RDS you could have solved that performance problem by increasing IOPS with a click, or adding more RAM, without having to pay for all the other resources you don’t need on heroku + the premium price for nothing different.

        • That is a trade-off.
          If you don’t have a team to be just taking care of infrastructure, I believe Heroku can save you tons of time. If it doesn’t work you can, yes, upgrade and it will cost more (you will probably still save money from a ops hours).
          And plus, you can always come back to RDS if you think it’s not worth it anymore.

          Not saying everyone should choose one or other. Just saying that no you can say this is the best solution. And this article is another point of view that shows that sometimes what is more expensive at first could end up saving you precious money later.

  • Great article @brightball:disqus! I like to have more granular control on my infrastructure so will prefer RDS over Heroku but ya as you said for startups initially Heroku is better choice.
    Thanks for the article.

  • Sergio

    Nice article. I would like to ask you more about horizontal scaling (going little bit out of topic)…
    Does Heroku OR AWS RDS offer this solution out of the box? Which “native” solution will you instead advice for a write-ahead scaling solution? I’ve tried pg-pool2 but is extremely slow and thought to configure.. Is it simpler by using Postgres-XC?

    • It’s a tough question to answer. I’ve never had to implement either because I largely avoid the problem.

      1. App just needed read scaling so simple read replicas were enough
      2. App was customer-per-subdomain and lent itself to simple sharding using a schema-per-client approach
      3. The heavy write traffic was isolated to one or two tables, so just moving that specific data to a dedicated database solution (Mongo/Couchbase/DynamoDB, etc) fit the bill.

      In order to get to a point where PGXC was needed I’d have to have heavy writes to multiple different tables, that weren’t easy to separate logically (by customer, etc), needed to be queried several different ways (full text, multi-index, etc), or needed to be query-able across multiple machines when it wasn’t purely for analytics (which lends to Hadoop or RedShift as a solution).

      I just haven’t be on a project that justified it yet when other solutions were simpler unfortunately, so I can’t really give you a good answer to that. XC looks to be the most promising from everything I’ve read to this point though.

  • Pingback: 1 – Heroku PostgreSQL vs. Amazon RDS for PostgreSQL | Offer Your()

  • TheGoat

    You do realize, according to Gabriel’s link, that Heroku is running on AWS EC2 infrastructure, correct? Since RDS instances are also based on EC2, the hardware is pretty much apples to apples.

  • Great article!

    I’m spotting a trend of people realizing that Heroku might be very cost efficient in a small to medium scale. As servers grew complex, dealing with load balancers, DB’s, multiple servers start to become costly and time consuming.

    Whereas in the past, many complained about Heroku costs and tried to switch from it to Amazon.

  • Pingback: DB Weekly No.62 | ENUE()

  • andreasalicetti

    On a Heroku Postgresql DB – standing to their support – is not possible to enable the plperl extension.

    Keep in mind if you are planning to use it inside your functions or if you want to use bucardo to keep a DB replica of the DB on heroku using a tool like bucardo, since Heroku won’t allow to do that