Scaling Postgres at Affinity with Citus

Arzav Jain
Affinity
Published in
4 min readJun 20, 2018

--

In a previous blog post, we discussed how we at Affinity migrated a large column in our biggest Postgres table to DynamoDB. Below we outline why we took the next step to scale to a rapidly growing user base: migrating the entire table to a sharded Postgres cluster managed by Citus.

Background

Each user that joins Affinity brings along a plethora of emails that are synced into our emails table. This email data is the backbone of how Affinity powers relationship intelligence for our customers. There are two main clients that read and write to the emails table: syncers and consumers.

Single-node architecture with all tables residing in the same Postgres database

In our original single-node Postgres architecture, the emails table resided together with all the other tables in the same database. The syncers constantly fetched emails from Gmail and Exchange web servers and inserted them into the primary Postgres database. These writes were streamed to the Read Replica, from which the emails were read by our consumers. The consumers modified these emails to pull out the relevant information that makes most of Affinity’s features possible. These modifications were written back to the primary database. Given the large volume of emails ingested by Affinity, this meant that our reads and writes were dominated by those to the emails table, and growing fast:

Emails processed by Affinity over time

To give some perspective, in the architecture described above, the emails table comprised approximately 75% of our entire database size on disk, while the next biggest table comprised a mere 8%.

Why shard?

Some consequences of such a read-write pattern were:

  1. The cache hit rate for the emails table was 79% on the read replica and slowly decreasing over time. As more users joined the platform, the size of the table and indexes grew over time and less and less of the frequently accessed data was able to fit in memory.
  2. Not surprisingly, the decreasing hit rate was accompanied by increasing database swap usage.
  3. Both of the above led to degraded performance for queries not just to the emails table but also to other tables, since the other tables and their indexes were fighting for space in the same cache as the emails table.

For a while, we combatted these problems by vertically scaling with bigger and better Postgres RDS instances. But we were soon approaching the upper limit on RDS Postgres memory size, not to mention the increasingly prohibitive costs of continuing to vertically scale. Hence, we considered scaling horizontally instead.

We began by sharding only our emails table and leaving the other tables in the Postgres RDS setup (shown above). The emails table was the best first candidate because:

  1. As described above, it dominated the read-write pattern and suffered the lowest cache hit rate of all our tables.
  2. It was involved in few joins with other tables and so moving it to a different datastore meant rewriting only a few queries.
  3. By migrating only our biggest table instead of the all the tables, we get most of the wins we are looking for while still having a short feedback loop to encounter pitfalls and get familiar with the new datastore. Equipped with more experience, we can in the future migrate the rest of our database over to the new datastore.

Why Citus?

Our choice for the new datastore was Citus. Reasons being:

  1. Citus uses Postgres under the hood and is simply an extension to plan and distribute queries. Consequently, our developers can continue to work with the same database engine and query language, allowing us to make use of and continue to build on our in-house Postgres expertise.
  2. Citus has great documentation and tooling to help us migrate our data from Postgres with no downtime (namely, a feature called Warp, which takes advantage of the logical decoding feature in Postgres 9.4 and up).
  3. Citus abstracts out the sharding so that application developers writing queries to the Citus database (mostly) don’t have to think about it. The coordinator node in the Citus cluster rewrites the queries and fetches data from the appropriate shards before aggregating and returning the results back to the client.

Results

After migrating the emails table to Citus, our new read-write pattern looks like this:

Sharded architecture with the emails table in Citus and all other tables in Postgres

We were pleased to see our cache hit rate for the emails table go from 79% to 99.9%. This helped reduce query latencies across the board; as an example we show the 99 percentile latencies below for a frequently hit application server endpoint that reads from both our Postgres RDS setup as well as Citus. As you may notice, there’s a sharp drop in latencies on May 12th, which is when we made the final switch over to Citus.

99p query latencies for a single endpoint

Huge thanks to Rohan Sahai and Adam Perelman who worked together to implement this migration and read drafts of this post. We’d also like to give a shoutout to the Citus team for being available and answering all our questions promptly.

--

--