How to increase your Baserow's performance in 1.17.2 using SQL caching

We’ll be releasing a more comprehensive performance optimization guide at some point, however I wanted to post an early version letting people know about the new SQL caching which can be used to speed up Baserow.

In 1.17.0 we turned on this redis SQL caching by default, in 1.17.2 we turned it off by default and instead put it behind some env vars.

When should you turn this on

In 1.17.2 using the env vars below you can make Baserow cache some/close to all SQL statements it runs using redis. This caching is done per table, and can have a massive performance increase depending on a number of factors:

  • Your redis server is fast / much faster than your PostgreSQL

  • The tables being cached are being edited/updated/changed less than 50 times a minute.

    • This is because every change causes a cache invalidation, and if this is happening to a specific table more than 50 times a minute the performance overhead of the invalidation is worse than the performance gains for that specific table.
  • The overall overhead of having to check every SQL statement to see if cache invalidation needs to happen is less than the performance gain from actually caching the SQL statements.

    • In our own production we don’t see this to be the case due to the sheer number of constant changes to tables, but in self-hosted instances with much fewer users and relatively static schemas it can instead be a huge boost.

How does it work

It works using this library but essentially it looks at every SQL statement Baserow runs and that statement only involves tables that can be cached, is a cachable statement like SELECT it will first check redis to see if the result of that query is already known.

How to enable and tune it

To turn on and control/tune this library you can use the following env vars. I’d recommend experimenting with both the full and default mode in a staging/test env.

  • BASEROW_CACHALOT_ENABLED=true will turn on redis based caching of some of Baserow’s SQL queries. We limit it to specific tables we think are a conservative choice.
  • BASEROW_CACHALOT_ENABLED=true and BASEROW_CACHALOT_MODE=full will turn on redis based SQL caching of pretty much all SQL that Baserow runs. `
  • BASEROW_CACHALOT_UNCACHABLE_TABLES=comma separated list of table names never to cache. This env var lets you disable in both modes when cachalot is enabled caching for a specific table.
    • For example, say i know that my Baserow tables with ids 147157 and 147160 are updated on average more than 50 times per minute. I should then probably experiment with and turn off caching for those tables by setting BASEROW_CACHALOT_UNCACHABLE_TABLES=database_table_​147157,database_table_147160 .
    • In BASEROW_CACHALOT_MODE=default mode we only cache one or two queries involving your actual Baserow tables and so this setting is less important.
    • However in BASEROW_CACHALOT_MODE=full all SQL involving your actual user tables will be attempted to be cached and so this setting becomes more important to tune for tables you edit more than 50 times a minute.
2 Likes

Can this newer update 1.17.2 comparing to previous 1.17.0 have negative impact to the performance in terms of scrolling the table down and getting slower response (displaying data in table takes more time after I scroll down)? If yes, what setting should I change?

Yes, enabling cachalot makes scrolling tables much smoother for tables with many rows.
Of course, it depends on the hardware, but I’d say the difference is barely noticeable on tables with less than 100k rows.

You can easily re-enable cachalot setting the env var BASEROW_CACHALOT_ENABLED=true