Using baserow hosted for a headless CMS?

We are thinking of using baserow hosted version for a headless CMS running on nextjs. I am curious as to the API limits here as I can’t find any information on the rate limits or maximum API calls. When you run nextjs usually you create a bunch of static pages (static generation) by hitting the database on build to get post information etc. So each build requires extensive connections to the database to create the posts. For example, with 3,000 posts, you will hit the database 3,000 times in the span of a minute or so to create the posts on build. Will the baserow API timeout under these circumstances? What is the max connections limit? API rate limit? Thanks.

Hello @osseonews, first of all, welcome to the Baserow community :wave:

You should check this thread first: Baserow free tier limits - #12 by bram.

We have decided to not implement any API usage limits in the hosted version for now. We do expect fair usage of the API of course and will reach out if someone is excessively using it, but there is no limit for now. This might change at some point in the future. The self hosted version will never have API usage limits.

@bram can you please check the case, and answer these questions:

Will the baserow API timeout under these circumstances? What is the max connections limit? API rate limit?

Morning :wave:

I have just given this a test on my self hosted instance and I fired off 3000 requests in chunks of 8 at a time and I didn’t hit any errors so I would say this is all going to depend on your setup. What you could do for science if you have a self hosted instance is use parallel with curl to test.

The command below will run 3000 times in batches of 8, This will just get the field types so you would need to adjust it to get your data but it is always worth testing so that you know what your environment can handle.

seq 3000 | parallel -n0 -j8 "curl -X GET -H 'Authorization: Token API-TOKEN-HERE' https://baserow.mydomain.com/api/database/fields/table/xxxxx/"

If you are using the hosted offering it may be a bit different :slight_smile:

This doesn’t sound right. You should be able to get multiple rows/posts at the same time using e.g. https://api.baserow.io/api/redoc/#tag/Database-table-rows/operation/list_database_table_rows.

If the static site builder doesn’t allow for efficient API calls like this, I’d suggest to download all the needed rows first and use it during the build later. This could be handy to do so during development too, so you would only download new data when it is needed.

Thanks for all the comments. Yes, we plan to use the hosted version, as we don’t won’t to bother with backend infrastructure anymore.

Anyway, we are using NextJs on the frontend. The way it works on build is that you have a getStatic paths function which queries the database to get all your paths (let’s say we have 1,000 paths or slugs). This is one query. But, then it feeds those paths into another function, getStaticProps to get the data on each path individually. So for getStatic props this is one query per path, so in our example this is 1,000 queries. This happens very quickly on build. So yes, if there are 1,000 paths, there will be 1,000 queries against the database done very quickly. We currently use Postgres as a backend and routinely run into the classic “too many clients” error. This is solved on the backend by using connection pooling (e.g. pgBouncer). So I guess my question is: since BaseRow is using Postgres on the backend, what kind of pooling is done to prevent the too many clients error and what is the pooling size?

Also, as a quick follow up question: I noticed that there is a 100,000 row limit on baserow. This seemed strange to me as postgres can easily handle exponentially more records than that. So I’m wondering why there is a 100,000 limit and what will be done to lift that limit in the future? Just a higher monthly fee? Thanks.

Will the baserow API timeout under these circumstances? What is the max connections limit? API rate limit?

The self hosted version of Baserow doesn’t have rate limiting at all. As long as your servers can handle it, you can fire as much API requests as you want. The hosted version of Baserow currently has a rate limit of 20 requests per second, this is of course with a fair use policy. If it will slow down our servers, we might temporarily block your account and reach out to you via email. A row GET request of a table that doesn’t have that many fields, should be super fast. I would kindly like to ask you to limit the amount of requests to maybe max 8 requests per second or make consecutive requests of one at a time, to make sure everything will work properly. It would even be better make it work like @petrs suggested and using the row list endpoint to fetch multiple rows per request. There is a limit of 200 rows simultaneously.

So I guess my question is: since BaseRow is using Postgres on the backend, what kind of pooling is done to prevent the too many clients error and what is the pooling size?

In the hosted version of Baserow, we can have about 80 simultaneous connections to the PostgreSQL database. We currently have about 30 gunicorn workers running. Each worker can handle one single request at the same time and has one connection open with the database. If there are more than 30 API calls, the next one will have to wait until one of the workers finishes. We’ve never needed more than this and we can easily scale up horizontally if needed. We constantly monitor this.

Also, as a quick follow up question: I noticed that there is a 100,000 row limit on baserow. This seemed strange to me as postgres can easily handle exponentially more records than that. So I’m wondering why there is a 100,000 limit and what will be done to lift that limit in the future? Just a higher monthly fee? Thanks.

The self hosted version will never have any data limitations of course. If your servers can handle it, then it won’t be a problem. The hosted version currently has a limit of 100,000. Even though PostgreSQL can easily handle more than 100k rows in a single table, there are also other factors at play. Some users for example have 100 fields in a single table, which results in 100 columns being created. This can have an impact on performance. There are field types that depend on other field types, for example the formula and lookup field. Depending on how your formulas and links to other tables work, it could mean that when you update a single cell, 1000 other rows need to be updated as well. This will have impact on performance as well.

In most cases, everything will work very performant, but there are a couple of exceptions that will not be performant in combination with 100k rows. We value performance a lot and we realize there are lots of opportunities to improve. These will be implemented in the upcoming months. When we’re comfortable that everything will work performant with 100k rows, even in combination with lots of formulas and other fields, we’re probably going to increase the data limitations as well. Alternatively, you can always use the self hosted version.

2 Likes