Are you using our SaaS platform (Baserow.io) or self-hosting Baserow?
Self-hosted
If you are self-hosting, what version of Baserow are you running?
1.31
If you are self-hosting, which installation method do you use to run Baserow?
Docker Container
What are the exact steps to reproduce this issue?
This is not an issue.
I just wonder what to consider when setting page size limit in baserow.
I need to fetch up to 5000 rows as quickly as possible and I think if I fetched them with a higher page size limit (1000/ 5000), that would be quicker? What should I consider in terms of system load etc.?
Maybe additionally: What are other ways to increase API speed on a single API call? Would using only particular “include” fields help? RAM / CPU? Workers?
Increasing that variable allows requesting more rows per request, but it’s also used by the notification system and row history. As a result, more rows will be returned there as well by default, making initial requests slower when dealing with large amounts of data.
How much slower depends on many factors, so it’s difficult to give a precise estimate. You can try increasing it to 500 first, then 1000, and observe the impact. In production, we found that 200 was a good balance since tables can have many fields, but your case might be different.
Another approach is to send multiple requests in parallel for different “pages” using limit and offset. It won’t be as fast as a single request for all 5000 rows, but if your system can handle multiple concurrent requests, you can achieve similar response times without making drastic changes to that variable.
The biggest performance impact when dealing with large datasets comes from serializing fields in the response. The fewer fields included, the less data needs to be serialized and transmitted. Using include selectively would likely have the most significant effect when handling many rows.