Api with large files - how to make it faster?

Hello,

I am using this machine:
image

And it is very slow to create rows with api (we are talking about 9000 rows). It creates one row every 3-4 seconds.

I try to do the same with your online website and it is much more faster.

Do you what I can do to improve the website?

I follow your install on ubuntu guide.

Hey @ozzz ,

How many fields are in the table you are creating rows? Are there lots of the more complicated field types like lookup , formula and link row fields?

Additionally could you install a process monitoring tool like htop on your machine and see how much utilization is occurring whilst you are inserting rows / provide a screenshot perhaps privately via PM?

Just saw you mentioned Large files in the title, are you uploading lots of files with each row? Also it might be useful to know the specific endpoint you are using to create rows with. Is it also slow when you create rows manually using the GUI?

Hello,

thanks for your answer.

Sorry or not beeing more precise.

I am uploading a 9000 rows into a table. They have 26 texts fields + 1 link row with only one relation in it.

I will install the htop and keep you in touch

Go ahead i’ve downloaded the image!

And so when you say you are uploading 9000 rows, are you doing that via the Import table modal:

Or are you using the API programmatically somehow?

I am using the api.

to be honest I am not importing a file, I am migrating data from a database to baserow.

Your data doesn’t look unreasonable at all, feel free to remove the link. We recently merged onto our develop branch a profiling tool that can be enabled in dev mode called django-silk. This might be useful to figure out what is causing the slowdown. Are you comfortable with temporally switching your Baserow onto the develop branch + in dev mode to debug this? If so I’ll post some instructions on how to do this tomorrow morning!

I am ok to join the beta program.

What are the minimal technical requirements you use on your live version ?

Other question: how to upload csv, json or other in a way to create a link row?

thank you

Hey ozzz,

We don’t yet have official technical requirements.

Here is a quick and dirty guide on how to enable the profiling for your Baserow environment. I’ve not tested this thoroughly and so whilst it should be safe, please only do this if you are OK with potentially breaking your Baserow environment and/or data loss.

$ sudo -i
$ cd /baserow
$ source env/bin/activate
$ pip3 install django-silk==4.2.0

# Ensure you copy and paste and run this entire command including the final EOT all in one go.
$ cat <<EOT >> /baserow/baserow/backend/src/baserow/config/settings/base.py 
# Delete these following lines to disable the silk profiling program once done
INSTALLED_APPS += ["silk"]  # noqa: F405

MIDDLEWARE += [  # noqa: F405
    "silk.middleware.SilkyMiddleware",
]

SILKY_ANALYZE_QUERIES = True
DEBUG = True
EOT

$ cat <<EOT >> /baserow/baserow/backend/src/baserow/config/urls.py 
from django.contrib.staticfiles.urls import staticfiles_urlpatterns

urlpatterns += [re_path("^silk/", include("silk.urls", namespace="silk"))]
urlpatterns += staticfiles_urlpatterns()
EOT

$ export DJANGO_SETTINGS_MODULE='baserow.config.settings.base'
$ export DATABASE_PASSWORD='yourpassword'
$ export DATABASE_HOST='localhost'
$ export REDIS_HOST='localhost'
$ baserow migrate

$ supervisorctl restart all

Now you should be able to visit http://localhost:8000/silk/ and see a dashboard of per request performance. If you are accessing your Baserow server remotely then you’ll need to change http://localhost:8000 to whatever you’ve set the PUBLIC_BACKEND_URL to be.

Once you can see the /silk/ page. Try make the requests you find to be slow and then visit that page and you should see the slowest requests. Could you then send screenshots of the slowest requests and importantly the SQL sub page showing each query and how long it took?

Heres a gif showing this:
silky

Once we are done profiling your Baserow later you can revert the changes above by running

$ sudo -i
$ cd /baserow/baserow
$ git checkout -- . 
$ source /baserow/env/bin/activate
$ pip uninstall -f django-silk
$ supervisorctl restart all

To answer your question about uploading csvs to create link rows this is not yet possible.

Hello got an error here. it is sad ‘path’ is not defined.

File "/baserow/baserow/backend/src/baserow/config/urls.py", line 24, in <module>
    urlpatterns += [path("silk/", include("silk.urls", namespace="silk"))]
NameError: name 'path' is not defined

is it re_path instead of path?

Ah my bad, could you reset by doing

cd /baserow/baserow
git checkout -- . 
cd /baserow

And try run the commands above again? I’ve updated them to fix the error

1 Like

here it is

The problem is that I have a lot of request like this to do.

Could you show me the SQL tab on that page?

The right asked picture.

I got 13 pages like that.