Can the Baserow cluster share the same Redis?

Because the server’s network is restricted, so my server cannot use Docker. I want to run Baserow on several servers with the supervisor to form a cluster, and then use nginx to achieve load balancing.

I currently only have one Redis server, there are about five servers in the Baserow cluster. If they all use the same redis, will it cause problems with some functions of Baserow? For example “duplicate database” and “export tables” (celery tasks)

What should I do to avoid these problems?

Hi @Chase, Baserow has been developed with horizontal scalability in mind. It’s made for all gunicorn workers and celery workers to share the same single Redis and PostgreSQL database. Celery distributes the tasks between works out of the box, so duplicating and exporting should work as expected. Please let me know if you run into any problems.

Hi@bram , I’d like to share my problems and thank you for your kind reply, hope you have a pleasant work and everything goes well !

Hi @bram When I started the baserow cluster, I found in the log: "DuplicateNodenameWarning: Received multiple replies from node names: celery@exportworker, celery@worker.
Please make sure you give each node a unique nodename using
the celery worker -n option.",

Do I need to pay attention to this warning? Or add “-n” to celery’s startup command ?

What we do in production, when you’re running multiple normal celery and export workers, we add a unique name to it. If the code below doesn’t work, would you then mind sharing your supervisor config?

[program:worker]
command=/baserow/env/bin/celery -A baserow worker -l INFO -Q celery -n default-worker@%h

[program:exportworker]
command=/baserow/env/bin/celery -A baserow worker -l INFO -Q export -n export-worker@%h

Hi!@bram I shared my current supervisor config, please check it and point out any errors, thanks !

[supervisord]
nodaemon = true
environment =
    DJANGO_SETTINGS_MODULE='baserow.config.settings.base',
    MEDIA_ROOT='/home/baserow/media',
    DATABASE_HOST='142.468.0.00',      
    DATABASE_PASSWORD='',
    SECRET_KEY='0ifhVlGHfxSVTY6mn0lzIwAuKsJNm9Xrdm1box10xnaaY4EklR8JFnVrEMp3CsWe',
    DATABASE_NAME="baserow",
    DATABASE_USER="baserow",
    PRIVATE_BACKEND_URL='http://backend:8000',  ---> (private IP of each server)
    PUBLIC_WEB_FRONTEND_URL='http://10.416.77.000:3000', ---> (public web-frontend nginx url)
    PUBLIC_BACKEND_URL='http://10.416.77.000:8000', ---> (public backend nginx url)
    MEDIA_URL='http://10.416.77.000:8080/media/',  ---> (media  nginx url)
    REDIS_HOST='142.328.0.00', 
    REDIS_PASSWORD='',
[program:gunicorn]
command = /home/baserow/python/virtual/env/bin/gunicorn -w 5 -b 0.0.0.0:8000 -k uvicorn.workers.UvicornWorker baserow.config.asgi:application --log-level=debug --chdir=/home/baserow/
stdout_logfile=/home/baserow/logs/backend.log
stderr_logfile=/home/baserow/logs/backend.error
user=baserow

[program:worker]
command=/home/baserow/python/virtual/env/bin/celery -A baserow worker -l INFO -Q celery -n worker%%h
stdout_logfile=/home/baserow/logs/worker.log
stderr_logfile=/home/baserow/logs/worker.error
user=baserow

[program:exportworker]
command=/home/baserow/python/virtual/env/bin/celery -A baserow worker -l INFO -Q export -n exportworker%%h
stdout_logfile=/home/baserow/logs/exportworker.log
stderr_logfile=/home/baserow/logs/exportworker.error
user=baserow

[program:beatworker]
directory=/home/baserow/python
command=/home/baserow/python/virtual/env/bin/celery -A baserow beat -l INFO -S redbeat.RedBeatScheduler
stdout_logfile=/home/baserow/logs/exportworker.log
stderr_logfile=/home/baserow/logs/exportworker.error
user=baserow

[program:nuxt]
directory = /home/baserow/web-frontend
command = node ./node_modules/.bin/nuxt start --hostname 196.542.15.00 --port 3000 --config-file ./config/nuxt.config.dev.js
stdout_logfile = /home/baserow/logs/frontend.log
stderr_logfile = /home/baserow/logs/frontend.error
user=baserow


also, “@%h” will display format error, I use “%%h” instead

I would like to share with you two other problems I encountered when using the cluster, others may encounter similar problems, I haven’t modified any code of Baserow.

export:
When I use the baserow cluster, because I use nginx, my request to export file will be forwarded to a server, and the file will be generated in the “MEDIA_ROOT” of this server, and then the request to download the file may be forwarded to other servers, which will lead to download failure.
Solutions:
I guess by configuring nginx, the export and download requests can be forwarded to the same server?

import:
When I import a file to create table, it often fails with the page shows: “Something went wrong during the file_import job execution.” , error in the backend log:


==> exportworker.error <==
[2022-12-05 11:41:05,313: INFO/MainProcess] Task baserow.core.jobs.tasks.run_async_job[4da34f10-9faa-4c14-bc98-1437e718e840] received
[2022-12-05 11:41:05,338: ERROR/ForkPoolWorker-6] Task baserow.core.jobs.tasks.run_async_job[4da34f10-9faa-4c14-bc98-1437e718e840] raised unexpected: FileNotFoundError(2, 'No such file or directory')
Traceback (most recent call last):
  File "/home/baserow/python/virtual/env/lib/python3.8/site-packages/celery/app/trace.py", line 451, in trace_task
    R = retval = fun(*args, **kwargs)
  File "/home/baserow/python/virtual/env/lib/python3.8/site-packages/celery/app/trace.py", line 734, in __protected_call__
    return self.run(*args, **kwargs)
  File "/home/baserow/backend/src/baserow/core/jobs/tasks.py", line 73, in run_async_job
    raise e
  File "/home/baserow/backend/src/baserow/core/jobs/tasks.py", line 36, in run_async_job
    JobHandler().run(job)
  File "/home/baserow/backend/src/baserow/core/jobs/handler.py", line 62, in run
    return job_type.run(job, progress)
  File "/home/baserow/backend/src/baserow/contrib/database/file_import/job_type.py", line 136, in run
    with job.data_file.open("r") as fin:
  File "/home/baserow/python/virtual/env/lib/python3.8/site-packages/django/db/models/fields/files.py", line 76, in open
    self.file = self.storage.open(self.name, mode)
  File "/home/baserow/python/virtual/env/lib/python3.8/site-packages/django/core/files/storage.py", line 38, in open
    return self._open(name, mode)
  File "/home/baserow/python/virtual/env/lib/python3.8/site-packages/django/core/files/storage.py", line 243, in _open
    return File(open(self.path(name), mode))
FileNotFoundError: [Errno 2] No such file or directory: '/home/baserow/media/user_1/file_import/job__176.json'

It seems that Django’s FileField did not create a file for me under “MEDIA_ROOT/user/file_import”. when I check the server, the “job__176. json” file is not generated, but I will succeed after a few more attempts.
Solutions:
What puzzles me the most is why sometimes it succeeds and sometimes it fails? After failure, try a few more times to succeed, I have no solution yet. :face_exhaling:

Hi @bram ,I seem to have found the reason, the problem is explained here:

Baserow Community

I was trying to solve the problem. If you have time to share your ideas, I would be very grateful :pray: