Backend connection closes after 10 - 20 minutes

Hello :grinning:,

I have been running baserow for a while now without any issues. Due to the limited resources on my current server I am planning to move my baserow instance over to a more powerful machine.

On my new machine I am facing a strange problem. Although Baserow seems to run perfectly I am getting disconnected after working with Baserow for 5-20 minutes.

The Backend Logs show this:

[BACKEND][2023-01-17 09:24:49] 127.0.0.1:56582 - "GET /_health/ HTTP/1.1" 200
[BACKEND][2023-01-17 09:24:56] [2023-01-17 09:24:49 +0000] [417] [INFO] connection closed
[BACKEND][2023-01-17 09:25:00] 127.0.0.1:34638 - "GET /_health/ HTTP/1.1" 200
[BACKEND][2023-01-17 09:55:55] 127.0.0.1:36048 - "GET /_health/ HTTP/1.1" 200
**[BACKEND][2023-01-17 09:56:13] [2023-01-17 09:55:55 +0000] [416] [INFO] connection closed**
[BACKEND][2023-01-17 09:56:27] 127.0.0.1:50664 - "GET /_health/ HTTP/1.1" 200

I am running the docker image baserow:1.13.3 behind a Caddy Reverse Proxy. Port 80 from the container is mapped to 3001 on the host.

Caddyfile entry:

            myurl.com {
                          reverse_proxy localhost:3001
            }

I am running the same docker image without problems on my old host. There are diffenres between the systems, but I cannot see how these could cause the issue:

Old server:
Arch Linux
Docker
Traefik

New Server:
NixOS
Podman
Caddy

Does anyone have an idea what could cause this issue?

Hi @m3tam3re,

So when you say you get disconnected after 20 minutes, do you see anything in the UI?

The only persistent connection we keep open during use of Baserow is the websocket one for realtime.

Are you able to provide more or a full set of logs, feel free to email them to me at nigel@baserow.io ? No worries if not.

Thanks for helping out. I have sent you the logfile from the container.

I do not remember the URL exactly, but it was some sort of redirect url (sorry I am at work and unable to reproduce this right now)

This happens when I am actively working and when baserow is just open doing nothing. Basically you click something and get to this redirect url. If the backend is available you are redirected to the login, otherwise you get a 404.

I will update this for the exact url later on.

Hey @m3tam3re ,

This might have been caused by a bug in 1.13.X which has been fixed in 1.14.0. Could you upgrade and see if that helps?

Additionally I am seeing errors in your logs which looks like you already upgraded to 1.14.0 but then back down to 1.13.3 ?

Hey @nigel

I had to switch NixOS generations back, that’s why the container version was reverted to 1.13.
Have baserow running on 1.14.0 now, the problems seems to be gone :grinning:

Thanks for your help :pray:

@nigel it seems like the error is still there on 1.14.0

Instead of the blank redirect page followed by the 404 in container version 1.13.3 I get:

followed by:

when I tree to enter the dashboard I get a 404

[BACKEND][2023-01-20 10:22:38] 127.0.0.1:35900 - "GET /_health/ HTTP/1.1" 200
[BACKEND][2023-01-20 10:22:48] 127.0.0.1:41750 - "GET /_health/ HTTP/1.1" 200
**[BACKEND][2023-01-20 10:22:58] [2023-01-20 10:22:48 +0000] [417] [INFO] connection closed**
[BACKEND][2023-01-20 10:23:04] 127.0.0.1:53024 - "GET /_health/ HTTP/1.1" 200
 [BACKEND][2023-01-20 10:23:18] 127.0.0.1:53034 - "GET /_health/ HTTP/1.1" 200

Hi @m3tam3re ,

So to confirm/clarify:

  1. You use Baserow normally for period of time
  2. Suddenly after doing something for about 20 minutes or just randomly you get redirected to a 404 page
  3. You then have to log back in

Could you provide/email me:

  1. The latest logs from your docker container once again
  2. The command with anonymized environment variables or .env file you are using to launch and run your Baserow server
  3. If you know how, in browsers you can right click on the page and click “inspect” to open a debug window. In this window it would be very useful for me to see the contents of the “network” tab and “console” tab immediately after you hit this issue. But I believe these tabs need to be opened and visible prior to the error occuring for it to show up in them.

I have a suspicion this is happening when your login token expires, but the token refreshing process fails and you get logged out somehow.

Sorry for all the hassle!

Thanks,
Nigel

Hey @nigel

For your clarification question:
I have run my baserow installation on another server for month without problems. This problem seems to be happening only on the new server where my instance will be migrated to as soon as I fixed this problem.

There 3 key differences between the servers:
OS: new system runs NixOS / old system runs Arch linux
Reverse proxy: new system runs Caddy / old system runs Traefik
Docker: news system runs container with podman as systemd service / old systems runs regular docker

I have also sent you the logs.

here is my docker/podman command:

podman run \
  --rm \
  --name='baserow' \
  --log-driver=journald \
  --cidfile=/run/podman-'baserow'.ctr-id \
  --cgroups=no-conmon \
  --sdnotify=conmon \
  -d \
  --replace \
  -e 'BASEROW_PUBLIC_URL'='https://myurl.com' \
  -e 'EMAIL_SMTP'='in-v3.mailjet.com' \
  -e 'EMAIL_SMTP_HOST'='in-v3.mailjet.com' \
  -e 'EMAIL_SMTP_PASSWORD'='/run/agenix/mj-smtp-pass' \
  -e 'EMAIL_SMTP_PORT'='587' \
  -e 'EMAIL_SMTP_USER'='/run/agenix/mj-smtp-user' \
  -p '3001:80' \
  -v 'baserow_data:/baserow/data' \
  '--add-host=postgres:10.88.0.1' \
  docker.io/baserow/baserow:1.14.0

The command is run as a systemd service.

I have kept the network tab open. It seems to be an issue with the token said as you said:

Hm yeh it definitely shouldn’t be responding with 404. Could you select the red 404 token-refresh row in the network log and provide the contents of the “response” tab that appears on the right?

Edit: And also in the Headers tab the “Response Headers” section?

I’m trying to figure out if its a reverse proxy between Baserow and your browser that is returning 404, or Baserow’s API server itself.

Hi @m3tam3re so that 404 response does not look like it is coming from Baserow’s API server nor internal Caddy, but instead some other http server. Is there a chance that in your own Caddyfile that routes some requests to Baserow, that it could be incorrectly matching and miss routing some requests to Baserow?

@nigel I have found the problem. Unfortunately I have completely wasted your time with this issue.
I have been digging deeper into the requests made from my browser. The problem was that there was a conflicting DNS record on the subdomain the new baserow instance was running on.

Thank you very much for your help and sorry for the time waste.

No worries at all glad you got it all working. These are always useful data points in helping out future users and figuring out how to make our error handling/display/debugging tools work better in the future.