Webhook using import CSV

Hi everyone !

I have configured a webhook to be called when any row is created. The webhook is actually called when I use a form to create a new row or if I directly create a row from grid.

When I use the feature “import form CSV”, all rows is created in the grid but the webhook is never called. I have checked into webhook call log and nothing appears.

About the webhook configuration : I use event_type:rows.created

May I have miss something in the webhook configuration ?

I use a self-hosted instance.

Thanks for your help !

Hey there @hivewapp,

We don’t always trigger a webhook when a row is created even if you have set it up to do that.

We only send webhook requests when we also send realtime updates.

What are realtime updates?

Any time you change a row manually for example Baserow will send that update to everyone else in your group currently online. This ensures that data is always up to date when 2 people are working on the same table.

Why does it make sense to couple that to webhooks?

Generally speaking realtime updates should be sent whenever something changes that another user can currently have access to. In the case of an import it is impossible for another user to have access to that table before or during the import and therefore we don’t need realtime updates.

So the assumption here is if we need to inform users about a change in realtime we should also trigger a webhook.

Now looking at your post, I understand the confusion. Clearly it doesn’t make sense to send realtime updates for an import, but you could make the case that a webhook should be triggered in the case of an import given that a webhook can be used for all sorts of things, not just to make sure you don’t have runtime conflicts.

@nigel how do you feel about that? Just an oversight on our end or do we have good reasons to not trigger the webhook here? (apart from the code being very tightly coupled with signals)

1 Like

Hi @Alex, thanks for your complete answer.

To give you a greater point of view of my case, I’m using baserow as database but I also use other apps to handle some automation (e.g sending email etc).
So for example, each time there is a change in my baserow, I plan to use webhook to start some automation around received data.

1 Like

Yeah that makes perfect sense, I agree in your case (and in general) it would be useful to trigger the webhook in the case of an import.

Hi @Alex ! What about this topic please ? is it released ? or included on the official roadmap please ?

Hey @hivewapp I am not actively involved in the roadmap planning so I would need to defer to either @olgatrykush or @bram on that matter :slight_smile:

Hi @hivewapp, this feature is currently not on the roadmap. We’re a bit worried about what happens if someone uploads a huge CSV file. This can result in a massive payload with hundreds of megabytes of data. Most web servers don’t accept payloads that big.

Would it be acceptable to you if we introduce a “refresh” webhook event? This will be an event without the newly created rows, but when received it indicates that you must you the API to refetch all the rows again?

What do you means by “all the rows again” ? it means all the rows in the table or all the new rows ?

The refresh event would be acceptable if it contains at least new / updated row ids. if not, it means API should refetch all existing and useless rows to find out new or updated rows by keeping a state sync no ?

I understand about the huge csv file. But how do you do right now to actually insert rows into baserow db if the csv file is too huge ? I mean using the current import button.

What I meant with “all the rows again”, is that you would have to refetch all the rows from Baserow using the API.

Just sending the newly created IDs can definitely be an option for preventing a huge payload. Maybe we could even make that optional when creating the webhook.

Anyway, I agree with you that we should look into this further. I’ve created an issue for this problem on the backlog here: Trigger create webhook when (csv, json, xml, etc) file is imported (#2002) · Issues · Baserow / baserow · GitLab. When we’re going to work on this, we can think about the correct technical solution.

1 Like

Thank you for your answer !
I don’t really know how it works internally but we can also consider using a batch process (and setup an env var for the size of a batch).

That’s why I asked about how you process huge CSV file when you insert data into baserow (without talking about webhook).

I saw you read the csv file from the web but I didn’t see yet if the backend directly process the uploaded csv file or the string the web app send to your backend.

Importing large files into a table is done via an async job (Celery) because it can take minutes to process if the data set is large enough. Here we can have long-running jobs, communicate the progress, etc.

Some web servers can complain if the request body payload is too large, or if it takes too long to execute. I’m worried that if we’re going to send the data of all the rows that are created to a webhook receiver, we’re probably going to run into one of those errors.