Remove duplicates

Hi,

Can someone point me in the right direction of an easy(ish) way to remove duplicates from my tables. Not a coder but I can hack something together with the api.

Kind Regards

Hi jonathan,

Depending on the amount of tables and records, I think that there are two options:

  1. Sorting your data on multiple criteria and remove the duplicates manually. This is an option if you only have to do it once for a limited number of records.

  2. Creating an automation with (Make, N8N, Zapier, Pipedream) that checks if a record exists before inserting it in a table.

The scenario for the automation will look a bit like this:

  1. Duplicate the table that you want to remove the duplicates and remove all the data. So you end up with only the table structure.

  2. Use the Baserow building block in your automation to get all the records from the original table (so with the data and the duplicates)

  3. Use the iterator block to loop over all the results. For each record you do an API call in a new block to check is that record exists in your new table. If no: insert the record in the new table, otherwise go to the next record

Once you have clean data in your new tables, you can adjust the automation to prevent that duplicated records are inserted again.