Whether it’s an import error or human error, duplicate data makes your databases less useful. Here’s four options for identifying duplicates in Baserow:
Option 1: Manual sorting and removal
Option 2: Identify duplicates using Zapier
Option 3: Create new rows using Make
Option 4: Remove existing duplicates with n8n
Let’s dive in!
Thank you for the tip.
That is good, but I still believe - as I requested this in the past - that detecting duplicates right in the baserow, without the need to use another software or service, would be very appreciated.
I think this could be possible to implement, especially if there is a mechanism already to see unique and non-unique field content.
F.e. my usecase is pretty simple:
I have a database of +10k rows in my table where the primary field is just a text type - people’s Full name. I need to find duplicates (the same full names - 2 or more instances of any of those in the table) and need to decide visually and remove them manually. Sometimes The same full name doesn’t mean that it is the same person (can be two or more different, but with the same name - these I need to keep in the table).
I can’t say anything about the first three suggestions, but the n8n instructions will not work.
There is no need to use the ‘split in batches’ node, but even if you would use it, this is not the way to do it. Without actually creating a loop back to the node, it will only process the first batch and then stop executing. By the way, the ‘split in batches’ node looks different (and works more intuitively) as well in the latest n8n versions.
The ‘item lists’ node does not exist any longer in the latest versions of n8n. It has been replaced by separate nodes. In this case, you would use the ‘remove duplicates’ node instead.
Thank you for your feedback, @naamval. Our content lead has made the necessary changes to the blog post.
Hey @marcus, sure thing, that’s on our list! While we don’t have these features yet, we thought it might be useful to share a tutorial with some workarounds.
OK, that would be really great, the sooner the better, because doing it manually is soooo exhausting and it is very error prone. Especially the scenario where I need to tag selected rows (people) by a specific status, there is a problem later (f.e. when filtering) if of them is tagged differently than the other one…