YouTube Shorts Demonstration Video
How to get duplicate values from any datasource, array, or JSON array in Power Automate.
Select Action Note:
Add a Select action if you want to get more than just exact duplicates. Use this action to select specific item columns to check for duplicates.
Filter Array Action Note:
The expression only uses strings & returns the index of the 2nd argument item if it shows up the number of occurrences specified in the 3rd argument, otherwise it returns -1 Expression in the Filter Array: nthindexof(string(body(‘Select’)), string(item()), 2)
This template flow is already set up with an example Union( ) expression and Apply to each loop for those who want to delete the duplicates from their dataset.
Copy this scope code and Control + V paste it into the "My clipboard" tab of a new action menu:
{"id":"5e530dcd-5589-4e51-b541-d5df4b5e7765","brandColor":"#8C3900","connectionReferences":{"shared_excelonlinebusiness":{"connection":{"id":"/providers/Microsoft.PowerApps/apis/shared_excelonlinebusiness/connections/shared-excelonlinebu-29fbd1aa-441a-4ca4-badb-fd92d9ce165f"}}},"connectorDisplayName":"Control","icon":"data:image/svg+xml;base64,PHN2ZyB3aWR0aD0iMzIiIGhlaWdodD0iMzIiIHZlcnNpb249IjEuMSIgdmlld0JveD0iMCAwIDMyIDMyIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPg0KIDxwYXRoIGQ9Im0wIDBoMzJ2MzJoLTMyeiIgZmlsbD0iIzhDMzkwMCIvPg0KIDxwYXRoIGQ9Im04IDEwaDE2djEyaC0xNnptMTUgMTF2LTEwaC0xNHYxMHptLTItOHY2aC0xMHYtNnptLTEgNXYtNGgtOHY0eiIgZmlsbD0iI2ZmZiIvPg0KPC9zdmc+DQo=","isTrigger":false,"operationName":"Find_and_delete_duplicates","operationDefinition":{"type":"Scope","actions":{"List_rows_present_in_a_table":{"type":"OpenApiConnection","inputs":{"host":{"connectionName":"shared_excelonlinebusiness","operationId":"GetItems","apiId":"/providers/Microsoft.PowerApps/apis/shared_excelonlinebusiness"},"parameters":{"source":"me","drive":"b!CVcYUvLh8EGkvkCsqwCTp7Pqs2dO4IxIl2FLeQnvIk-zJXFeRmKbRIbbO8P0Lkhi","file":"01PCEUDVBD2GUIKLQ7CRFL32GZXK5EIHSQ","table":"{0DF9A07A-9DC2-436B-85CC-A958741ED454}"},"authentication":{"type":"Raw","value":"@json(decodeBase64(triggerOutputs().headers['X-MS-APIM-Tokens']))['$ConnectionKey']"}},"runAfter":{},"description":"You can change this out with any type of get data action for any type of datasource, just update the From field in the following Select action with the outputs of the new datasource.","metadata":{"01PCEUDVBD2GUIKLQ7CRFL32GZXK5EIHSQ":"/Book.xlsx","operationMetadataId":"f5de7156-8cde-47e0-949f-5bd37bea2f64","tableId":"{0DF9A07A-9DC2-436B-85CC-A958741ED454}"}},"Select":{"type":"Select","inputs":{"from":"@outputs('List_rows_present_in_a_table')?['body/value']","select":{"Col1":"@item()?['Col1']","Col2":"@item()?['Col2']"}},"runAfter":{"List_rows_present_in_a_table":["Succeeded"]},"description":"Add a Select action if you want to get more than just exact duplicates. Use this action to select specific item columns to check for duplicates.","metadata":{"operationMetadataId":"218007cf-3241-4ebf-906c-563b82e36850"}},"Filter_array":{"type":"Query","inputs":{"from":"@body('Select')","where":"@greater(nthIndexOf(string(body('Select')), string(item()), 2), -1)"},"runAfter":{"Select":["Succeeded"]},"description":"nthIndexOf(string(body('Select')), string(item()), 2) ||| Only uses strings & returns the index of the 2nd argument item if it shows up the number of occurrences specified in the 3rd argument, otherwise it returns -1","metadata":{"operationMetadataId":"04080a21-d0d1-4018-9d1b-cb6411877800"}},"Apply_to_each":{"type":"Foreach","foreach":"@union(body('Filter_array'), body('Filter_array'))","actions":{"Delete_a_row":{"type":"OpenApiConnection","inputs":{"host":{"connectionName":"shared_excelonlinebusiness","operationId":"DeleteItem","apiId":"/providers/Microsoft.PowerApps/apis/shared_excelonlinebusiness"},"parameters":{"source":"me","drive":"b!CVcYUvLh8EGkvkCsqwCTp7Pqs2dO4IxIl2FLeQnvIk-zJXFeRmKbRIbbO8P0Lkhi","file":"01PCEUDVBD2GUIKLQ7CRFL32GZXK5EIHSQ","table":"{0DF9A07A-9DC2-436B-85CC-A958741ED454}","idColumn":"Col1","id":"@items('Apply_to_each')?['Col1']"},"authentication":{"type":"Raw","value":"@json(decodeBase64(triggerOutputs().headers['X-MS-APIM-Tokens']))['$ConnectionKey']"}},"runAfter":{},"metadata":{"01PCEUDVBD2GUIKLQ7CRFL32GZXK5EIHSQ":"/Book.xlsx","operationMetadataId":"92987169-4d74-4aba-8c99-20877b3599de","tableId":"{0DF9A07A-9DC2-436B-85CC-A958741ED454}"}}},"runAfter":{"Filter_array":["Succeeded"]},"description":"The union expression helps remove the extra duplicate value so it only deletes 1 of the duplicate rows. If you need to reference something in the loop array values, use items('Apply_to_each')?['InsertColumnNameHere']}","metadata":{"operationMetadataId":"325d06d0-d08b-4c8e-9344-d06b5a4c6acc"}}},"runAfter":{},"metadata":{"operationMetadataId":"0c3ee8e3-6aec-4537-988e-f8849d28bca6"}}}
These resources may also help if you need to delete A LOT of duplicates...
SharePoint Batch Delete: https://www.tachytelic.net/2021/06/power-automate-delete-sharepoint-items/
Excel Batch Delete: https://powerusers.microsoft.com/t5/Power-Automate-Cookbook/Excel-Batch-Delete/td-p/1634375
SQL CRUD
AND, if you have other criteria by which you wanted to keep 1 duplicate value over another, say another column was status or something and you only wanted to keep InProgress duplicates over Completed duplicates, then you could put in a Get row or Get items action before the delete action that would pull based on the ID or primary key and that extra criteria of Completed. Then you could use the values that action returns in the delete action to only delete the duplicates matching that specific criteria.
Hey, thank you for this. I'm a bit confused as to how to use this with a sharepoint list. As you must know, SP lists have the ID column which is an unique number, this ID is the one that would be passed to a delete item step at the end of this flow.
So it becomes a bit of a catch 22, if you don't specify columns, it won't find any duplicates because the ID column and other SP automatic columns are never equal, if you specify columns it finds duplicates like a charm without the ID but you can't leave out the ID because it will be needed to tell the list which elements to delete... any ideas?
@ChelsyMena
Yes, for that you can not include the ID column, then add another action to the Apply to each to get the ID of every record matching the identified duplicate column(s). Then you can sort the outputs in certain ways to decide which of the duplicates to delete. For example:
This starts to look more like the standard method for deleting duplicates where you get items & check the length of the response to determine if there are multiple records & it needs to be deleted. But instead of looping through every individual record, this helps filter to just the records that do have duplicates, thus drastically reducing the number of actions & time required to delete duplicates on data-sets of any size.