cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
TechieJones
Regular Visitor

Remove duplicate records in collection

Hello,

 

I am having difficulty removing duplicate records from a collection.

 

Consider these collections:

Collection1 named "colDisplayNames"
==Column named "displayName"

Collection2 named "colUsers_WW_with_Office"
==Column named "displayName"
==Column named "Office"
==Column named "Country"
==Column named "Region"


My goal is to update the collection so it only contains a single instance of each record.
So reduce all records from colUsers_WW_with_Office that are exact duplicates of each other down to a single instance of the record.
If every value in every column of the record is the same as some other record, remove the duplicate records, leaving a single instance of the record.

 

I tried to use the Distinct function, but quickly learned that it only returns a single-column collection.

 

So in order to get around this, my brain went like this...

 

I created the colDisplayNames collection, to which I am able to use Distinct to get it down to a unique list of display names.
ClearCollect(colDisplayNames, Distinct(Sort(colUsers_WW_with_Office, displayName), displayName));

I figured I could then loop through the colDisplayNames collection of unique display names and pull a single instance of each record in the colUsers_WW_with_Office collection that matches the display name, which would give me what I want.
But I am not able to get the code to work.
ClearCollect(colUsers_Gallery, Filter(colUsers_WW_with_Office, displayName in colDisplayNames.displayName)) (This code executes without error, but it does not return unique values)
ClearCollect(colUsers_Gallery, First(Filter(colUsers_WW_with_Office, displayName in colDisplayNames.displayName))) (This code executes without error, but it returns the same record)

 

Thoughts on how I may be able to achieve my goal?

Appreciate the help.

 

1 ACCEPTED SOLUTION

Accepted Solutions
v-bofeng-msft
Community Support
Community Support

Hi @TechieJones :

Do you want to delete the exact same record in the table colUsers_WW_with_Office?

As you said, directly using Distinct can only get a table with only one column. So your needs need to use special methods to achieve.

I assume these four fields are all text type(displayName/Office/Country/Region).

I've made a test for your reference:

My idea is to merge all the fields together, and then deduplicate the merged fields, and then split the deduplicated table to get a table with no duplicate records. Use this table to update the original collection.

 

ClearCollect(
    colUsers_WW_with_Office,
    ForAll(
        Distinct(
            Split(
                Concat(
                    colUsers_WW_with_Office,
                    displayName & "##" & Office & "##" & Country & "##" & Region,
                    "$$$"
                ),
                "$$$"
            ),
            Result
        ),
        {
            displayName: Last(
                FirstN(
                    Split(
                        Result,
                        "##"
                    ),
                    1
                )
            ).Result,
            Office: Last(
                FirstN(
                    Split(
                        Result,
                        "##"
                    ),
                    2
                )
            ).Result,
            Country: Last(
                FirstN(
                    Split(
                        Result,
                        "##"
                    ),
                    3
                )
            ).Result,
            Region: Last(
                FirstN(
                    Split(
                        Result,
                        "##"
                    ),
                    4
                )
            ).Result
        }
    )
)

 

700.gif

Best Regards,

Bof

View solution in original post

7 REPLIES 7

there is a cool power automate trick I saw @ThatAPIGuy  do where you union a list with itself and it only returns the unique values. You could do that and return the Distinct  list to the app https://www.google.com/amp/s/thatapiguy.tech/2020/04/04/get-distinct-records-using-the-union-express...

v-bofeng-msft
Community Support
Community Support

Hi @TechieJones :

Do you want to delete the exact same record in the table colUsers_WW_with_Office?

As you said, directly using Distinct can only get a table with only one column. So your needs need to use special methods to achieve.

I assume these four fields are all text type(displayName/Office/Country/Region).

I've made a test for your reference:

My idea is to merge all the fields together, and then deduplicate the merged fields, and then split the deduplicated table to get a table with no duplicate records. Use this table to update the original collection.

 

ClearCollect(
    colUsers_WW_with_Office,
    ForAll(
        Distinct(
            Split(
                Concat(
                    colUsers_WW_with_Office,
                    displayName & "##" & Office & "##" & Country & "##" & Region,
                    "$$$"
                ),
                "$$$"
            ),
            Result
        ),
        {
            displayName: Last(
                FirstN(
                    Split(
                        Result,
                        "##"
                    ),
                    1
                )
            ).Result,
            Office: Last(
                FirstN(
                    Split(
                        Result,
                        "##"
                    ),
                    2
                )
            ).Result,
            Country: Last(
                FirstN(
                    Split(
                        Result,
                        "##"
                    ),
                    3
                )
            ).Result,
            Region: Last(
                FirstN(
                    Split(
                        Result,
                        "##"
                    ),
                    4
                )
            ).Result
        }
    )
)

 

700.gif

Best Regards,

Bof

View solution in original post

Thank you @v-bofeng-msft.  I will test and report back.

 

Thank you @jlindstrom... cool use case.

Man, @v-bofeng-msft, your solution really is clever.

It works perfectly.

I am going to save that as my first method of de-duplicating a multi-field collection.  😉

I am working with only a thousand records or so, but the processing of your code is fast, nonetheless.
Thank you soo much for the help.

As for @jlindstrom I am going to save your idea as a tool, as well, because it has use cases beyond Power Apps.

Thank you.

Hi 

What if I just have one column out of five I want to remove duplicates from? 

Br. Tore


james_hathaway
Helper II
Helper II

I found a pretty good way of removing duplicates from a collection that uses the GroupBy and UnGroup functions:

 

col_Original = Original Collection that contains Duplicates

col_Distinct = Collection with the same columns as the original, but without Duplicate values in the "ColumnToDistinct" Column.

 

ClearCollect(col_Distinct,
   UnGroup(
      DropColumns(
         AddColumns(
            GroupBy(col_Original,"ColumnToDistinctOn","Rows"),
            "FirstRow",
            Table(First(Rows))
         ),
         "Rows"
      ),
      "FirstRow"
   )
)


Adding the Column "FirstRow" as a TABLE and Not a Record is key to making the Ungroup work properly.

 

Dropping the Column "Rows" is not strictly necessary, and might actually be useful if you need it, but I included it to make sure the end Collection has the same columns as the original.

 

If you don't have a "ColumnToDistinctOn", just use AddColumns to create a calculated "tuple" column to distinct (and DropColumns to remove again if you like)

 

Hope this helps!

skydivertak
Regular Visitor

Helpful resources

Announcements
UG GA Amplification 768x460.png

Launching new user group features

Learn how to create your own user groups today!

Community Connections 768x460.jpg

Community & How To Videos

Check out the new Power Platform Community Connections gallery!

M365 768x460.jpg

Microsoft 365 Collaboration Conference | December 7–9, 2021

Join us, in-person, December 7–9 in Las Vegas, for the largest gathering of the Microsoft community in the world.

Top Solution Authors
Top Kudoed Authors
Users online (2,717)