cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
TechieJones
Regular Visitor

Remove duplicate records in collection

Hello,

 

I am having difficulty removing duplicate records from a collection.

 

Consider these collections:

Collection1 named "colDisplayNames"
==Column named "displayName"

Collection2 named "colUsers_WW_with_Office"
==Column named "displayName"
==Column named "Office"
==Column named "Country"
==Column named "Region"


My goal is to update the collection so it only contains a single instance of each record.
So reduce all records from colUsers_WW_with_Office that are exact duplicates of each other down to a single instance of the record.
If every value in every column of the record is the same as some other record, remove the duplicate records, leaving a single instance of the record.

 

I tried to use the Distinct function, but quickly learned that it only returns a single-column collection.

 

So in order to get around this, my brain went like this...

 

I created the colDisplayNames collection, to which I am able to use Distinct to get it down to a unique list of display names.
ClearCollect(colDisplayNames, Distinct(Sort(colUsers_WW_with_Office, displayName), displayName));

I figured I could then loop through the colDisplayNames collection of unique display names and pull a single instance of each record in the colUsers_WW_with_Office collection that matches the display name, which would give me what I want.
But I am not able to get the code to work.
ClearCollect(colUsers_Gallery, Filter(colUsers_WW_with_Office, displayName in colDisplayNames.displayName)) (This code executes without error, but it does not return unique values)
ClearCollect(colUsers_Gallery, First(Filter(colUsers_WW_with_Office, displayName in colDisplayNames.displayName))) (This code executes without error, but it returns the same record)

 

Thoughts on how I may be able to achieve my goal?

Appreciate the help.

 

1 ACCEPTED SOLUTION

Accepted Solutions
v-bofeng-msft
Community Support
Community Support

Hi @TechieJones :

Do you want to delete the exact same record in the table colUsers_WW_with_Office?

As you said, directly using Distinct can only get a table with only one column. So your needs need to use special methods to achieve.

I assume these four fields are all text type(displayName/Office/Country/Region).

I've made a test for your reference:

My idea is to merge all the fields together, and then deduplicate the merged fields, and then split the deduplicated table to get a table with no duplicate records. Use this table to update the original collection.

 

ClearCollect(
    colUsers_WW_with_Office,
    ForAll(
        Distinct(
            Split(
                Concat(
                    colUsers_WW_with_Office,
                    displayName & "##" & Office & "##" & Country & "##" & Region,
                    "$$$"
                ),
                "$$$"
            ),
            Result
        ),
        {
            displayName: Last(
                FirstN(
                    Split(
                        Result,
                        "##"
                    ),
                    1
                )
            ).Result,
            Office: Last(
                FirstN(
                    Split(
                        Result,
                        "##"
                    ),
                    2
                )
            ).Result,
            Country: Last(
                FirstN(
                    Split(
                        Result,
                        "##"
                    ),
                    3
                )
            ).Result,
            Region: Last(
                FirstN(
                    Split(
                        Result,
                        "##"
                    ),
                    4
                )
            ).Result
        }
    )
)

 

700.gif

Best Regards,

Bof

View solution in original post

5 REPLIES 5
jlindstrom
Super User
Super User

there is a cool power automate trick I saw @ThatAPIGuy  do where you union a list with itself and it only returns the unique values. You could do that and return the Distinct  list to the app https://www.google.com/amp/s/thatapiguy.tech/2020/04/04/get-distinct-records-using-the-union-express...

v-bofeng-msft
Community Support
Community Support

Hi @TechieJones :

Do you want to delete the exact same record in the table colUsers_WW_with_Office?

As you said, directly using Distinct can only get a table with only one column. So your needs need to use special methods to achieve.

I assume these four fields are all text type(displayName/Office/Country/Region).

I've made a test for your reference:

My idea is to merge all the fields together, and then deduplicate the merged fields, and then split the deduplicated table to get a table with no duplicate records. Use this table to update the original collection.

 

ClearCollect(
    colUsers_WW_with_Office,
    ForAll(
        Distinct(
            Split(
                Concat(
                    colUsers_WW_with_Office,
                    displayName & "##" & Office & "##" & Country & "##" & Region,
                    "$$$"
                ),
                "$$$"
            ),
            Result
        ),
        {
            displayName: Last(
                FirstN(
                    Split(
                        Result,
                        "##"
                    ),
                    1
                )
            ).Result,
            Office: Last(
                FirstN(
                    Split(
                        Result,
                        "##"
                    ),
                    2
                )
            ).Result,
            Country: Last(
                FirstN(
                    Split(
                        Result,
                        "##"
                    ),
                    3
                )
            ).Result,
            Region: Last(
                FirstN(
                    Split(
                        Result,
                        "##"
                    ),
                    4
                )
            ).Result
        }
    )
)

 

700.gif

Best Regards,

Bof

View solution in original post

Thank you @v-bofeng-msft.  I will test and report back.

 

Thank you @jlindstrom... cool use case.

Man, @v-bofeng-msft, your solution really is clever.

It works perfectly.

I am going to save that as my first method of de-duplicating a multi-field collection.  😉

I am working with only a thousand records or so, but the processing of your code is fast, nonetheless.
Thank you soo much for the help.

As for @jlindstrom I am going to save your idea as a tool, as well, because it has use cases beyond Power Apps.

Thank you.

Hi 

What if I just have one column out of five I want to remove duplicates from? 

Br. Tore


Helpful resources

Announcements
PA User Group

Welcome to the User Group Public Preview

Check out new user group experience and if you are a leader please create your group

secondImage

Demo Extravaganza is Back!

We are excited to announce that Demo Extravaganza for 2021 has started!

MBAS on Demand

Microsoft Business Applications Summit sessions

On-demand access to all the great content presented by the product teams and community members! #MSBizAppsSummit #CommunityRocks

Power Apps June 2021

June Power Apps Community Call

Did you miss the call? Check out the recording here!

Top Solution Authors
Top Kudoed Authors
Users online (31,021)