I have created a Power Query with subsequent dataflow which created the records in CDS fine except for one thing - it created duplicate records.
I have selected the "Do Not Delete records not in this query" however it just seems to create duplicate records in my CDS environment which isnt ideal when your Power Query has 9000+ records and your existing CDS entity you are using has 30k records.
What I need to occur is:
Is this something I need to amend within my Power Query to only show those with a created on date of today? Or do i need to bring my CDS table into Power Query (if possible) to ensure it knows what records already exist before running its nightly update?
Happy to provide more details to assist further.
Unfortunately this is a current limitation of Dataflows with the only workaround selecting the setting the "Delete rows that no longer exist". You can vote for this feature below though and hopefully Microsoft will be looking into this for a further release:
You can avoid duplicates by setting up a Key on the entity you are importing into and then setting this as the Alternate Key field on the Field Mapping in the dataflow. You will then also need to have a unique identifier for each record in your source data and map that to the Key field that you set up on the entity. This then tells the dataflow not to create a duplicate because a record with that key already exists in CDS.
This will resolve your duplicate issue, however, as @EricRegnier says dataflow is limited and cannot meet your requirement exactly for
The dataflow will "modify" all records in CDS that match your source file if they already exist in CDS, regardless of whether the data has changed or not. However, this may be OK for you?
Thanks a lot for your feedback on this.
I assumed this was the case however I wanted to clarify before I looked at alternate solutions.
The key solution was something I was looking at I just need to ensure our Power App isn't reading a flag on our existing contact record. The alternate solution would be to create these records in a data import entity per see and use flow to create the new records however with what I am reading below the created on record will always be the same? What about if I used the "delete records if not in that query" in that seperate entity? Would this result in my created on date being the same date as my data came in via my query? Eg stages dates for each record that came in via the source?
Unless Dataflows supports it now, but from what I've seen even with alternate keys, Dataflows tries to insert records, the only difference is you'll get a duplicate key exception for that row and thus won't be updated. @HSheild are you saying we can now update? Because that would be awesome!
@mattyn07, having some kind of staging entity will help with a flow that does the update so will keep your original created on in the "real" entity.
Hi @EricRegnier , yep, you can update existing data without getting a duplicate warning. Dataflows seem to have changed a lot over the past. Just not a lot of documentation about it. Most of my knowledge is from experimenting with it!
Here is an example, where dataflow has updated existing Contact records in CDS which were created a while ago
@mattyn07 , using a staging entity is interesting. And you could take that approach.
To answer your other questions.
If a record already exists in CDS and is matched by the Alternative Key then that record is updated. In the past I think dataflow used to delete the records and then re-create them.
If you choose the option to "delete records" then this will delete all records in CDS that are not matched with your data source - so this should only be used if your data source is the master for this data and if users are not creating records for this entity directly in CDS.
Hope this helps
I 100% agree re lack of documentation.
It has been tricky to find any real answers apart from the discussions we are having now. So your example below I noticed the "Created On" date remained the same date as I assume the dataflow populated that record in CDS? Or was it when the record was created in your source data (Power Query connection).
My source of truth will remain my external SQL DB which is good for my situation so I think the solution proposed will work as long as the modified date only updates the fields I have mapped (which i am 99.99999% sure it will).
Il see how I go over the weekend and potentially do a blog post as I think the community needs to see more info regarding what a dataflow can and cannot do.
Cheers @HSheild, is there something in particular you configured? I just tried form my side and getting duplicate key exceptions. Here's how my config looks like:
Error message of the row:
Reason: Precondition Failed, Error code: 0x80060892, Message: A record that has the attribute values Column1.char_id already exists. The entity key Char requires that this set of attributes contains unique values. Select unique values and try again.
@mattyn07, good idea for the blog post. You can also amend the Dataflow documentation directly now that everything is hosted in GitHub and if your updates are valid, MS will approve your PR. @HSheild you should do the same with all the Dataflow tricks you have!
"So your example below I noticed the "Created On" date remained the same date as I assume the dataflow populated that record in CDS? Or was it when the record was created in your source data (Power Query connection)." Created On is the date that the record was first created in CDS by dataflow. There is a field in CDS called overriddencreatedon which you can normally use to set the Created On date to be from your source data but that field doesn't seem to be available with dataflow yet.
Great idea on writing a blog - I was also thinking about doing the same thing. Let me know how you get on with yours.
@EricRegnier, I'm not sure what your issue could be. Your mapping looks OK. I'll write a blog with steps for how I have it working and maybe that will help you. Any chance that Char ID already exists twice in your CDS? If you have created data before you created the key?
Nah, if Char ID wasn't unique, the alt key wouldn't get activated. I think it's just not supported yet, I also tried with a brand new dataflow, with data before, without, but no luck...
Featuring samples like Return to the Workplace and Emergency Response Applications
We're excited to announce our first cross-community 'Can You Solve These?' challenge!
Reopen responsibly, monitor intelligently, and protect continuously with solutions for a safer work environment.
We are excited for the next Super User season.
Features releasing from October 2020 through March 2021