cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
Highlighted
Frequent Visitor

Finding updated entities

Hi all,

 

I am building up a ETL-like system on top of CDS: import data using dataflow, edit if needed on UI and export to other systems using Power Automate flows.

 

Almost everything works fine, but i have a problem identifying which entities have been updated, to filter them out during export.

CDS maintains a modifiedon field, but it seems to always get updated in import dataflows; I have to mention, that to be able to support entity deletes from import flows, i always import everything (the import dataflow always processes the full input data set, and based on the match key it knows which action to take on each entity).

 

So, what's the correct way to handle this?

 

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted
Solution Sage
Solution Sage

Re: Finding updated entities

Thanks for clarifying and understand your scenario. Like mentioned, unfortunately with Dataflows is limited to delete all and re-inserts. Dataflow is a fairly new component of the Power Platform and I'm confident Microsoft will implement new functionalities in the future that will make these scenarios simpler.

It seems you already have a staging ground, so have a look at the options originally posted. If you can get am "Updated Date", "Created Date", and other record metadata and map those to custom fields, then you can discard the out-of-the-box "Created On" , "Modified On", "Modified By", etc and use those custom fields instead for your CDS logic and UI. If all options are not feasible, then perhaps you'll have to park the Dataflows route and look at alternatives such as:

  1. Virtual Entities. This can be a good option! https://docs.microsoft.com/en-us/powerapps/developer/common-data-service/virtual-entities/get-starte...
  2. Logic Apps or Power Automate
  3. Custom integration or SSIS

View solution in original post

8 REPLIES 8
Highlighted
Solution Sage
Solution Sage

Re: Finding updated entities

Hi @CatalinAdler,

As you mentioned unfortunately up to now, Dataflows always inserts even if you have an alternate key defined (or before if option is selected). The best workaround would be to define a custom Modified Date field that you can map and set from the source data if the source data has some kind of Modified Date property? Otherwise another option although a bit more complex is to have a "staging" entity and perform you ETL something like this:

  1. Run the Dataflow on the staging entity instead of the "real" one
  2. Have a plugin or Power Automate flow that triggers on create of the staging entity
  3. In plugin or flow, check if the data changed and if so update the "real" entity:
    1. For new records, check if the key doesn't exist in the real entity
    2. For delete, check if key doesn't exist in staging entity

Hope this helps!

Highlighted
Impactful Individual
Impactful Individual

Re: Finding updated entities

When a record is modified by a user the name of the user is specified in the modifiedby field.

 

To identify records that have been manually updated by a different user just look for records where the modifiedby field is not the name of the user that runs that import routine and ensure that the import routine is ran by a separate account (you could use an non-interactive user account see https://docs.microsoft.com/en-us/power-platform/admin/create-users-assign-online-security-roles#crea... for the import which would ensure that was the case).

---
If this post has answered your question please consider it for "Accept as Solution" or if it has been helpful give it a "Thumbs Up".
Highlighted
Solution Sage
Solution Sage

Re: Finding updated entities

From my understanding @CatalinAdler needs to determine the records that changed for Dataflow run and not within CDS. Dataflows instance will always run under the same account so wouldn't be possible to distinguish with the Modified By

 

@CatalinAdler BTW I submitted a request to have update capabilities on Dataflows. That would solve your issue. Feel free to vote 🙂 https://powerusers.microsoft.com/t5/Power-Apps-Ideas/Upsert-update-capabilities-in-Power-Apps-Datafl... 

Highlighted
Impactful Individual
Impactful Individual

Re: Finding updated entities

@EricRegnier Unless I completely misread things I'm sure the question was how to identify the records that have been manually updated for export from CDS - mind you the use of the phrase "on top of" does offer 2 different possibilities.

 

To me is a very common use case - a quickly written import routine updates all records (unless you add a sanity check staging step but that has a cost) so you need a means of identifying just those records that have been manually updated. 

 

---
If this post has answered your question please consider it for "Accept as Solution" or if it has been helpful give it a "Thumbs Up".
Highlighted
Frequent Visitor

Re: Finding updated entities

Thanks for the replies @ben-thompson @EricRegnier 

Let me give you an example.

 

Imagine you import invoices and invoice details and export them to a web service (powerautomate flows); the data source is an ERP system and the dataflow imports from .csv files.

Imagine you run the import data flow and import invoice with 5 invoice details.

 

After you import into CDS, someone deletes 2 invoice details in the ERP system; the entity is exported to CSV files again: 1 invoice with 3 invoice details.

To import it again, you have to just delete 3 invoice detail entities from CDS.

 

Now, how do you do the delete (the existing delete capabilities will delete all entities which are not in the current dataflow execution)?

How do you mark down that the invoice was changed, so that you know you have to export it again.

 

We are currently doing hops with IsExported flags and left anti joins on a staging ground to achieve this...but it's beyond complex and brittle.

 

All because:

1. you cannot delete specific entities when doing imports.

2. you cannot know when an entity was updated in CDS through import dataflows.

 

 

Highlighted
Impactful Individual
Impactful Individual

Re: Finding updated entities

@CatalinAdler I think the issue here is that you need and don't have a Change Data Capture routine as I personally would be expecting the csv file not to contain the unchanged invoice details records, just the changed invoice and a delete flag against the  invoice details due to be deleted.

 

one way of resolving this might be to bring your CDS data into your staging tables (Microsoft labs have a data export solution that will generate a sql database) and use that as the basis of working out what needs to be updated / deleted. But you really need to check if a CDC solution is available for your ERP as that would make things far easier. 

---
If this post has answered your question please consider it for "Accept as Solution" or if it has been helpful give it a "Thumbs Up".
Highlighted
Frequent Visitor

Re: Finding updated entities

In reality there isn't 1 ERP system.

 

There are many, and different; CDS's dataflow's flexibility really works really well here.

 

I have 0 control over them: some are api, some are file based, etc.

 

All can expose current state of things: how each entity looks like at moment of dataflow import.

Highlighted
Solution Sage
Solution Sage

Re: Finding updated entities

Thanks for clarifying and understand your scenario. Like mentioned, unfortunately with Dataflows is limited to delete all and re-inserts. Dataflow is a fairly new component of the Power Platform and I'm confident Microsoft will implement new functionalities in the future that will make these scenarios simpler.

It seems you already have a staging ground, so have a look at the options originally posted. If you can get am "Updated Date", "Created Date", and other record metadata and map those to custom fields, then you can discard the out-of-the-box "Created On" , "Modified On", "Modified By", etc and use those custom fields instead for your CDS logic and UI. If all options are not feasible, then perhaps you'll have to park the Dataflows route and look at alternatives such as:

  1. Virtual Entities. This can be a good option! https://docs.microsoft.com/en-us/powerapps/developer/common-data-service/virtual-entities/get-starte...
  2. Logic Apps or Power Automate
  3. Custom integration or SSIS

View solution in original post

Helpful resources

Announcements
secondImage

Demo-Extravaganza 2020

Check out these cool Power Apps & vote on your favorite!

secondImage

Community Highlights

Check out whats happening in Power Apps

secondImage

Community User Group Member Badges

FIll out a quick form to claim your community user group member badge today!

secondImage

Power Platform 2020 release wave 2 plan

Features releasing from October 2020 through March 2021

Users online (10,997)