cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
CatalinAdler
Frequent Visitor

Finding updated entities

Hi all,

 

I am building up a ETL-like system on top of CDS: import data using dataflow, edit if needed on UI and export to other systems using Power Automate flows.

 

Almost everything works fine, but i have a problem identifying which entities have been updated, to filter them out during export.

CDS maintains a modifiedon field, but it seems to always get updated in import dataflows; I have to mention, that to be able to support entity deletes from import flows, i always import everything (the import dataflow always processes the full input data set, and based on the match key it knows which action to take on each entity).

 

So, what's the correct way to handle this?

 

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions

Thanks for clarifying and understand your scenario. Like mentioned, unfortunately with Dataflows is limited to delete all and re-inserts. Dataflow is a fairly new component of the Power Platform and I'm confident Microsoft will implement new functionalities in the future that will make these scenarios simpler.

It seems you already have a staging ground, so have a look at the options originally posted. If you can get am "Updated Date", "Created Date", and other record metadata and map those to custom fields, then you can discard the out-of-the-box "Created On" , "Modified On", "Modified By", etc and use those custom fields instead for your CDS logic and UI. If all options are not feasible, then perhaps you'll have to park the Dataflows route and look at alternatives such as:

  1. Virtual Entities. This can be a good option! https://docs.microsoft.com/en-us/powerapps/developer/common-data-service/virtual-entities/get-starte...
  2. Logic Apps or Power Automate
  3. Custom integration or SSIS

View solution in original post

8 REPLIES 8
EricRegnier
Super User II
Super User II

Hi @CatalinAdler,

As you mentioned unfortunately up to now, Dataflows always inserts even if you have an alternate key defined (or before if option is selected). The best workaround would be to define a custom Modified Date field that you can map and set from the source data if the source data has some kind of Modified Date property? Otherwise another option although a bit more complex is to have a "staging" entity and perform you ETL something like this:

  1. Run the Dataflow on the staging entity instead of the "real" one
  2. Have a plugin or Power Automate flow that triggers on create of the staging entity
  3. In plugin or flow, check if the data changed and if so update the "real" entity:
    1. For new records, check if the key doesn't exist in the real entity
    2. For delete, check if key doesn't exist in staging entity

Hope this helps!

ben-thompson
Solution Sage
Solution Sage

When a record is modified by a user the name of the user is specified in the modifiedby field.

 

To identify records that have been manually updated by a different user just look for records where the modifiedby field is not the name of the user that runs that import routine and ensure that the import routine is ran by a separate account (you could use an non-interactive user account see https://docs.microsoft.com/en-us/power-platform/admin/create-users-assign-online-security-roles#crea... for the import which would ensure that was the case).

---
If this post has answered your question please consider it for "Accept as Solution" or if it has been helpful give it a "Thumbs Up".

From my understanding @CatalinAdler needs to determine the records that changed for Dataflow run and not within CDS. Dataflows instance will always run under the same account so wouldn't be possible to distinguish with the Modified By

 

@CatalinAdler BTW I submitted a request to have update capabilities on Dataflows. That would solve your issue. Feel free to vote 🙂 https://powerusers.microsoft.com/t5/Power-Apps-Ideas/Upsert-update-capabilities-in-Power-Apps-Datafl... 

@EricRegnier Unless I completely misread things I'm sure the question was how to identify the records that have been manually updated for export from CDS - mind you the use of the phrase "on top of" does offer 2 different possibilities.

 

To me is a very common use case - a quickly written import routine updates all records (unless you add a sanity check staging step but that has a cost) so you need a means of identifying just those records that have been manually updated. 

 

---
If this post has answered your question please consider it for "Accept as Solution" or if it has been helpful give it a "Thumbs Up".
CatalinAdler
Frequent Visitor

Thanks for the replies @ben-thompson @EricRegnier 

Let me give you an example.

 

Imagine you import invoices and invoice details and export them to a web service (powerautomate flows); the data source is an ERP system and the dataflow imports from .csv files.

Imagine you run the import data flow and import invoice with 5 invoice details.

 

After you import into CDS, someone deletes 2 invoice details in the ERP system; the entity is exported to CSV files again: 1 invoice with 3 invoice details.

To import it again, you have to just delete 3 invoice detail entities from CDS.

 

Now, how do you do the delete (the existing delete capabilities will delete all entities which are not in the current dataflow execution)?

How do you mark down that the invoice was changed, so that you know you have to export it again.

 

We are currently doing hops with IsExported flags and left anti joins on a staging ground to achieve this...but it's beyond complex and brittle.

 

All because:

1. you cannot delete specific entities when doing imports.

2. you cannot know when an entity was updated in CDS through import dataflows.

 

 

@CatalinAdler I think the issue here is that you need and don't have a Change Data Capture routine as I personally would be expecting the csv file not to contain the unchanged invoice details records, just the changed invoice and a delete flag against the  invoice details due to be deleted.

 

one way of resolving this might be to bring your CDS data into your staging tables (Microsoft labs have a data export solution that will generate a sql database) and use that as the basis of working out what needs to be updated / deleted. But you really need to check if a CDC solution is available for your ERP as that would make things far easier. 

---
If this post has answered your question please consider it for "Accept as Solution" or if it has been helpful give it a "Thumbs Up".

In reality there isn't 1 ERP system.

 

There are many, and different; CDS's dataflow's flexibility really works really well here.

 

I have 0 control over them: some are api, some are file based, etc.

 

All can expose current state of things: how each entity looks like at moment of dataflow import.

Thanks for clarifying and understand your scenario. Like mentioned, unfortunately with Dataflows is limited to delete all and re-inserts. Dataflow is a fairly new component of the Power Platform and I'm confident Microsoft will implement new functionalities in the future that will make these scenarios simpler.

It seems you already have a staging ground, so have a look at the options originally posted. If you can get am "Updated Date", "Created Date", and other record metadata and map those to custom fields, then you can discard the out-of-the-box "Created On" , "Modified On", "Modified By", etc and use those custom fields instead for your CDS logic and UI. If all options are not feasible, then perhaps you'll have to park the Dataflows route and look at alternatives such as:

  1. Virtual Entities. This can be a good option! https://docs.microsoft.com/en-us/powerapps/developer/common-data-service/virtual-entities/get-starte...
  2. Logic Apps or Power Automate
  3. Custom integration or SSIS

View solution in original post

Helpful resources

Announcements
PA User Group

Welcome to the User Group Public Preview

Check out new user group experience and if you are a leader please create your group

Power Apps Community Call

Monthly Power Apps Community Call

Did you miss the call?? Check out the Power Apps Community Call here!

secondImage

Experience what’s next for Power Apps

See the latest Power Apps innovations, updates, and demos from the Microsoft Business Applications Launch Event.

Power Platform ISV STudio

Power Platform ISV Studio

ISV Studio is the go-to Power Platform destination for ISV’s to monitor & manage applications post-AppSource publish.

Users online (15,420)