cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
Community Champion
Community Champion

Pulling in large-ish SQL tables

Hello,

Is this how everyone is pulling in large-ish SQL data sets (~35k records)? Or do you use another method?

  1. Set Advanced Settings max to `2000`
  2.  
Concurrent(
    ClearCollect(col1, Filter('[dbo].[bigSqltable]', recordID >= 1 && recordID <= 2000)),
    ClearCollect(col2, Filter('[dbo].[bigSqltable]', recordID >= 2001 && recordID <= 4000)),
    ClearCollect(col3, Filter('[dbo].[bigSqltable]', recordID >= 4001 && recordID <= 6000)),
    ClearCollect(col4, Filter('[dbo].[bigSqltable]', recordID >= 6001 && recordID <= 8000)),
    ClearCollect(col5, Filter('[dbo].[bigSqltable]', recordID >= 8001 && recordID <= 10000)),
    ClearCollect(col6, Filter('[dbo].[bigSqltable]', recordID >= 10001 && recordID <= 12000)),
    ClearCollect(col7, Filter('[dbo].[bigSqltable]', recordID >= 12001 && recordID <= 14000)),
    ClearCollect(col8, Filter('[dbo].[bigSqltable]', recordID >= 14001 && recordID <= 16000)),
    ClearCollect(col9, Filter('[dbo].[bigSqltable]', recordID >= 16001 && recordID <= 18000)),
    ClearCollect(col10, Filter('[dbo].[bigSqltable]', recordID >= 18001 && recordID <= 20000)),
    ClearCollect(col11, Filter('[dbo].[bigSqltable]', recordID >= 20001 && recordID <= 22000)),
    ClearCollect(col12, Filter('[dbo].[bigSqltable]', recordID >= 22001 && recordID <= 24000)),
    ClearCollect(col13, Filter('[dbo].[bigSqltable]', recordID >= 24001 && recordID <= 26000)),
    ClearCollect(col14, Filter('[dbo].[bigSqltable]', recordID >= 26001 && recordID <= 28000)),
    ClearCollect(col15, Filter('[dbo].[bigSqltable]', recordID >= 28001 && recordID <= 30000)),
    ClearCollect(col16, Filter('[dbo].[bigSqltable]', recordID >= 30001 && recordID <= 32000)),
    ClearCollect(col17, Filter('[dbo].[bigSqltable]', recordID >= 32001 && recordID <= 34000)),
    ClearCollect(col18, Filter('[dbo].[bigSqltable]', recordID >= 34001 && recordID <= 35000))
);
ClearCollect(colCombined, 
    col1, col2, col3, col4, col5, col6, col7, col8, col9, col10, col11, col12, col13, col14, col15, col16, col17, col18
)
38 REPLIES 38

@ericonline  Nope...now you've crossed into delegation.  The challenge was how to get at all the items in the list and how to avoid delgation issues.

>= and <= are not delegable in SharePoint.

Nice try 😉 

We hashed out every possible way to skin this one and get the total results of the list...including that method, which still only would return only a subset up to the limits of delegation.

By the way - total load time on that list with those formulas, including the pulling of duplicates and everything else in the kitchen sink was about 20 seconds.

_____________________________________________________________________________________
Digging it? - Click on the Thumbs Up. Solved your problem? - Click on Accept as Solution. Others seeking the same answers will be happy you did.
Check out my PowerApps Videos too!

Hello,

Have you tried retrieving the data using Flow? If you don't mind using Flow, you should try it.
Check this short video:
https://www.youtube.com/watch?time_continue=12&v=FLR41v0F1OE

Hi @RandyHayes 

Thanks for posting your SharePoint/StartsWith formula. It'll definitely be of help to lots of people.

Just as a slight comment - the <,<=,>,>= operators are delegable in SharePoint, but only for numeric fields. There was a mistake in the documentation, the post here gives a bit more detail.

https://powerusers.microsoft.com/t5/General-Discussion/Filter-delegable-predicates-for-SharePoint/td...

 

@timlInteresting...ya take the word of these documents some times and stay within the lines.

I will put this to the test one day.  I believe the other aspect in this case was...what were the ID's?  The list has been around for a long time in SharePoint and lots of adds, removes and such.  So, couldn't say for sure that casting a filter on a range (as @ericonline suggested) would yield the proper results if ID's were larger than we expected.

 

All good food for thought and fun experiments!

_____________________________________________________________________________________
Digging it? - Click on the Thumbs Up. Solved your problem? - Click on Accept as Solution. Others seeking the same answers will be happy you did.
Check out my PowerApps Videos too!

Ran across this blog post (again) yesterday and it looks like a more eloquent way to do the whole "collect-2k-records-at-a-time" thing from a SQL DB. 

I can't quite decipher it at first glance, nor can I afford (time) to reengineer the solution I already have working, but wanted to share. 
Cheers!

@ericonline For what it's worth, here is one method I use. It is similar in concept to the blog post you referenced, only they are generalizing the process a bit more.  I don't remember who I picked this up from but it was back around when the SaveData/LoadData functions were first released, so it is an older method for sure. This code is adapted from our maintenance app, which is offline capable (-ish...if they can open the app, it works...) and needs to give access to roughly a year of maintenance data. 

 

// Collect MaintenanceRecord table if missing (4000 records max)
If(
    IsEmpty(MaintenanceRecordCollection),
    Collect(
        MaintenanceRecordCollection,
        Sort(
            '[dbo].[MaintenanceRecord]',
            ID,
            Descending
        )
    );
    UpdateContext(
        {
            MinID: Min(
                MaintenanceRecordCollection,
                ID
            )
        }
    );
    If(
        CountRows(MaintenanceRecordCollection) = 500,
        Collect(
            MaintenanceRecordCollection,
            Filter(
                Sort(
                    '[dbo].[MaintenanceRecord]',
                    ID,
                    Descending
                ),
                ID < MinID
            )
        )
    );
    UpdateContext(
        {
            MinID: Min(
                MaintenanceRecordCollection,
                ID
            )
        }
    );
    If(
        CountRows(MaintenanceRecordCollection) = 1000,
        Collect(
            MaintenanceRecordCollection,
            Filter(
                Sort(
                    '[dbo].[MaintenanceRecord]',
                    ID,
                    Descending
                ),
                ID < MinID
            )
        )
    );
.
.
. UpdateContext( { MinID: Min( MaintenanceRecordCollection, ID ) } ); If( CountRows(MaintenanceRecordCollection) = 3500, Collect( MaintenanceRecordCollection, Filter( Sort( '[dbo].[MaintenanceRecord]', ID, Descending ), ID < MinID ) ) ); SaveData( MaintenanceRecordCollection, "LocalMaintenanceRecord" ) )

The bolded code would be repeated for as many blocks as you want to pull in. I set the limit for this particular data set at 4000 and I am using the standard 500 items, which is what the red number is denoting. In subsequent blocks, that number goes up by multiples of whatever the delegation limit is set at, so in my case 500 to 1000 to 1500 and so on. The last one should be one multiple less than your desired limit, so in my case 3500.

 

The gist of the code is this:

  1. If the collection is empty, pull in the first chunk, sorted by ID in descending order. (Checking for an empty collection is optional.)
  2. Store the collection's minimum ID number in a local variable
  3. If the row count in the collection is equal to the maximum amount of delegatable items, filter out the IDs larger than or equal to the stored minimum ID and collect the next chuck, again sorted by ID in descending order, and collect the next chunk.
  4. Repeat steps 2 & 3 as many times as needed.
  5. Save the data (optional).

 

The disadvantage of this method versus the Concurrent() method in your original post is that it is stepwise in how it collects the data, so the time to gather the data will be longer (assuming the maximum amount of data is loaded in). I doubt the clean-up functions in your method add much time, so even with those, I would bet your method is faster. This method doesn't require the clean-up and is easy to extend as needed with a simple copy/paste and one variable change. It also doesn't depend on the ID's as much, so it is a bit more flexible in cases where there are gaps in the IDs. It will get the max possible number of items regardless.

Thanks @wyotim (and everyone!) good convo. 
Pecking away at this some more. Created a SQL View which carved the data set down ~11.7k records, sweet! Now here is where it gets interesting:

  • There is an ID column in the dataset (primary key)
  • The View is pulling in records from around the entire 34k set. Many ID's in the view are beyond 11700. 
  • Do I have to iterate over the entire 34k records still? 
  • How can I pull in just the records in the SQL view?

Basically, we've been doing great at pulling sequential records. This little piece throws a monkey wrench in... I think 🙂

Tricky!

Hm. Nice. 
It appears that just substituting 'vwSql' for 'bigSqlTable' (in the original post) reduced the time from:

  • 34240 records @ 20 secs //I know I said 6 secs yesterday @timl. Maybe it was a cache thing? 🙂
  • 11175 records @ 7 secs

I didn't have to fiddle with the ID's.
Cool!

Hi @ericonline 

I'm glad you're making progress on this, and that you didn't need to fiddle with IDs!

If the absence of a sequential ID were an issue, one workaround would be to modify your view to include a sequential row number which you can do with the Row_Number function. The view definition would look something like this:

 

 

CREATE VIEW [vwSQL]
SELECT RecordID, Column1, Column2, ROW_NUMBER() OVER(ORDER BY RecordID) AS RowNumber FROM [bigSQLTable]
WHERE Column1 = 'WhatEverCriteriaToReturn11kRows'

 

I'm not really sure about why pulling 35k yesterday took 6secs, but 20secs today.

 

One of things I sometimes do when I'm performance tuning is I'll export the contents of vwSQL to a text format through the export function in Management Studio. I'll look at the file size to give me an indication of how much data there is (say 45MB for example). I have a rough idea of what my internet speed is so I can use that to calculate the optimum transfer time for the data. If the load times in my app then deviate greatly from this, it can indicate that there's some problem. If it's loading too quickly, I can double check that it's actually loading all the data that I expect.

Also if you did suspect a caching issue, SQL Profiler could help you check the data that the app is actually retrieving.

Continued Contributor
Continued Contributor

@RandyHayes @wyotim @ericonline 
I am attempting a different approach and wanted to see if any of you (formula wizards) had any insights that might help...  I  do not want to have to monitor my SQL table and add another ClearCollect as new records get added.  So, for example, using @ericonline example: I would need to append the OnStart with another ClearCollect once my SQL table exceeds 35000 records.  Not only is this an issue, the view I am referencing does not have recordId (or similar) in chronological order.  

ClearCollect(col17, Filter('[dbo].[bigSqltable]', recordID >= 32001 && recordID <= 34000)),
ClearCollect(col18, Filter('[dbo].[bigSqltable]', recordID >= 34001 && recordID <= 35000))

My OnStart is as follows: 

 

ClearCollect(
    colSubs,
    '[dbo].[vwDISTINCT_SUBS]'
);
Clear(colSubsMaster);
ForAll(
    colSubs,
    Collect(
        colSubsMaster,
        {
            val: CountRows(colSubsMaster) + 1,
            colName: Concatenate("col", Text(CountRows(colSubsMaster) + 1)),
            //Output of next formula is: ClearCollect(col1, Filter('[dbo].[VW_MasterTable]', SUBS = "SubsName"
collectCommand: Concatenate( "ClearCollect(", Concatenate("col", Text(CountRows(colSubsMaster) + 1)), ", Filter('[dbo].[VW_TableViewName]', SUBS = """, SUBS Char(34) ), name: Last( FirstN( colSubs, CountRows(colSubsMaster) + 1 ) ).SUBS } ) )

This will be dynamic, so if a person adds a new SUBS to the master data, it will be pulled into these collections.  Now, here is where I want the magic to happen.  I would like to run all of the ClearCollect formulas in the colSubsMaster --> collectCommand (column).  I have tried using a Gallery with a toggle and button to trigger (shown in image below).  I have also tried ForAll in different variations.  No luck.  I understand that the collectCommand in collection is 'seen' as data and not an action. Is there some way to translate it into an action? 

*Updated - it would be really great if this could run under a Concurrent.  

 

image.png

Clear as mud 🙂 !

Thanks!.... aka ~ @KickingApps 

Helpful resources

Announcements
New Badges

New Solution Badges!

Check out our new profile badges recognizing authored solutions!

New Power Super Users

Congratulations!

We are excited to announce the Power Apps Super Users!

Power Apps Community Call

Power Apps Community Call: February

Did you miss the call? Check out the Power Apps Community Call here.

Microsoft Ignite

Microsoft Ignite

Join digitally, March 2–4, 2021 to explore new tech that's ready to implement. Experience the keynote in mixed reality through AltspaceVR!

Top Solution Authors
Top Kudoed Authors
Users online (52,491)