cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
Super User
Super User

Re: Pulling in large-ish SQL tables

@ericonline  Nope...now you've crossed into delegation.  The challenge was how to get at all the items in the list and how to avoid delgation issues.

>= and <= are not delegable in SharePoint.

Nice try 😉 

We hashed out every possible way to skin this one and get the total results of the list...including that method, which still only would return only a subset up to the limits of delegation.

By the way - total load time on that list with those formulas, including the pulling of duplicates and everything else in the kitchen sink was about 20 seconds.

_____________________________________________________________________________________
Digging it? - Click on the Thumbs Up. Solved your problem? - Click on Accept as Solution. Others seeking the same answers will be happy you did.
Super User
Super User

Re: Pulling in large-ish SQL tables

Hello,

Have you tried retrieving the data using Flow? If you don't mind using Flow, you should try it.
Check this short video:
https://www.youtube.com/watch?time_continue=12&v=FLR41v0F1OE

Super User
Super User

Re: Pulling in large-ish SQL tables

Hi @RandyHayes 

Thanks for posting your SharePoint/StartsWith formula. It'll definitely be of help to lots of people.

Just as a slight comment - the <,<=,>,>= operators are delegable in SharePoint, but only for numeric fields. There was a mistake in the documentation, the post here gives a bit more detail.

https://powerusers.microsoft.com/t5/General-Discussion/Filter-delegable-predicates-for-SharePoint/td...

 

Super User
Super User

Re: Pulling in large-ish SQL tables

@timlInteresting...ya take the word of these documents some times and stay within the lines.

I will put this to the test one day.  I believe the other aspect in this case was...what were the ID's?  The list has been around for a long time in SharePoint and lots of adds, removes and such.  So, couldn't say for sure that casting a filter on a range (as @ericonline suggested) would yield the proper results if ID's were larger than we expected.

 

All good food for thought and fun experiments!

_____________________________________________________________________________________
Digging it? - Click on the Thumbs Up. Solved your problem? - Click on Accept as Solution. Others seeking the same answers will be happy you did.
Super User
Super User

Re: Pulling in large-ish SQL tables

Ran across this blog post (again) yesterday and it looks like a more eloquent way to do the whole "collect-2k-records-at-a-time" thing from a SQL DB. 

I can't quite decipher it at first glance, nor can I afford (time) to reengineer the solution I already have working, but wanted to share. 
Cheers!

Super User
Super User

Re: Pulling in large-ish SQL tables

@ericonline For what it's worth, here is one method I use. It is similar in concept to the blog post you referenced, only they are generalizing the process a bit more.  I don't remember who I picked this up from but it was back around when the SaveData/LoadData functions were first released, so it is an older method for sure. This code is adapted from our maintenance app, which is offline capable (-ish...if they can open the app, it works...) and needs to give access to roughly a year of maintenance data. 

 

// Collect MaintenanceRecord table if missing (4000 records max)
If(
    IsEmpty(MaintenanceRecordCollection),
    Collect(
        MaintenanceRecordCollection,
        Sort(
            '[dbo].[MaintenanceRecord]',
            ID,
            Descending
        )
    );
    UpdateContext(
        {
            MinID: Min(
                MaintenanceRecordCollection,
                ID
            )
        }
    );
    If(
        CountRows(MaintenanceRecordCollection) = 500,
        Collect(
            MaintenanceRecordCollection,
            Filter(
                Sort(
                    '[dbo].[MaintenanceRecord]',
                    ID,
                    Descending
                ),
                ID < MinID
            )
        )
    );
    UpdateContext(
        {
            MinID: Min(
                MaintenanceRecordCollection,
                ID
            )
        }
    );
    If(
        CountRows(MaintenanceRecordCollection) = 1000,
        Collect(
            MaintenanceRecordCollection,
            Filter(
                Sort(
                    '[dbo].[MaintenanceRecord]',
                    ID,
                    Descending
                ),
                ID < MinID
            )
        )
    );
.
.
. UpdateContext( { MinID: Min( MaintenanceRecordCollection, ID ) } ); If( CountRows(MaintenanceRecordCollection) = 3500, Collect( MaintenanceRecordCollection, Filter( Sort( '[dbo].[MaintenanceRecord]', ID, Descending ), ID < MinID ) ) ); SaveData( MaintenanceRecordCollection, "LocalMaintenanceRecord" ) )

The bolded code would be repeated for as many blocks as you want to pull in. I set the limit for this particular data set at 4000 and I am using the standard 500 items, which is what the red number is denoting. In subsequent blocks, that number goes up by multiples of whatever the delegation limit is set at, so in my case 500 to 1000 to 1500 and so on. The last one should be one multiple less than your desired limit, so in my case 3500.

 

The gist of the code is this:

  1. If the collection is empty, pull in the first chunk, sorted by ID in descending order. (Checking for an empty collection is optional.)
  2. Store the collection's minimum ID number in a local variable
  3. If the row count in the collection is equal to the maximum amount of delegatable items, filter out the IDs larger than or equal to the stored minimum ID and collect the next chuck, again sorted by ID in descending order, and collect the next chunk.
  4. Repeat steps 2 & 3 as many times as needed.
  5. Save the data (optional).

 

The disadvantage of this method versus the Concurrent() method in your original post is that it is stepwise in how it collects the data, so the time to gather the data will be longer (assuming the maximum amount of data is loaded in). I doubt the clean-up functions in your method add much time, so even with those, I would bet your method is faster. This method doesn't require the clean-up and is easy to extend as needed with a simple copy/paste and one variable change. It also doesn't depend on the ID's as much, so it is a bit more flexible in cases where there are gaps in the IDs. It will get the max possible number of items regardless.

Super User
Super User

Re: Pulling in large-ish SQL tables

Thanks @wyotim (and everyone!) good convo. 
Pecking away at this some more. Created a SQL View which carved the data set down ~11.7k records, sweet! Now here is where it gets interesting:

  • There is an ID column in the dataset (primary key)
  • The View is pulling in records from around the entire 34k set. Many ID's in the view are beyond 11700. 
  • Do I have to iterate over the entire 34k records still? 
  • How can I pull in just the records in the SQL view?

Basically, we've been doing great at pulling sequential records. This little piece throws a monkey wrench in... I think 🙂

Tricky!

Super User
Super User

Re: Pulling in large-ish SQL tables

Hm. Nice. 
It appears that just substituting 'vwSql' for 'bigSqlTable' (in the original post) reduced the time from:

  • 34240 records @ 20 secs //I know I said 6 secs yesterday @timl. Maybe it was a cache thing? 🙂
  • 11175 records @ 7 secs

I didn't have to fiddle with the ID's.
Cool!

Super User
Super User

Re: Pulling in large-ish SQL tables

Hi @ericonline 

I'm glad you're making progress on this, and that you didn't need to fiddle with IDs!

If the absence of a sequential ID were an issue, one workaround would be to modify your view to include a sequential row number which you can do with the Row_Number function. The view definition would look something like this:

 

 

CREATE VIEW [vwSQL]
SELECT RecordID, Column1, Column2, ROW_NUMBER() OVER(ORDER BY RecordID) AS RowNumber FROM [bigSQLTable]
WHERE Column1 = 'WhatEverCriteriaToReturn11kRows'

 

I'm not really sure about why pulling 35k yesterday took 6secs, but 20secs today.

 

One of things I sometimes do when I'm performance tuning is I'll export the contents of vwSQL to a text format through the export function in Management Studio. I'll look at the file size to give me an indication of how much data there is (say 45MB for example). I have a rough idea of what my internet speed is so I can use that to calculate the optimum transfer time for the data. If the load times in my app then deviate greatly from this, it can indicate that there's some problem. If it's loading too quickly, I can double check that it's actually loading all the data that I expect.

Also if you did suspect a caching issue, SQL Profiler could help you check the data that the app is actually retrieving.

tianaranjo
Level 8

Re: Pulling in large-ish SQL tables

@RandyHayes @wyotim @ericonline 
I am attempting a different approach and wanted to see if any of you (formula wizards) had any insights that might help...  I  do not want to have to monitor my SQL table and add another ClearCollect as new records get added.  So, for example, using @ericonline example: I would need to append the OnStart with another ClearCollect once my SQL table exceeds 35000 records.  Not only is this an issue, the view I am referencing does not have recordId (or similar) in chronological order.  

ClearCollect(col17, Filter('[dbo].[bigSqltable]', recordID >= 32001 && recordID <= 34000)),
ClearCollect(col18, Filter('[dbo].[bigSqltable]', recordID >= 34001 && recordID <= 35000))

My OnStart is as follows: 

 

ClearCollect(
    colSubs,
    '[dbo].[vwDISTINCT_SUBS]'
);
Clear(colSubsMaster);
ForAll(
    colSubs,
    Collect(
        colSubsMaster,
        {
            val: CountRows(colSubsMaster) + 1,
            colName: Concatenate("col", Text(CountRows(colSubsMaster) + 1)),
            //Output of next formula is: ClearCollect(col1, Filter('[dbo].[VW_MasterTable]', SUBS = "SubsName"
collectCommand: Concatenate( "ClearCollect(", Concatenate("col", Text(CountRows(colSubsMaster) + 1)), ", Filter('[dbo].[VW_TableViewName]', SUBS = """, SUBS Char(34) ), name: Last( FirstN( colSubs, CountRows(colSubsMaster) + 1 ) ).SUBS } ) )

This will be dynamic, so if a person adds a new SUBS to the master data, it will be pulled into these collections.  Now, here is where I want the magic to happen.  I would like to run all of the ClearCollect formulas in the colSubsMaster --> collectCommand (column).  I have tried using a Gallery with a toggle and button to trigger (shown in image below).  I have also tried ForAll in different variations.  No luck.  I understand that the collectCommand in collection is 'seen' as data and not an action. Is there some way to translate it into an action? 

*Updated - it would be really great if this could run under a Concurrent.  

 

image.png

Clear as mud 🙂 !

Thanks!.... aka ~ @KickingApps 

Helpful resources

Announcements
New Ranks and Rank Icons in April

'New Ranks and Rank Icons in April

Read the announcement for more information!

Better Together’ Contest Finalists Announced!

'Better Together’ Contest Finalists Announced!

Congrats to the finalists of our ‘Better Together’-themed T-shirt design contest! Click for the top entries.

Power Platform 2019 release wave 2 plan

Power Platform 2019 release wave 2 plan

Features releasing from October 2019 through March 2020

thirdimage

Community Summit North America

Innovate, Collaborate, Grow - The top training and networking event across the globe for Microsoft Business Applications

Top Solution Authors
Top Kudoed Authors
Users online (9,975)