Re: HTML Table (from Http Get) save to SharePoint ...

yhlow · ‎09-29-2020

Hi,

Need some help, I'm trying to save the html table to SharePoint and the html table are from the webpage using get.

Here the table format

S/No.	Registration Number	Name	Date of Approval	Status
1	A123456	Bomiil	09/10/2007	Effective till 03/12/2018
2	1RT436K	Bporeited	01/2/2011	Active
3	2BHYER016Z	Petored	25/5/2010	Effective till 24/05/2019
4	1QE$6769Z	BunkPet	10/1/2005	Active

and managed to get the html code as below.

<table>
<tbody>
<tr>
<th style="width: 46.9318px;" valign="top">S/No.</th>
<th style="width: 164.205px;" valign="top">Registration Number</th>
<th style="width: 71.4773px;" valign="top">Name </th>
<th style="width: 131.477px;" valign="top">Date of Approval</th>
<th style="width: 172.386px;" valign="top">Status</th>
</tr>
<tr>
<td style="cursor: default; width: 46.9318px;" valign="top">
1
</td>
<td style="cursor: default; width: 164.205px;" valign="top">
A123456
</td>
<td style="cursor: default; width: 71.4773px;" valign="top">
Bomiil
</td>
<td style="cursor: default; width: 131.477px;" valign="top">
09/10/2007
</td>
<td style="cursor: default; width: 172.386px;" valign="top">Effective till 03/12/2018</td>
</tr>
<tr>
<td style="cursor: default; width: 46.9318px;" valign="top">
2
</td>
<td style="cursor: default; width: 164.205px;" valign="top">1RT436K</td>
<td style="cursor: default; width: 71.4773px;" valign="top">
Bporeited
</td>
<td style="cursor: default; width: 131.477px;" valign="top">
01/2/2011
</td>
<td style="cursor: default; width: 172.386px;" valign="top">
Active
</td>
</tr>
<tr>
<td style="cursor: default; width: 46.9318px;" valign="top">
3
</td>
<td style="cursor: default; width: 164.205px;" valign="top">
2BHYER016Z
</td>
<td style="cursor: default; width: 71.4773px;" valign="top">
Petored
</td>
<td style="cursor: default; width: 131.477px;" valign="top">
25/5/2010
</td>
<td style="cursor: default; width: 172.386px;" valign="top">
Effective till 24/05/2019
</td>
</tr>
<tr>
<td style="cursor: default; width: 46.9318px;" valign="top">
4
</td>
<td style="cursor: default; width: 164.205px;" valign="top">
1QE$6769Z
</td>
<td style="cursor: default; width: 71.4773px;" valign="top">
BunkPet
</td>
<td style="cursor: default; width: 131.477px;" valign="top">
10/1/2005
</td>
<td style="cursor: default; width: 172.386px;" valign="top">
Active
</td>
</tr>
</tbody>
</table>

efialttes · ‎09-30-2020

Hi!

Extremely hard challenge since Power Automate is not the best tool for scraping.

So either you follow a split() based approach to extract all the info from your table, starting with headers...

split(replace('Your HTML table here','/th','th'),'<th')

...next with the rest of the rows...

split(replace('Your HTML table here','/td','td'),'<td')

...please note you will need to add lots of additional manipulations not detailed in the two WDL expressions suggested above...

...or you follow an XML based strategy to do it.

Second approach is more robust, but also really hard to implement. @Marty9 provided this incredibly helpful step-by-step description on this second approach:

Basically what you need to do is the following:

get the entire enclosing string between <table> and </table> from the web-source and turn it into an XML string
get the header information with XPath using the following in an expression “…string(‘//th/descendant::text()’)…” this will give you an array with the table’s column headers, the “descendant::” will go recursively into all XML-sub-nodes and extract the text value — nut just the current node’s text value — usually you only have the column-name at one place, but you never know…
do the same with each line of data points in the table by

3a. getting an array of lines first using an expression like “…string(‘//tbody/*’)…” — assuming that the <tbody> marker contains all <td>s

3b. Iterating through that array from (3a) = iterate through the lines and

3b1. extract an array of values with an XPath using the following in an expression “…string(‘//td/descendant::text()’)…” whick will give you an array just like in (2)

3b2. you may have to use a base64ToString() function in there somewhere — I believe it was for the data you get from (3a) — the line-data there was retrieved as binary information, on which of course you can’t use an XPath — that requires a plain, old-fashioned XML string

3b3. you keep adding to your array of data-points, where each line is a JSON using the header-names from (2) and the data-points from (3ai) — since my information was row centric I just created a JSON Object for each row with my header-name and reach row’s data-points, but one could also create an array, why not…

4. Finally, I parsed through the array of data-points so I could use it anyway I wanted as at this point the original HTML table content has been transformed into a JSON array of objects (or arrays if you like)

No matter what of the two approaches you finally shoose... Good luck with the challenge!

Did I make your day? If so, mark my response as 'Solution' !!!

También escribo sobre Power Automate en este Blog y en Twitter