cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
NandorR
Frequent Visitor

Failed to extract data (web page error while extracting data).

Hi!

 

I am currently trying to do a webscraping flow to copy data in a table to an excel file:

  1. I used Recorder to identify which data to copy and I selected to extract HTML Table
  2. However when the flow completes, there is no data in the excel file.
  3. When i look at the Flow Variable itself, there is no values at all refer to pict below.
    image.png

    Has anybody ever encountered such an issue before, and how do I resolve it?

19 REPLIES 19
Ankesh_49
Super User
Super User

Could you please share the PAD script you are using?

Hi Ankesh_49,

 

Thank you for the response! I am pretty new to PAD, so may I know what is the script you are referring to? Is it the screenshot below?image.png

Works fine.

Ankesh_49_0-1655889411013.png

Could you please share the Web url, you are using for data extraction.

Hi Ankesh,

 

I am unable to share the URLas it is a company intranet link. Is it possible that the intranet website is the reason why PAD is unable to pull anything?

Not at all.

 

Could you please try adding a delay after step 11 in your script.

Also, could you please confirm if you are getting below option while doing a record

Ankesh_49_0-1655891783597.png

 

Cheers,

Ankesh

--------------------------------

If this post helps answer your question, please click on “Accept as Solution” to help other members find it more quickly. If you thought this post was helpful, please give it a Thumbs Up.

 

VJR
Super User
Super User

@NandorR 

 

Instead of the recorder use the "Extract data from web page" action.

Also see on which Html tag are you selecting the "Extract entire html table"?

 

For example in this case I am doing it on the <th> tag and then able to see the columns in the preview section on the right.

 

Likewise play around and see which one works for you. 

Sometimes it is also the <table> tag.

VJR_0-1655892032608.png

Hi,

 

My answer to your comments in blue:

 

  • Could you please try adding a delay after step 11 in your script. Performed but still got the same results)

image.png

  • Also, could you please confirm if you are getting below option while doing a record (i did get that same option when i used the Recorder function. However when i clicked it, and ran the flow, the variables were still empty)

image.png

Ankesh_49
Super User
Super User

@NandorR Could you please check if you are using correct browser instance? %browser2% or %browser1%.

 

Cheers,

Ankesh

--------------------------------

If this post helps answer your question, please click on “Accept as Solution” to help other members find it more quickly. If you thought this post was helpful, please give it a Thumbs Up.

NandorR
Frequent Visitor

Hi VJR,

 

Thank you for your response. My answer to your comments in blue:

 

  • Instead of the recorder use the "Extract data from web page" action. (For some reason this doesn't seem to work for me, when i select "Extract data from web page" it cannot seem to recognize any of the elements in the web page, though could it be due to lag as I am using it on MS Edge?)
  • Also see on which Html tag are you selecting the "Extract entire html table"? (For this case i selected <td> it still returned a blank table when i ran the code, when i originally selected <table> it still yielded the same issue. though i will try clicking on other types of tags to see if it works. Also it should be noted, I originally selected specific elements in the web page, the flow variables still showed blanks.)

image.png

Ankesh_49
Super User
Super User

@NandorR Is this webtable inside some webframe?

NandorR
Frequent Visitor

Hi VJR,

 

  • I managed to try with "Extract data from web page" however the results was the same. In the "Extractor Preview" I was able to see the values i wanted (censored for confidentiality reasons)

image.png

  • But when i ran the flow, the flow variable i got was blank. While the columns were the correct quantity (8 columns), the number of rows generated was incorrect (total data rows was definitely more than 3 rows)

image.png

  • Additionally when i look at the web page, the only tags i see are <td> (the data i want is in in this tag), <a>, <b>.<table> does not appear unless i select "extract entire HTML Table". Could this be the issue?
NandorR
Frequent Visitor

Hi Ankesh,

 

To my limited knowledge in html, I don't think the table is in a webframe? As when i view page source, I am unable to find any mention of "frame" when i Ctr+f "frame" in the page source. Additionally, the only tags i see by default when i use Data Extractor are <td> (the data i want is in in this tag), <a>, and <b>.<table> does not appear unless i select "extract entire HTML Table" after clicking an element.

Ankesh_49
Super User
Super User

@NandorR  Could you please check it on other websites, if table data is getting extracted.

Hi Ankesh,

 

It works on other websites. I was able to pull data into a table.

image.png

However under "Advanced Settings" i noticed there was a difference in the CSS Selector description.

 

  • Website where data unable to be extracted: html > body > table
  • Website where data able to be extracted: html > body > form > main > div:eq(1) > div > div > div:eq(1) > div > div:eq(0) > div > div > div > div:eq(0) > table

 

Would this be what is causing no data to be extracted?

Ankesh_49
Super User
Super User

@NandorR  Could you please try this:

1. Add the HTML table in ui element

Open selector builder and see how is it getting identified by PAD

Ankesh_49_0-1656104487972.png

and try using that attribute while creating custom selector

Ankesh_49_1-1656104645412.png

 

Hope it helps!!

 

Cheers,

Ankesh

--------------------------------

If this post helps answer your question, please click on “Accept as Solution” to help other members find it more quickly. If you thought this post was helpful, please give it a Thumbs Up.

NandorR
Frequent Visitor

Hi Ankesh,

 

I did like you suggested.

 

  • I identified the UI of some sample values in the webpage (circled in green)

image.png

  • I then entered it into the respective "CSS Selector" and "Attribute" (e.g. i changed the initial CSS selector from "html > body > form > table:eq(1) > tbody > tr:eq(0) > td > input:eq(1)" to "html > body > form > table:eq(1) > tbody > tr:eq(0) > td > input[Id="styleSmall"]" and Attribute "Own Text" to "id".

image.png

  • I ran the flow, which ran without any errors, but when I opened the variables, it was still blank

image.png

  •  When i look back at the CSS selector and attribute, I can see that it was successfully saved. Howver I am still not getting any results. Do you have any other suggestions as to what may be wrong?

image.png

Ankesh_49
Super User
Super User

@NandorR  Only thing which I can think of now, if you could share a similar website so that people here can look into it.

Thank you

momlo
Helper V
Helper V

Hi @NandorR 

Apologies if that was tested already, but did you test your extract data from web page in isolation, not depended on the send keys actions that happen earlier in your code?

I'm asking as perhaps your web page navigation does not reach to the point where table is displayed, hence extraction fails.

 

What I would do is to deactivate all actions except extract data (or create fresh flow with just this action), navigate to the page manually and test the action. If this works, your flow has issue with prior actions.

Helpful resources

Announcements
Microsoft 365 Conference – December 6-8, 2022

Microsoft 365 Conference – December 6-8, 2022

Join us in Las Vegas to experience community, incredible learning opportunities, and connections that will help grow skills, know-how, and more.

Difinity Conference 2022

Difinity Conference 2022

Register today for two amazing days of learning, featuring intensive learning sessions across multiple tracks, led by engaging and dynamic experts.

European SharePoint Conference

European SharePoint Conference

The European SharePoint Conference returns live and in-person November 28-December 1 with 4 Microsoft Keynotes, 9 Tutorials, and 120 Sessions.

Top Solution Authors
Top Kudoed Authors
Users online (3,025)