cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
lliu_western
Resolver II
Resolver II

Scrap a specific text from html

Hello community,

 

I have a flow that gets the html of a website.

lliu_western_0-1606974367313.png

 

Part of the html out that I got from getting the html is follow:

 

lliu_western_1-1606974437827.png

 

The "sessionDataKey" is the unique text within the html that stays consistent and closest to the key that I need to scap.

 

So I have a substring formula as follw:

 

substring
(
substring(outputs('html'), indexOf(outputs('html'), 'sessionDataKey')),
9,
indexOf(substring(outputs('html'), indexOf(outputs('html'), 'sessionDataKey')), '"')
)

 

I counted from the word "sessionDataKey" to the first letter of the key that I need to scrap, there are nine spaces between then, and I want it to stop when it encounter the quotation mark. 

 

But the flow fails, can anyone sees any obvious mistakes in my substring code?

 

Thanks in advance 

1 ACCEPTED SOLUTION

Accepted Solutions
Paulie78
Super User
Super User

It is because the expression I gave you was searching for a double quote as per original post. This string has single quotes so try this instead:

 

split(substring(outputs('htmlContent'), indexof(outputs('htmlContent'), 'name=''sessionDataKey''')),'''')[3]

If you get stuck, lose the [3] off the end of the split expression and see what array it outputs. The [3] just selects the 3rd element of the array.

 

 

View solution in original post

12 REPLIES 12
Paulie78
Super User
Super User

Try:

split(substring(outputs('html'), indexof(outputs('html'), 'name="sessionDataKey"')),'"')[3]

@Paulie78 

lliu_western_0-1607016853966.png

still not liking it. Can't figure out where it got wrong either. 

Paulie78
Super User
Super User

sounds like it is looking for a step called "html" and you don't have one.

@Paulie78 , I did have the step "html" before that. I renamed it again and it worked, I was able to run the flow but it failed with the error below:

 

Unable to process template language expressions in action 'Get_DataKey_Token' inputs at line '1' and column '7271': 'The template language function 'substring' parameter is out of range: 'start index' must be non-negative integer and should be less than the length of the string. Please see https://aka.ms/logicexpressions#substring for usage details.'.

Paulie78
Super User
Super User

That means it did not find 

name="sessionDataKey"

in the string. Try adding a compose before which just has:

indexof(outputs('html'), 'name="sessionDataKey"')

and see what it comes out with. Sounds like it is producing -1

@Paulie78  I got the substring working just fine now, but I got more than I needed in terms of split length but I should be able to figure out from here, thanks!!

@Paulie78 

@yashag2255 

 

this whole image is the return value I got:

lliu_western_0-1607018037728.png

by using this formula:

lliu_western_1-1607018068674.png

But I only need the highlighted text, how can I get the substring top at the first single quote it encounters? 

Paulie78
Super User
Super User

It is because the expression I gave you was searching for a double quote as per original post. This string has single quotes so try this instead:

 

split(substring(outputs('htmlContent'), indexof(outputs('htmlContent'), 'name=''sessionDataKey''')),'''')[3]

If you get stuck, lose the [3] off the end of the split expression and see what array it outputs. The [3] just selects the 3rd element of the array.

 

 

@Paulie78  it is still saying invalid expression even if I take out the [3]

@Paulie78 

lliu_western_0-1607021599250.png

this is what I have currently that gets me exactly what I need. But it is not dynamic enough. I don't want to use "32", I want to use a expression that would stop when it encounters the quotation. 

Paulie78
Super User
Super User

When I tested it, the expression I provided worked perfectly. But it is really hard to test it properly without actual sample data. 

@Paulie78  I still couldn't get it work by using split. But I was able to get it work by using Substring again and just use indexof formula to calculate where I want the substring to stop. But thank you for your help, you have been very helpful! 

Helpful resources

Announcements
Power Automate News & Announcements

Power Automate News & Announcements

Keep up to date with current events and community announcements in the Power Automate community.

Microsoft 365 Conference – December 6-8, 2022

Microsoft 365 Conference – December 6-8, 2022

Join us in Las Vegas to experience community, incredible learning opportunities, and connections that will help grow skills, know-how, and more.

Power Automate Community Blog

Power Automate Community Blog

Check out the latest Community Blog from the community!

Users online (4,619)