cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
jfk86d
Helper I
Helper I

CSS Selector help for Extract data from web page advanced settings

I am trying to select the price below ($64.99) in the div element and the CSS selector retrieves the contents of the span element as well ($64.99reg $71.99). How do I modify my CSS selector term (and Attribute??) to only get the price ($64.99)?

<div class="prices f23442L">$64.99<span class="Pricereg b98723s">reg $71.99</span></div>

This is the current CSS selector and Attribute.

CSS selector: html > body > div:eq(1) > main > div > div:eq(2) > form > div:eq(1) > div:eq(0)
Attribute: Own Text

Thank you for sharing your expertise and help.

John

 

 

1 ACCEPTED SOLUTION

Accepted Solutions

Hey @jfk86d
So I have bad news, OK news, and some interesting news.

The bad news is that I recall a similar case of where a div had text shared inline with a span containing more text, and I couldn't find a way to get the text I wanted by itself using CSS selectors alone(much the same way we're trying here).

The OK news is that regex should work (of course you'll need to do some more validation to check prices on other sample pages, you know how online storefronts can be with different CSS layouts...). Here's a picture of the settings I used to make it possible (with resultant outputs): 

UPLOADMETOFLOWCOMMUNITY.png

 

For reference:

 

 

    CSS selector: html > body > div:eq(1) > main > div > div:eq(2) > form > div:eq(1) > div:eq(0)
    Attribute: Own Text
    Regex: ^\$\d.\.\d{2}

 


Just in case I've included a second variation to the regex that also worked for this case but again do some testing on the variations that will be presented (I'm doing this with another very large online store so I'm very familiar with the tendency for product listing pages to be incredibly inconsistent):

 

    Regex: ^\$\d*\.?\d*

 


While it was desirable to try and parse out only the numeric value without the currency symbol (so you can save yourself a step), every attempt I made at regex capture specification doesn't seem to play well with PAD (I think they use a slightly different syntax than I'm used to). 

Now here's the interesting news. You can actually very easily do this with the "Run JavaScript function on web page" function. I did some fudging around with the dev console in Firefox and Chrome and managed to get the following function pieced together: 

 

function GetPrice() {
    let price = document.querySelector('div.PriceRowContainer-sc-1wlo6zv-1:nth-child(1)').childNodes[0].textContent;
    return price;
}

 


This successfully returned the value in question, and if you wanted you could trim out the dollar sign first before returning the value.

 

 

document.querySelector('div.PriceRowContainer-sc-1wlo6zv-1:nth-child(1)').childNodes[0].textContent.replace('\$','');

 


This is a little bit outside of the general scope of what PAD is designed for but I find the JavaScript function to be a lot more reliable for some cases. The only concern I have here is the slew of seemingly random characters shown after "div.PriceRowContainer-sc" but again I tested this on a couple different pages and it didn't give me any issues. 

If anyone else that is more gifted than I about reducing this back to just using the CSS selector, I'd love to hear their approach as problems like this have been a nightmare for my data validation. 

Below I have included the text version of a sample flow, you should be able to select it, copy, and then paste it into a flow editor and it will import all of the actions and their settings to run for yourself and verify. (Actually this doesn't work... which is a bit of a bummer really, see my edit down below).
------------------------------- FLOW STARTS BELOW THIS LINE-------------------------------

WebAutomation.LaunchFirefox Url: $'''https://www.petco.com/shop/en/petcostore/product/purina-pro-plan-focus-sensitive-skin-and-stomach-salmon-and-rice-formula-adult-dry-dog-food''' WindowState: WebAutomation.BrowserWindowState.Normal ClearCache: False ClearCookies: False Timeout: 60 BrowserInstance=> Browser
WebAutomation.DataExtraction.ExtractSingleValue BrowserInstance: Browser ExtractionParameters: {[$'''html > body > div:eq(1) > main > div > div:eq(2) > form > div:eq(1) > div:eq(0)''', $'''Own Text''', $'''^\\$\\d.\\.\\d{2}'''] } ExtractedData=> arrayPriceUsingExtractorRegexA
SET stringA TO arrayPriceUsingExtractorRegexA[0][0]
WebAutomation.DataExtraction.ExtractSingleValue BrowserInstance: Browser ExtractionParameters: {[$'''html > body > div:eq(1) > main > div > div:eq(2) > form > div:eq(1) > div:eq(0)''', $'''Own Text''', $'''^\\$\\d*\\.?\\d*'''] } ExtractedData=> arrayPriceUsingExtractorRegexB
SET stringB TO arrayPriceUsingExtractorRegexB[0][0]
WebAutomation.ExecuteJavascript BrowserInstance: Browser Javascript&colon; $'''function GetPrice() {
let price = document.querySelector(\'div.PriceRowContainer-sc-1wlo6zv-1:nth-child(1)\').childNodes[0].textContent;
return price;
}''' Result=> stringPriceUsingJavaScript
WebAutomation.ExecuteJavascript BrowserInstance: Browser Javascript&colon; $'''function GetPrice() {
let price =document.querySelector(\'div.PriceRowContainer-sc-1wlo6zv-1:nth-child(1)\').childNodes[0].textContent.replace(\'\\$\',\'\');
return price;
}''' Result=> stringPriceUsingJavaScriptWithoutDollarSign
Text.ToNumber Text: stringPriceUsingJavaScriptWithoutDollarSign Number=> TextAsNumber
WebAutomation.CloseWebBrowser BrowserInstance: Browser
-------------------------------- FLOW ENDS ABOVE THIS LINE--------------------------------
See below for the example flow in the editor with the results of the last run performed. 


UPLOADMETOFLOWCOMMUNITY2222.png

 

 

 

As a note here I'm gonna fiddle around with this to make it look a prettier. Just need to get it submitted first so the draft is saved. But please let me know if you have any questions about any of this. Over the last several months I've grown very familiar with the kinds of automations that are required when interfacing with online store fronts. 

Edit: So I recognize that the text gets re-encoded when it's posted into the html so it's not possible to directly copy and paste from here (though I've had no issue copying and pasting to and from text editors when trying to modify some of my flows at a code level) and I know the coding syntax of this can be a bit of a headache to follow. If you need more details on the flow setup I used for my solution feel free to PM me and I'll go over everything I did. 

View solution in original post

6 REPLIES 6
BLawless
Frequent Visitor

Without looking at the element in the context of the page it's in, the first idea that comes to mind would be to use some regex parsing of the text that's captured and supply that as an argument in the third column for this element (I'd need to test it but I think something similar to "^(\$\d.\.\d{2})" would be a good starting point).

You could also use the split text action after extracting and split on a regex of any letter (or space if there's a space between "$64.99" and "reg") and then reference the first index in the resultant variable.  There is likely still a way to do this with the CSS selector alone but I'd need to see the site to be able to test it. 

If you would like to share the URL to the webpage, or a sample webpage anyways that contains this element, it would better help me with identifying the issue with the selector (if you need you are welcome to PM me with that information). 

jfk86d
Helper I
Helper I

Hi @BLawless ,

Thanks for the reply. I like the split suggestion as Plan B. Here is the URL.

https://www.petco.com/shop/en/petcostore/product/purina-pro-plan-focus-sensitive-skin-and-stomach-sa...

Thanks,

John

Hey @jfk86d
So I have bad news, OK news, and some interesting news.

The bad news is that I recall a similar case of where a div had text shared inline with a span containing more text, and I couldn't find a way to get the text I wanted by itself using CSS selectors alone(much the same way we're trying here).

The OK news is that regex should work (of course you'll need to do some more validation to check prices on other sample pages, you know how online storefronts can be with different CSS layouts...). Here's a picture of the settings I used to make it possible (with resultant outputs): 

UPLOADMETOFLOWCOMMUNITY.png

 

For reference:

 

 

    CSS selector: html > body > div:eq(1) > main > div > div:eq(2) > form > div:eq(1) > div:eq(0)
    Attribute: Own Text
    Regex: ^\$\d.\.\d{2}

 


Just in case I've included a second variation to the regex that also worked for this case but again do some testing on the variations that will be presented (I'm doing this with another very large online store so I'm very familiar with the tendency for product listing pages to be incredibly inconsistent):

 

    Regex: ^\$\d*\.?\d*

 


While it was desirable to try and parse out only the numeric value without the currency symbol (so you can save yourself a step), every attempt I made at regex capture specification doesn't seem to play well with PAD (I think they use a slightly different syntax than I'm used to). 

Now here's the interesting news. You can actually very easily do this with the "Run JavaScript function on web page" function. I did some fudging around with the dev console in Firefox and Chrome and managed to get the following function pieced together: 

 

function GetPrice() {
    let price = document.querySelector('div.PriceRowContainer-sc-1wlo6zv-1:nth-child(1)').childNodes[0].textContent;
    return price;
}

 


This successfully returned the value in question, and if you wanted you could trim out the dollar sign first before returning the value.

 

 

document.querySelector('div.PriceRowContainer-sc-1wlo6zv-1:nth-child(1)').childNodes[0].textContent.replace('\$','');

 


This is a little bit outside of the general scope of what PAD is designed for but I find the JavaScript function to be a lot more reliable for some cases. The only concern I have here is the slew of seemingly random characters shown after "div.PriceRowContainer-sc" but again I tested this on a couple different pages and it didn't give me any issues. 

If anyone else that is more gifted than I about reducing this back to just using the CSS selector, I'd love to hear their approach as problems like this have been a nightmare for my data validation. 

Below I have included the text version of a sample flow, you should be able to select it, copy, and then paste it into a flow editor and it will import all of the actions and their settings to run for yourself and verify. (Actually this doesn't work... which is a bit of a bummer really, see my edit down below).
------------------------------- FLOW STARTS BELOW THIS LINE-------------------------------

WebAutomation.LaunchFirefox Url: $'''https://www.petco.com/shop/en/petcostore/product/purina-pro-plan-focus-sensitive-skin-and-stomach-salmon-and-rice-formula-adult-dry-dog-food''' WindowState: WebAutomation.BrowserWindowState.Normal ClearCache: False ClearCookies: False Timeout: 60 BrowserInstance=> Browser
WebAutomation.DataExtraction.ExtractSingleValue BrowserInstance: Browser ExtractionParameters: {[$'''html > body > div:eq(1) > main > div > div:eq(2) > form > div:eq(1) > div:eq(0)''', $'''Own Text''', $'''^\\$\\d.\\.\\d{2}'''] } ExtractedData=> arrayPriceUsingExtractorRegexA
SET stringA TO arrayPriceUsingExtractorRegexA[0][0]
WebAutomation.DataExtraction.ExtractSingleValue BrowserInstance: Browser ExtractionParameters: {[$'''html > body > div:eq(1) > main > div > div:eq(2) > form > div:eq(1) > div:eq(0)''', $'''Own Text''', $'''^\\$\\d*\\.?\\d*'''] } ExtractedData=> arrayPriceUsingExtractorRegexB
SET stringB TO arrayPriceUsingExtractorRegexB[0][0]
WebAutomation.ExecuteJavascript BrowserInstance: Browser Javascript&colon; $'''function GetPrice() {
let price = document.querySelector(\'div.PriceRowContainer-sc-1wlo6zv-1:nth-child(1)\').childNodes[0].textContent;
return price;
}''' Result=> stringPriceUsingJavaScript
WebAutomation.ExecuteJavascript BrowserInstance: Browser Javascript&colon; $'''function GetPrice() {
let price =document.querySelector(\'div.PriceRowContainer-sc-1wlo6zv-1:nth-child(1)\').childNodes[0].textContent.replace(\'\\$\',\'\');
return price;
}''' Result=> stringPriceUsingJavaScriptWithoutDollarSign
Text.ToNumber Text: stringPriceUsingJavaScriptWithoutDollarSign Number=> TextAsNumber
WebAutomation.CloseWebBrowser BrowserInstance: Browser
-------------------------------- FLOW ENDS ABOVE THIS LINE--------------------------------
See below for the example flow in the editor with the results of the last run performed. 


UPLOADMETOFLOWCOMMUNITY2222.png

 

 

 

As a note here I'm gonna fiddle around with this to make it look a prettier. Just need to get it submitted first so the draft is saved. But please let me know if you have any questions about any of this. Over the last several months I've grown very familiar with the kinds of automations that are required when interfacing with online store fronts. 

Edit: So I recognize that the text gets re-encoded when it's posted into the html so it's not possible to directly copy and paste from here (though I've had no issue copying and pasting to and from text editors when trying to modify some of my flows at a code level) and I know the coding syntax of this can be a bit of a headache to follow. If you need more details on the flow setup I used for my solution feel free to PM me and I'll go over everything I did. 

jfk86d
Helper I
Helper I

Hi @BLawless ,

 

Thank you very much for your expert advice. I used the RegEx solution and it is working well. Thank you very much!

 

 

Hey @jfk86d,
Happy to help! I'm sure someone else will come along with a better solution, but in the meantime if you wanna mark my explanation post as a solution so others can easily reference it, feel free to do so! 
Whoops, you were already on top of that. Thank you!

kwikretails
New Member

Hi @BLawless ,

Thanks for the reply. I like the split suggestion as Plan B. Here is the URL.

https://www.kwikpets.com/collections/dog-vitamins-supplements

Thanks,

John

Helpful resources

Announcements

Celebrating the May Super User of the Month: Laurens Martens

  @LaurensM  is an exceptional contributor to the Power Platform Community. Super Users like Laurens inspire others through their example, encouragement, and active participation. We are excited to celebrated Laurens as our Super User of the Month for May 2024.   Consistent Engagement:  He consistently engages with the community by answering forum questions, sharing insights, and providing solutions. Laurens dedication helps other users find answers and overcome challenges.   Community Expertise: As a Super User, Laurens plays a crucial role in maintaining a knowledge sharing environment. Always ensuring a positive experience for everyone.   Leadership: He shares valuable insights on community growth, engagement, and future trends. Their contributions help shape the Power Platform Community.   Congratulations, Laurens Martens, for your outstanding work! Keep inspiring others and making a difference in the community!   Keep up the fantastic work!        

Check out the Copilot Studio Cookbook today!

We are excited to announce our new Copilot Cookbook Gallery in the Copilot Studio Community. We can't wait for you to share your expertise and your experience!    Join us for an amazing opportunity where you'll be one of the first to contribute to the Copilot Cookbook—your ultimate guide to mastering Microsoft Copilot. Whether you're seeking inspiration or grappling with a challenge while crafting apps, you probably already know that Copilot Cookbook is your reliable assistant, offering a wealth of tips and tricks at your fingertips--and we want you to add your expertise. What can you "cook" up?   Click this link to get started: https://aka.ms/CS_Copilot_Cookbook_Gallery   Don't miss out on this exclusive opportunity to be one of the first in the Community to share your app creation journey with Copilot. We'll be announcing a Cookbook Challenge very soon and want to make sure you one of the first "cooks" in the kitchen.   Don't miss your moment--start submitting in the Copilot Cookbook Gallery today!     Thank you,  Engagement Team

Announcing Power Apps Copilot Cookbook Gallery

We are excited to share that the all-new Copilot Cookbook Gallery for Power Apps is now available in the Power Apps Community, full of tips and tricks on how to best use Microsoft Copilot as you develop and create in Power Apps. The new Copilot Cookbook is your go-to resource when you need inspiration--or when you're stuck--and aren't sure how to best partner with Copilot while creating apps.   Whether you're looking for the best prompts or just want to know about responsible AI use, visit Copilot Cookbook for regular updates you can rely on--while also serving up some of your greatest tips and tricks for the Community. Check Out the new Copilot Cookbook for Power Apps today: Copilot Cookbook - Power Platform Community.  We can't wait to see what you "cook" up!    

Welcome to the Power Automate Community

You are now a part of a fast-growing vibrant group of peers and industry experts who are here to network, share knowledge, and even have a little fun.   Now that you are a member, you can enjoy the following resources:   Welcome to the Community   News & Announcements: The is your place to get all the latest news around community events and announcements. This is where we share with the community what is going on and how to participate.  Be sure to subscribe to this board and not miss an announcement.   Get Help with Power Automate Forums: If you're looking for support with any part of Power Automate, our forums are the place to go. From General Power Automate forums to Using Connectors, Building Flows and Using Flows.  You will find thousands of technical professionals, and Super Users with years of experience who are ready and eager to answer your questions. You now have the ability to post, reply and give "kudos" on the Power Automate community forums. Make sure you conduct a quick search before creating a new post because your question may have already been asked and answered. Galleries: The galleries are full of content and can assist you with information on creating a flow in our Webinars and Video Gallery, and the ability to share the flows you have created in the Power Automate Cookbook.  Stay connected with the Community Connections & How-To Videos from the Microsoft Community Team. Check out the awesome content being shared there today.   Power Automate Community Blog: Over the years, more than 700 Power Automate Community Blog articles have been written and published by our thriving community. Our community members have learned some excellent tips and have keen insights on the future of process automation. In the Power Automate Community Blog, you can read the latest Power Automate-related posts from our community blog authors around the world. Let us know if you'd like to become an author and contribute your own writing — everything Power Automate-related is welcome.   Community Support: Check out and learn more about Using the Community for tips & tricks. Let us know in the Community Feedback  board if you have any questions or comments about your community experience. Again, we are so excited to welcome you to the Microsoft Power Automate community family. Whether you are brand new to the world of process automation or you are a seasoned Power Automate veteran - our goal is to shape the community to be your 'go to' for support, networking, education, inspiration and encouragement as we enjoy this adventure together.     Power Automate Community Team

Hear what's next for the Power Up Program

Hear from Principal Program Manager, Dimpi Gandhi, to discover the latest enhancements to the Microsoft #PowerUpProgram, including a new accelerated video-based curriculum crafted with the expertise of Microsoft MVPs, Rory Neary and Charlie Phipps-Bennett. If you’d like to hear what’s coming next, click the link below to sign up today! https://aka.ms/PowerUp  

Tuesday Tip | How to Report Spam in Our Community

It's time for another TUESDAY TIPS, your weekly connection with the most insightful tips and tricks that empower both newcomers and veterans in the Power Platform Community! Every Tuesday, we bring you a curated selection of the finest advice, distilled from the resources and tools in the Community. Whether you’re a seasoned member or just getting started, Tuesday Tips are the perfect compass guiding you across the dynamic landscape of the Power Platform Community.   As our community family expands each week, we revisit our essential tools, tips, and tricks to ensure you’re well-versed in the community’s pulse. Keep an eye on the News & Announcements for your weekly Tuesday Tips—you never know what you may learn!   Today's Tip: How to Report Spam in Our Community We strive to maintain a professional and helpful community, and part of that effort involves keeping our platform free of spam. If you encounter a post that you believe is spam, please follow these steps to report it: Locate the Post: Find the post in question within the community.Kebab Menu: Click on the "Kebab" menu | 3 Dots, on the top right of the post.Report Inappropriate Content: Select "Report Inappropriate Content" from the menu.Submit Report: Fill out any necessary details on the form and submit your report.   Our community team will review the report and take appropriate action to ensure our community remains a valuable resource for everyone.   Thank you for helping us keep the community clean and useful!

Users online (3,504)