cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
PAuserFromFranc
Helper III
Helper III

New webflow from specific website not able to extract data

Hello Guys, 

 

Is someone able to build up a flow that can extract all the contacts informations listed in all the companies of this website?

i canno't find the right selector to grab it

https://www.energaia.fr/visiter/liste-des-exposants/

thanks for helping, 

 

regards, 

 

Fred

2 ACCEPTED SOLUTIONS

Accepted Solutions
Henrik_M
Super User
Super User

This should work for you. Paste the whole text into an empty PAD flow.

 

Look it through and see if it makes sense to you.

 

You must have this page open when you run it: https://exposants.energaia.fr/form/liste_exposant&lang=fr&session=EN22&langue_id=1

 

View solution in original post

https://regexone.com/ takes you through many of the basics, but (a lot of) practice is what makes... proficient, at some point 😅

 

But actually you shouldn't even need the regular expressions that much moving forward, since the Crop text action can do the whole "get text between two other texts" thing that I did with parse text.

 

By the way, the Replace text that I add, is just because I find new lines to be annoying when it comes to parsing, so I tend to reduce them to regular spaces.

Henrik_M_0-1664919298934.png

 

View solution in original post

18 REPLIES 18
PAuserFromFranc
Helper III
Helper III

@VJR @Henri @Henrik_M @Ankesh_49 

Hi Team, requesting your help to manage selector and build this flow i'm asking for...

May you help please?

thx fred

Ankesh_49
Super User
Super User

@PAuserFromFranc   Could you please share the flow you have developed? which selector are you using?

PAuserFromFranc
Helper III
Helper III

@Ankesh_49 at the moment nothing but i want to grab information from 

https://www.energaia.fr/visiter/liste-des-exposants/

for each company name found in the table, then open the little down arrow and get name, phone, email, adress and field of activity for thoses companies and also for each page (we need also pagination)

Usually i know how to do it but i can't here normally selector would be : 

body > main > section > div > div > form > div:eq(3) > table > tr.odd:nth-child(1) > td:nth-child(1) > span:nth-child(2)

something like this with attribute like tr[Class="odd"]

 

Pavel_NaNoi
Impactful Individual
Impactful Individual

From my limited research on this, most I can tell you is that you'll need to run a javascript in order for this to work on the webpage. Why? well because the company information is not actually on the webpage, but rather an imbued document, an iframe. Thus to access it you need to run javascript that can somehow switch the CSS selector from the main page to this iframe of the imbued document. I have honestly no idea how to do that and I hope someone more well versed in this will come along to help out with this.

Pavel_NaNoi
Impactful Individual
Impactful Individual

Figured it out, do the following:

-Launch the webpage

-Use Extract data from web page

-In the advanced options write the following css selector - > "iframe:eq(0)" with the following attribute "src"

this will get you the html link of the iframe, now simply launch a new chrome instance with that link and from there you can extract everything as normal.

 

Enjoy.

PAuserFromFranc
Helper III
Helper III

@Ankesh_49 @Pavel_NaNoi 

thank you i could step a bit but i'm still stuck with a javascript to execute to open the little arrow and get the datas

PAuserFromFranc_0-1664476023464.png

 

Shouldn't need javascript for that, does extract data from webpage not work?

Alright, I can see now why you were struggling on that extraction part, got some good news and some bad news,

Good news, I made an automation that does what you want, opens the arrow, extracts text and moves to the next.

Bad news? its 1 minute 30 seconds per page (in 1ms delay debug mode)

 

I don't see a way of improving that time other than maybe just using an API call method or something (I have no idea how to do that don't even ask)

 

However if you want this automation, private message me (just click my profile, should see the button on the right), I'll send it over to you, it will need some editing from your end though.

In summary it does the following:

- Creates a new Datatable 

- Gets the number of pages to go through

- Creates a loop based on the number of pages

- Extracts arrows count on page

- Goes through each arrow, extracting specific text (can be edited to extract w/e)

- Puts that information into the Datatable (will need editing if the above is changed)

- Once done, moves to next page and repeat till finished.

Henrik_M
Super User
Super User

Step one should be to enter the iframe directly:  https://exposants.energaia.fr/form/liste_exposant&lang=fr&session=EN22&langue_id=1 

Henrik_M_0-1664552145732.png

 

I thought about the program in my head, and it should be possible. I'll see if I have time to make it during the weekend, then I can share.

PAuserFromFranc
Helper III
Helper III

Enter the iframe, i did but the rest...i'm stuck, thanks again @Henrik_M 

Henrik_M
Super User
Super User

This should work for you. Paste the whole text into an empty PAD flow.

 

Look it through and see if it makes sense to you.

 

You must have this page open when you run it: https://exposants.energaia.fr/form/liste_exposant&lang=fr&session=EN22&langue_id=1

 

Wait, Henrik, how'd you put a zip file attachment in your message? I can't seem to do it, just tells me its no supported.

I might have more privileges because of the Super User status. I only got the "not supported" message when I tried uploading the .txt file 🤔

Henrik_M_0-1664648781703.png

 

Ah, fair enough.

PAuserFromFranc
Helper III
Helper III

Thank you so much @Henrik_M 

I'm now trying to understand the flow you made but too difficult. I don't get this for instance :

table[Id="exposant"] > tbody > tr > td > span[Class*="fa-chevron"]:eq(%LoopIndex_Chevron%)

loopIndex_Chevron is variable and you use it as attribute right to keep forward?

And what does mean the little * after Class?

The rest i get it i think but very complex for me to think the algorithmes this way...

thank for all

Fred

Correct. Since we know that there are 25 entries on each page, we count from index 0 to 24.

 

*= is the way to write the contains operator between an attribute (the class) and the value (fa-chevron)

 

So in this case, we are able to advance down through the list and open each description box, regardless of the chevron type.

Henrik_M_0-1664730000646.png

 

PAuserFromFranc
Helper III
Helper III

Hi @Henrik_M where can i learn Regex like you did (?<=Contact : ).+?(?=string) and so on? i don't get it and i'm not into code or regex so i can't understand it well in order to use it for similar flows which attend to be some others texts to parse

thanks

https://regexone.com/ takes you through many of the basics, but (a lot of) practice is what makes... proficient, at some point 😅

 

But actually you shouldn't even need the regular expressions that much moving forward, since the Crop text action can do the whole "get text between two other texts" thing that I did with parse text.

 

By the way, the Replace text that I add, is just because I find new lines to be annoying when it comes to parsing, so I tend to reduce them to regular spaces.

Henrik_M_0-1664919298934.png

 

Helpful resources

Announcements

Celebrating the May Super User of the Month: Laurens Martens

  @LaurensM  is an exceptional contributor to the Power Platform Community. Super Users like Laurens inspire others through their example, encouragement, and active participation. We are excited to celebrated Laurens as our Super User of the Month for May 2024.   Consistent Engagement:  He consistently engages with the community by answering forum questions, sharing insights, and providing solutions. Laurens dedication helps other users find answers and overcome challenges.   Community Expertise: As a Super User, Laurens plays a crucial role in maintaining a knowledge sharing environment. Always ensuring a positive experience for everyone.   Leadership: He shares valuable insights on community growth, engagement, and future trends. Their contributions help shape the Power Platform Community.   Congratulations, Laurens Martens, for your outstanding work! Keep inspiring others and making a difference in the community!   Keep up the fantastic work!        

Check out the Copilot Studio Cookbook today!

We are excited to announce our new Copilot Cookbook Gallery in the Copilot Studio Community. We can't wait for you to share your expertise and your experience!    Join us for an amazing opportunity where you'll be one of the first to contribute to the Copilot Cookbook—your ultimate guide to mastering Microsoft Copilot. Whether you're seeking inspiration or grappling with a challenge while crafting apps, you probably already know that Copilot Cookbook is your reliable assistant, offering a wealth of tips and tricks at your fingertips--and we want you to add your expertise. What can you "cook" up?   Click this link to get started: https://aka.ms/CS_Copilot_Cookbook_Gallery   Don't miss out on this exclusive opportunity to be one of the first in the Community to share your app creation journey with Copilot. We'll be announcing a Cookbook Challenge very soon and want to make sure you one of the first "cooks" in the kitchen.   Don't miss your moment--start submitting in the Copilot Cookbook Gallery today!     Thank you,  Engagement Team

Announcing Power Apps Copilot Cookbook Gallery

We are excited to share that the all-new Copilot Cookbook Gallery for Power Apps is now available in the Power Apps Community, full of tips and tricks on how to best use Microsoft Copilot as you develop and create in Power Apps. The new Copilot Cookbook is your go-to resource when you need inspiration--or when you're stuck--and aren't sure how to best partner with Copilot while creating apps.   Whether you're looking for the best prompts or just want to know about responsible AI use, visit Copilot Cookbook for regular updates you can rely on--while also serving up some of your greatest tips and tricks for the Community. Check Out the new Copilot Cookbook for Power Apps today: Copilot Cookbook - Power Platform Community.  We can't wait to see what you "cook" up!    

Welcome to the Power Automate Community

You are now a part of a fast-growing vibrant group of peers and industry experts who are here to network, share knowledge, and even have a little fun.   Now that you are a member, you can enjoy the following resources:   Welcome to the Community   News & Announcements: The is your place to get all the latest news around community events and announcements. This is where we share with the community what is going on and how to participate.  Be sure to subscribe to this board and not miss an announcement.   Get Help with Power Automate Forums: If you're looking for support with any part of Power Automate, our forums are the place to go. From General Power Automate forums to Using Connectors, Building Flows and Using Flows.  You will find thousands of technical professionals, and Super Users with years of experience who are ready and eager to answer your questions. You now have the ability to post, reply and give "kudos" on the Power Automate community forums. Make sure you conduct a quick search before creating a new post because your question may have already been asked and answered. Galleries: The galleries are full of content and can assist you with information on creating a flow in our Webinars and Video Gallery, and the ability to share the flows you have created in the Power Automate Cookbook.  Stay connected with the Community Connections & How-To Videos from the Microsoft Community Team. Check out the awesome content being shared there today.   Power Automate Community Blog: Over the years, more than 700 Power Automate Community Blog articles have been written and published by our thriving community. Our community members have learned some excellent tips and have keen insights on the future of process automation. In the Power Automate Community Blog, you can read the latest Power Automate-related posts from our community blog authors around the world. Let us know if you'd like to become an author and contribute your own writing — everything Power Automate-related is welcome.   Community Support: Check out and learn more about Using the Community for tips & tricks. Let us know in the Community Feedback  board if you have any questions or comments about your community experience. Again, we are so excited to welcome you to the Microsoft Power Automate community family. Whether you are brand new to the world of process automation or you are a seasoned Power Automate veteran - our goal is to shape the community to be your 'go to' for support, networking, education, inspiration and encouragement as we enjoy this adventure together.     Power Automate Community Team

Hear what's next for the Power Up Program

Hear from Principal Program Manager, Dimpi Gandhi, to discover the latest enhancements to the Microsoft #PowerUpProgram, including a new accelerated video-based curriculum crafted with the expertise of Microsoft MVPs, Rory Neary and Charlie Phipps-Bennett. If you’d like to hear what’s coming next, click the link below to sign up today! https://aka.ms/PowerUp  

Tuesday Tip | How to Report Spam in Our Community

It's time for another TUESDAY TIPS, your weekly connection with the most insightful tips and tricks that empower both newcomers and veterans in the Power Platform Community! Every Tuesday, we bring you a curated selection of the finest advice, distilled from the resources and tools in the Community. Whether you’re a seasoned member or just getting started, Tuesday Tips are the perfect compass guiding you across the dynamic landscape of the Power Platform Community.   As our community family expands each week, we revisit our essential tools, tips, and tricks to ensure you’re well-versed in the community’s pulse. Keep an eye on the News & Announcements for your weekly Tuesday Tips—you never know what you may learn!   Today's Tip: How to Report Spam in Our Community We strive to maintain a professional and helpful community, and part of that effort involves keeping our platform free of spam. If you encounter a post that you believe is spam, please follow these steps to report it: Locate the Post: Find the post in question within the community.Kebab Menu: Click on the "Kebab" menu | 3 Dots, on the top right of the post.Report Inappropriate Content: Select "Report Inappropriate Content" from the menu.Submit Report: Fill out any necessary details on the form and submit your report.   Our community team will review the report and take appropriate action to ensure our community remains a valuable resource for everyone.   Thank you for helping us keep the community clean and useful!

Users online (4,718)