cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
PAuserFromFranc
Helper III
Helper III

New webflow from specific website not able to extract data

Hello Guys, 

 

Is someone able to build up a flow that can extract all the contacts informations listed in all the companies of this website?

i canno't find the right selector to grab it

https://www.energaia.fr/visiter/liste-des-exposants/

thanks for helping, 

 

regards, 

 

Fred

2 ACCEPTED SOLUTIONS

Accepted Solutions
Henrik_M
Super User
Super User

This should work for you. Paste the whole text into an empty PAD flow.

 

Look it through and see if it makes sense to you.

 

You must have this page open when you run it: https://exposants.energaia.fr/form/liste_exposant&lang=fr&session=EN22&langue_id=1

 

View solution in original post

https://regexone.com/ takes you through many of the basics, but (a lot of) practice is what makes... proficient, at some point 😅

 

But actually you shouldn't even need the regular expressions that much moving forward, since the Crop text action can do the whole "get text between two other texts" thing that I did with parse text.

 

By the way, the Replace text that I add, is just because I find new lines to be annoying when it comes to parsing, so I tend to reduce them to regular spaces.

Henrik_M_0-1664919298934.png

 

View solution in original post

18 REPLIES 18
PAuserFromFranc
Helper III
Helper III

@VJR @Henri @Henrik_M @Ankesh_49 

Hi Team, requesting your help to manage selector and build this flow i'm asking for...

May you help please?

thx fred

Ankesh_49
Super User
Super User

@PAuserFromFranc   Could you please share the flow you have developed? which selector are you using?

PAuserFromFranc
Helper III
Helper III

@Ankesh_49 at the moment nothing but i want to grab information from 

https://www.energaia.fr/visiter/liste-des-exposants/

for each company name found in the table, then open the little down arrow and get name, phone, email, adress and field of activity for thoses companies and also for each page (we need also pagination)

Usually i know how to do it but i can't here normally selector would be : 

body > main > section > div > div > form > div:eq(3) > table > tr.odd:nth-child(1) > td:nth-child(1) > span:nth-child(2)

something like this with attribute like tr[Class="odd"]

 

Pavel_NaNoi
Impactful Individual
Impactful Individual

From my limited research on this, most I can tell you is that you'll need to run a javascript in order for this to work on the webpage. Why? well because the company information is not actually on the webpage, but rather an imbued document, an iframe. Thus to access it you need to run javascript that can somehow switch the CSS selector from the main page to this iframe of the imbued document. I have honestly no idea how to do that and I hope someone more well versed in this will come along to help out with this.

Pavel_NaNoi
Impactful Individual
Impactful Individual

Figured it out, do the following:

-Launch the webpage

-Use Extract data from web page

-In the advanced options write the following css selector - > "iframe:eq(0)" with the following attribute "src"

this will get you the html link of the iframe, now simply launch a new chrome instance with that link and from there you can extract everything as normal.

 

Enjoy.

PAuserFromFranc
Helper III
Helper III

@Ankesh_49 @Pavel_NaNoi 

thank you i could step a bit but i'm still stuck with a javascript to execute to open the little arrow and get the datas

PAuserFromFranc_0-1664476023464.png

 

Shouldn't need javascript for that, does extract data from webpage not work?

Alright, I can see now why you were struggling on that extraction part, got some good news and some bad news,

Good news, I made an automation that does what you want, opens the arrow, extracts text and moves to the next.

Bad news? its 1 minute 30 seconds per page (in 1ms delay debug mode)

 

I don't see a way of improving that time other than maybe just using an API call method or something (I have no idea how to do that don't even ask)

 

However if you want this automation, private message me (just click my profile, should see the button on the right), I'll send it over to you, it will need some editing from your end though.

In summary it does the following:

- Creates a new Datatable 

- Gets the number of pages to go through

- Creates a loop based on the number of pages

- Extracts arrows count on page

- Goes through each arrow, extracting specific text (can be edited to extract w/e)

- Puts that information into the Datatable (will need editing if the above is changed)

- Once done, moves to next page and repeat till finished.

Henrik_M
Super User
Super User

Step one should be to enter the iframe directly:  https://exposants.energaia.fr/form/liste_exposant&lang=fr&session=EN22&langue_id=1 

Henrik_M_0-1664552145732.png

 

I thought about the program in my head, and it should be possible. I'll see if I have time to make it during the weekend, then I can share.

PAuserFromFranc
Helper III
Helper III

Enter the iframe, i did but the rest...i'm stuck, thanks again @Henrik_M 

Henrik_M
Super User
Super User

This should work for you. Paste the whole text into an empty PAD flow.

 

Look it through and see if it makes sense to you.

 

You must have this page open when you run it: https://exposants.energaia.fr/form/liste_exposant&lang=fr&session=EN22&langue_id=1

 

Wait, Henrik, how'd you put a zip file attachment in your message? I can't seem to do it, just tells me its no supported.

I might have more privileges because of the Super User status. I only got the "not supported" message when I tried uploading the .txt file 🤔

Henrik_M_0-1664648781703.png

 

Ah, fair enough.

PAuserFromFranc
Helper III
Helper III

Thank you so much @Henrik_M 

I'm now trying to understand the flow you made but too difficult. I don't get this for instance :

table[Id="exposant"] > tbody > tr > td > span[Class*="fa-chevron"]:eq(%LoopIndex_Chevron%)

loopIndex_Chevron is variable and you use it as attribute right to keep forward?

And what does mean the little * after Class?

The rest i get it i think but very complex for me to think the algorithmes this way...

thank for all

Fred

Correct. Since we know that there are 25 entries on each page, we count from index 0 to 24.

 

*= is the way to write the contains operator between an attribute (the class) and the value (fa-chevron)

 

So in this case, we are able to advance down through the list and open each description box, regardless of the chevron type.

Henrik_M_0-1664730000646.png

 

PAuserFromFranc
Helper III
Helper III

Hi @Henrik_M where can i learn Regex like you did (?<=Contact : ).+?(?=string) and so on? i don't get it and i'm not into code or regex so i can't understand it well in order to use it for similar flows which attend to be some others texts to parse

thanks

https://regexone.com/ takes you through many of the basics, but (a lot of) practice is what makes... proficient, at some point 😅

 

But actually you shouldn't even need the regular expressions that much moving forward, since the Crop text action can do the whole "get text between two other texts" thing that I did with parse text.

 

By the way, the Replace text that I add, is just because I find new lines to be annoying when it comes to parsing, so I tend to reduce them to regular spaces.

Henrik_M_0-1664919298934.png

 

Helpful resources

Announcements

Power Platform Connections Ep 14 | J. Panchal | Thursday, 18 May 2023

Episode Fourteen of Power Platform Connections sees David Warner and Hugo Bernier talk to Microsoft PM Jocelyn Panchal, alongside the latest news, videos, product reviews, and community blogs.     Use the hashtag #PowerPlatformConnects on social media for a chance to have your work featured on the show.      Show schedule in this episode:  00:00 Cold Open 00:32 Show Intro 01:10 Jocelyn Panchal Interview 24:10 Blogs & Articles 29:50 Outro & Bloopers  Check out the blogs and articles featured in this week’s episode:   https://www.nathalieleenders.com/Blog/index.php/;focus=STRATP_com_cm4all_wdn_Flatpress_42136159&path=?x=entry:entry230511-101930#C_STRATP_com_cm4all_wdn_Flatpress_42136159__-anchor  @NathLeenders https://www.keithatherton.com/posts/2023-05-12-msbuild2023-cloud-skills-challenge/  @MrKeithAtherton https://elliskarim.com/2023/05/13/how-to-find-files-in-onedrive-that-match-a-naming-pattern/  @MrCaptainKarim https://www.linkedin.com/pulse/my-fond-memories-scottish-summit-2022-pranav-khurana/ @pranavkhuranauk https://www.linkedin.com/feed/update/urn:li:activity:7061777660745560064/?updateEntityUrn=urn%3Ali%3Afs_feedUpdate%3A%28V2%2Curn%3Ali%3Aactivity%3A7061777660745560064%29  @thevictordantas  Action requested: Feel free to provide feedback on how we can make our community more inclusive and diverse.  This episode premiered live on our YouTube at 12pm PST on Thursday 18th May 2023.  Video series available at Power Platform Community YouTube channel.  Upcoming events:  Power Apps Developers Summit – May 19-20th - London European Power Platform conference – Jun. 20-22nd - Dublin Microsoft Power Platform Conference – Oct. 3-5th - Las Vegas  Join our Communities:  Power Apps Community Power Automate Community Power Virtual Agents Community Power Pages Community  If you’d like to hear from a specific community member in an upcoming recording and/or have specific questions for the Power Platform Connections team, please let us know. We will do our best to address all your requests or questions.   

May 2023 Community Newsletter and Upcoming Events

Welcome to our May 2023 Community Newsletter, where we'll be highlighting the latest news, releases, upcoming events, and the great work of our members inside the Biz Apps communities. If you're new to this LinkedIn group, be sure to subscribe here in the News & Announcements to stay up to date with the latest news from our ever-growing membership network who "changed the way they thought about code".         LATEST NEWS "Mondays at Microsoft" LIVE on LinkedIn - 8am PST - Monday 15th May  - Grab your Monday morning coffee and come join Principal Program Managers Heather Cook and Karuana Gatimu for the premiere episode of "Mondays at Microsoft"! This show will kick off the launch of the new Microsoft Community LinkedIn channel and cover a whole host of hot topics from across the #PowerPlatform, #ModernWork, #Dynamics365, #AI, and everything in-between. Just click the image below to register and come join the team LIVE on Monday 15th May 2023 at 8am PST. Hope to see you there!     Executive Keynote | Microsoft Customer Success Day CVP for Business Applications & Platform, Charles Lamanna, shares the latest #BusinessApplications product enhancements and updates to help customers achieve their business outcomes.         S01E13 Power Platform Connections - 12pm PST - Thursday 11th May Episode Thirteen of Power Platform Connections sees Hugo Bernier take a deep dive into the mind of co-host David Warner II, alongside the reviewing the great work of Dennis Goedegebuure, Keith Atherton, Michael Megel, Cat Schneider, and more. Click below to subscribe and get notified, with David and Hugo LIVE in the YouTube chat from 12pm PST. And use the hashtag #PowerPlatformConnects on social media for a chance to have your work featured on the show.         UPCOMING EVENTS   European Power Platform Conference - early bird ticket sale ends! The European Power Platform Conference early bird ticket sale ends on Friday 12th May 2023! #EPPC23 brings together the Microsoft Power Platform Communities for three days of unrivaled days in-person learning, connections and inspiration, featuring three inspirational keynotes, six expert full-day tutorials, and over eighty-five specialist sessions, with guest speakers including April Dunnam, Dona Sarkar, Ilya Fainberg, Janet Robb, Daniel Laskewitz, Rui Santos, Jens Christian Schrøder, Marco Rocca, and many more. Deep dive into the latest product advancements as you hear from some of the brightest minds in the #PowerApps space. Click here to book your ticket today and save!      DynamicMinds Conference - Slovenia - 22-24th May 2023 It's not long now until the DynamicsMinds Conference, which takes place in Slovenia on 22nd - 24th May, 2023 - where brilliant minds meet, mingle & share! This great Power Platform and Dynamics 365 Conference features a whole host of amazing speakers, including the likes of Georg Glantschnig, Dona Sarkar, Tommy Skaue, Monique Hayward, Aleksandar Totovic, Rachel Profitt, Aurélien CLERE, Ana Inés Urrutia de Souza, Luca Pellegrini, Bostjan Golob, Shannon Mullins, Elena Baeva, Ivan Ficko, Guro Faller, Vivian Voss, Andrew Bibby, Tricia Sinclair, Roger Gilchrist, Sara Lagerquist, Steve Mordue, and many more. Click here: DynamicsMinds Conference for more info on what is sure an amazing community conference covering all aspects of Power Platform and beyond.    Days of Knowledge Conference in Denmark - 1-2nd June 2023 Check out 'Days of Knowledge', a Directions 4 Partners conference on 1st-2nd June in Odense, Denmark, which focuses on educating employees, sharing knowledge and upgrading Business Central professionals. This fantastic two-day conference offers a combination of training sessions and workshops - all with Business Central and related products as the main topic. There's a great list of industry experts sharing their knowledge, including Iona V., Bert Verbeek, Liza Juhlin, Douglas Romão, Carolina Edvinsson, Kim Dalsgaard Christensen, Inga Sartauskaite, Peik Bech-Andersen, Shannon Mullins, James Crowter, Mona Borksted Nielsen, Renato Fajdiga, Vivian Voss, Sven Noomen, Paulien Buskens, Andri Már Helgason, Kayleen Hannigan, Freddy Kristiansen, Signe Agerbo, Luc van Vugt, and many more. If you want to meet industry experts, gain an advantage in the SMB-market, and acquire new knowledge about Microsoft Dynamics Business Central, click here Days of Knowledge Conference in Denmark to buy your ticket today!   COMMUNITY HIGHLIGHTS Check out our top Super and Community Users reaching new levels! These hardworking members are posting, answering questions, kudos, and providing top solutions in their communities.   Power Apps:  Super Users: @WarrenBelz, @LaurensM  @BCBuizer  Community Users:  @Amik@ @mmollet, @Cr1t    Power Automate:  Super Users: @Expiscornovus , @grantjenkins, @abm  Community Users: @Nived_Nambiar, @ManishSolanki    Power Virtual Agents:  Super Users: @Pstork1, @Expiscornovus  Community Users: @JoseA, @fernandosilva, @angerfire1213    Power Pages: Super Users: @ragavanrajan  Community Users: @Fubar, @Madhankumar_L,@gospa  LATEST COMMUNITY BLOG ARTICLES  Power Apps Community Blog  Power Automate Community Blog  Power Virtual Agents Community Blog  Power Pages Community Blog  Check out 'Using the Community' for more helpful tips and information:  Power Apps , Power Automate, Power Virtual Agents, Power Pages 

Announcing | Super Users - 2023 Season 1

Super Users – 2023 Season 1    We are excited to kick off the Power Users Super User Program for 2023 - Season 1.  The Power Platform Super Users have done an amazing job in keeping the Power Platform communities helpful, accurate and responsive. We would like to send these amazing folks a big THANK YOU for their efforts.      Super User Season 1 | Contributions July 1, 2022 – December 31, 2022  Super User Season 2 | Contributions January 1, 2023 – June 30, 2023    Curious what a Super User is? Super Users are especially active community members who are eager to help others with their community questions. There are 2 Super User seasons in a year, and we monitor the community for new potential Super Users at the end of each season. Super Users are recognized in the community with both a rank name and icon next to their username, and a seasonal badge on their profile.  Power Apps  Power Automate  Power Virtual Agents  Power Pages  Pstork1*  Pstork1*  Pstork1*  OliverRodrigues  BCBuizer  Expiscornovus*  Expiscornovus*  ragavanrajan  AhmedSalih  grantjenkins  renatoromao    Mira_Ghaly*  Mira_Ghaly*      Sundeep_Malik*  Sundeep_Malik*      SudeepGhatakNZ*  SudeepGhatakNZ*      StretchFredrik*  StretchFredrik*      365-Assist*  365-Assist*      cha_cha  ekarim2020      timl  Hardesh15      iAm_ManCat  annajhaveri      SebS  Rhiassuring      LaurensM  abm      TheRobRush  Ankesh_49      WiZey  lbendlin      Nogueira1306  Kaif_Siddique      victorcp  RobElliott      dpoggemann  srduval      SBax  CFernandes      Roverandom  schwibach      Akser  CraigStewart      PowerRanger  MichaelAnnis      subsguts  David_MA      EricRegnier  edgonzales      zmansuri  GeorgiosG      ChrisPiasecki  ryule      AmDev  fchopo      phipps0218  tom_riha      theapurva  takolota     Akash17  momlo     BCLS776  Shuvam-rpa     rampprakash  ScottShearer     Rusk  ChristianAbata     cchannon  Koen5     a33ik  Heartholme     AaronKnox  okeks      Matren  David_MA     Alex_10        Jeff_Thorpe        poweractivate        Ramole        DianaBirkelbach        DavidZoon        AJ_Z        PriyankaGeethik        BrianS        StalinPonnusamy        HamidBee        CNT        Anonymous_Hippo        Anchov        KeithAtherton        alaabitar        Tolu_Victor        KRider        sperry1625        IPC_ahaas      zuurg    rubin_boer   cwebb365   Dorrinda   G1124   Gabibalaban   Manan-Malhotra   jcfDaniel   WarrenBelz   Waegemma   drrickryp   GuidoPreite   metsshan    If an * is at the end of a user's name this means they are a Multi Super User, in more than one community. Please note this is not the final list, as we are pending a few acceptances.  Once they are received the list will be updated. 

Check out the new Power Platform Communities Front Door Experience!

We are excited to share the ‘Power Platform Communities Front Door’ experience with you!   Front Door brings together content from all the Power Platform communities into a single place for our community members, customers and low-code, no-code enthusiasts to learn, share and engage with peers, advocates, community program managers and our product team members. There are a host of features and new capabilities now available on Power Platform Communities Front Door to make content more discoverable for all power product community users which includes ForumsUser GroupsEventsCommunity highlightsCommunity by numbersLinks to all communities Users can see top discussions from across all the Power Platform communities and easily navigate to the latest or trending posts for further interaction. Additionally, they can filter to individual products as well.   Users can filter and browse the user group events from all power platform products with feature parity to existing community user group experience and added filtering capabilities.     Users can now explore user groups on the Power Platform Front Door landing page with capability to view all products in Power Platform.      Explore Power Platform Communities Front Door today. Visit Power Platform Community Front door to easily navigate to the different product communities, view a roll up of user groups, events and forums.

Microsoft Power Platform Conference | Registration Open | Oct. 3-5 2023

We are so excited to see you for the Microsoft Power Platform Conference in Las Vegas October 3-5 2023! But first, let's take a look back at some fun moments and the best community in tech from MPPC 2022 in Orlando, Florida.   Featuring guest speakers such as Charles Lamanna, Heather Cook, Julie Strauss, Nirav Shah, Ryan Cunningham, Sangya Singh, Stephen Siciliano, Hugo Bernier and many more.   Register today: https://www.powerplatformconf.com/   

Users online (1,994)