cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
prisccaviana
Regular Visitor

Grab url in the middle of the body

Hi Folks! I would really appreciate if someone could come up with a guidance for this scenario: Imagining that my users are receiving emails with URL all day long, I need a smart way to extract the URI no matter the position this field is based on the email body:So, for instance: an email like this: from: johndoe@gmail.com to: prisccaviana@company.com Body: line 1 line 2 line 3 ..please visit this url: I need to be able to extract just the URI portion. I've already check a previous post here however,I´m matching just on the messages on which URLs are being mentioned on the first line of email body. Ideas?

1 ACCEPTED SOLUTION

Accepted Solutions

Hi again!

\n is the character for new line, one of the multiple exceptions you will find with this approach. As I mentioned before, Power Automate is not the best tool for scaping, specially when no patterns are used.

Anyway, you can try with this:

-Create a Compose action block, name 'Compose NewLine'. Click on its Inputs, hit 'Enter' from your keyboard

-Redesign your split() based expression to replace the New Lines inserting spaces instead. You can see the new expression in the follwing screenshot

split(replace(outputs('Html_to_text')?['body'],outputs('Compose_NewLine'),' '),' ')

Hope this helps

 

Flow_emailN.png

 



Each time you click on any of our inspiring answers 'Thumb up' icon...
...an ewok scapes from the stormtroopers.

Be grateful, Thumbs up! Save the Galaxy for free!


Escribo sobre Power Automate en:
https://medium.com/anyone-can-automate/

Proud to be a Flownaut!



View solution in original post

6 REPLIES 6
efialttes
Super User
Super User

Hi!

Does the URL always start with http://? https:// ?

If so, I would try with 'HTML to text' action block to convert your body into plain text, then add a Compose action block and assign the following expression:

concat('http',split(body('HTML_to_text'),'http')[1])

Then you just need to find the proper delimiter to remove extra characters after the URL, so you can play with split() function again

Hope this saves your day



Each time you click on any of our inspiring answers 'Thumb up' icon...
...an ewok scapes from the stormtroopers.

Be grateful, Thumbs up! Save the Galaxy for free!


Escribo sobre Power Automate en:
https://medium.com/anyone-can-automate/

Proud to be a Flownaut!



thanks for your suggestion.. so, sometimes would work, but I imagine when email body would come like:

 

blah blah blah.. click here: www. xyz.jp (imagining a phishing scenario)

 

in cases like these, how could I grab the uri portion?

thanks a lot

Looks bad...

When implementing email scraping with Power Automate (not the best tool to perform it, I am afraid) you need a pattern to extract URLs or any other data. If email bodies do not follow the same structure (i.e. data to be extracted comes before pattern#A and after pattern#B... If not starting with http... what else we can try? Split the body by spaces and filter it in order to get elements with more than one dot inside?

I would get some example emails and try the following expression:

split(body('HTML_to_text'),' ')

Once transformed into an array, doublecheck if URLs are isolated from the rest of the words as a single element.

If so, you can try to add a 'Filter array' assign as its input the expression above, and set the following condition rule:

On your left side assign the following expression

length(split(item(),'.'))

Select 'greater than' as an operator

On your right side of the rule assign the following value

2

Inspect its results to veryfy if you managed to extract the URLs

 

Hope this helps

 



Each time you click on any of our inspiring answers 'Thumb up' icon...
...an ewok scapes from the stormtroopers.

Be grateful, Thumbs up! Save the Galaxy for free!


Escribo sobre Power Automate en:
https://medium.com/anyone-can-automate/

Proud to be a Flownaut!



Hi, So, I tried using this workflow:

 

print1-duvida.PNG

and this is generating this outcome:

 

print2-duvida.PNG

so, I was wondering if there is a way to split this \n inside of the array, for just catch the domain portion.

ideas? suggestions?

thanks a lot.

Hi again!

\n is the character for new line, one of the multiple exceptions you will find with this approach. As I mentioned before, Power Automate is not the best tool for scaping, specially when no patterns are used.

Anyway, you can try with this:

-Create a Compose action block, name 'Compose NewLine'. Click on its Inputs, hit 'Enter' from your keyboard

-Redesign your split() based expression to replace the New Lines inserting spaces instead. You can see the new expression in the follwing screenshot

split(replace(outputs('Html_to_text')?['body'],outputs('Compose_NewLine'),' '),' ')

Hope this helps

 

Flow_emailN.png

 



Each time you click on any of our inspiring answers 'Thumb up' icon...
...an ewok scapes from the stormtroopers.

Be grateful, Thumbs up! Save the Galaxy for free!


Escribo sobre Power Automate en:
https://medium.com/anyone-can-automate/

Proud to be a Flownaut!



@prisccaviana 

I just so this topic was marked as solved, so I guess it was you that followed the instructions and overcame your challenge. Thanx for your kindness, and Happy Flowing!



Each time you click on any of our inspiring answers 'Thumb up' icon...
...an ewok scapes from the stormtroopers.

Be grateful, Thumbs up! Save the Galaxy for free!


Escribo sobre Power Automate en:
https://medium.com/anyone-can-automate/

Proud to be a Flownaut!



Helpful resources

Announcements
Power Platform Conf 2022 768x460.jpg

Join us for Microsoft Power Platform Conference

The first Microsoft-sponsored Power Platform Conference is coming in September. 100+ speakers, 150+ sessions, and what's new and next for Power Platform.

New Ideas Forum MPA.jpg

A new place to submit your Ideas for Power Automate

Announcing a new way to share your feedback with the Power Automate Team.

MPA Virtual Workshop Carousel 768x460.png

Register for a Free Workshop

Learn to digitize and optimize business processes and connect all your applications to share data in real time.

Super User 2 - 2022 Congratulations 768x460.png

Welcome Super Users

The Super User program for 2022- Season 2 has kicked off!

Users online (2,623)