cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
Helper I
Helper I

Extract URL from Email body

Hi All,

 

Need some help with extracting the URL from the email body below. I have a flow which converts the received email body  HTML to text and I tried to use split to extract it but its not working, can anyone advise the best way to achieve this?

aanyoti1_0-1600310070861.png

 

 

1 ACCEPTED SOLUTION

Accepted Solutions

Hi @aanyoti1 

The '\n' aren't being removed as there is no space between the carriage return... I've adjusted as follows:

a.jpg

 

Full configuration:

b.jpg

 

Expressions:

substring(variables('Text'),lastIndexOf(variables('Text'),'https'))

replace(outputs('Html_to_text')?['body'],'\n',' ')

substring(outputs('Compose_-_Replace_Chars'),0,indexOf(outputs('Compose_-_Replace_Chars'),' '))

 

You can obviously consolidate the expressions but I've kept separate for ease of reading... personally I wouldn't consolidate as it just makes it harder to read / support in future.

HTH

Jay

View solution in original post

10 REPLIES 10
Super User II
Super User II

Hey @aanyoti1 

You can do this with expressions but it is a little convoluted, review this data flow:

1.jpg

 

The URL has been extracted, the actions are:

1) using substring to remove all content before the first instance of 'https'

2) converting the remaining string content to plain text (This removes any new line chars (\n))

3) using substring to remove trailing content by locating the first whitespace

 

Here is the config:

2.jpg

 

And the expressions:

substring(variables('Text'),lastIndexOf(variables('Text'),'https'))

substring(outputs('Html_to_text')?['body'],0,indexOf(outputs('Html_to_text')?['body'],' '))

 

You could also consider using the Encodian 'Search Text - Regex' action which would be a lot more robust:

 

3.jpg

 

Configuration:

4.jpg

 

Regex: (?:(?:https?|ftp):\/\/|\b(?:[a-z\d]+\.))(?:(?:[^\s()<>]+|\((?:[^\s()<>]+|(?:\([^\s()<>]+\)))?\))+(?:\((?:[^\s()<>]+|(?:\(?:[^\s()<>]+\)))?\)|[^\s`!()\[\]{};:'".,<>?«»“”‘’]))?

HTH

Thanks alot @Jay-Encodian, am trying this out and will update.

Hi @Jay-Encodian,

 

Looks like we are nearly there, the last compose action still seems to include the word 'please', not sure why, maybe because it begins on a new line?? What expression should be included in the substring to have this removed?

 

aanyoti1_0-1600441370455.png

 

 

Hi @aanyoti1 

I don't think you have copied the expressions correctly.

The 'HTML to Text' is used to remove the line breaks, the last expression substrings the output from the 'HTML to Text' output starting at the first character and then ending at the first blank space.

You need to recheck what you have entered... can you also please post screen shots of the outputs from all the actions as per my previous screenshot and also include the expression you have entered

Thanks

Hi @Jay-Encodian

 

I checked again and this is what I entered:

 

Compose - Trim Start Action: 

 

 

 

 

substring(variables('Text'),lastIndexOf(variables('Text'),'https'))

 

 

 

 

 

Compose Action: 

 

 

 

 

substring(outputs('Html_to_text')?['body'],0,indexOf(outputs('Html_to_text')?['body'],' '))

 

 

 

 

 

See below flow:

 

flow url.jpg

 

flow url output 1.png

flow url output 2.png

flow url output 3.png

 

 

 

@aanyoti1 Can you post the a copy of the data you are trying to process... I don;t think there is a space between the URL and please.

Hi @Jay-Encodian,

 

See below:

 

Hello,

Your Approval has been requested for 
Products in the Basket: https://mydomain.12345.com/a3G5I000000bxkP

Please click link above to approve or reject this record. 

Thank you!

Hi @aanyoti1 ... hmmm, same data with the expressions I have already provided to you

1.jpg

 2.jpg

 

substring(variables('Text'),lastIndexOf(variables('Text'),'https'))

substring(outputs('Html_to_text')?['body'],0,indexOf(outputs('Html_to_text')?['body'],' '))

 

Can you please click on the 'Raw outputs' from the Html to Text action... I think there is some extra data in the payload.

Hi @Jay-Encodian,

 

See below raw output of the HTML to Text:

{
    "statusCode": 200,
    "headers": {
        "Pragma": "no-cache",
        "Transfer-Encoding": "chunked",
        "Vary": "Accept-Encoding",
        "Strict-Transport-Security": "max-age=31536000; includeSubDomains",
        "X-Content-Type-Options": "nosniff",
        "X-Frame-Options": "DENY",
        "Timing-Allow-Origin": "*",
        "x-ms-apihub-cached-response": "false",
        "Cache-Control": "no-store, no-cache",
        "Date": "Fri, 18 Sep 2020 14:51:08 GMT",
        "Set-Cookie": "ARRAffinity=7007353e6908e52d8a882a0d248752a54a5a3b25dfdde97fabc8ecca38b8d51c;Path=/;HttpOnly;Domain=conversionservice-ne.azconn-ne.p.azurewebsites.net",
        "Content-Type": "text/html; charset=utf-8",
        "Expires": "-1",
        "Content-Length": "127"
    },
    "body": "https://12345.abcde.com/a3G5I000000bxkP\n\nPlease click link above to approve or reject this record. \n\nThank you!"
}

 

Hi @aanyoti1 

The '\n' aren't being removed as there is no space between the carriage return... I've adjusted as follows:

a.jpg

 

Full configuration:

b.jpg

 

Expressions:

substring(variables('Text'),lastIndexOf(variables('Text'),'https'))

replace(outputs('Html_to_text')?['body'],'\n',' ')

substring(outputs('Compose_-_Replace_Chars'),0,indexOf(outputs('Compose_-_Replace_Chars'),' '))

 

You can obviously consolidate the expressions but I've kept separate for ease of reading... personally I wouldn't consolidate as it just makes it harder to read / support in future.

HTH

Jay

View solution in original post

Helpful resources

Announcements
New Super Users

Meet the Power Automate Super Users!

Many congratulations to the Season 1 2021 Flownaut Crew!

New Badges

New Solution Badges!

Check out our new profile badges recognizing authored solutions!

MPA Community Blog

Power Automate Community Blog

Check out the community blog page where you can find valuable learning material from community and product team members!

Users online (102,739)