cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
Highlighted
New Member

Xpath with Outlook email body

Hi all,

 

While trying to parse the body of my email (to select only the last message rather than all replies), I am trying to use xpath on the email body HTML but it fails because of the meta tag that is not closed.

 

 

 

<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta content="text/html; charset=utf-8">
</head>
<body>
<div dir="ltr">test&nbsp;</div>
<br>
<...>

 

 

There has to be a way to ignore the 2 meta lines and only parse the body but I could not find it. 

 

Here is what I tried:

 

Capture.PNG

 

where xpath is:

 

 

xpath(xml(outputs('Compose')),'//*[local-name()="body"]/*[local-name()="div"]')

 

 

Here is the message that mentions the meta tag that leads to invalid XML format:

Unable to process template language expressions in action 'Compose_2' inputs at line '1' and column '18960': 'The template language function 'xml' parameter is not valid. The provided value cannot be converted to XML: 'The 'meta' start tag on line 4 position 2 does not match the end tag of 'head'. Line 5, position 3.'. Please see https://aka.ms/logicexpressions#xml for usage details.'.

 

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted
Community Support
Community Support

Re: Xpath with Outlook email body

Hi @j38813,

 

There is my expression that process the HTML, convert it into XML, you could refer to it, please note that the compose name is Compose2 and Compose3, not Compose_2 and Compose_3:

replace(replace(split(split(outputs('Compose'),'</head>')[1],'</html>')[0],'&nbsp','_'),'<br>','')
xpath(xml(outputs('Compose2')),'//*[local-name()="body"]/*[local-name()="div"]')

Annotation 2020-04-13 130301.png

 

Best Regards,
Community Support Team _ Lin Tu
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

View solution in original post

7 REPLIES 7
Highlighted
Dual Super User III
Dual Super User III

Re: Xpath with Outlook email body

Hi!

You mean this meta tag is the cause for the error, right?

 

&nbsp;

 

Did you try to remove by means of an WDL expression?

 

xpath(xml(replace(outputs('Compose'),'&nbsp;','')),'//*[local-name()="body"]/*[local-name()="div"]')

 

I guess you can also replace it by whatever other character fitting your requirements, for example:

 

xpath(xml(replace(outputs('Compose'),'&nbsp;','_')),'//*[local-name()="body"]/*[local-name()="div"]')

 

Hope this helps

 

 



Each time you click on any of our inspiring answers 'Thumb up' icon...
...an ewok scapes from the stormtroopers.

Be grateful, Thumbs up! Save the Galaxy for free!


Escribo sobre Power Automate en:
https://medium.com/anyone-can-automate/

Proud to be a Flownaut!



Highlighted
New Member

Re: Xpath with Outlook email body

I was talking about the 2 <meta> tags that are in the <head> section. Becaus there is no </meta> closing tag and because it does not end with />, the XHTML is not considered a valid XML.

 

That being said, your suggestion is helpful. I can probable replace the two meta lines contained in the header by blank lines. Let see if I can make it work....

Highlighted
New Member

Re: Xpath with Outlook email body

It does not look that easy after all since different email sender would potentially have different headers. I would need to find a way to remove the head section or to extract the first div I am interested in.

 

At the end of the day, it just just about transforming some HTML into valid XML but I am not sure about how to do that.

Highlighted
Dual Super User III
Dual Super User III

Re: Xpath with Outlook email body

@j38813 

If your problem is just with the head section, you can try by removing the whole head section by means of WDL functions.

 

 

concat('<body>',split(triggerBody()?['Body'],'body>')[1],'body>')

 

 

I know, it sucks. But I haven' found any decent approach to transform HTML into XML so far... I mean with PowerAutomate  😞

If this new suggestion doesn't help, let's hope somebody else an point us to a reasonable workaround

 

 

 



Each time you click on any of our inspiring answers 'Thumb up' icon...
...an ewok scapes from the stormtroopers.

Be grateful, Thumbs up! Save the Galaxy for free!


Escribo sobre Power Automate en:
https://medium.com/anyone-can-automate/

Proud to be a Flownaut!



Highlighted
Community Support
Community Support

Re: Xpath with Outlook email body

Hi @j38813,

 

There is my expression that process the HTML, convert it into XML, you could refer to it, please note that the compose name is Compose2 and Compose3, not Compose_2 and Compose_3:

replace(replace(split(split(outputs('Compose'),'</head>')[1],'</html>')[0],'&nbsp','_'),'<br>','')
xpath(xml(outputs('Compose2')),'//*[local-name()="body"]/*[local-name()="div"]')

Annotation 2020-04-13 130301.png

 

Best Regards,
Community Support Team _ Lin Tu
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

View solution in original post

Highlighted
New Member

Re: Xpath with Outlook email body

Thank you all for your suggestions. Both would solve the specific problem I reported (the two meta tags in my HTML document). However, depending on the client email, there are other tags that are not supported which makes the document invalid (e.g. <div that have no closing tag for gmail emails), so I am still tweaking the flow to deal with these scenarios.

 

  

Highlighted
Dual Super User III
Dual Super User III

Re: Xpath with Outlook email body


@j38813 wrote:

Thank you all for your suggestions. Both would solve the specific problem I reported (the two meta tags in my HTML document). However, depending on the client email, there are other tags that are not supported which makes the document invalid (e.g. <div that have no closing tag for gmail emails), so I am still tweaking the flow to deal with these scenarios.

 

  


Hi!

when implementing this approach in a couple of projects I also faced with exceptions which make the document invalid. It's a bit frustrating but haven't found any better approach yet.... I mean other than adding replace() to handle the excepcions everytime I found a new one.

Anyway, we got some kind of happy ending, right? Not the happiest, but...

Hope this helps

 



Each time you click on any of our inspiring answers 'Thumb up' icon...
...an ewok scapes from the stormtroopers.

Be grateful, Thumbs up! Save the Galaxy for free!


Escribo sobre Power Automate en:
https://medium.com/anyone-can-automate/

Proud to be a Flownaut!



Helpful resources

Announcements
Community Conference

Power Platform Community Conference

Check out the on demand sessions that are available now!

Power Platform ISV Studio

Power Platform ISV Studio

ISV Studio is designed to become the go-to Power Platform destination for ISV’s to monitor & manage published applications.

Upcoming Events

Experience what’s next for Power Automate

See the latest Power Automate innovations, updates, and demos from the Microsoft Business Applications Launch Event.

Top Solution Authors
Top Kudoed Authors
Users online (6,834)