cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
Anonymous
Not applicable

Parse data from text file

I need to be able to parse data elements from a test file, this is a jumble. 

I should always have a starting ref and end ref to the data (Words, headings etc) I thought that index, may be best?

 

Any Ideas? Below is ample with required text in bold, and static in italic.

 

TxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxt

 

A1. abcd123456     (Data length can be variable, but always has a space at end).

 

TxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxt

 

A3. A name of a a person  (Data has spaces etc in and variable length)

 

TxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxtTxt

 

1 ACCEPTED SOLUTION

Accepted Solutions
Anonymous
Not applicable

This was eventual solution.

 

TitleStart = add(indexof(string(body('Convert_Word_DOCX_Document_to_Text_(txt)')?['TextResult']),'Project title:'),14)

TitleEnd = add(indexof(substring(string(body('Convert_Word_DOCX_Document_to_Text_(txt)')?['TextResult']),add(variables('TitleStart'),10),1500),'Requested Quotation:'),10)


TitleText = replace(trim(replace(substring(string(body('Convert_Word_DOCX_Document_to_Text_(txt)')?['TextResult']),variables('TitleStart'),variables('TitleEnd')),'\r\n','')),' ','')

 

 

Converted the doc/docx to text document then simply parsed it thus for the information. All three of above obviously placed into variables. This method allows me to change the start / end txt to comply with many different input doc layouts.

View solution in original post

4 REPLIES 4
efialttes
Super User III
Super User III

Hi!

So can you split its content by using 'new line delimiter', and A1, A3 will always be at the beginnig of a row, and its contents inside that same row?

if so, and assuming your text file content is stored in a variable called 'myInputString', you can try with the following

split(variables('myInputString'),'
')

Now you can use Filter Array to remove all lines that do not start with A1, A3, etc.

Hope this helps

 



Each time you click on any of our inspiring answers 'Thumb up' icon...
...an ewok scapes from the stormtroopers.

Be grateful, Thumbs up! Save the Galaxy for free!


Escribo sobre Power Automate en:
https://medium.com/anyone-can-automate/

Proud to be a Flownaut!



ChristianAbata
Super User II
Super User II

hi @Anonymous  what I'll do is define a # max character for my lines for example

 

A1. abcd123456 has 10 and 

A3. A name of a a person has 20

 

So my max cloud be 25 and then unse indexof to locate my lines and substring with my max # of characters to get my line, of course , this could work if my line is separated from all text I mean in a separated line.



Did I answer your question? Please consider to Mark
my post as a solution! to guide others :winking_face:

Proud to be a Flownaut!


If you want you can follow me at www.christianabata.com Quieres contenido en español? Síguenos en Power Automate LA
v-litu-msft
Community Support
Community Support

Hi @Anonymous,

 

If each data is a line, you could use the split() function to separate these lines, then find these lines whether contains the "A1.", "A3.", if it contains, drag them out by using split() function again.

The method that separate lines, you could initialize a variable, this variable is empty but has an "Enter" (when you create it, just click enter once).

Annotation 2020-04-22 161500.png

 The way to drag the string after "A1." is that using split function, then get the second member of the array:

Annotation 2020-04-22 161534.png

 

Best Regards,
Community Support Team _ Lin Tu
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Anonymous
Not applicable

This was eventual solution.

 

TitleStart = add(indexof(string(body('Convert_Word_DOCX_Document_to_Text_(txt)')?['TextResult']),'Project title:'),14)

TitleEnd = add(indexof(substring(string(body('Convert_Word_DOCX_Document_to_Text_(txt)')?['TextResult']),add(variables('TitleStart'),10),1500),'Requested Quotation:'),10)


TitleText = replace(trim(replace(substring(string(body('Convert_Word_DOCX_Document_to_Text_(txt)')?['TextResult']),variables('TitleStart'),variables('TitleEnd')),'\r\n','')),' ','')

 

 

Converted the doc/docx to text document then simply parsed it thus for the information. All three of above obviously placed into variables. This method allows me to change the start / end txt to comply with many different input doc layouts.

View solution in original post

Helpful resources

Announcements
MPA_User Group Leader_768x460.jpg

Manage your user group events

Check out the News & Announcements to learn more.

V3_PVA CAmpaign Carousel.png

Community Challenge - Giveaways!

Participate in the Power Virtual Agents Community Challenge

Carousel 2021 Release Wave 2 Plan 768x460.jpg

2021 Release Wave 2 Plan

Power Platform release plan for the 2021 release wave 2 describes all new features releasing from October 2021 through March 2022.

PowerPlatform 768x460.png

Microsoft Learn

Check out our new Discover Your Career Path blog post series and get all the details.

Top Solution Authors
Users online (2,384)