Extract data from html table from email body in Power Automate
Use Case
I have found that there are many posts related to extract data from html table in the email body. This is the common scenario when organization tries to automate the process to increase personal productivity of users. So, I thought of sharing a solution to accomplish this task.
Scenario
Let’s take an example of getting information from partners. The users send a template to business partners in the form of html table to respond. Partners replied on the mail with the required information. Using Power Automate cloud flow, we will extract the data or required information from the business and stores those values in the excel sheet saved in SharePoint document library.
Input Source (Email)
We will extract the values for Price/PC, Currency, MOQ & Lead-time (weeks) from the email received from business partners.
Target Excel File
Master excel file sheet has been saved in SharePoint document library. The excel sheet contains a table which has unique column ‘UniqueID’. Based on the unique column, values extracted from email body will be updated in the excel file.
Solution
There are many solutions to extract the data from email like convert html to text and parsing the string, AI builder etc. But here, I will make use of fx expressions to achieve this.
1. Start by creating a new Automated cloud flow.
2. Enter the flow name & select the trigger “When an email arrives (V3)”. Press Create button to proceed.
3. Expand trigger action by right click it. When an email arrives provides various filter options to choose from and accordingly the flow will trigger. You can filter based on sender, recipient(s), subject, with or without attachment(s) etc. Here, we will filter the incoming email on subject “RQF Automated” so that the flow will be triggered on a specific email.
4. Add Compose action and write an expression to extract html table from the email body
concat('<table ',first(split(last(split(triggerOutputs()?['body/body'],'<table ')),'</table>')),'</table>')
5. Run the flow once to get to know which tags to target for extracting the data. Depends on the column’s formatting, html tags may vary so it recommended to know beforehand. Analyze the output of compose action and note the tags that contains header of data values.
<table width=\"100%\" style=\"box-sizing:border-box; border-collapse:collapse; border-spacing:0px; font-family:SegoeUI,Lato,"Helvetica Neue",Helvetica,Arial,sans-serif; font-size:15px; font-weight:300; background-color:rgb(255,255,255)\"><tbody style=\"box-sizing:border-box\"><tr style=\"box-sizing:border-box; border-bottom:1px solid rgb(234,234,234)\"><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p style=\"box-sizing:border-box; margin:0px\"><strong class=\"x_ContentPasted0\" style=\"box-sizing:border-box; font-weight:bold\">Item Code</strong></p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p style=\"box-sizing:border-box; margin:0px\"><strong class=\"x_ContentPasted0\" style=\"box-sizing:border-box; font-weight:bold\">Supplier</strong></p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p style=\"box-sizing:border-box; margin:0px\"><strong class=\"x_ContentPasted0\" style=\"box-sizing:border-box; font-weight:bold\">QTY</strong></p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p style=\"box-sizing:border-box; margin:0px\"><strong class=\"x_ContentPasted0\" style=\"box-sizing:border-box; font-weight:bold\">UniqueID</strong></p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p style=\"box-sizing:border-box; margin:0px\"><strong class=\"x_ContentPasted0\" style=\"box-sizing:border-box; font-weight:bold\">Price/PC</strong></p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p style=\"box-sizing:border-box; margin:0px\"><strong class=\"x_ContentPasted0\" style=\"box-sizing:border-box; font-weight:bold\">Currency</strong></p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p style=\"box-sizing:border-box; margin:0px\"><strong class=\"x_ContentPasted0\" style=\"box-sizing:border-box; font-weight:bold\">MOQ</strong></p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p style=\"box-sizing:border-box; margin:0px\"><strong class=\"x_ContentPasted0\" style=\"box-sizing:border-box; font-weight:bold\">Lead-time (weeks )</strong></p></td></tr><tr style=\"box-sizing:border-box; border-bottom:1px solid rgb(234,234,234)\"><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">ABCDEF</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">XYZ</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">1000</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">ABCDEFXYZ</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">1</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">USD</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">10</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">2</p></td></tr><tr style=\"box-sizing:border-box; border-bottom:1px solid rgb(234,234,234)\"><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">EFGH</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">XYZ</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">2000</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">EFGHXYZ</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">2</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">USD</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">20</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">4</p></td></tr><tr style=\"box-sizing:border-box; border-bottom:1px solid rgb(234,234,234)\"><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">HIGL</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">XYZ</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">1000</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">HIGLXYZ</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">3</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">USD</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">20</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">6</p></td></tr><tr style=\"box-sizing:border-box; border-bottom:1px solid rgb(234,234,234)\"><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">MNOP</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">XYZ</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">2000</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">MNOPXYZ</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">4</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">USD</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">40</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">8</p></td></tr><tr style=\"box-sizing:border-box; border-bottom:1px solid rgb(234,234,234)\"><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">QRST</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">ZMR</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">5000</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">QRSTZMR</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">5</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">USD</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">50</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">10</p></td></tr><tr style=\"box-sizing:border-box\"><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">UVWZ</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">ZMR</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">5000</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">UVWZZMR</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">6</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">USD</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">60</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">15</p></td></tr></tbody></table>
Here, we will target the rows with data columns and find the route or path (table > tbody > tr > td > p). Using this path in XPATH function, we will retrieve the data values.
5. Add another compose action, to get the values from html table using expression. In the expression, we will convert the string to xml & then apply XPATH to get values. We also convert into rows and each rows contains corresponding all column values using chunk function.
chunk(xpath(xml(outputs('Compose')),'//table//tbody//tr//td//p//text()'),8)
Pls note that as we have total 8 columns in the html table so we will use 8 in chunk function. The chunk function creates element for each row (including row) and each row contains an array of 8 elements (columns)
6. Next, add “Apply to each” action to iterate each row. As output array of compose 2 also contains header row, so we will use expression to start iterating from second element & skips the header row:
skip(outputs('Compose_2'),1)
Add “Update a row” action inside apply to each block, to update the row in excel based on the unique column.
As index starts from zero (0), so to get the value of first column (Item Code) you need to use item()?[0]. You could get the value of each row in the iteration using the indexer.
Output
When an email arrives with subject “RFQ Automated”, the flow will trigger. It extracts the values from table present in the email body and update the data in the excel file
Conclusion
So, we can now say that using fx expression we can automate the process by extracting the data from the table in the email body. This is the power of expressions in power platform. Using same concept, we could extract the data from any source as we could easily transforms JSON object to XML using expression and further apply filter using XAPTH as per the business requirement.
gopi
Hi Manish, this is good one. thanks.
How to deal with multiple tables in HTML. Lets say we have 3 tables in the HTML.
How do we detect second table under your formula. In the HTML, we have table ID when at start but closing table will not have any ID..
concat(‘<table ',first(split(last(split(triggerOutputs()?['body/body'],'<table ')),'</table ')),'</table ')
please let me know
Manish Solanki
Hi Gopi,
I am glad that you find this useful 🙂
To deal with multiple tables, you can use ‘Select’ action that creates an array of tables present in the email body:
Expression to be used for ‘From’ parameter:
Click ‘Switch Map to text mode’ button on the right side of ‘Map’ parameter and enter the below expression in the Map textbox:
The output body of ‘Select’ action will be a string array of tables present in the email body.
You can get html code of each table using an expression.
For first table use expression:
Second table:
Third and last table:
Please try this & let me know if you face any issue.
Thank you!
dhiakeshia
Hi Manish! Thank you for this tutorial. I would like to ask for this part
“You can get html code of each table using an expression.” what kind of action is being used? is it Composed?
and also, the example you gave was for 3 tables, what about the expression when there are 2 tables only? Thank you in advance.
Manish Solanki
Hi,
You may use ‘Compose’ action or enter an expression directly in xpath function.
For two tables, you can use expression for first & last tables and ignore the second table expression.
Please try & let me know if you face any issue.
Thanks!
Jo
HiHi, can help with the function to get the latest table? on compose ” concat(‘<table ',first(split(last(split(triggerOutputs()?['body/body'],'<table ')),'’)),”)”. I wouldnt get it right:(
Manish Solanki
Hi,
Are you getting any error in the expression for getting last table? Please share the details.
Thank you!
Adrian
Manish, this is exactly what I was looking for! However.. I’m a bit stuck 😐 I’ve followed your instructions and also included the extra steps to split out multiple tables (as per comments and help from/for dhiakeshia above).
I’m at the point where I have the table I need extracted into a ‘Compose’ (and if I save the output from this step as an html file it loads as I’d expect) but when I add in the next step to convert to XML/Chunk I get an error when running:
” Unable to process template language expressions in action ‘Compose_2’ inputs at line ‘0’ and column ‘0’: ‘The template language function ‘xml’ parameter is not valid. The provided value cannot be converted to XML: ‘Reference to undeclared entity ‘nbsp’. Line 1, position 400.’. Please see https://aka.ms/logicexpressions#xml for usage details.’. ”
If you have any time to help it would be much appreciated! Thanks 🙂
Manish Solanki
Pls use replace function to clear encode blank space i.e.  . If you are facing issue then pls share the output of compose action before extracting last html table. Thank you!
Jordan Smale
Is this the same process used when trying to extract a table from a email body and then convert the table into a csv file? I keep getting stuck where the create a csv is saying the data “from” is a string and needs to be an array. My actions are 1. “When a new email arrives (V3)” 2. “Html to text” 3. Compose to isolate table ** split(outputs(‘Html_to_text’)?[‘body’], ‘Total’)[0] ** 4. “Compose action” to split rows ** split(outputs(‘IsolateTable’), ‘\n’) ** 5. “Compose Action” rows to columns from previous compose action. 6. “Create CSV table” from previous “compose action” 7. Create file CSV/ Save to sharepoint
Manish Solanki
Please try the below formula or expression in step 4. New line feed ‘\n’ is represented as decodeUriComponent(‘%0A’) in expression:
split(outputs(‘IsolateTable’), decodeUriComponent(‘%0A’))
Marcus
Hi Manish
Thanks for providing this. I was trying it but hit an error at the point of Compose 1 which is convert to XML/Chunk.
My Compose Output is:
and I used the expression: chunk(xpath(xml(outputs(‘Compose’)),’//table//tbody//tr//td//p//text()’),6) in Compose 1, which resulted in the following error:
Unable to process template language expressions in action ‘Compose_1’ inputs at line ‘0’ and column ‘0’: ‘The template language function ‘xml’ parameter is not valid. The provided value cannot be converted to XML: ‘The ‘img’ start tag on line 1 position 147 does not match the end tag of ‘a’. Line 1, position 289.’. Please see https://aka.ms/logicexpressions#xml for usage details.’.
Manish Solanki
Hi,
It looks like html code contains an image tag that usually does not closed property or does not contain close tag. So, when we convert it to xml using expression, it will throw error. I am afraid this flow will not work here. I will check if I am able find some workaround for this specific scenario.
Thank you!