Custom AI prompt for extracting & transforming data
Use Case
In automation, many a times we come across copying data from email body to other data source like excel, SharePoint, Dataverse table etc. To extract data from HTML table & then transform it into JSON requires writing lots of complex expression and multiple action.
To resolve this issue, I have leveraged custom AI prompt that extract data from HTML table and convert it to JSON array. The JSON array can be further be used to store data in excel sheet table, csv file, Dataverse table etc.
Scenario
We will read data from the HTML table in email body and convert it to JSON format. Let’s take an example of an email with order details. We will also format currency & datetime columns.
Solution
We will divide solution into two parts: 1) Creating a custom AI prompt & 2) Desing a flow to track incoming emails
1) Creating a custom AI prompt
Navigate to “AI hub” from left navigation and click “AI Prompts” tile:
Click “Create text GPT with prompt” tile on next screen:
Go to “Prompt Settings” section and enter the prompt name as “Convert HTML Table to JSON array”. To pass the inputs from power automate flow, we need to add input parameters:
We will add 3 parameters to pass the value dynamically in the flow.
First parameter is ‘HTML Table’ that will dynamically take the email text from the trigger.
Name: HTML Table
Sample data:
<table>
<tr>
<th>Company</th>
<th>Contact</th>
<th>Country</th>
</tr>
<tr>
<td>Alfreds Futterkiste</td>
<td>Maria Anders</td>
<td>Germany</td>
</tr>
<tr>
<td>Centro comercial Moctezuma</td>
<td>Francisco Chang</td>
<td>Mexico</td>
</tr>
</table>
Second Parmeter is ‘ExcludeColumns’ that holds the comma separated columns name(s) that needs to be excluded from the output JSON array:
Name: ExcludeColumns
Sample data:Contact
Last parameter is ‘OutputStructure’ that defines the structure of JSON array return by GPT model:
Name: OutputStructure
Sample data: [{"Company":"String","Contact":"String","Country":"String"}]
Select ‘JSON’ option in the “Output” setting. This is a preview feature:
Finally, select “GPT 4” in the Settings section. Leave the default value (0) for temperature:
Now, we will enter the prompt to fetch data from HTML table and convert into JSON format:
Copy the below text in the Prompt textbox:
Create a valid JSON array of objects to list columns from given HTML table:
Exclude property in each JSON object.
Provide valid JSON response that strictly comply with below structure:
Using ‘Insert” button, insert the input variable at the appropriate location to complete the prompt:
Finally, the entire prompt will look like this:
Click ‘Test prompt’ button to see the response:
You can test the accuracy of the prompt by varying input parameters & don’t forget to click “Save custom prompt” before using in flow or an app.
We have custom AI prompt is ready so let’s put in a real use case to convert HTML table into JSON array:
2) Design a flow to track incoming emails
1. Create an automate cloud flow and choose trigger as “When a new email arrives (V3)”. Set up the parameters to apply filter on subject, sender, recipients etc. as per the need:
2. Add “Compose” action and name this action as ‘Output Structure’. Set the value of expected JSON output structure that we will pass as input parameter in Chat GPT action. While defining structure, we have also provided instruction to set correct data type along with currency formatting & converting date to ISO format:
[
{
"Order Id": "String",
"Item Id": "String",
"Item Name": "String",
"Item Description": "String",
"Quantity": "Number",
"Total Price": "String, add $ prefix if value exits",
"Shipment Date": "Datetime, convert date from M/D/yyyy to ISO format"
}
]
Add another “Compose” action & name it as ‘Exclude Columns’. Set the value as ‘Item Description’ to exclude that column in the JSON output. To return all columns in the output, set this parameter to null (as an expression):
Item Description
3. Finally, add “Create text with GPT using a prompt” action to get the response from AI prompt set up earlier. Choose the AI prompt from the drop down. Pass the email body form the output of trigger action in the ‘HTML Table’ parameter. Dynamically pass the value of both compose action in the respective parameters:
Output
Flow execution output:
[
{
"@odata.type": "#Microsoft.Dynamics.CRM.expando",
"Order_00c5e057d97da2d32bc7250e380c6217c2": "10001",
"Item_002270c9eb23790af8722ba88510e993970": "1",
"Item_00206d4918fe3e6838d2d1b1df08325e67f": "Item 1",
"Quantity": 2,
"Total_003cb3f0517ef0da8bf733893dd7f35b12": "$100",
"Shipment273d810c3170a23f04f0758777b3e573": "2024-06-02T00:00:00Z"
},
{
"@odata.type": "#Microsoft.Dynamics.CRM.expando",
"Order_00c5e057d97da2d32bc7250e380c6217c2": "10001",
"Item_002270c9eb23790af8722ba88510e993970": "2",
"Item_00206d4918fe3e6838d2d1b1df08325e67f": "Item 2",
"Quantity": 1,
"Total_003cb3f0517ef0da8bf733893dd7f35b12": "$150",
"Shipment273d810c3170a23f04f0758777b3e573": "2024-06-02T00:00:00Z"
},
{
"@odata.type": "#Microsoft.Dynamics.CRM.expando",
"Order_00c5e057d97da2d32bc7250e380c6217c2": "10002",
"Item_002270c9eb23790af8722ba88510e993970": "3",
"Item_00206d4918fe3e6838d2d1b1df08325e67f": "Item 3",
"Quantity": 10,
"Total_003cb3f0517ef0da8bf733893dd7f35b12": "$50",
"Shipment273d810c3170a23f04f0758777b3e573": "2024-06-10T00:00:00Z"
}
]
In the output JSON array, we can verify:
- Quantity column is set as number type
- $ symbol is added for ‘Total Price’ column to look like currency
- Datetime column is transformed in ISO format
- ‘Item Description’ column is excluded from the output JSON array
BUT, I wonder why columns name are being modified when there is space or special characters like /, underscore etc. in the name. At the same time the columns are correctly returned when there is no space in its name.
Conclusion
Following are the take aways:
- Custom AI prompt can be used for data extraction and data transformation
- These Custom AI prompt can be reused across organization for the repeatable tasks
- Using instruction, we can set the data type (string, number, datetime) for a column
- We can transform column values by adding extra instructions like converting string to currency type, convert dates to ISO format etc.