Topic: Mail merge / serial letters: How to convert HTML codes to formatted text  (Read 51589 times)

We are using Microsoft Office and its mail merge function to create the book of abstracts from the ConfTool Pro XLS export file. We used the WYSIWYG online editor as input form for the abstract. In the XLS file, the abstracts are exported as plain text with tags.
How can we use the code / the tags in the abstracts to create formatted documents in the resulting Word file?
Can Word be used to parse the HTML code?

Unfortunately the technology Microsoft Office uses to create a mail merge can only transfer unformatted text/data from the data source to Word, therefore such processing cannot take place during the mail merge process itself.

However, you can use advanced search and replace functions or corresponding VBA macros to do the conversion after the merge process.
You can find an example file below with a VBA macro that replaces HTML codes with bold, italics, super- and subscript text and removes all paragraph elements. However, it cannot handle bullet lists; these have to be replaced manually. Please note: There is usually a warning message when you open the file because of the VBA macro. We can assure you that the macro is save and poses no threat.

Hints:
  • First, create a serial letter based on the data of the XLS file in Word. Then create single pages by using the mail and merge function:
    Mailings => Mail & Merge => Edit Individual Documents
  • To run the VBA macro, navigate to the tab "View" and then click on Macros => View Macros. In the pop-up window that now appears, select the macro "HTMLreplace" and click on "Run".
  • Alternative for advanced users: Open the VBA macro editor by pressing CTRL-F11 (Option F11 on OSX) or by selecting the tab "Developer Tools" (this tab might have to be enabled first) and then clicking on the icon for Visual Basic.
    To run the macro, click on the green "Play" arrow or press F5. Then select the macro "HTMLreplace".
  • The macro uses regular expression search like \<i\>(*)\</i\> to apply italic formatting to the found text. The replacement text is \1 for the matched text in the brackets (*).
    You can find more information here:
      Find and replace text by using regular expressions (Advanced)
  • A more detailed example of how to use regular expressions to search and replace in order to change the formatting of authors can be found here:
      Mail merge / serial letters in ConfTool: Problems with the formatting of numbers, dates, times and text
  • More information on the use of VBA macros to replace HTML elements can be found here:
      HTML-Tags durch Zeichenformattierung ersetzen (in German)
      VBA: Automatically format specific text in Microsoft Word