Converting an XML file to a Mail Merge data source format

XML is a new buzz word in information technology. It's become the standard format for exchanging data in the Internet, much as delimited text files were the standard five or ten years ago. What make both XML and delimited files useful as standards is that both consist of plain text, that can essentially be read by any program. The main difference between the two is that XML allows a much more flexible data structure. A delimited text file is limited to the traditional, "flat table" concept, as shown in Figure 1. Each record is contained in a paragraph, and the fields are separated by a character, such as a comma in this example.

Figure 1
Example of a delimited text file

XML can hold any type of data, in any kind of combination. This very flexibility is what makes it unsuited as a mail merge data source. Word's mail merge expects data to be presented as a flat table. And even with the XML support added to Word in version 2003, this requirement hasn't changed. XML in Word is aimed primarily at developers, with the purpose of document generation, or extracting document content.

So, how can this limitation be overcome? By using Excel as a "go-between"; data manipulation falls under Excel's purview. Simply use its File/Open command and select the XML file.

Figure 2 is the same Employees table as in Figure 1, in XML format. (Both were exported from the Access Northwind.mdb sample database's Employees table.) As you can see, the fields are marked with . There is an hierarchical structure, but it needn't be as strictly enforced and as clear-cut as in a delimited text file. (If you're interested in learning more about XML, you can find more information in the Office section of the msdn site at microsoft.com, as well as in various books on XML in Office 2003.)

Figure 2
Example of a simple XML file

XML support was introduced in Excel 2002 and extended in Excel 2003. Since in Excel it is aimed at data import and manipulation, Excel can help get the data into a format Word's mail merge can work with. Figure 3 shows how Excel 2003 interprets the XML file: exactly as it's required for a mail merge in any version of Word. Just save it as an Excel spreadsheet, and you're ready to roll.

Figure 3
How Excel 2003 opens the simple XML file

If you have Excel 2002 it can also open XML files, but won't interpret them as intuitively as the later version; you may have to do some manual clean-up work, first. Excel 2002 will open any well-formed XML file and try to interpret it in a flat-file format. The result for the Employees example is shown in Figure 4. As you can see, Excel tries to include any and all information about the data element "paths" and sorts the data fields (elements) alphabetically. But there's nothing to stop you from deleting the first row, then using Edit/Replace to replace "/Employees/" with nothing. You can also delete any columns you aren't interested in using in the mail merge, and change their order.

Figure 4
How Excel 2002 opens the XML file

Excel 2003 can go a step further and, if it can't discern a flat table format from the file content, allow you to choose fields and add them to an Excel spreadsheet. When you open an XML file in Excel 2003, the dialog box "Open XML" is displayed (Figure 5).

Figure 5
Choose how Excel opens the XML file

Select the option "Use the XML Source task pane" and Excel will provide you with a list of elements, as shown at the right in Figure 6. Simply drag the elements to the spreadsheet columns where you want to display the data (list). Use Ctrl+Click to select more than one element at a time. Click the "Refresh XML data" button in the List toolbar to fill the data into the list. From this point on, you can work with the data as in any Excel spreadsheet.

Figure 6
Choose how Excel opens the XML file

Open/Download the sample text file, XML file and Excel 2003 file.