Every so often the tech news community lights up about a gaffe related to document metadata. Some years ago Apple was running a fairly successful switch campaign where people gave testimonials about why they switched to a Mac. Microsoft responded with its own anti-switch campaign. The name of the person in the Microsoft testimonial was not given but was included in the document’s metadata. An AP reporter was able to track her down and discovered that, much to Microsoft’s embarrassment, she worked for a PR firm employed by Microsoft. To add further damage, the picture in the testimonial was a fake, taken from stock footage. Microsoft quickly pulled the ad from its site and pretty much abandoned the anti-switch campaign.
More recently, the United Nations prepared a report on the murder of Rafik Hariri, the former Lebanese Prime Minister. Some of the more damaging allegations were removed just prior to the report’s release, but they remained in the document as metadata. These politically-sensitive deleted portions were quickly discovered and publicized, to the UN’s embarrassment.
For most practical purposes, “metadata” refers to hidden information kept by Microsoft Word as part of a saved *.doc file. The most common type of metadata is information on the people who created/edited the document. Just pull up a Word document and go to File | Properties. You should be able to quickly find the name and company of the author. This is the type of metadata that caught Microsoft.
The UN situation was a bit different. They had enabled Word’s abililty to track revisions, because the document was being edited by multiple people. The author forgot to accept the changes, thus making the original draft and the full revision history available to those “in the know.”
Anyone in a business or professional environment needs to be aware of document metadata—the potential for damage is just too high. The following are some ways to properly deal with metadata:
- Use the Office add-in provided by Microsoft, or (recommended) purchase a commercial “scrubber”. There is also a free utility, Doc Scrubber™, that works pretty well.
- Save the file in the RTF format and then convert it to PDF for distribution. (You should be doing this anyway—distributing non-draft versions of *.doc files can bite you.) Be aware that Adobe Acrobat also retains some metadata, so just converting to PDF may not be enough.
- Turning off the “track changes” feature and/or selecting “accept changes” are not sufficient to remove your metadata.