- What is Document Detective?
Document Detective is an interactive desktop application that works with Microsoft Office to expose and remove hidden
data that could compromise sensitive information. Document Detective also scans Adobe Portable Document Format (PDF) and HTML files
for hidden content, but it can not remove the hidden content from these file types.
[Table of Contents]
- Why do I need it?
To make their applications user friendly, developers collect information in anticipation of the user's
future needs and store the information in the application's data files. This happens automatically without
the user's knowledge or consent. ManTech International Corporation has discovered
proprietary and classified information in unprotected files retrieved from email and the Internet.
Federal law requires certain types of information to be protected, and there have been numerous
incidents in the press where a company's reputation or financial status has been affected by
information that was accidently exposed by hidden data in electronic documents.
See our webpage on Government Policy
See our webpage on published incidents
[Table of Contents]
- What does it cost?
Document Detective is a retail product that is sold for $300.00 per license. Volume pricing and Enterprise pricing
is available for quantities over 50 licenses. There is a GSA contract price for authorized US Government procurement. Please
see our pricing page for more details.
[Table of Contents]
- What versions of Microsoft Office are supported?
The current release of Document Detective (v3.1) works with Microsoft Office XP, 2003, and 2007. The current release will scan documents saved in the new Office 2007 format using Office 2003 if the appropriate conversion filters are installed.
Document Detective should NOT be installed with on earlier versions of Office.
[Table of Contents]
- What are the Ad Hoc Review and Fast Save warnings?
Document Detective checks for two very dangerous features called the Ad Hoc Review and the Fast Save. Although these
Microsoft features have no known benefit, we did not want to make changes to the system without the user's
permission. The following dialog boxes will likely appear when opening Microsoft Office after installing
Document Detective.

The Ad Hoc Review feature automatically adds tracking properties to an electronic document and enables Tracked Changes without warning the
user when a Microsoft Word, PowerPoint, or Excel document is attached to an Outlook email. This feature was
turned on by default in Office XP and some versions of Office 2003.
Fast Save was a feature to reduce the time required to save a document when we were working with slow media,
like floppy diskettes. Deleted information in a Fast Saved file can be recovered. Word and PowerPoint both have
a Fast Save feature. The Fast Save feature is still turned on by default in most versions of PowerPoint.
[Table of Contents]
- What user training is available for Document Detective?
There are several sources of training available. The Document Detective installation includes a computer
based training package that is listed in the Start menu tree.
We also offer classroom training and on-site training. Please contact
Ron Hackett for information on these training opportunities. Several online training courses are in development.
Additional resources are listed below.
[Table of Contents]
- How do I get Technical Support?
Web based technical support is available 24/7 at the Document Detective Technical Support website.
If you can not find the answer to your problem in our knowledge base, please use the Technical Support Contact
form on the website. If the website is not available, contact
Tech Support via email.
[Table of Contents]
- Doesn’t Microsoft Office 2007 fix the hidden data problem?
The Document Inspector feature that was added to Microsoft Office 2007 is a reworked version of the free
Remove Hidden Data (RHD) plug-in that was available from Microsoft for Office 2003. Like most commercial Meta data solutions, the Document Inspector
is a partial solution that fails to fix all of the hidden data issues in Microsoft Office. The new Open XML file formats
are neither open nor are they XML. The format is basically the same as saving the Save As Webpage feature found in previous
versions of Office, except that the results is archived using PKZip to conserve file space. Both the webpage format
and the Open XML formats contain proprietary binary data, and both support most of the hidden data features of the
Office 97-2003 format. In fact, the new Open XML format may be less secure, since anyone can now open and tamper
with the file contents without using any Microsoft Office applications.
[Table of Contents]
- Are there any Unix/Linux versions of the Document Detective?
Unfortunately, the quick answer is no. Document Detective leverages the Microsoft Office libraries and the Windows API to process Microsoft Office documents. There are open source implementations that attempt to mimic these libraries, but the developers will admit that they are not complete. We did examine all of the known open source implementations, and they all came up short of finding ALL of the information in an electronic document.
Microsoft did release their file format specifications a few months ago, but we've had them for years under a special license agreement. Our lead programmer described the Microsoft documentation as, “incomplete, inaccurate, and downright misleading.” We don’t think Microsoft could reproduce the libraries from their specification.
Leveraging the Microsoft libraries allows us to harness the same power Microsoft uses to create the documents and the hidden data issues, but it also binds us to the Microsoft platform. We are looking into virtualization as a potential solution that could free us from hardware and host operating system constraints in the future.
[Table of Contents]
- Does Document Detective have a Certificate of Networthiness (CoN)?
The Missile Defense Agency (MDA) approved a Certificate of Networthiness (CoN) for Document Detective v3.0 in April 2008, and updated the CoN for v3.1 in March 2009. The CoN is posted on the MDA portal. If you do not have access to the MDA portal, then we can provide contact information for this CoN upon request. Send your request to Ronald.Hackett@ManTech.com.
[Table of Contents]
- Does Document Detective replace Buster, Flush, and the CompuSec Toolbox?
Document Detective will scan any file for keywords, so it does replace Buster. Document Detective will warn you when the file type is not recognized, which means the scan is not reliable. Buster is also unreliable in these circumstances, but it may not warn you.
The requirement for Secure Copy and Flush is a bit more complicated, because it is governed by outdated institutional policies. Technically speaking, Secure Copy and Flush are not required as long as you are using a Windows NT based operating system with NTFS formatted media. Unfortunately, we have not been able to get the Government to recognize this or to establish updated policies. You will have to go by your organization's requirements.
Please see our knowledge base article on this topic for more information.
[Table of Contents]
- What happened to the DSS Trusted Downloading Products List?
The Trusted Downloading Products List got lost in one of the many DSS website upgrades, but it has never been rescinded. Unfortunately, the two individuals responsible for that list have retired. We plan to reengage DSS soon to see if we can get the list reconstituted.
[Table of Contents]
- Can Document Detective be hosted on a thin client?
In general, if Microsoft Office will run in your environment, then Document Detective will run in your environment. Document Detective has already been successfully sequenced (virtualized) and demonstrated on the DoDIIS Trusted Workstation (DTW) for thinclient users at the Northeast RSC (NERSC)
in a successful proof-of-concept for a small user-base. This demonstration leveraged Softgrid/App-V, but could easily have been done with other competing technologies like Altiris or Symantec SVS. We are working on a version of Document Detective that is optimized for thin client operations, and we are considering license agreements compatible with this environment.
[Table of Contents]
- Does Document Detective have a concurrent license agreement?
No, we do not offer a concurrent license agreement. Just like Microsoft Office, You will need one license for each workstation that will be using the software.
[Table of Contents]
- Is Document Detective Federal Desktop Core Configuration (FDCC) compliant?
Yes, Document Detective has been tested using the Oct 31, 2008 specificatiions from the National Institute of Standards and Technology (NIST), and is FDCC compliant. Document Detective does not modify any system settings, it follows standard installation practices, and it does not require any elevated user privileges to operate.
[Table of Contents]
- Will Document Detective work on a 64-bit operating system?
Document Detective is not compiled for 64-bit machines, but a 64-bit machine should be capable of running 32-bit processes. Microsoft Office 2007 is a 32-bit process according to the Microsoft Website, but it will run on a 64-bit operating system. In general, if Microsoft Office will run on the platform, then Document Detective will run. The only exception we know about is Windows 2000. We use an API that wasn't implemented correctly on Windows 2000. There appears to be a work-around, but Microsoft doesn't support Windows 2000 anymore, so we made the decision not to support it either.
[Table of Contents]
- Will Document Detective work on a Apple MacIntosh computer?
No. The Office Automation interface is significantly different on an Apple MacIntosh. There has not been sufficient demand for a MacIntosh version of Document Detective, so we have not allocated the resources to port the product to this environment.
[Table of Contents]
- What is the relationship between ManTech Internation, SRS Technologies, and NeXolve?
SRS Technologies developed Document Detective, which was first released to the public in April 2005. ManTech International acquired SRS Technologies in May 2007 and created a wholly owned subsidiary called ManTech SRS Technologies. On 1 January 2009, NeXolve Corporation split from ManTech SRS Technologies to resolve a conflict of interest issues with another line of business. NeXolve is also a wholly owned subsidiary of ManTech International. Document Detective is still marketed as a ManTech International product, but SRS Technologies and NeXolve may still appear in some of our paperwork.
[Table of Contents]
- Are license keys and active registration required?
Beginning with v3.1, Document Detective does require an activation code. The activation code is not intended for tracking or auditing purposes, and it does not require Internet access. The activation code just makes distribution easier. The activation code tells the software what license model to implement, so we can easily switch from an evaluation copy to a subscription or perpetual license just by supplying an new activation code. The activation code includes a case sensitive Organization or User Name and a 17-character code. If you are installing multiple copies of Document Detective, you can save this information to a text file called License.key in the same folder with the setup program to automatically activate the software during the installation. Contact Technical Support for instructions.
[Table of Contents]
- Can Document Detective be pushed using systems management software?
Yes, Document Detective is a simple installation that can be pushed using technology like Microsoft's Systems Management Server (SMS). There are two command line switches available to support network installations.
These switches only apply to Spoon Installer packaged version of Document Detective. The general distribution
is created using InstallShield, which does not support a silent operation. Contact Technical
Support to obtain the Spoon Installer package.
--silent suppresses all dialogs and installs without user intervention
--installto allows you to change the installation folder
Example: c:\setup.exe --silent --installtoC:\A Folder\Another Folder\
[Table of Contents]
- What changes are made by the installation?
Many network administrators want to know what changes Document Detective will make to there systems, and
what information is left behind after uninstalling the software. The following document describes the Document
Detective 3.1 install and uninstall processes, and lists the residual information remaining after an uninstall.
Document Detective Installation Description
[Table of Contents]
- Why doesn’t the Slide Name match the slide index?
Microsoft automatically assigns many objects a name for easy reference by programs. These names generally include information
about the object type and a number that indicates the order that the objects were created. In the case of PowerPoint Slides, the
object type (i.e. Slide) and a number are used to name Slides.
The following example was taken from the Demo Presentation document that is included with the standard Document Detective
installation. The second Slide in this presentation (Slide Index = 2) is named Slide9, which indicates it was the ninth Slide
created in this presentation and could have been created by inserting a new Slide after the first Slide in an eight Slide
presentation.
Slide 2
Slide
Name = Slide9
Background
Alternative Text = <empty>
Background
Name = Rectangle 1
Slide
Transition Sound Effect = <empty>
Slide
ID = 264
Slide
Index = 2
Slide
Number = 2
The Slide Name is a text field that can be altered to include sensitive information. Some document management systems will use
this feature to tag Slides, and the information they insert could be sensitive. Microsoft Office actually provides the user with
and interface to change many object names, so it is important to check the Slide Name property.
Slide Number is the page number displayed on the Slide, and can be different than the Slide Index if an offset is specified in the
Slide Show Setup. The Slide ID is a unique number that Microsoft uses to differentiate the Slide from all other Slides. The
Slide Name, Index, and Number can all change, but the Slide ID does not change within the presentation.
[Table of Contents]
- Why do we need a Flatten Hash code?
Passing the document flattening information from the Content Controls to the Content Browser isn’t easy, especially
if the document is processed on multiple workstations. We resolved this problem by inserting the flatten document information
into the Custom Document Properties container that Microsoft provides. The information includes the switch settings when the
document was flattened and the date and time the document was flattened.
Custom Document Properties are just simple text fields that can be altered by the user, and we needed a way to ensure these
properties were not altered. We also needed a way to expire the flatten document information if the document changes. We
accomplished both with the Flatten Hash code. The Flatten Hash code is unique to Document Detective. No other program can
create this hash, so the Content Browser can check to make sure the flatten information was not altered.
The Content Browser generates a standard MD5 hash for the document, and the Transfer Package and the Transfer Archive are
both signed to protect the document, but this information is not applied until after the document has been processed by the
Content Browser or the Checklist Viewer. The Flatten Hash could be removed at this point, but we left it in place as a
potential redundancy check.
[Table of Contents]
- Why are Cells in Excel referenced with a single index?
Our Excel parser was the last of the Microsoft Office parsers to be developed. By the time we got to Excel, the convention of
addressing objects by a name and a single index has already been well established, and our navigation tree did not support a
double index. The programmers decided to continue with the single index convention to number the Cells, and then the location
information (row and column) were included in the Cell properties.
[Table of Contents]
- Why can’t Shapes be extracted a second time with Content Browser?
The Content Browser has a hidden copy of the document open for reference. When you select a Shape in the navigation tree,
the Content Browser also selects that Shape in the hidden document just like you would select the Shape. When you extract
the document, the Content Browser uses that selection to make a copy of the Shape before it is scanned. The process of
extracting the object cancels the selection in the hidden document, so the Shape is no longer selected after the Shape has been
extracted. Clicking the Extract button again will cause an error, because there is no selection.
The work around for this problem is to select another Shape in the navigation tree and then select the desired Shape again. This
resets the selection in the hidden document, which allows the extraction process to work again for that Shape.
[Table of Contents]