There has been ongoing debate in information governance and e-discovery circles on the significance of documents that do not contain searchable text, with evidence that half or more of the documents in some collections cannot be analyzed or managed because the tools used for those purposes require textual representations. How important is this limitation in […]

Read More

Calculating MTV Ratio and True Recall Many tools designed to search or classify documents as part of the enterprise content management and electronic discovery functions in organizations depend on having accurate textual representations of the documents being analyzed or indexed. They have text-tunnel vision – they cannot “see” non-textual objects. If the only documents of […]

Read More

The Emperor has No Clothes – and PC Can’t See Image-Only Documents There are several parallels between predictive coding (AKA technology assisted review) and Hans Christian Andersons’ tale, “The Emperor’s New Clothes.” In the story, two weavers tell the emperor they will make him a suit of clothes that will be invisible to those people […]

Read More

Without the right tools, even basic information governance tasks can be difficult. The most glaring example is document classification which is the bedrock upon which virtually all information governance initiatives rest. If you can’t accurately classify an ever increasing volume of documents and correspondence, you can’t apply the correct retention schedules, you can’t specify which […]

Read More

Technology-Assisted Review (“TAR”) for e-discovery processing has received a fair amount of favorable publicity over the last several years, with extensive claims of statistically-sound measures of things like precision, recall, fallout, and f-measures. What may not be explicitly stated is that the “Technology” in TAR is limited to some type of textual analysis, i.e., TAR […]

Read More

Most of us have heard about the parable of the six blind men and the elephant – it may actually be the first recorded instance of faceted classification. Six blind men touched different parts of an elephant and each described a completely different thing based on their own perspective or “view” of the elephant: The one […]

Read More

There are profound differences in the capabilities of a glyph-based document processing engine compared to legacy optical character recognition (“OCR”) systems. From a process efficiency viewpoint, OCR treats each potential character as a fresh recognition task, meaning that even if precisely the same pattern of pixels had already been recognized, that same pattern will be put […]

Read More

Summary: Evidence that plaintiffs and their attorneys withheld information from Garlock Sealing Technologies, LLC, and filed inconsistent claims in other asbestos cases persuaded a bankruptcy court to cut mesothelioma plaintiffs requested estimated aggregate liability of Garlock by $1 billion. Without the use of BeyondRecognition’s BeyondRedaction technology, Garlock would not have been able to obtain as […]

Read More

Document-type taxonomy systems should be consistent in document classifications, be complete in accounting for both records and non-record documents, and remain timely as new document types are encountered and old document types evolve over time. This posting discusses how visual classification technology meets those three criteria. Consistency To be useful at all, taxonomy systems should […]

Read More