The focus of the report was on tools that organizations can use to manage what Gartner calls “unstructured data,” and included products like HP – ControlPoint, IBM – StoredIQ, Nuix – Luminate, Symantec, and ZyLab. Of all the products and companies analyzed, only BR uses visual classification to unify the treatment of both scanned and native electronic documents. Many of the products appear to use only existing metadata to analyze only electronic files.
The Guide included these comments on BR:
“BeyondRecognition supports information governance and legacy information cleanup of scanned or native electronic documents with visual classification, document attribute extraction, and deduplication. Their products can collect and act upon metadata from virtually any system, while supporting content tagging and migration. The system can provide data for any visual dashboard the client uses or would like to use, but also supports its own visual reporting. BeyondRecognition’s visual clusters are self-forming, and so when new document types enter the workflow, operators are alerted, enabling the policies and procedures around document classification to be continuously updated.”
Without the proper tool sets, scanned documents and native electronic files may indeed appear “unstructured,” but documents are actually fairly structured within clusters of visually-similar documents. In fact, by basing clustering on visual appearance, BR is able to normalize the treatment of scanned documents and native electronic files. See the Document U blog posting, “Documents ARE Structured – Just Heterogeneously.”
BeyondRecognition founder and CEO John Martin noted that, “One of the big goals in remediation efforts is to identify and remove duplicates, and BeyondRecognition is the only technology that can identify visual duplicates, i.e., documents that are visually identical, even if some of them have no associated text, or some of them are scanned images. This is all made possible by BR’s unique global glyph catalog.”