Document-type taxonomy systems should be consistent in document classifications, be complete in accounting for both records and non-record documents, and remain timely as new document types are encountered and old document types evolve over time. This posting discusses how visual classification technology meets those three criteria.
To be useful at all, taxonomy systems should provide replicable results – if the same documents are classified multiple times, the classifications should be the same. Classifications based on visual appearance are far and away more consistent than those made under any other approach. Not only is visual classification more accurate than manual designations, it also works on documents that have little or no text – the comparison is done much like facial recognition and does not use a textual analysis. The result: even image-only documents or documents with poor quality text can be classified accurately and consistently, whether they be native files, scanned images, or faxed images.
When BR has visually classified documents that had been previously classified by other approaches our clients invariably find that their earlier approach had been very inconsistent.
One of the best things about BR is that the document clustering or classification works without any significant upfront work by the client its automatic and the review and classification can begin within just days of the beginning of the project.
While one of the primary purposes of taxonomies may be to identify the documents that need to be retained as records, a taxonomy should also include document types that can be considered to be non-records and safely discarded. Without explicit designations of non-records, these documents are essentially placed in a bucket containing no known document-types. The organization doesn’t know if these are actually non-records or are records that have not yet been identified.
BR’s visual classification technology groups or classifies virtually all documents. Those classifications or clusters can then be reviewed for retention purposes and explicitly designated as records or non-records.
BR is configured to notify operators when new document type clusters form. This could be the result of completely new document types being encountered or it could be the result of changes having been made to existing document types. Either way, the client’s taxonomy can remain timely as new document types are placed in the taxonomy as soon as the new groupings start forming.
Other taxonomy resources:
“Topics and Document Types in Taxonomies,” by the Accidental Taxonomist, http://accidental-taxonomist.blogspot.com/2013/05/topics-and-document-types-in-taxonomies.html
“A guide to developing taxonomies for effective data management,” by Michael Pincher in ComputerWeekly.com, http://www.computerweekly.com/feature/A-guide-to-developing-taxonomies-for-effective-data-management
“Agreed GuidelinesGoverning TaxonomyDevelopment,” posted on Claremont University Consortium website, http://www.cuc.claremont.edu/odyssey/forms/Agreed_Guidelines_for_Taxonomy_Development.pdf