The central theme of David Weinberger’s book Everything is Miscellaneous* is that no single method of classification serves all purposes, and it is a concept worth considering when designing classification schemes for enterprise content management (“ECM”).
One example of a classification scheme that he uses is the well-known periodic table which arranges basic elements in a way that conveys meaning:
By 2012rc (http://creativecommons.org/licenses/by/3.0), via Wikimedia Commons
In the periodic table, the elements in columns (the groups or families) have common properties or trends as do the elements that appear in the same row (the periods), and there is a wealth of information conveyed in this depiction or arrangement. More at https://www.barcodesinc.com/articles/all-about-the-periodic-table.htm.
However, Weinberger notes that there are several other ways of arranging and presenting elements, including:
- A circular one emphasizing orbital structures within the atoms:
Ed Perley’s Circular Model of Elements Emphasizing Electronic Orbital Structures
- A periodic spiral showing relationships of hydrogen to the noble gases:
Periodic Spiral, Illustrating Hydrogen’s Ambiguous Relationship to the Noble Gases
- Another version emphasizing how elements appear as ions to earth scientists:
Railsback Table Organizing Elements Encountered by Earth Scientists as Ions
- A “galaxy” based depiction:
Phillip Stewart’s Chemical Galaxy, a Periodic System of Elements
Based on Cyclical Nature of Those Elements
There are two important factors here:
- All the different arrangements or classifications of the elements were performed AFTER their basic attributes or facets had been determined, e.g.,
- Atomic mass
- Atomic number
- Chemical symbol
- Electron configuration
By contrast, file classification in ECM almost invariably starts with classifying content and then determining which attributes should be extracted or tracked. An alternative approach based on grouping files by like attributes would be to perform attribution before classification or to use a system that permits regrouping or reclassification based on an iterative analysis of the facets contained in different groupings or clusters. For example, if two classifications have completely overlapping attributes or facets, perhaps they should be consolidated into one classification.
To the extent that a classification system attempts to show where an individual item is within a larger corpus of objects, it makes sense to view the corpus before finalizing the classifications. Visual classification groups or clusters visually-similar files and the clusters are then placed in a classification hierarchy. Usually multiple clusters will be assigned the same classification. For example, the “Invoices” classification may have several visually similar clusters that contain different invoice formats. The cluster ID remains constant, a type of “true North” that permits users to reclassify or rearrange where individual clusters are placed in the classification scheme without impacting where other clusters are placed.
- File attributes or facets provide alternative ways of navigating content that can supplement the original classification.
Once file facets have been defined and extracted it becomes possible to provide supplemental ways to arrange or navigate the files in the collection, much like the information from the periodic table can be used to present or arrange elements to fit specific needs of users, e.g. one for earth scientists, another for people who are more concerned about atomic orbital structures.
Faceted navigation was explored in an earlier blog posting, User Augmented ECM Classifications, which contained the following diagram to suggest how facets or attributes can be used to navigate from one classification to others that are linked or related by common facets.
Note: all links in this blog posting were last visited August 19, 2016.
- Documents ARE Structured – Just Heterogeneously http://beyondrecognition.net/documents-are-structured-just-heterogeneously/
- Boosting PII Detection and Protection in “Unstructured” Content: http://beyondrecognition.net/boosting-pii-detection-and-protection-in-unstructured-content/
- Need More than Text Search for Unstructured Content: http://beyondrecognition.net/need-more-than-text-search-for-unstructured-content/
To request your copy of Managing Unstructured Content, Practical Advice on Gaining Control of Unstructured Content.
*Everything is Miscellaneous, The Power of the New Digital Disorder, by David Weinberger, is available on Amazon at:https://www.amazon.com/Everything-Miscellaneous-Power-Digital-Disorder-ebook/dp/B000R7PUW4/#navbar