The three most important criteria by which to judge file or document classification and coding systems are Consistency Consistency & Consistency The reason is pretty obvious: without consistency a file classification scheme cannot deliver any of the promised downstream benefits, things like enhanced retrievability, selection of appropriate retention schedules, and setting appropriate security access permissions […]

Read More

Whether an organization is trying to consolidate individual information silos, incorporate content acquired by merger or acquisition, or permit federated search, the challenge is determining where is true north in terms of classifying content. That is, to find a way to first group or classify documents consistently regardless of any one person’s view or assessment […]

Read More

The legal and reputational risks associated with mismanaging PII and other sensitive data are well known, and one of the most challenging areas is managing PII in  “unstructured” content – the files and documents found on file shares, local drives, and removable media. You know that PII (or PCI or PHI or IP) is in there, […]

Read More

BeyondRecognition CEO and founder John Martin will be the featured speaker at the ARMA NYC meeting on May 19th, 2015. The topic is “Detecting and Protecting PII in Unstructured Content.” John will be discussing how visual classification offers new options for both detecting and protecting PII in unstructured content. Parties interested in attending the meeting […]

Read More

Because of the significant reputational and financial consequences of failing to protect content containing personally identifiable information (“PII”), corporations and governmental agencies have made it a major goal to identify and protect such content. Privacy expectations arise from a number of laws in different jurisdictions and are sometimes referred to by various acronyms such as […]

Read More

Jeb Bush’s actions in publishing social security numbers and other personal information in emails accumulated during his tenure as governor of Florida created a bit of an international stir recently, see, e.g., articles in the BBC (LINK) and LA Times (LINK) Today there is a heightened sensitivity to protecting personally-identifiable information (PII) and to taking […]

Read More

Faceted classification represents the collective judgment of knowledge workers or subject matter experts from multiple areas in an organization on how to classify documents and grant access to them. It is a logical outgrowth of visual classification and builds on the organization’s existing access authorization infrastructure. Faceted classification is an extremely efficient way to remove […]

Read More

Update: Since our posting that Amazon Web Services had taken down the EDRM/FERC/Enron data set, EDRM and Nuix announced that Nuix had cleansed the Enron data set of more than 10,000 items containing private, health and financial information (http://www.edrm.net/archives/17490), and Index Engines announced that it had found “more dirt” in the EDRM/Nuix data set (http://www.poweroverinformation.com/index-engines-finds-more-dirt-on-nuixs-cleansed-enron-data-set/). […]

Read More

Since writing our initial posting on the EDRM/FERC/Enron PII disclosures, we have learned more information about the PII disclosures that may be of interest to those who use or discuss this collection. This posting reviews FERC’s actions in the light of what was technically possible at the time and comments on Enron’s appeal of the disclosure decision […]

Read More

Background. The Electronic Discovery Reference Model (“EDRM”) is an e-discovery industry standards setting group, and the EDRM Enron Email Data Set v2 (“EDRM Data”) is a collection of documents originally gathered by the Federal Energy Regulatory Commission (“FERC”) as part of its investigation of Enron’s energy trading practices and then made public by it. EDRM […]

Read More