The first two email remediation steps involve identifying duplicate emails and attachments and then intradeduping embedded emails to arrive at unique emails that are not contained within later emails. Statistics vary with individual collections, but the first two steps can remove 90% or more of a collection from further consideration. The third step, unique to BeyondRecognition, involves using payload analysis to classify emails based on the classifications of their attachments or payloads.
Documents that have ongoing business or regulatory value are deemed “records,” and are retained. Emails that have records as attachments can be viewed as providing context to those records and hence inherit the retention policies associated with them. This automated payload analysis can make classification decisions for a large percentage of those emails that have attachments.
As discussed elsewhere in this blog and on this website, BeyondRecognition clusters visually similar documents and provides an efficient mechanism by which BR’s clients can indicate which document clusters contain records to be retained and which contain non-records that can be disposed of. This visual classification does not require that the documents being classified have accurate textual representations.
Three Steps of Email Remediation
Leveraging Visual Classifications
Visual classifications are persistent, meaning the classifications assigned by BR’s clients are assigned to documents that are added to the cluster later on. Furthermore, the intelligence accumulated during the client’s classification process can be used elsewhere. This means that if the client has already classified its file shares or ECM systems, those classification decisions can be used during the payload analysis phase of email remediation.
On the other hand, if the client hasn’t used visual classification before, the process can start by clustering and classifying the email attachments. The intelligence gained by classifying the attachments can be rolled forward when visual classification is applied to other document sets.
The benefit of this inter-operability of the visual classification intelligence is that the client soon reaches a point of convergence, i.e., a point at which virtually all of the incoming documents fall into pre-existing clusters that have already been classified. Going forward the workload is substantially diminished because only the new clusters have to be classified.
The first three steps of email remediation can be set up to be completely self-executing once the visual classification is performed. Future postings will discuss other ways to achieve defensible disposition of emails using BR’s Find technology.
Email Remediation Step 1: Faceted Deduplication
Email Remediation Step 2: Intradeduping Emails