Selection bias occurs when data are selected for analysis in a way that not all objects being evaluated are equally likely to be selected. This results in samples that are not representative of entire populations. An extreme example would be predicting the presidential race by only sampling New York City or Los Angeles, or predicting all […]

Read More

Sometimes profound implications become apparent from thinking through the implications of direct observations and sampling to determine the extent of the observed conditions. This is a story about the consequences of observing four Authorizations for Expenditures (“AFEs”) and a Daily Drilling Report (“DDR”) while in a meeting with an energy client talking about file share […]

Read More

The Grossman-Cormack article, “Evaluation of Machine-Learning Protocols for Technology-Assisted Review in Electronic Discovery,” has kicked off some useful discussions. Here are our comments on two blog posts about the article, one by Ralph Losey, the other by John Tredennick and Mark Noel: Losey: The Text Streetlight Ralph Losey made an interesting point in his July 6, […]

Read More

Proposed industry metrics and sampling approaches measure significance of text-restricted information governance Memphis TN – June 19, 2014. In a series of blog posts, information governance technology provider BeyondRecognition has noted major limitations in the prevalent text-based approach used by virtually all major content management and information governance systems and has proposed new industry metrics […]

Read More

There has been ongoing debate in information governance and e-discovery circles on the significance of documents that do not contain searchable text, with evidence that half or more of the documents in some collections cannot be analyzed or managed because the tools used for those purposes require textual representations. How important is this limitation in […]

Read More
The BeyondRecognition Network

the-beyondrecognition-network-of-companies