In content management, negation involves the ability to focus on items that are relevant for a particular purpose by removing irrelevant items from consideration. The basic idea of negation is familiar to people who have used the Boolean logical operators “NOT” or “XOR” for full text search – those operators remove irrelevant documents from search […]

Read More

Many organizations manage their content using enterprise content management (“ECM”) systems like SharePoint, Office 365, FileNet, Open Text, or Documentum. These ECM systems are really databases that permit organizations to associate various classifications, tags, and descriptions (collectively, “attributes”) to the content being managed and to use those attributes to find specific documents. Parties seeking documents […]

Read More

Enterprise Content Management systems enable organizations to work effectively with their unstructured content. ECM typically takes a more holistic view of an organization’s documents than e-discovery and enhances the ability to retrieve and analyze documents beyond what e-discovery is typically able to achieve. ECM classifies unstructured content, provides controlled access to it, and assigns granular […]

Read More

In Everything is Miscellaneous, David Weinberger points out that no single classification system will necessarily best serve all those who use the classified content, and he points out several tools used by popular websites to let individual users create and share what they consider to be significant information. Many of those tools could be applied to improve the […]

Read More

The usual approach to classifying files or documents in an enterprise collection of unstructured content is top-down: determine what the classifications should be and then write rules or scripts on how to place individual files in the predetermined classifications. This presupposes a comprehensive knowledge of what’s in a collection and what attributes can be used […]

Read More

Document images often have quality issues that make it difficult to extract text or data elements from them. For example: Forms can have lines running through much of the text. Watermarks can interfere with text recognition. Text orientation may be skewed. Once specific issues have been identified, advanced image enhancement techniques can greatly improve the quality and quantity […]

Read More

…Hitting the Sweet Spot The value-complexity curve provides a visualization of the value added to organizations by enterprise search. Initially value grows as the volume of content being managed grows. However, at some point enterprise search becomes more difficult to use as volume continues to expand and users experience increasingly cluttered and incomplete search results. […]

Read More

PDF standards enable users to embed or include non-visible metadata within PDFs as attribute name and attribute value pairs. This feature can be used to embed referential metadata normally stored and used external to the files to help find or otherwise work with them. Here are some reasons why embedding metadata values can be a […]

Read More