Migrating unstructured content can involve many moving parts. This can include substantial investments of time and money to extract and then load the files themselves as well as to quality control, aggregate, and supplement existing file attributes. Here are some points to consider when migrating content: 1 – Identify Relevant Stakeholders What business units or […]

Read More

Simple text search architecture – where every non-noise word of every document is indexed – doesn’t work well at enterprise scale. This approach consumes considerable IT resources and, from an end-user perspective, returns considerable numbers of irrelevant results for searches. This approach may work on small personal collections where it’s not too burdensome to wade […]

Read More

“Unstructured” content is a term used to describe the seemingly infinite types of documents that can be found in file shares and personal computing devices. In my last posting I considered some of the differences between structured and unstructured content and discussed how enterprise content management systems represent an attempt to provide the advantages of […]

Read More

In the IT and Information Governance (“InfoGov”) worlds, organizational data is usually thought of as being structured or unstructured. This posting looks at differences between the two, and in the next posting I’ll suggest a best practices approach to thinking about and managing unstructured content. Overview of Structured vs. Unstructured Information professionals often prefer dealing […]

Read More

The Data-Information-Knowledge-Wisdom (“DIKW”) model is a useful for examining how well an organization is doing in deriving value from its unstructured content. In his book, Too Big to Know,* David Weinberger credits Russell Ackoff, a leading organizational theorist, with making a pyramid-shaped depiction of the DIKW model in a 1988 address to the International Society for […]

Read More

In Everything is Miscellaneous, David Weinberger points out that no single classification system will necessarily best serve all those who use the classified content, and he points out several tools used by popular websites to let individual users create and share what they consider to be significant information. Many of those tools could be applied to improve the […]

Read More

Sometimes a large percentage of files found in unstructured content locations like file shares and ECM systems were actually created by database-driven business systems. These documents are essentially filled-in templates populated with specified database elements.  Whether stored as PDF or TIF, these computer-generated files are completely redundant to information stored in the database and could […]

Read More

Imagine the internet with great search functionality but no hyperlinks. You could locate any individual page or at least have it included in extensive search results, but then you’d have to conduct other searches to find related pages, even on the same website. Not very useful, right? The point is that text search functionality alone is […]

Read More