Drawing a blueprint for a scalable taxonomy: drawing on the basic concepts of biological classification most studied in high school, this article describes how to develop a scalable taxonomy that can migrate to any content repository--from share drives to enterprise content management systems.

AuthorStakhov, Eugene

Not very long ago, the word "taxonomy" didn't really have a place in the field of information technology. But, as the ability to govern information has grown more sophisticated, so has the language used to describe the newfound complexity of the various interrelationships and countless moving parts that comprise the typical enterprise data landscape.

To the records and information management professional, words like "system" and "program" don't seem quite adequate now to describe the richness or the organic and evolving nature of this discipline. Today, the words "platforms" and "ecosystems" are used.

[ILLUSTRATION OMITTED]

This is an important concept because it underscores the challenge of effective information governance in this day and age, and it provides a glimpse of the growing monster so many organizations are grappling with.

Reviewing Taxonomy Fundamentals

Many will remember first hearing the word "taxonomy" in high school science class, where they learned that the hierarchical categories "Kingdom," "Phylum," "Class," "Order," "Family," "Genus," and "Species" are conceptual buckets within which plants and animals can be classified, such as the above classification for tiger.

This hierarchical classification teaches that tigers are carnivores; that every carnivore is a vertebrate; and, therefore, that all tigers must be vertebrates--but not all vertebrates must be tigers.

Inheritance and Specialization

This relationship of a parent class (superclass) to its child (subclass) illustrates the important concepts of inheritance and specialization. In the case of the tiger example, this subclass would be carnivora (Order). Carnivora inherit all the characteristics of the animalia (Kingdom), the vertebrata (Phylum), and the mammalia (Class).

Then, they specialize by defining their own characteristics that are unique to all carnivores. This pattern of inheritance and specialization repeats all the way down to the tigris (Species)--the lowest category of the biological taxonomy tree.

The only difference between a biological taxonomy and its content counterpart is that rather than inheriting limbs and backbones, the latter inherits document characteristics, including metadata and security, and, in some cases, retention requirements.

In fact, records management professionals have been practicing taxonomy development for as long as the discipline has been around. There may be nuances in terminology (e.g., "file plan" and "retention schedule"), but the core concept is the same: the higher up the bucket, the broader the classification; the lower the bucket, the more specialized the classification.

The common denominator among all these classification practices is the specialization and inheritance of characteristics.

Explaining Technical Concepts

Objects and Classes

It is helpful to think of the relationship between classes and objects as analogous to cookie cutters and cookies. Classes are templates that are used to build the objects (documents and folders) that are managed by an enterprise content management (ECM) system. Take the following pattern:

* Documents are patterned by Classes

* Classes are described by Properties

* Classes can pass on their property definitions to one or more children, known as Subclasses

This type of design paradigm borrows from a style of computer programming known as object-oriented programming.

Inheritance and Polymorphism

Inheritance and polymorphism are among the core capability requirements of object-oriented design. The latter refers to the ability of a property to have more than one intrinsic meaning. To illustrate this, consider two document classes, one called "Invoice" and the other "Contract."

The 'Invoice" document class may have these properties defined:

* Invoice Date

* Invoice Number

The "Contract" document class may have these properties defined:

* Contract Date

* Contract Number

Rather than define...

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT