Organizing Content – Ontology 101
An ontology can be many things. In this example, it’s a model of a professional’s work environment as it exists. This environmental ontology has a specialized language, factual and theoretical knowledge, and business specific concepts that provide mutual understanding within the community. It also has structure and establishes a framework for how things are classified, associated and related, and provides context for these relationships. An ontology (populated with data) is known as a knowledge base, and is composed of the following:
- A set of concepts – Words (or phrases) that represent a thing such as an animal, a medical procedure or business process.
- A controlled vocabulary – A special set of agreed upon words with definitions common to the community.
- A taxonomy – A list of those specialized vocabulary words in a generalized parent-child hierarchy.
- A thesaurus – Shows how those words (concepts) are associated and equivalent. It also provides hierarchical information about narrower or broader terms.
- A schema – A specification for organizing and defining data, its’ properties and relationships.
- Theories – A set of business, professional or research axioms…things that are true.
The word ontology is often used to refer to all, or to selected subsets of the above items.
Ontology in more detail
An ontology can be as varied and as complex as people. They can perform specialized or generalized functions. In the health care industry they might have:
- A generic or high-level ontology that describe a goal, a policy, a general medical or business concept.
- A community ontology that describes a vocabulary related to a particular medical specialty.
- A task ontology that describes a medical procedure, problem solving processes or a business activity such as record keeping or patient billing.
An ontology can be simple or complex…they in fact can provide varying levels of precision depending upon their make up. An ontology with a controlled vocabulary and a taxonomy is of limited value without the relationships defined in the thesaurus. If you add a scheme and theory the ontology is of greater value and usefulness.
A word can represent a person, place or thing. In the example below, the concept cat has relationships with other concepts; cats can be defined by sub species, cats can be classified in multiple ways, cats have value, the concept of “cat” has a constraint – in this case it is not a hip human being.
Is a set of preferred/agreed upon terms with unambiguous and non-redundant definitions that help bring about order. These terms can be used to categorize and provide a consistent labeling system for the database schema to assist in more precise retrieval of information. This vocabulary provides a valuable interpretive layer between the search term entered by the user and the underlying database. An example of a simple controlled vocabulary:
Vocabulary with Definitions
- Bird – A member of the class Aves, which include warm-blooded, egg laying feathered vertebrates with forelimbs modified to form wings.
- Blue point – A type of Siamese cat with lighter coloring of the ears, face, tail and feet.
- Cat – A carnivorous mammal, felis catus domesticated since early times as a catcher of mice and rats, and as a pet in a distinctive set of breeds
- Dog – A domesticated mammal, canis familiaris probably derived of wild species.
- Ferret – A weasel like mammal, mustela nigripes of central North America.
- Hamster – Any of several Eurasian rodents of the family circetidae.
- Pet – An animal kept for amusement or companionship.
- Siamese cat – A short-haired cat of a breed developed in the orient, having blue eyes, a pale fawn with darker ears, face, tail and feet.
- Seal point – A type of Siamese cat with dark brown coloring of the ears, face, tail and feet.
- Turtle – Any of the reptiles of the order chelonian.
A collection of controlled vocabulary terms organized from general to specific in a hierarchical structure with one or more parent-child relationships. There may be different types of parent-child relationships – a term may have two parents, for example, Massachusetts can be a child of the United States and of New England. Taxonomies can come in many flavors…here are two examples:
An example of a simple Subject Taxonomy
An example of a simple Content Taxonomy
A collection of controlled vocabulary terms for a community that has information beyond that defined by the parent-child hierarchy in a taxonomy. These richer relationships (equivalence, associative and hierarchical) are defined for each concept. It includes information about the relationships of words to other terms. For example, is the term preferred, non-preferred, broader, narrower or related to other words. There may also be information about the term as it exists in other languages. Here are the essentials that produce a good thesaurus:
A preferred term is the one word that a majority of people in a community know and use. A preferred term in a thesaurus is selected from among all synonyms to be the one used for consistent labeling, indexing and retrieval purposes. For any hierarchy there is one and only one Top Term. For example, automobile, car and motorcar can describe a 1963 Ford Falcon. Only of these terms can be used as the primary label when describing a concept in a taxonomy with a thesaurus.
A related term indicates two or more words have some sort of link but may not have any special significance other than that there is some relationship. These relationships can be alternative spellings (color vs. colour), plural spelling, alternative endings, abbreviations and acronyms.
An equivalent term is a synonym. Automobile, car and motorcar are conceptually equivalent. In this example automobile is the preferred term, and term car and motorcar are equivalent variants that represent the same concept.
hierarchical terms indicate a relationship that may not actually be reflected in the taxonomy. They are either broader concepts (general) or narrower (specialized) instances of a term. They are shared relationships on the borders in the hierarchy. In the sample thesaurus below, the South Shore is a narrower term for Massachusetts, but may not be represented as a parent-child node in the Taxonomy.
Scope notes are comments that provide a textual explanation for a term. This note is attached to a term in a controlled vocabulary, and provides guidance on how to use the term. Scope notes perform a variety of task; they can detail grammatical usage, disambiguate definitions, include or exclude concepts for the term and clarify relationships.
- Preferred term – Massachusetts
- Related – Taxachusetts
- Narrower term – Boston, Springfield, Charlestown
- Broader term – New England, Northeast
- Variant term – Bay State, Old Colony
- Narrower term – South Shore, North Shore, Berkshires, Mohawk Trail
- Equivalent term – MA, Mass.
In an ontology, a schema is the set of building blocks and rules that provide the organizational structure of the database that can actually be populated with information. It provides a standard way of defining specific vocabularies, theoretical or factual knowledge by defining a common set of elements, and the relationships between the elements. It can also define how something works or how to do something. The activity of modeling a business process or task leads to a schema. An example of a business form (purchase order process) that has been modeled:
Things that are always true. It’s those rules that govern how a community interacts; it’s how a medical procedure is executed, or how a business process actually functions. A theory can also determine whether a document belong to a category in a taxonomy.
Examples of business axioms in a sales and shipping process:
- Purchase orders are always prepaid for overseas shipments.
- If a payment is made by credit card, and the invoice address is not equal to delivery address, than check credit card number against the fraud list.
- Credit terms are not granted without a positive credit check of 700.
- Shipments are not made to PO Boxes.
- All shipments are conveyed by UPS.
- When an order is shipped a tracking number is sent to the customer by email only.
If you would like to find out more about optimizing content for the Enterprise send me an email at Mark@MSprague.com or call me now at 781-862-3126.
About Lexington eBusiness Consulting
Mark Sprague’s 25 years of product development experience, which includes expertise in Search Engines, Information Products, SEO platforms and Social Networking applications provide in-depth expertise to help you refine products and services, and improve your websites performance by:
- Developing a superior data-driven SEO strategy for your website.
- Understanding your customers’ search behavior and normalizing it to your content strategy.
- Understanding how search engine technology practically impacts SEO and content strategies.
- Understanding how search technology impacts content in a social networking environment.
- Developing a superior user experience based on sound information architecture, usability and coding standards.
Lexington eBusiness Consulting
Mark Sprague, CEO
580 Lowell Street
Lexington, MA 02420