Content Optimization & Search Technology Strategy
SEO Issues in the Enterprise
Optimizing content and search for the Enterprise has more complexity than found in web-based SEO. Many of the content silos found in the Enterprise employ link-based navigational architecture – so many of the Internet-based SEO techniques apply, but not all. Page rank and link building play no role – but, good information architecture and Meta Data strategy are important in both cases.
Search optimization for the enterprise requires dealing with many moving parts that depend upon each other to maximize the user’s experience. The major groups (with one example from each group) are:
- Information architecture optimization – i.e., SharePoint
- Data discovery optimization – i.e., Expertise detection
- Search module / tool optimization – i.e., Federation
- Query manipulation optimization – i.e., Tuning and expansion
- Optimizing for usage – i.e., UI design and Faceted navigation
- Social search behavior optimization – i.e., User search model
- Individual optimization – i.e., Personalization
Not all information sits in nicely managed CMS / DMS applications. Share point portals and wikis are often created organically without regard for how information is to be organized, or how the data set can scale over time. Before you know it, you have a Frankenstein portal that has thousands of useful documents, but they are hard to find, and navigation is a real mess.
The lack of oversight responsibility, and no information architecture strategy impacts other areas as well. You see the same problems in:
- Blogs, Wikis, Twitter feeds
- Company Intranet
- Employee desktops
The lack of a search compliance policy also impacts the user’s experience. Users will find a useful document, and simply email it about, or copy it into their departmental section of the Share Point portal instead of providing a simple link to the original document. Moving data without regard to duplication and context leads to degraded search results over time. A search compliance policy will help keep this in check.
Is there a second optimization opportunity here? You bet. Develop an information architecture policy for organic data silos up front, and include an information architect on the planning team.
Document discovery is interesting, but much more interesting is the process of parsing a document database set for new Meta data, relationships and context. Named entity extraction provides the basis for identifying Subject Matter Experts.
Duplicate documents in the Enterprise are a major problem for search engines. But, there is a silver lining – they do provide an opportunity to understand the life of the document. The duplicates do exist within a context – they reside in email servers that can be mined for who received, opened and forwarded the document. They exist in desktops where employee names, titles and expertise are known. They exist in Share point portals where the submitter name is known (as well as the project name / practice group). Other optimization options include:
- Meta Data extraction to support optimization
- Meta Data management policy and enforcement
- Expertise detection
- History maps
- Multilingual normalization
- Content discovery and post content processing: (mapper, entity extraction, language detection, format conversion)
Search Modules that Aid in Optimization
Two of the most important options for optimizing search results is federating and clustering. Federated search allow you to issue queries to multiple data silos, return the results, normalize relevancy and remove the duplicate documents. Real-time clustering allows you to organize a set of search results from an unstructured data set (e.g., Share point) with no governing taxonomy. These real-time clusters can be displayed topically, by source or by an other custom grouping.
Query Manipulation Optimization
You can take the relevancy algorithms out of the box but it’s better to have access to the API to manipulate the various options. You have the standard keyword match, natural language and fuzzy logic options to start, but tuning relevancy to take into account, for example, the searchers job is one of many ways to optimize queries. Other ways to improve optimization is to use the various Query Expansion options that allow the searcher to find related documents by using stemming, synonym expansion and the “more like this” option.
A second way to optimize is to acknowledge the inherent short-comings of search technologies, and develop a set of best practices to minimize the impact of these problems. For example, a single word search is inherently ambiguous, and the probability that superior results will be found increases when you have the ability to suggest two and three word combinations based on a single term.
Another way to optimize search is to consider not removing common words from the index such as; in, with, at, the, to, be, or, not, for, about. These “stop words” can be excluded from the index to improve performance, save overhead and to produce a small footprint. The problem with removing these prepositions is that they provide clues to user intent. For example, the phrase “books for sales people” has a very different meaning than “books about sales people.” “Hotels in Boston” is semantically different from “hotels near Boston.” Consider this stop word phrase; “to be or not to be that is the question”. The entire sentence with the exception of the word “question” would be excluded from the index if these stop words are aggressively removed, and of course no results would be returned for that search phrase.
Optimizing for Usage
One of the most aggravating problems searchers encounter are poorly designed user interfaces. Google taught people that search is a single text box with ten search results. Old-school librarians understand the power of fielded search, but they rarely are part of the UI design team. Today, the UI design team has dozens of options for deploying powerful search applications. The options for searching and navigating results are numerous. Information can be presented in a multi-formatted array that allows users to view a single set of results from many perspectives. The options include tabbing by topic, source and type. Navigation options can be presented by generating real-time links based upon a taxonomy and facets derived from Meta data.
Social Search Behavior Optimization
One of the most over-looked optimization options is the understanding of human search behavior, and capturing the insight generated from the interaction of search behavior and content discovery. Some social search strategies include:
- Determine user intent (search behavior models)
- Search collaboration
- Search results tagging
- Social, visual and technical Mash-ups
- Saving results and attachments to virtual folders
- Rating results for quality
- Annotate search results
- Bookmarking search results
Personal context for search results and content can go a long way toward weeding out non-relevant content, even though the search algorithms have identified a document as a candidate for inclusion. A good example, sales people need very different results than professional researchers and engineers. Context can be generated by considering the job of the individual and the types of content that individual consumes. Personalization options include:
- Roll-based searching (contextual search)
- Parse personal information profiles for needs and expertise
- Alerting – provides clues to needs and expertise
- Page rank for people – quality based upon content that is consumed or created
Find Out More
“As CTO of mShopper, a customer facing mobile price comparison and shopping service, we encountered some very complex issues. I engaged Mark Sprague and he provided invaluable content expertise and architectural advice for our business logic and our search engine tools”. Alec P. Karys, former CTO, mShopper
If you would like to find out more about the rest of the Enterprise SEO issues discussed in this article, send me an email at Mark@MSprague.com or call me now at 781-862-3126.
About Lexington eBusiness Consulting
Providing comprehensive Enterprise SEO services to the Boston community…
Mark Sprague’s 25 years of product development experience, which includes expertise in Search Engines, Information Products, SEO platforms and Social Networking applications provide in-depth expertise to help you refine products and services, and improve your websites performance by:
- Developing a superior data-driven SEO strategy for your website.
- Understanding your customers’ search behavior and normalizing it to your content strategy.
- Understanding how search engine technology practically impacts SEO and content strategies.
- Understanding how search technology impacts content in a social networking environment.
- Developing a superior user experience based on sound information architecture, usability and coding standards.
Lexington eBusiness Consulting
Mark Sprague, CEO
580 Lowell Street
Lexington, MA 02420