There was a shift that happened over a decade ago. It was the shift that moved from no good full-text search options inside of an organization to having a few good options for doing full-text search inside of documents. In a moment, people felt as if they could abandon the difficult process of entering metadata and instead just rely on full-text search to help them find whatever they wanted.
It worked when it was tested, but in practice the results generally contained too many files. It was as difficult to sort through them as it was to navigate the old structure or enter the metadata to make the documents findable. In short, it worked in the small scale, but not in the large scale, so the pendulum swung back towards a strategy that included traditional findability and browsability approaches augmented with full text search.
Today, we have tools at our disposal that can help us optimize the ways that we store and retrieve information to make it easier in both directions.
In the distant past (a few decades), the way that you located documents inside of a content management system was to provide a search field-by-field. If you didn’t have the locator number, you had no chance of finding the information you wanted. As a result, investments were made to ensure that the metadata was entered – and entered correctly. Quality control, double entry, and verification was the name of the game, and it was a big game.
The problem was that the process of getting the right metadata in was expensive for anything except for operational records. Operational records could be output from one machine and stored in the content management system without human intervention or error. The fields that were indexed were consistently provided, because they were done by the interface as it dropped files into the system.
Enter Full Text
A boon to finding documents that weren’t emitted by a system, the introduction of full text promised that no one would have to enter the invoice number – the system would use optical character recognition (OCR), find it, and instantly display the document. The limitations of OCR and the 90ish percent accuracy and the commonality of the numbers quickly turned a search for a single document into hundreds of results, as users struggled to articulate that it was the invoice number they wanted, not the purchase order number, requisition number, or any of the thousand other serial numbers that occur in an organization.
Some organizations had already implemented prefixes to simplify identification and disambiguation, but many had not; they started doing data entry again, and users started searching specific fields. The good news for full text search engines is that they would accept a search for either the full text or the metadata and would prioritize the metadata in the results. So, while there would still be hundreds of results, the invoice you were looking for was on top.
Soon after, we gained the capacity to take those metadata fields and use them to refine your search. You could search for a ZIP code 01234, and the results would appear with refiners that could be used to filter the results. If there were results from five customers, those five customers would appear in the refiners pane and – importantly – the other 30,000 customers’ names wouldn’t be there. This filters extraneous noise and allows users to pick the way they want to filter results by a small list that is easy for them to process.
While clicks are generally bad things, the added value of a focused list was worth the clicks and the wait for the revised search to complete. In short, it was quicker and easier than the user trying to sort out the right result for themselves. After a few refinements, the results list was small or the item that you really wanted on the top.
While the clicks for refiners are valuable to the person looking for a document, clicks that are necessary to get them from their result to what they really want aren’t. In some cases, the search result would simply lead to a container that had what the person was looking for. This is the case particularly often when the information has a listing or summary on the page that was returned but also occurs when the folder or container has metadata on it that the items themselves don’t have.
Soon, users wonder why they must do all the extra clicks at the end of the search. It feels wasteful and frustrating.
The extra clicks problem is particularly frustrating when the person doesn’t exactly know which document they need and are forced to do a few clicks per result to evaluate whether the result is what they wanted or not. This pogo sticking problem is one of the reasons why modern search results include previews of the document right in the search results window (usually as a popup), so that users can quickly hover over a result and, ideally, discover whether the result is what they want or not.
Extra clicks between the results and the desired documents breaks this functionality and makes it harder for users to get the results they need.
A slick search is aware of the context in which you issued the search and the social network around you to identify what things you’re most likely to be looking for, and it makes suggestions for common mistakes. However, more importantly, the search leverages the metadata from containers and pushes the appropriate metadata to the documents, so that search results are the actual documents the users are looking for rather than containers.
Getting to a slick, streamlined search is about making it easy for people to get the right result. That means providing easy ways to reduce the results to a manageable number and evaluate individual results quickly.