Search is big business, and getting bigger every day. Just a few years ago, searching meant typing something into a textbox. Now search encompasses text, voice, music, photos, videos, products, and so much more. Just before the turn of the millennium there were just 3.5 million Google searches per day. Today (according to the top result for search term 2020 google searches per day) that figure could be as high as 5 billion and rising, more than 1,000 times more. That’s not to mention all the billions of Tinder profiles, Amazon products, and Spotify playlists searched by millions of people every day from their phones, computers, and virtual assistants.
Frameworks like Elastic and Apache Solr, are instances of symbolic search systems that let developers write the rules and create pipelines for searching products, people, messages, or whatever the company needs.
Let’s take Shopify for example. They use Elastic to index and search through millions of products across hundreds of categories. This couldn’t be done out-of-the-box or with a general-purpose search engine like Google. They have to take Elastic and write specific rules and pipelines to index, filter, sort, and rank products by a variety of criteria, and then convert this data into symbols that the system can understand. Hence the name, symbolic search.
You and I know that if you search for “red nike sneakers” you want, well, red Nike sneakers. Those are just words to a typical search system though. Sure, if you type them in you’ll hopefully get what you asked for, but what if those sneakers are tagged as trainers? Or even tagged as scarlet for that matter? In cases like this, a developer needs to write rules to further customize the search engine: (1) Red is a color, (2) Scarlet is a synonym of red,(3) Nike is a brand, (4) Sneakers are a type of footwear, (5) Another name for sneakers is trainers
In summary in symbolic search contexts you have to explain every little thing for it to deliver good search experiences.
An easier way would be a search system trained on existing data. If you train a system on enough different scenarios beforehand (i.e. a pre-trained model), it develops a generalized ability to find results that match inputs, whether they’re Tumblr GIFs, sentences from Wikipedia, or Pokémon images. You can plug the model directly into your system and start indexing and searching right away.
Compared to symbolic search, neural search: (1) Removes the fragile pipeline, making the system more resilient and scalable., (2) Finds a better way to represent the underlying semantics of products and search queries, (3) Learns as it goes along, so improves over time.