TL;DR — In making search results more relevant & in detecting anomalies, ElasticSearch & Solr have baked-in machine learning features.
The week before last I made a mistake …
… I was attending a not-so hush-hush meetup of 25 Raleigh product managers, where no secret handshakes were exchanged, nor did there occur covert conversations about overthrowing our scrum master overlords.
Instead we engaged in a brainstorming-like activity to fuel a conversation on product management tools. First by listing our cherished contrivances, then by grouping said apps by category, and then finally picking-off outliers for discussion.
As one would expect with any cohort of product people, the erasable marker defined borders for the analytics category was replete with multiple Post-it® notes of favorites such as Tableau, Google Analytics, and Pendo.io,
Barely clinging to the boundary line for the analytics category were TensorFlow, R, and Python.
About two sentences into an exchange about how to implement such programming tools, the topic took a sudden turn. A pivot inspired by the lack of product manager centric machine learning tools that would rescue us from the sea of data in which many of us are drowning.
It was at this point I made my misstatement …
“… we’re just now seeing such tools emerge in the area of search. There is a newly and rapidly growing convergence in area of search-based products and machine learning.”
I’m sure it sounded impressive to my peers, but the fact is this convergence between search technologies and machine learning — while indeed rapidly growing — is nothing new. Here are just two examples of many:
- How Bloomberg Integrated Learning-to-Rank into Apache Solr
- Machine Learning for Smarter Search With Elasticsearch
While I’m sure there also exists examples outside of the Apache Lucene ecosystem, I have experience with Solr and ElasticSearch, so let’s limit our discussion on how these two search technologies already offer baked-in machine learning goodness.
Let’s start with ElasticSearch. Despite its name, it is actually an incredibly robust analytics tool that facilitates vertices based on graph theory. No surprise then that Elastic’s Prelert features support time series anomaly detection via unsupervised learning, for example:
Solr, similarly leveraging the power of the underlying Lucene library, implements a supervised learning approach to influence search ranking, as reflected at about the 20 minute mark of the following video:
And before I get calls from proponents and/or sales persons from either, I should note both products continue to expand in both supervised and unsupervised learning. ElasticSearch recently with a means of training search rank results, and SOLR with graph-traversal capabilities.
What about MY Needs?!
So what does this mean to me, the product peep? Glad you asked!
As organizations continue to challenge product people to provide more analytics-driven feature decisions, so too will continue the need for friendlier machine learning abstractions that do not require us to become full-blown data scientists.
Fortunately, a convergence between machine learning and search systems is well under way to make such tools available to us. Granted, some currently require a bit more geekery than others, but I suspect that’ll change.
Until then, the current challenge as product managers is to at minimum understand how they may already have in place a search infrastructure to support machine learning, and perhaps to know just enough data science to understand how machine learning goodies baked into such search products could contribute to delivering relevant features of value.
Personally, along with understanding feature adoption and/or usage anomalies, I’d love a product owner tool that would build effective user stories based off past JIRA or VSTS entries. I wonder if the Automated Insights folks in nearby Durham, NC have an idea on how to make that happen?