Weighted Random Selection with Elasticsearch

I recently found an interesting stackoverflow question that I posted an answer to without the thought the question deserved. The poster was storing documents in Elasticsearch and using function_score with its random_score option to randomly select a document from a set. This seemed fairly straightforward. The poster also wanted to boost certain documents. My first response suggested the use of query boost to accomplish the task. However, what the poster really wanted was weighted random sampling and upon further exploration, I discovered this was not the simple task of constructing the right query as I had initially thought.

Indexing HTML content in Elasticsearch

Notes on Elasticsearch’s html_strip filter

Elasticsearch allows you to create analyzers giving you control over how your content is indexed. What if this content is HTML content? Perhaps, you need to index content submitted via rich text editors in Wordpress forms. Or, you might be improving the search experience for your intranet. Elasticsearch’s html_strip filter is handy in these types of scenarios.