Elasticsearch is a great search engine, but using JSON and curl does not fit python. Fortunately there are two libraries that you can use - and in today's article I'll focus on that :) Check out!
S0-E21/E30 :)
Elasticsearch python wrappers
How to start with Python Wrapper for Elasticsearch engine?
That's pretty easy. First, we need to setup our elasticsearch engine by using a jar
file and use java, or for Docker to handle this heavy topic - which you can find in my previous post.
When you do that, we can start tinkering with it using two python wrappers.
First one is the Official one and it's called Low Level
because it uses simple api that makes just requests with json data (converted from python-dicts). This one is called Elasticsearch-py
.
Second one is a bit more complex but also very user-friendly and rather easy to use and learn. It uses a different approach. Reusing official library, makes use of python's lambdas
making a very data-flow-friendly api.
Let's check them out!
Elasticsearch-py
Dependencies
The only one is the library itself - you can install it with:
pip install elasticsearch
Example
This is an example of data that you could put in a blog-post:
import datetime
from elasticsearch import Elasticsearch
client = Elasticsearch('localhost')
blog_post = {
"author": "Anselmos",
"date": datetime.datetime.now().strftime("%Y-%m-%d"),
"content": "A new Blog-Post Content!",
}
res = client.index(index="test-index", doc_type='blogpost', id=1, body=blog_post)
print res
What do you think what will be output of this script?
And the Searching :
from elasticsearch import Elasticsearch
client = Elasticsearch('localhost')
res = client.get(index="test-index", doc_type='blogpost', id=1)
print res
How to you think, will it output my data?
Elasticsearch-dsl-py
Dependencies
The only one is the library itself - you can install it with:
pip install elasticsearch-dsl
Search Query!
This library has a more advanced features that can come in handy especially in a advanced search with filtering.
To make a better use of this library, let's make a list of data for searching:
from elasticsearch import Elasticsearch
import random
import datetime
from elasticsearch import helpers
client = Elasticsearch('localhost')
tag = ['blog', 'anselmos', 'elk', 'elastic', 'elasticsearch', 'elasticstack', 'elasticsearch-py', 'elasticsearch-dsl']
author = ['Anselmos', 'Bartosz Witkowski']
docs = [
{
"_index": "blogpost-{}".format(datetime.datetime.now().strftime("%Y-%m-%d")),
"_type": "blogpost",
"_id": x,
"_source": {
"author": author[random.randrange(0, len(author))]
"date": datetime.datetime.now().strftime("%Y-%m-%d"),
"content": "A new Blog-Post Content!",
"tag": tag[random.randrange(0, len(tag))]
}
}
for x in range(10)
]
helpers.bulk(client, docs)
That's an elasticsearch-py code that makes 10 blogposts with the same content, but different tags (randomized ).
Let's say, we wanted to know which tag have blog
and author equals Anselmos
.
with low-level api this would be:
from elasticsearch import Elasticsearch
import json
client = Elasticsearch('localhost')
queries = []
queries.append(
{"query": {"bool": {"should": [{"match": {"tag": {"query": "blog"}}}]}}}
)
queries.append(
{"query": {"bool": {"should": [{"match": {"author": {"query": "Anselmos"}}}]}}}
)
request = ''
for each in queries:
request += '%s \n' % json.dumps(each)
res = client.msearch(index="blogpost-2018-02-23", doc_type='blogpost', body=queries)
print(res)
And how about hour High Level ?
Check-out
from elasticsearch import Elasticsearch
from elasticsearch_dsl import Search
client = Elasticsearch('localhost')
s = Search(using=client).query('match', tag='blog').query('match', author='Anselmos')
response = s.execute()
print response
Acknowledgements
- Official Python low-level client for Elasticsearch
- High Level Python client for Elasticsearch
- Elasticsearch-py API Documentation
- Elasticsearch-py Helpers
- Elasticsearch-dsl API Documentation
- Elasticsearch-dsl Search DSL
- Loading Sample Data
- Visualizing Your Data
- Installing and Running Elasticsearch
- How To Install Elasticsearch, Logstash, and Kibana (ELK Stack) on Ubuntu 14.04
- Adding Logstash Filters To Improve Centralized Logging
- Using jq to Import JSON into Elasticsearch
- Python + Elasticsearch. First steps.
- Simple Analytics with Elasticsearch and Kibana
- SO-How to use Bulk API to store the keywords in ES by using Python
- SO-Why is there no xrange function in Python3?
- SO-How to create request body for Python Elasticsearch mSearch
- Elasticsearch in an Hour
- Adding show/hide button to pelican-bootstrap3 theme
- How TO - Toggle Hide and Show
Thanks!
That's it :) Comment, share or don't :)
If you have any suggestions what I should blog about in the next articles - please give me a hint :)
See you tomorrow! Cheers!
Comments
comments powered by Disqus