Know the basics of Index and Document APIs in Elasticsearch.

Shagun
5 min readAug 14, 2020

When we store some data in Elasticsearch and we want to query the data, it does not query the whole cluster instead, it queries on the index. In Elasticsearch, Index (Indice) is the very first thing we need to create to put our data. It is a collection of JSON documents and has some settings and mappings.

  • Indexing is an operation to put a document into an index. (Indexing document into an index)
  • A Document is a single unit of data that will be stored in an index and then we can query that data.

REST APIs are used to interact with Elasticsearch and the general pattern is :

http://[host:port]/[index]/[type]/[action]OR[id]

Naming Restrictions

Elasticsearch indices have the following naming restrictions:

  • All the letters must be in lower case.
  • Index names cannot begin with “_” or “ - ”
  • Index names cannot contain spaces ,commas, : , ” , * , + , / , \ , | , ? , # , > , or < .

The Index has three main components :

  • Alias
  • Mapping
  • Settings

Index API

List Index API

To list all the indices use this URL,

http://localhost:9200/_cat/indices

Here, localhost is the host, and 9200 is the default Elasticsearch port.

Create Index

http://localhost:9200/test_index
  • Here the test_index is the index name.

If you use Kibana for requesting, your request will look like this:

PUT /test_index

and you will get this response,

{
“acknowledged”: true,
“shards_acknowledged”: true,
“index” : “test_index”
}

Now if you GET the index by

GET test_index

You will get this response,

{
“test_index” : {
“aliases” : { },
“mappings” : { },
“settings” : {
“index” : {
“creation_date” : “1597424320395”,
“number_of_shards” : “1”,
“number_of_replicas” : “1”,
“uuid” : “6UamFjY_RkOT5bhmg-ZsRQ”,
“version” : {
“created”: “7080099”
},
“provided_name” : “test_index” } } } }
  • Here we can see the “provided_name”: “test_index” is the index name.
  • Alias and mapping are empty because we have not provided any while creating the index.
  • The index has taken the default settings because we have not explicitly provided the settings.

In the settings, there are two important fields:

  • number_of_shards, which shows the number of shards.
  • number_of_replicas, which shows the number of replicas.

You can find all the Index apis here.

Indexing Documents

An index is consists of one or more JSON documents and a document may contain more than one field.

How to insert a document into the index?

http://localhost:9200/test_index/_doc/1

Here test_index is the name of the index, _doc is the document API and 1 is the id of that document that we have provided.

If we do not provide any id to the document it will be autogenerated by Elasticsearch.

Index document

In Kibana the query will look like this

POST test_index/_doc/1
{
“firstname”:”Sherlock”,
“lastname”:”Holmes”
}

This will be the response we will get,

{
“_index” : “test_index”,
“_type” : “_doc”,
“_id” : “1”,
“_version” : 1,
“result” : “created”,
“_shards” : {
“total” : 2,
“successful” : 1,
“failed” : 0
},
“_seq_no” : 0,
“_primary_term” : 1
}

In the response, we can see the version is 1

Now if we try to get the document,

GET test_index/_search

or

GET test_index/_doc/1

We will get this response,

{
“_index” : “test_index”,
“_type” : “_doc”,
“_id” : “1”,
“_version” : 1,
“_seq_no” : 0,
“_primary_term” : 1,
“found” : true,
“_source” : {
“firstname” : “Sherlock”,
“lastname” : “Holmes”
}
}
  • Here in the response, we can see the _id is 1 that we have provided, and the version of the document is 1.
  • Elasticsearch maintains the versions of the document and will keep updating whenever we update the document.
  • The main data that we index is inside the _source field.

Update Document

In the Elasticsearch index, we can update a document in two ways.

  • Doc
  • Script

Doc update

POST test_index/_update/1
{
“doc”: {
“address”:”221B”
}
}

If we now GET this document we will get this response,

{
“_index” : “test_index”,
“_type” : “_doc”,
“_id” : “1”,
“_version” : 2,
“_seq_no” : 1,
“_primary_term” : 1,
“found” : true,
“_source” : {
“firstname” : “Sherlock”,
“lastname” : “Holmes”,
“address” : “221B”
}
}

Here, we can see the address field is added in the document and, we can also see that the version is also upgraded to 2.

Script update

If we remove the last name field now then,

POST test_index/_update/1
{
“script”: {
“lang”: “painless”,
“source”: “ctx._source.remove(‘lastname’)”
}
}
  • Here, lang is the language and the source field contains the script.
  • In the script ctx._source is a reference to the _source field.
  • remove is the function that will take the field name in the parameter.

After this when we GET the document again,

{
“_index” : “test_index”,
“_type” : “_doc”,
“_id”: “1”,
“_version” : 3,
“_seq_no” : 2,
“_primary_term” : 1,
“found”: true,
“_source”: {
“firstname”: “Sherlock”,
“address”: “221B”
}
}

We can see that the lastname field is removed and the document is updated, as well as the version of the document is also changed, we can see the version is now 3.

Delete document

To delete the document this will be the request URI

DELETE test_index/_doc/1

This is the response we will get

{
“_index” : “test_index”,
“_type” : “_doc”,
“_id” : “1”,
“_version” : 4,
“result” : “deleted”,
“_shards” : {
“total” : 2,
“successful” : 1,
“failed” : 0
},
“_seq_no” : 3,
“_primary_term” : 1
}

Here, we can see the document is deleted successfully and the version is updated to 4.

Elasticsearch does not really delete the document, it just marks as delete, and the GC will clean the document.

Delete Index

DELETE test_index

response,

{
“acknowledged”: true
}

Now the Index will be deleted and the documents in that index are also get deleted permanently.

Deleting an index cannot be undone.

Summary

These are the main concepts and APIs that are commonly used to play with the documents and Index. There are other terms and concepts as well, we could not cover them all here so you can read the Elasticsearch documentation for more depth knowledge of Index and Document APIs.

Previous document: Know the basics of Elasticsearch.

My name is Shagun and I work as a Cloud Engineer. This blog is for easy understanding and implementation of Index and Document APIs. If you like it please leave feedback. Thanks for your time. Happy Reading!

--

--

Shagun

Cloud Engineer | Developer | AWS | Azure | Microservices