When you talk about ELK stack it just means you are talking about Elasticsearch, Logstash, and Kibana. But when you talk about Elastic stack, other components such as Beats, X-Pack are also included with it.
Elasticsearch
It is the Heart of elastic stack. Elasticsearch is a distributed, open-source, RESTful, Highly Scalable search and analytics engine based on the Apache Lucene Library and works for all types of data, including textual, numerical, geospatial, structured, and unstructured.
Commonly used for:
- Product searching
- Autocomplete
- Log mining (Aggregations)
- Analytics
- Investigate
- Visualize
- Statistics
Cluster
- It is a collection of one or more servers.
- It allows searching and indexing across all nodes in the cluster.
- One node is one Lucene instance.
- Every cluster is identified by its UNIQUE name. (This is Important for multi-cluster setup)
Node
- A single server in a cluster called Node.
- A node has a unique name in the cluster.
Index
- The index is a collection of documents.
- The index is an entity as well as an operation in elastic search. (Inserting a document into an index is called Indexing)
Document
A document is a single unit of information. It is in the form of JSON.
Shards
- Shard is like a partition(piece) of an Index.
- Shard splits the index horizontally.
- You can define the number of shards in an index at the time of Index creation.
- The main shard which is used for write is called as Primary shard.
- In Eleastisearch, replication is done with the help of Replica shards.
Replica (Replica Shard)
- Replica contains the same data as its primary shards.
- The replicas are never allocated to the same node as the primary shard.
- Allows for fault tolerance.
- Scales search throughput.
Cluster Status
Your cluster will be either of 3 stats of cluster depends on primary and replica shards.
- Green, when all the primary, as well as replica shards, are allocated.
- Yellow, when all the primary shards are allocated where one or more replica shards are unallocated
- Red, when one or more primary shards are unallocated.
Node Types
Master Eligible Node (Default: True)
- It is responsible for all the master cluster management, operations like create, update, delete, read as well as tracking of all the clusters and shard allocation.
Data Node (Default: True)
- Data nodes contain the shards. Index, Delete, Search and other operations are performed on data nodes.
Ingest Node (Default: True)
- Preprocessing of the data is done by the index node. (Logstash)
Coordinating Only Node (Default: false)
- Coordinating only nodes acts as a smart load balancer that routes the requests to the nodes.
- It also handles search reduction.
- Distributes bulk indexing.
Machine Learning Node
- It is a feature of X-pack which is not free.
- In this node, you can run machine learning jobs and API requests.