Fault tolerant Elasticsearch

Introduction

By default your Elasticsearch cluster is pretty robust. Typically you would go for a design with one primary shard and one replica shard. You could have multiple datacenters with low network latency and have the cluster operating in both centers at once. You could also have 2 racks with nodes.

But what happens if you loose one datacenter or one rack? Your cluster will likely go RED if you dont plan for it upfront.

Shard allocation awareness

There are multiple ways to design around a disaster. But one thing you surely need to be aware of is a feature called Shard Allocation Awareness.

You can read the documentation from Elastic here.

Basically this feature enables your Elastic cluster to know about your physical topology. This enables Elastic to be smart enough to put your primary shard and replica shards into 2 different zones. Zones can be a datacenter or a rack as mentioned before.

You tell Elastic this by adding node attributes to your config file. In this example we will add a node attribute called datacenter to our elasticsearch.yml file. It will have 2 possible values : dc1,dc2

node.attr.datacenter: dc1

Once you have added this attribute to all your nodes, you need to perform a rolling cluster restart for the attribute value to be read.

Afterwards you need to enable the feature.

put _cluster/settings
{
  "persistent" : {
   "cluster.routing.allocation.awareness.attributes": "datacenter"
  }
}

Shortly thereafter you will notice some shard activity going on in the cluster when the master will arrange your shards according to your topology. When the dust settles , you can rest assured, that your indices are present in both datacenters.

Forced shard allocation awareness

However this all sounds good, but there is a problem. Suppose you loose a datacenter (dc1) now. The cluster will do its best to recover. So it will begin making all replica shards in DC2 into primary and then will start to create new replica also in DC2. This means , that you need to have double up on diskspace in either center.

If you dont have the luxury of having double up on diskspace everywhere, then you should be aware of forced shard allocation awareness.

Here you enable this , notice that you now specify the possible values of the datacenter attribute.

put _cluster/settings
{
  "persistent" : {
   "cluster.routing.allocation.awareness.attributes": "datacenter"
   "cluster.routing.allocation.awareness.force.datacenter.values": 
 "dc1,dc2"
  }
}

When you do this, Elastic knows, that you intend to have your indices available on nodes tagged with these values. So when you loose all nodes in DC1, Elastic is not going to try recover everything into DC2. When this happens , you will see cluster go yellow with 50% of your shards missing. But cluster will be available and operate as before. When DC1 becomes available again , Elastic will start to recover as normal.

Additional benefits

This feature will do more for you than just help out in case of disaster. This feature can also help you when you need to a rolling cluster restart, rolling cluster upgrade or simple base OS patching.

Normally when you do a rolling upgrade, you need to do this node by node. This is cumbersome and takes time. With forced shard allocation awareness, you can take eg. 50% of your warm nodes out of service, patch them or change config and bring them back online. So you should have much faster maintenance on your cluster.

Summary

This setup is not for everyone. If you are really paranoid and have enough resources, you could also make your clusters available multiple places and use CCR as your recovery plan. Examine your options and choose what fits you best.

Leave a Reply

Your email address will not be published. Required fields are marked *