Backup Elasticsearch Indices By NFS

Suppose you have a critical Elasticsearch cluster. You have to know how to backup it. Correctly and quickly.

It’s not too difficult. But there are something you’d better know. And those are quite common in most DB backup.

Check it out, before it bits you.

Backup Elasticsearch Indices By NFS


Highlights

  • You need a shared folder for all ES nodes. One typical setup is NFS.
1. In one ES node, setup a NFS server with a big volume. 
2. From all other ES nodes, mount it as NFS client. 
3. Create ES repository
4. Create ES snapshot for selective indices or all
  • Umount NFS immediately, if you don’t need it. NFS service is troublesome. It might introduce unreasonable high CPU load to all nodes. Let me repeat it again: unreasonable high!
  • Always check return code, before running the following procedure. Without this principle, any automation would be dangerous.
  • Full backup VS Incremental backup. rsync VS scp.

Previously I need to migrate a big system from one data center to another. One major challenge is how to migrate 3TB ES cluster(10TB data) with minimum downtime?

Here is what I have done:

1. Perform first round of backup and restore. No downtime for this.
2. Run second round of backup. As an incremental backup, it's fast.
3. Use rsync to copy over TB data across WAN. Not scp. 
4. Rsync from N nodes to M nodes, is faster than 1 to 1.
5. Run second round of restore in cluster2. It's relatively fast.
  • Record time performance, for future reference. How large original data and backupset, how long each critical step(backup/copy/restore) takes.
# wrap curl with time command
time curl ...

# wrap rsync with time command
time rsync ...

Preparation

  1. Make sure path.repo is configured in /etc/elasticsearch/elasticsearch.yml for all nodes. Otherwise you won’t be able to create ES filesystem repository. [1]
  2. Prepare a volume. It should be big enough to hold your snapshots. Nowadays all modern cloud providers provide block storage service. Much better than before.

Procedure

1.1 Prepare env

# Customize the values for your env.
# NFS mount folder, which is also where we put the snapshot
export es_fs_mnt="/usr/share/elasticsearch/repo"
# The ip of ES node, which will host the NFS server
export nfs_server="138.168.244.34"
  • Verify elasticsearch.yml is configured correctly in all nodes.

If not, update elasticsearch.yml. And restart the ES instance.

grep "path.repo.*${es_fs_mnt}" /etc/elasticsearch/elasticsearch.yml

1.2 Create A Volume, And Mount To One ES VM

# Customize this: the volume folder might be different for different cloud provider.
export volume_folder="/dev/disk/by-id/*volume*"

echo "$volume_folder"
sudo mkfs.ext4 -F "$volume_folder"

sudo mkdir -p "$es_fs_mnt"; 
sudo mount -o discard,defaults "$volume_folder" "$es_fs_mnt"; 

fstab_command="$volume_folder $es_fs_mnt ext4 defaults,nofail,discard 0 0"
echo "$fstab_command" | sudo tee -a /etc/fstab

cat /etc/fstab

chown elasticsearch:elasticsearch "$es_fs_mnt"

# Create a dummy file
touch "$es_fs_mnt/helloworld.txt"

ls -lth "$es_fs_mnt"

1.3 Setup NFS Server In That VM

# Install NFS server
apt-get install -y nfs-kernel-server

# Create NFS share folder
cat > /etc/exports <<EOF
$es_fs_mnt *(rw,sync,crossmnt,no_subtree_check,no_root_squash)
EOF

cat /etc/exports

# start nfs service
service nfs-kernel-server start
service nfs-kernel-server status

2.1 Setup NFS Client In All ES VMs

apt-get install -y nfs-common

mkdir -p "$es_fs_mnt"
chown elasticsearch:elasticsearch "$es_fs_mnt"
ls -lt $es_fs_mnt

3.1 Mount NFS Client

mount -t nfs "$nfs_server:$es_fs_mnt" "$es_fs_mnt"

# Here we shall see hellworld.txt
ls -lt $es_fs_mnt

3.2 Create ES Filesystem Repository In One Node

# Customize this
export es_ip="138.168.244.34"
# Customize this
export es_port="9200"
# Customize this
export repo_name="my_backup"

curl -X PUT "http://$es_ip:$es_port/_snapshot/$repo_name" -d "{
    \"type\": \"fs\",
    \"settings\": {
        \"location\": \"$es_fs_mnt\",
        \"compress\": true,
        \"chunk_size\": \"10m\"
    }
}"

# List repo
curl -XGET "http://$es_ip:$es_port/_snapshot/_all"

3.3 Create ES Snapshot In Previous Node

We can backup and restore snapshot for selective indices.[2]

# Customize this
export snapshot_name="snapshot_20170726"
# Customize this to backup selective indices
export es_index_list="my-index-123,my-index-234"

# create snapshot
# Here we use time to get status. 
# Run it in a blocking way, with wait_for_completion=true
time curl -XPUT "http://$es_ip:$es_port/_snapshot/$repo_name/${snapshot_name}?wait_for_completion=true" -d "{
    \"indices\": \"$es_index_list\",
    \"ignore_unavailable\": true,
    \"include_global_state\": false
}"

ls -lth $es_fs_mnt

# keep watching status
watch "du -h -d 1 $es_fs_mnt"

# List snapshot
curl -XGET "http://$es_ip:$es_port/_snapshot/$repo_name/_all"

4.1 Umount All NFS Client

es_fs_mnt="/usr/share/elasticsearch/repo"
umount "$es_fs_mnt"
ls -lth "$es_fs_mnt"

Command CheatSheet

List Repository And Snapshot

curl -XGET "http://$es_ip:$es_port/_snapshot/_all"
curl -XGET "http://$es_ip:$es_port/_snapshot/$repo_name/_all"

Checkt Snapshot Status

Backup could take hours. Check the snapshot status.[3]

curl -XGET "http://$es_ip:$es_port/_snapshot/$repo_name/$snapshot_name/_status"

Restore Snapshot

Need to close the index in target ES cluster, before restore

curl -XPOST "http://$es_ip:$es_port/$index_name/_close"
curl $es_ip:$es_port/_cat/indices?v

Restore from snapshot

# Customize this
export snapshot_name="snapshot_20170726"
# Customize this to backup selective indices
export es_index_list="my-index-123,my-index-234"

time curl -XPOST "http://$es_ip:$es_port/_snapshot/$repo_name/$snapshot_name/_restore?wait_for_completion=true" -d "{
    "indices": "$es_index_list",
    "ignore_unavailable": true,
    "include_global_state": false
}'

# list indices and shards
curl $es_ip:$es_port/_cat/indices?v
curl $es_ip:$es_port/_cat/shards?v | grep " p "

Restore Snapshot With Replica Changed

# Customize this
export snapshot_name="snapshot_20170726"
# Customize this to backup selective indices
export es_index_list="my-index-123,my-index-234"

time curl -XPOST "http://$es_ip:$es_port/_snapshot/$repo_name/${snapshot_name}/_restore?wait_for_completion=true" -d "{
    "index_settings": {
    "index.number_of_replicas": 2
    },
    "indices": "$es_index_list",
    "ignore_unavailable": true,
    "include_global_state": false
}'

curl $es_ip:$es_port/_cat/shards?v

Delete ES Snapshot And Repository

curl -XDELETE "$es_ip:$es_port/_snapshot/$repo_name/$snapshot_name"
curl -XDELETE "$es_ip:$es_port/_snapshot/$repo_name"

More Reading:

Leave a Reply

Your email address will not be published. Required fields are marked *