ELK Stack configuration using Ansible

Marcin Back-end

14 May 2021

We often receive information from a client that there was an error with the http 500 code on an application. The tester quickly creates a scenario to reproduce the error and another problem arises. Behind a nice frontend there are a dozen or so applications installed on different servers. The error information is undoubtedly in the log – just in which one? You can spend days trying to find the right file on the right server, or you can use a toolkit that allows you to aggregate logs.

ELK Stack

ELK is an acronym for the set of Elasticsearch, Logstash, and Kibana. Each of these components has a specific role.

  1. Elasticsearch (ES) is responsible for indexing data. In our case, these will be logs – ES also acts here as a kind of database, where the data will be collected.
  2. Logstash, on the other hand, is used to collect data, filter it, combine it or send it to different places. Of course, you can send data directly to ES, while Logstash adds the ability to filter some data or split it into different streams. What’s more – based on the content of the sent messages, you can add metadata that will eventually land in ES.
  3. Kibana, on the other hand, serves as a front to the tools mentioned above. It allows you to define various types of dashboards, create charts, generate reports etc.

Development environment

We will need to prepare ourselves a clean development environment using Vagrant. Installation instructions for this tool can be found here.

In a nutshell, Vagrant will allow you to set up a virtual machine that you can easily and quickly replace with a new one. The source codes that will appear later in this post can be found in this repository – https://github.com/greywarden09/elk-stack

Next, we will use Ansible to configure the environment. Installation instructions can be found here. You will need to install ansible.posix collection. I encourage you to familiarize yourself with the basics of using this tool, it will allow you to better understand the rest of the article. I also prepared a version to put on local environment, using docker-compose. There is a docker subdirectory in the repository with docker-compose.yml file.

Preparing the virtual machine

First, we will set up a virtual machine which will serve as our installation environment. To do this, you will need to prepare a Vagrantfile. Below is a sample configuration using libvirt as a provider:

Vagrant.configure("2") do |config|

  config.vm.box = 'generic/ubuntu2004'

  config.vm.hostname = 'elk-stack'

  config.vm.provider :libvirt do |libvirt|

    libvirt.memory = 8192

    libvirt.random_hostname = true

    libvirt.cpus = 2

  end

  config.vm.define :network do |network|

    network.vm.network :private_network, :ip => '172.168.0.2'

  end

end

The configuration above defines the configuration of the machine with Ubuntu 20.04, it sets 2 processors, 8192 MB RAM and a random hostname. A private network is also configured, where the machine will receive the address 172.168.0.2. Of course, nothing prevents you from using another provider, such as VirtualBox or VMWare. Here the Vagrant documentation can be of help as the file configuration is showed in detail.

Being in the same directory as the Vagrantfile, you can already start the virtual machine, using the command:

vagrant up

Next, you will need an SSH configuration, so generate one using the ssh-config command:

vagrant ssh-config --host elk-stack > elk-stack

This command will generate the configuration needed in the next steps – a file named elk-stack will appear, with content similar to this:

Host elk-stack

  HostName 192.168.121.93

  User vagrant

  Port 22

  UserKnownHostsFile /dev/null

  StrictHostKeyChecking no

  PasswordAuthentication no

  IdentityFile /home/mlas/poligon/elk-stack/vagrant/.vagrant/machines/network/libvirt/private_key

  IdentitiesOnly yes

  LogLevel FATAL

The final step is to prepare the inventory under Ansible. So we will create a vagrant.yml file with the following content:

all:

  hosts:

    elk:

      ansible_host: elk-stack

      ansible_user: vagrant

      ansible_ssh_common_args: -F vagrant/elk-stack[1]

    vars:

      ansible_python_interpreter: /usr/bin/python3

Note the indentation! The ansible_ssh_common_args[1] parameter contains an additional parameter indicating the location of the SSH configuration file.

We have now set up a clean environment on which we will install the ELK stack. You can verify if the machine is running – for example, by logging into it via SSH.

ssh elk-stack -F vagrant/elk-stack

Use the vagrant halt command to stop the VM, and the vagrant destroy command to delete the VM.

Installing ELK Stack on a Virtual Machine

For the rest of this article, I will be using the file and directory structure from the repository linked in the Development Environment section.

To run the various components of the ELK Stack we will use Docker. So we will use the role that will install this Docker:

ansible-playbook -i vagrant/vagrant.yml bootstrap.yml

Docker has been installed, so you can proceed with installing the rest of the components. This is done with playbook site.yml, which installs Elasticsearch, Logstash, and Kibana. We will now get down to the configuration.

Let’s start with Elastic.


- name: create Docker volumes for Elasticsearch

  include_role:

    name: common

    tasks_from: prepare-volumes[1]

  with_items:

    - { name: elasticsearch-data, path: "{{ elasticsearch.data_dir }}" }

    - { name: elasticsearch-conf, path: "{{ elasticsearch.conf_dir }}" }

 

- name: increase max_map_count in sysctl

  block:

    - ansible.posix.sysctl:

        name: vm.max_map_count

        value: 262144[2]

        state: present

        reload: True

 

- name: pull Elasticsearch image[3]

  docker_image:

    name: docker.elastic.co/elasticsearch/elasticsearch

    tag: "{{ elasticsearch.version }}"

    source: pull

 

- name: start Elasticsearch[4]

  docker_container:

    name: elasticsearch

    image: docker.elastic.co/elasticsearch/elasticsearch:{{ elasticsearch.version }}

    volumes:

      - elasticsearch-data:/usr/share/elasticsearch/data

      - elasticsearch-conf:/usr/share/elasticsearch/config

    restart_policy: unless-stopped

    env:

      node.name: elasticsearch

      discovery.type: 'single-node'[5]

      bootstrap.memory_lock: "true"

      ES_JAVA_OPTS: "-Xms512m -Xmx512m"

    ports[6]:

      - 9200:9200

      - 9300:9300

    ulimits:

      - 'memlock:-1:-1'[7]

    networks[8]:

      - name: elk

    purge_networks: True

First we will prepare two volumes for the container – we need to preserve the configuration and data in case the container is deleted. Tasks linked in line 4 using the tasks_from[1] directive first create a directory (or directories if the structure is nested) and then a docker volume with the specified name. In this case, two volumes will be created – elasticsearch-data and elasticsearch-conf. The next point seems to be extremely tricky to anyone who puts Elastic up for the first time –  increasing the value of the vm.max_map_count flag to a minimum of 262144[2]. You can check the reason for that on the documentation page. Then it’s downhill – downloading the image[3] and launching it[4]. We have one instance, so we disable the search for others within a single cluster – set the discovery.type flag to single-node[5]. We issue ports[6] 9200 and 9300, and set memlock on both instances (soft and hard) to -1[7]. For now, we have to have blind faith in ourselves. Finally, we plug the container into the elk network and unplug it from the default networks[8].

It’s time for Logstash. You will definitely notice that the role looks very similar:



- name: create Docker volumes for Logstash[1]

  include_role:

    name: common

    tasks_from: prepare-volumes

  with_items:

    - { name: logstash-conf, path: "{{ logstash.conf_dir }}" }

    - { name: logstash-pipeline, path: "{{ logstash.pipeline_dir }}" }

 

- name: pull Logstash image

  docker_image:

    name: docker.elastic.co/logstash/logstash

    tag: "{{ logstash.version }}"

    source: pull

 

- name: deploy Logstash pipeline configuration[2]

  copy:

    src: logstash.conf

    dest: "{{ logstash.pipeline_dir }}"

 

- name: start Logstash

  docker_container:

    name: logstash

    image: docker.elastic.co/logstash/logstash:{{ logstash.version }}

    volumes:

      - logstash-pipeline:/usr/share/logstash/pipeline

      - logstash-conf:/usr/share/logstash/config

    restart_policy: unless-stopped

    env:

      monitoring.elasticsearch.hosts: "http://elasticsearch:9200"[3]

      LS_JAVA_OPTS: "-Xms512m -Xmx512m"

    ports:

      - 9600:9600

      - 5044:5044

    networks:

      - name: elk[4]

    purge_networks: True

Analogously to Elastic, we will first prepare the volumes, this time for configuration and definition of the pipeline (inputs and outputs, filtering, in general – configuration of actions on data streams), issue ports and configure the location of Elasticsearch using the environment variable monitoring.elasticsearch.hosts[3]. Note the fact that I did not use localhost, but the domain name of the container. This is only possible if those are on the same network – here it is elk[4].

However there is a task which does something different – here it is called deploy Logstash pipeline configuration[2]. It dumps the Logstash pipeline configuration file into the directory linked in the volume list. The file looks like this:


input {

  beats {

    port => 5044

  }

}

 

output {

  elasticsearch {

    hosts => ["http://elasticsearch:9200"]

    index => "%{[@metadata][beat]}-%{[@metadata][version]}"

  }

}

So we will define a beats input that listens on port 5044, and what comes in is passed to the output – here it is elasticsearch, listening on port 9200.

 

At the very end there is Kibana.


- name: create Docker volume for Kibana

  include_role:

    name: common

    tasks_from: prepare-volumes

  with_items:

    - { name: kibana-conf, path: "{{ kibana.conf_dir }}" }

 

- name: pull Kibana image

  docker_image:

    name: docker.elastic.co/kibana/kibana

    tag: "{{ kibana.version }}"

    source: pull

 

- name: start Kibana

  docker_container:

    name: kibana

    image: docker.elastic.co/kibana/kibana:{{ kibana.version }}

    volumes:

      - kibana-conf:/usr/share/kibana/config

    restart_policy: unless-stopped

    env:

      elasticsearch.url: "http://elasticsearch:9200"

      elasticsearch.hosts: "http://elasticsearch:9200"

      server.name: kibana

    ports:

      - 5601:5601

    networks:

      - name: elk

    purge_networks: True

Here the situation is really simple – we prepare the volume for configuration, download the image, and then start the container. If everything went well, you should be able to access the Kibana homepage after deploying the entire ELK Stack. We run the deployment with the following command:

ansible-playbook -i vagrant/vagrant.yml site.yml

The address in the browser is the same address that’s in the SSH configuration file – in my case it’s 192.168.121.87, but it may be different every time you recreate the VM (delete it and put it up again).

Summary

In the next section I will describe the configuration of Filebeat and Metricbeat. The first tool is used to collect data from logs and then to send it to Logstash or Elastic. Metricbeat on the other hand allows you to collect metrics – for example CPU usage, RAM usage, number of read/write operations and more.

bannerbanner

Your software development experts

We’re a team of experienced and skilled software developers – and people you’ll enjoy working with.

Start Your Projectadd