We often receive information from a client that there was an error with the http 500 code on an application. The tester quickly creates a scenario to reproduce the error and another problem arises. Behind a nice frontend there are a dozen or so applications installed on different servers. The error information is undoubtedly in the log – just in which one? You can spend days trying to find the right file on the right server, or you can use a toolkit that allows you to aggregate logs.
ELK Stack
ELK is an acronym for the set of Elasticsearch, Logstash, and Kibana. Each of these components has a specific role.
- Elasticsearch (ES) is responsible for indexing data. In our case, these will be logs – ES also acts here as a kind of database, where the data will be collected.
- Logstash, on the other hand, is used to collect data, filter it, combine it or send it to different places. Of course, you can send data directly to ES, while Logstash adds the ability to filter some data or split it into different streams. What’s more – based on the content of the sent messages, you can add metadata that will eventually land in ES.
- Kibana, on the other hand, serves as a front to the tools mentioned above. It allows you to define various types of dashboards, create charts, generate reports etc.
Development environment
We will need to prepare ourselves a clean development environment using Vagrant. Installation instructions for this tool can be found here.
In a nutshell, Vagrant will allow you to set up a virtual machine that you can easily and quickly replace with a new one. The source codes that will appear later in this post can be found in this repository – https://github.com/greywarden09/elk-stack
Next, we will use Ansible to configure the environment. Installation instructions can be found here. You will need to install ansible.posix collection. I encourage you to familiarize yourself with the basics of using this tool, it will allow you to better understand the rest of the article. I also prepared a version to put on local environment, using docker-compose. There is a docker subdirectory in the repository with docker-compose.yml file.
Preparing the virtual machine
First, we will set up a virtual machine which will serve as our installation environment. To do this, you will need to prepare a Vagrantfile. Below is a sample configuration using libvirt as a provider:
Vagrant.configure("2") do |config|
config.vm.box = 'generic/ubuntu2004'
config.vm.hostname = 'elk-stack'
config.vm.provider :libvirt do |libvirt|
libvirt.memory = 8192
libvirt.random_hostname = true
libvirt.cpus = 2
end
config.vm.define :network do |network|
network.vm.network :private_network, :ip => '172.168.0.2'
end
end
The configuration above defines the configuration of the machine with Ubuntu 20.04, it sets 2 processors, 8192 MB RAM and a random hostname. A private network is also configured, where the machine will receive the address 172.168.0.2. Of course, nothing prevents you from using another provider, such as VirtualBox or VMWare. Here the Vagrant documentation can be of help as the file configuration is showed in detail.
Being in the same directory as the Vagrantfile, you can already start the virtual machine, using the command:
vagrant up
Next, you will need an SSH configuration, so generate one using the ssh-config command:
vagrant ssh-config --host elk-stack > elk-stack
This command will generate the configuration needed in the next steps – a file named elk-stack will appear, with content similar to this:
Host elk-stack
HostName 192.168.121.93
User vagrant
Port 22
UserKnownHostsFile /dev/null
StrictHostKeyChecking no
PasswordAuthentication no
IdentityFile /home/mlas/poligon/elk-stack/vagrant/.vagrant/machines/network/libvirt/private_key
IdentitiesOnly yes
LogLevel FATAL
The final step is to prepare the inventory under Ansible. So we will create a vagrant.yml file with the following content:
all:
hosts:
elk:
ansible_host: elk-stack
ansible_user: vagrant
ansible_ssh_common_args: -F vagrant/elk-stack[1]
vars:
ansible_python_interpreter: /usr/bin/python3
Note the indentation! The ansible_ssh_common_args[1] parameter contains an additional parameter indicating the location of the SSH configuration file.
We have now set up a clean environment on which we will install the ELK stack. You can verify if the machine is running – for example, by logging into it via SSH.
ssh elk-stack -F vagrant/elk-stack
Use the vagrant halt command to stop the VM, and the vagrant destroy command to delete the VM.
Installing ELK Stack on a Virtual Machine
For the rest of this article, I will be using the file and directory structure from the repository linked in the Development Environment section.
To run the various components of the ELK Stack we will use Docker. So we will use the role that will install this Docker:
ansible-playbook -i vagrant/vagrant.yml bootstrap.yml
Docker has been installed, so you can proceed with installing the rest of the components. This is done with playbook site.yml, which installs Elasticsearch, Logstash, and Kibana. We will now get down to the configuration.
Let’s start with Elastic.
- name: create Docker volumes for Elasticsearch
include_role:
name: common
tasks_from: prepare-volumes[1]
with_items:
- { name: elasticsearch-data, path: "{{ elasticsearch.data_dir }}" }
- { name: elasticsearch-conf, path: "{{ elasticsearch.conf_dir }}" }
- name: increase max_map_count in sysctl
block:
- ansible.posix.sysctl:
name: vm.max_map_count
value: 262144[2]
state: present
reload: True
- name: pull Elasticsearch image[3]
docker_image:
name: docker.elastic.co/elasticsearch/elasticsearch
tag: "{{ elasticsearch.version }}"
source: pull
- name: start Elasticsearch[4]
docker_container:
name: elasticsearch
image: docker.elastic.co/elasticsearch/elasticsearch:{{ elasticsearch.version }}
volumes:
- elasticsearch-data:/usr/share/elasticsearch/data
- elasticsearch-conf:/usr/share/elasticsearch/config
restart_policy: unless-stopped
env:
node.name: elasticsearch
discovery.type: 'single-node'[5]
bootstrap.memory_lock: "true"
ES_JAVA_OPTS: "-Xms512m -Xmx512m"
ports[6]:
- 9200:9200
- 9300:9300
ulimits:
- 'memlock:-1:-1'[7]
networks[8]:
- name: elk
purge_networks: True
First we will prepare two volumes for the container – we need to preserve the configuration and data in case the container is deleted. Tasks linked in line 4 using the tasks_from[1] directive first create a directory (or directories if the structure is nested) and then a docker volume with the specified name. In this case, two volumes will be created – elasticsearch-data and elasticsearch-conf. The next point seems to be extremely tricky to anyone who puts Elastic up for the first time – increasing the value of the vm.max_map_count flag to a minimum of 262144[2]. You can check the reason for that on the documentation page. Then it’s downhill – downloading the image[3] and launching it[4]. We have one instance, so we disable the search for others within a single cluster – set the discovery.type flag to single-node[5]. We issue ports[6] 9200 and 9300, and set memlock on both instances (soft and hard) to -1[7]. For now, we have to have blind faith in ourselves. Finally, we plug the container into the elk network and unplug it from the default networks[8].
It’s time for Logstash. You will definitely notice that the role looks very similar:
- name: create Docker volumes for Logstash[1]
include_role:
name: common
tasks_from: prepare-volumes
with_items:
- { name: logstash-conf, path: "{{ logstash.conf_dir }}" }
- { name: logstash-pipeline, path: "{{ logstash.pipeline_dir }}" }
- name: pull Logstash image
docker_image:
name: docker.elastic.co/logstash/logstash
tag: "{{ logstash.version }}"
source: pull
- name: deploy Logstash pipeline configuration[2]
copy:
src: logstash.conf
dest: "{{ logstash.pipeline_dir }}"
- name: start Logstash
docker_container:
name: logstash
image: docker.elastic.co/logstash/logstash:{{ logstash.version }}
volumes:
- logstash-pipeline:/usr/share/logstash/pipeline
- logstash-conf:/usr/share/logstash/config
restart_policy: unless-stopped
env:
monitoring.elasticsearch.hosts: "http://elasticsearch:9200"[3]
LS_JAVA_OPTS: "-Xms512m -Xmx512m"
ports:
- 9600:9600
- 5044:5044
networks:
- name: elk[4]
purge_networks: True
Analogously to Elastic, we will first prepare the volumes, this time for configuration and definition of the pipeline (inputs and outputs, filtering, in general – configuration of actions on data streams), issue ports and configure the location of Elasticsearch using the environment variable monitoring.elasticsearch.hosts[3]. Note the fact that I did not use localhost, but the domain name of the container. This is only possible if those are on the same network – here it is elk[4].
However there is a task which does something different – here it is called deploy Logstash pipeline configuration[2]. It dumps the Logstash pipeline configuration file into the directory linked in the volume list. The file looks like this:
input {
beats {
port => 5044
}
}
output {
elasticsearch {
hosts => ["http://elasticsearch:9200"]
index => "%{[@metadata][beat]}-%{[@metadata][version]}"
}
}
So we will define a beats input that listens on port 5044, and what comes in is passed to the output – here it is elasticsearch, listening on port 9200.
At the very end there is Kibana.
- name: create Docker volume for Kibana
include_role:
name: common
tasks_from: prepare-volumes
with_items:
- { name: kibana-conf, path: "{{ kibana.conf_dir }}" }
- name: pull Kibana image
docker_image:
name: docker.elastic.co/kibana/kibana
tag: "{{ kibana.version }}"
source: pull
- name: start Kibana
docker_container:
name: kibana
image: docker.elastic.co/kibana/kibana:{{ kibana.version }}
volumes:
- kibana-conf:/usr/share/kibana/config
restart_policy: unless-stopped
env:
elasticsearch.url: "http://elasticsearch:9200"
elasticsearch.hosts: "http://elasticsearch:9200"
server.name: kibana
ports:
- 5601:5601
networks:
- name: elk
purge_networks: True
Here the situation is really simple – we prepare the volume for configuration, download the image, and then start the container. If everything went well, you should be able to access the Kibana homepage after deploying the entire ELK Stack. We run the deployment with the following command:
ansible-playbook -i vagrant/vagrant.yml site.yml
The address in the browser is the same address that’s in the SSH configuration file – in my case it’s 192.168.121.87, but it may be different every time you recreate the VM (delete it and put it up again).
Summary
In the next section I will describe the configuration of Filebeat and Metricbeat. The first tool is used to collect data from logs and then to send it to Logstash or Elastic. Metricbeat on the other hand allows you to collect metrics – for example CPU usage, RAM usage, number of read/write operations and more.