rsyslog, journal or both?


rsyslog is an open source project that has been providing log data management in many linux distros for years, rsyslog is easy to setup and it has been widely used to store logs from many servers on a centralized one. On the other hand, journal is a systemd component with a similar role to rsyslog, but with a different approach (it doesn’t use traditional syslog files).

systemd is a controversial project since its beginnings, and journal is probably one of the most controversial components, mainly because of its initial intention to replace a stable and well-known component of any linux system such as syslog. In this post we are not discussing which of these projects is better, but simply showing which one of them is used in some linux distributions: Debian, Ubuntu, CentOS or Fedora, because many users (like me before writing this post) are a little confused about the changes that are taking place in the log management. In any case, the situation shown in this post is likely to change in the next releases, so this post will probably soon become obsolete.

rsyslog

rsyslog uses the standard BSD syslog protocol, specified in RFC 3164, but also includes support for RFCs 5424 (syslog protocol), 5425 (TLS mapping) and 5426 (syslog over UDP). rsyslog provides a modular design and many different inputs (gssapi, auditd, klog, etc.) and outputs (snmp, pgsql, mail, etc.) are supported (rsyslog plugins), but regarding the subject of this post we will focus on two features: The ability to read from syslog Unix socket (usually /dev/log) and the possibility to send or receive syslog messages through the network (udp or tcp) as has been the standard configuration on many linux deployments for years. Another advantage of using the traditional syslog implementation is that it allows to collect syslog messages not only from computers, but from a wide variety of devices in the network (switches, access points, routers, etc.).

A typical rsyslog configuration (/etc/rsyslog.conf) of a computer that locally stores syslog messages in /var/log/ and also sends them to a centralized server via udp should be something similar to this (192.168.102.1 is the centralized server’s IP):

$ModLoad imuxsock # provides support for local system logging
...
*.*		-/var/log/syslog # Any syslog data (all facilities and priorities) stored locally
*.*		@192.168.102.1:514 # Any syslog data (all facilities and priorities) sent to a remote server
...

And the the corresponding server configuration could be:

$ModLoad imuxsock # provides support for local system logging
$ModLoad imudp # provides support for network logging
$UDPServerRun 514 # UDP port
...
*.*		-/var/log/syslog # Any syslog data (all facilities and priorities) stored locally

In this typical situation, logs data are stored on a centralized server as plaint text (log files) in the well-known syslog format. Syslog files are human readable and can be processed by many tools, including traditional processing with regular expressions.

journal

journal is a systemd core component so it’s automatically installed on any operating system using systemd, if your favorite linux distro is using systemd as init system, journal is installed on your computer, in a more specific way: the linux distros analyzed in this post provide journal in the systemd package.

journal provides structured and indexed logging, while providing a certain degree of compatibility with classic syslog implementations. The first big difference with other syslog management tools is that the journal stores log data in a binary format rather than plain text files, so it cannot be read directly by humans or used by the traditional and well-known toolset. journal data logs are usually processed by an application called journalctl.

journalctl provides a new way to interact with data logs, due to the use of structured and indexed data it is easy to select specific logs using some parameters, let’s see a couple of nice examples:

journalctl -u ssh
-- Logs begin at Sun 2017-12-17 17:11:43 CET, end at Sun 2017-12-17 17:41:49 CET. --
dic 17 17:11:49 mut systemd[1]: Starting OpenBSD Secure Shell server...
dic 17 17:11:49 mut sshd[1005]: Server listening on 0.0.0.0 port 22.
dic 17 17:11:49 mut sshd[1005]: Server listening on :: port 22.
dic 17 17:11:49 mut systemd[1]: Started OpenBSD Secure Shell server.

journalctl --since 17:52 --until 17:53
-- Logs begin at Fri 2017-12-15 06:13:15 CET, end at Sun 2017-12-17 17:52:55 CET. --
dic 17 17:52:06 corot dnsmasq-dhcp[30317]: DHCPDISCOVER(tap7753081a-5d) fa:16:3e:a7:92:06
dic 17 17:52:06 corot dnsmasq-dhcp[30317]: DHCPOFFER(tap7753081a-5d) 192.168.130.2 fa:16:3e:a7:92:06
dic 17 17:52:06 corot dnsmasq-dhcp[30317]: DHCPREQUEST(tap7753081a-5d) 192.168.130.2 fa:16:3e:a7:92:06
dic 17 17:52:06 corot dnsmasq-dhcp[30317]: DHCPACK(tap7753081a-5d) 192.168.130.2 fa:16:3e:a7:92:06 host-192-168-130-2
dic 17 17:52:07 corot ntpd[727]: Soliciting pool server 213.183.48.250
dic 17 17:52:15 corot ntpd[727]: Soliciting pool server 213.251.52.234
dic 17 17:52:16 corot ntpd[727]: Soliciting pool server 150.214.94.16
dic 17 17:52:21 corot dnsmasq-dhcp[30229]: DHCPREQUEST(tap8a52ef26-ad) 10.0.0.13 fa:16:3e:b9:5a:71
dic 17 17:52:21 corot dnsmasq-dhcp[30229]: DHCPACK(tap8a52ef26-ad) 10.0.0.13 fa:16:3e:b9:5a:71 host-10-0-0-13
dic 17 17:52:26 corot ntpd[727]: Soliciting pool server 185.132.136.116

The data logs permanent storage is optional in journal (this is really surprising to many users). When permanent storage is not enabled, journal uses the directory /run/log/journal whereas /var/log/journal is used when permanent storage is enabled and data logs remain after rebooting the computer. In any case, journal doesn’t write traditional human-readable log files, this can be a major problem for applications that read from log files or custom made applications to control events (both commonly used in servers).

Another point to consider is that journal did not initially provide a way to send data records over the network and store them on a centralized server, but in recent releases this can be done by journal-remote, journal-upload or journal-gatewayd components.

Can our systems work properly without traditional syslog files?

The correct answer to this kind of questions is always the same: it depends. In a desktop computer, the user can directly read the log files when needed, but in a data center is not possible to read syslog files directly from the centralized server, so many different tools are used for process and filter useful data: traditional tools such as logcheck, loganalyzer, denyhost, fail2ban, etc. or other modern tools such as logstash, fluentd, elasticsearch, etc. but also shell scripts and other tailor-made tools developed by system administrators. All modern tools include journal as a data source, but not all the traditional tools.

The user of a desktop computer can read from a syslog file or with the help of journalctl, doesn’t care too much about the tool used. journal is installed by default and journalctl includes some improvements that make it easy to manage data logs, so it can be the default option in this case. On the other hand, a system administrator managing many servers in a data center may find useful the journal API or journalctl, or even there are fully managed data centers with applications such as logstash or fluentd, but in general it is not an option to abruptly change or adapt the traditional tools used for years, so any implementation of traditional syslog is needed (for instance, there’s an issue to allow reading from journal in denyhost project opened since 2014).

In the following sections we’ll see what option has been chosen in different linux distros to store and manage data logs.

Fedora

systemd is a project supported by Red Hat (Lennart Poettering, the main systemd developer, is working there) and fedora is commonly used to introduce and test new technologies in the Red Hat environment, so the choice is clear: fedora uses only journal to manage data logs since fedora 20. Permanent storage is enabled by default, but very common files such as /var/log/messages no longer exist.

Fedora is mainly used for desktop and testing purposes and its use for servers is less common, so the lack of traditional log files should not be a big deal, as typical Fedora users do not usually use it with complicated custom mechanisms to manage their logs or typical software used in servers that need to read from those files.

A similar situation occurs with Arch, a linux distribution not analyzed here, but another that chooses only systemd-journal to manage the data logs. Like Fedora, Arch linux is not commonly used for servers and the typical user is someone who wants to test new software and doesn’t care too much about stability and managed systems for a long period of time. I don’t know if permanent storage is enabled by default in Arch linux’s journal.

CentOS

CentOS can be considered the «community» version of RHEL. CentOS 7 includes systemd as init system, so journal is included as a basic component. Unlike fedora, CentOS is extensively used in servers so they decided to still include rsyslog in the standard installation to avoid break any previous configuration. In addition to this, no basic package depends on rsyslog, rsyslog can be uninstalled when there are no applications using traditional syslog files.

There’s a major change in rsyslog configuration, CentOS’s rsyslog.conf file now includes the imjournal module, this means that data logs are read from journal instead of syslog UNIX socket, providing structured data to rsyslog. imjournal is not supported by rsyslog project, it’s directly supported by Red Hat.

Debian

Debian also includes systemd as init system, so journal is automatically installed on every debian box. As CentOS, debian is widely used in servers, so a traditional syslog management tool is frequently needed. Debian gives the «important» category to the rsyslog package, so it is installed automatically in many cases (you can test it with the command: aptitude search ~pimportant -F%p|grep rsyslog). Unlike CentOS, debian rsyslog conf file doesn’t include imjournal module, it still uses a imuxsock and imklog modules to manage syslog and kernel logs. Like CentOS, no basic package dependens on rsyslog, so it can be uninstalled when not needed.

Ubuntu

In this topic, the ubuntu approach is very close to that of Debian (Ubuntu is still a distribution derived from Debian and this is most evident for important packages). Ubuntu uses systemd as init system, but also considers rsyslog as an important package so it’s usually installed in many cases. Ubuntu rsyslog configuration differs from debian, but not in any key point, it also uses imuxsock and imklog to manage syslog and kernel messages.

So finally, rsyslog, journal or both?

All analyzed distributions include systemd and therefore the journal as a basic component (systemd-journal cannot be uninstalled), but there is an important difference between the GNU/Linux distributions typically used for desktop computers or for testing purposes and those frequently used on servers. The latter usually include rsyslog as a standard package for backward compatibility and to avoid breaking any previous configuration. This situation will probably remain in the next releases (CentOS, Debian stable or Ubuntu LTS) but, who knows?

The configuration of a centralized log system with journal has not been discussed in detail, because it would require a specific post due to the extension of the topic to be dealt with.

rsyslog, journal or both?

Deja un comentario