Since the early days of Netgen, there was hardly any site that was put out in the wild without any form of site-wide search. The search was, and it still is, backed by the highly reliable, scalable and fault-tolerant Apache Solr. It powers the search and navigation features of many of the world’s largest internet sites and that’s why it still is the right choice for any eZ Platform based website. It is worthy of noting that Elastic is going to be supported soon. While the usage of Apache Solr through the eZ Platform Search API is pretty much straightforward and well documented, where do we fall short is on how to set it up.
eZ Platform offers very good documentation on how to configure and start Apache Solr. This blog post continues where the official documentation has stopped. Apache Solr can be run manually as a standalone application with the help of built-in Jetty HTTP server and Servlet container, which is fine, but the recommended way would be to run the Solr with Jetty managed by some process manager. Being the JVM based application, there is a big chance that the container is going to blow up and stop the Solr application. This situation is requiring manual intervention and starting the process up manually, whereas controlling the process with process manager is going to start the Apache Solr process up again automatically. That’s the way I decided to document the process of configuring an Apache Solr with ubiquitous process manager called systemd. All code commands and code examples covered in this blog post run on Debian GNU/Linux 9 (Stretch) distribution but will work on any Debian or Red Hat based distro.
What is systemd?
systemd is a Linux initialization system and service manager that includes features like on-demand starting of daemons, mount and automount point maintenance, snapshot support, and processes tracking using Linux control groups. systemd provides a logging daemon and other tools and utilities to help with common system administration tasks. Within the last few years, systemd has gained a lot of traction by Linux distributions. It “de facto” became a standard for system and service management. Its authors took inspiration from macOS’s lanuchd and Ubuntu’s Upstart. Many distributions prefer systemd to other available system alternatives like OpenRC, sysvinit, runit. systemd tasks are organized as units. Units are objects that systemd knows how to manage. These are basically a standardized representation of system resources that can be managed by the suite of daemons and manipulated by the provided utilities. Units in some ways can be said to be similar to services or jobs in other init systems. However, a unit has a much broader definition, as these can be used to define abstract services, network resources, devices, filesystem mounts and isolated resource pools. systemd categorizes units according to the type of resource they describe. The easiest way to determine the type of unit is with its suffix, which is appended to the end of the resource name. I’m not going much into detail here but rather will focus on .service units.
One good alternative for managing processes to systemd is Supervisor.
Where are my units?
Files that define systemd units can be found in many different locations – these locations define their priorities and implications. The system’s copy of unit files is generally kept in the /lib/systemd/system directory. When software installs unit files on the system, this is the location where they are placed by default. You should not edit files in this directory. Instead, you should override the file, if necessary, using another unit file location which will override the file in this location.
If you want to create new unit files or modify the way a current unit works, the best location to do so is the /etc/systemd/system directory. Unit files found in this directory location take precedence over any of the other locations on the filesystem. If you need to modify the system’s copy of a unit file, putting a replacement in this directory is the safest and most flexible way to do that.
There is also a location for run-time unit definitions at /run/systemd/system. Unit files found in this directory have a priority landing between those in /etc/systemd/system and /lib/systemd/system. Files in this location are given less weight than the former location, but more weight than the latter. The systemd process itself uses this location for dynamically created unit files created at runtime. This directory can be used to change the system’s unit behaviour for the duration of the session. All changes made in this directory will be lost when the server is rebooted.
Let’s make our first systemd unit
Now when we know where systemd units are located, let’s create our unit that is going to manage the Apache Solr process. Please, keep in mind that for executing commands, or doing what is described in the rest of the blog post, you will need some higher-privileges like sudo or root access. Create a new unit file solr.service inside /etc/systemd/system directory. Put this content inside the solr.service unit file:
[Unit] Description=Apache SOLR After=syslog.target network.target remote-fs.target nss-lookup.target [Service] Type=simple WorkingDirectory=/opt/solr PIDFile=/opt/solr/server/ez/solr-8983.pid ExecStart=/opt/solr/bin/solr -s ez -noprompt -p 8983 User=www-data ExecReload=/opt/solr/bin/solr restart -p 8983 ExecStop=/opt/solr/bin/solr stop -p 8983 PrivateTmp=true Restart=on-failure [Install] WantedBy=multi-user.target
Let’s just briefly elaborate on the important aspects of our unit file:
- Description - This directive can be used to describe the name and basic functionality of the unit
- WorkingDirectory - Sets the working directory for executed processes. If not set, defaults to the root directory when systemd is running as a system instance and the respective user’s home directory if User directive is set.
- After - The units listed in this directive will be started before starting the current unit
- PIDFile - This directive is used to set the path of the file that should contain the process ID number of the main child that should be monitored
- ExecStart - This is executed on service startup
- User - This directive sets the UNIX user or group that the processes are executed as
- ExecStop - Commands to be executed on service stop
- ExecReload - Commands to execute when service is reloaded
- PrivateTmp - This directive secures the access to temporary files of the process
- Restart - Configures whether the service shall be restarted when the service process exits, is killed, or a timeout is reached. There is a very nice and structured table with the explanation of every option in the official man pages.
- WantedBy - This directive specifies in which run level this unit should be run
Plugging our unit into systemd
Our unit file is now prepared, so we can go on. The next step is to tell systemd about our new unit file. This must be done with systemctl daemon-reload command. Execute this command and at first, nothing will happen. Check if systemd is aware of our unit file with systemctl status solr.service. This should tell you that the service is loaded but inactive, which is perfectly fine. It is the ideal time to enable our service unit to be run at startup or when all relevant services are started - the services defined in After directive in our solr.service file. Before starting the service, check the filesystem permissions for the www-data user. Ensure that www-data is the owner of solr directory, as Solr process is going to do quite a lot of reading/writing to this directory:
chown -Rh www-data: /opt/solr
And finally, start the service with systemctl start solr.service command. Executing systemctl --type=service should output the list of active services and our solr.service should be on that list. If everything works fine, we are now safe to make the service run on system boot – it needs to be enabled inside systemd with systemctl enable solr.service command.
There are also some other interesting commands available for systemd that might be useful:
- systemctl stop solr.service - Stops solr.service
- systemctl restart solr.service - Restarts solr.service
- systemctl disable solr.service - Disables solr.service from running automatically on boot
When things go wrong
If for some reason the process described in the previous step is not working for you or if it was working but for some unknown reason has stopped working, you should inspect the logs. Systemd is shipped with an excellent tool – the journalctl. journalctl is a command for viewing logs collected by systemd. Just enter journalctl into the command line and you should see the complete and clogged data collected from systemd – and that would not help you much. The good news is that logs can be filtered by unit file. Executing journalctl -u solr will give you the information relevant to solr.service unit file.
Please be aware that every time you change the unit file, before starting the service again, you need to tell the systemd to take those changes into account. And that must be done with systemctl daemon-reload command. Also, have in mind that systemd has set security in the form of StartLimitIntervalSec and StartLimitBurst directives which will prevent service from executing if it was started/stopped too many times.
Still using eZ Find
While eZ Find is pretty much very old now and has been superseded by eZ Platform Solr Search engine, it is very much proven and stable search library so many projects still depend on it. I’ve prepared a systemd unit file if you want to control the Solr process:
[Unit] Description=Apache SOLR for My Project After=syslog.target network.target remote-fs.target nss-lookup.target [Service] Type=simple WorkingDirectory=/var/www/my_project/ezpublish_legacy/extension/ezfind/java ExecStart=/usr/bin/java -jar -server -Xms64M -Xmx512M -Dsolr.solr.home=/var/lib/solr/solr410/cores -Djetty.port=8984 -jar start.jar Restart=on-failure User=www-data PrivateTmp=true LimitNOFILE=10000 [Install] WantedBy=multi-user.target
Just follow the steps described previously to enable and start this unit.
The idea of this blog post was not to get into all nitty-gritty stuff of systemd. There are so many things related to service units that I didn’t mention, like advanced service unit configuration, targets, etc. The intention of this blog post is to guide you through running Apache Solr with systemd in a simple way. We went through the arguments of why running Solr with some sort process manager is important, what systemd is and we created a simple unit file that anyone can extend and tweak to their needs. Also, we showed some basic usage of systemctl and journalctl commands.