We use the Daemons Ruby Gem for a variety of applications. It has served us well, but we found ourselves wrapping the “stop” command with a shell script that makes sure the process actually dies. This behavior is necessary for our deploy scripts which restart daemons. Thanks to the magic of Ruby, we were able to eliminate these extra scripts with a simple Daemons extension.
Before extension: stop command returns immediately, pid file is deleted and we have no clue if the process is dead.
After extension: stop command blocks until the process is dead, giving feedback along the way.
To make sure the stop command doesn’t hang indefinitely, we send the TERM signal, then send the KILL signal (kill -9) if the process hasn’t died after a configurable amount of time. To use this extension, specify :force_kill_wait in seconds as part of the Daemons options hash:
Daemons.run_proc('dantalion', :force_kill_wait => 30) { ...
The implementation starts with making sure the pid file matches the UNIX ‘ps’ command. Let’s crack open the Daemons::ApplicationGroup class and redefine the find_applications method:
# pidfiles (e.g. find application if pidfile is gone)
# We recreate the pid files if they're not there.
def find_applications(dir)
# Find pid_files, like original implementation
pid_files = PidFile.find_files(dir, app_name)
@monitor = Monitor.find(dir, app_name + '_monitor')
pid_files.reject! {|f| f =~ /_monitor.pid$/}
# Find the missing pids based on the UNIX pids
pidfile_pids = pid_files.map {|pf| PidFile.existing(pf).pid}
missing_pids = unix_pids - pidfile_pids
# Create pidfiles that are gone
if missing_pids.size > 0
puts "[daemons_ext]: #{missing_pids.size} missing pidfiles: " +
"#{missing_pids.inspect}... creating pid file(s)."
missing_pids.each do |pid|
pidfile = PidFile.new(dir, app_name, multiple)
pidfile.pid = pid # Doesn't seem to matter if it's a string or Fixnum
end
end
# Now get all the pid file again
pid_files = PidFile.find_files(dir, app_name)
return pid_files.map {|f|
app = Application.new(self, {}, PidFile.existing(f))
setup_app(app)
app
}
end
Now we can be sure that we’ll send a signal to our process even if the pid file was initially missing. You can reference the attached code to see how unix_pids is implemented. Next, we redefine Daemons::ApplicationGroup.stop_all:
# block until the process is dead. It first sends a TERM signal, then
# a KILL signal (-9) if the process hasn't died after the wait time.
def stop_all(force = false)
@monitor.stop if @monitor
wait = options[:force_kill_wait].to_i
if wait > 0
puts "[daemons_ext]: Killing #{app_name} with force after #{wait} secs."
# Send term first, don't delete PID files.
@applications.each {|a| a.send_sig('TERM')}
begin
started_at = Time.now
Timeout::timeout(wait) do
num_pids = unix_pids.size
while num_pids > 0
time_left = wait - (Time.now - started_at)
puts "[daemons_ext]: Waiting #{time_left.round} secs on " +
"#{num_pids} #{app_name}(s)..."
sleep 1
num_pids = unix_pids.size
end
end
rescue Timeout::Error
@applications.each {|a| a.send_sig('KILL')}
ensure
# Delete Pidfiles
@applications.each {|a| a.zap!}
end
puts "[daemons_ext]: All #{app_name}(s) dead."
else
@applications.each {|a|
if force
begin; a.stop; rescue ::Exception; end
else
a.stop
end
}
end
end
Now we can be sure that our process is dead when the stop command returns… at least as sure as kill -9 will kill a process (I have seen where kill -9 didn’t kill a process, but that was 8 years ago on SunOS). The extension will also work for :multiple => true. Note that because of the system calls, this extension will not work on all operating systems… also a good reason not to patch Daemons.
Download daemons_extension.rb
Download New Daemons Extension (tested with 1.0.10 and 1.0.8)
Download CHANGELOG

18 Comments
Just what I was looking for! The extensions work great. Thanks!
hey guys,
I am having this *exact* same issue… monit sends stop, then starts, blows away the pid file, and the daemons gem does not clean up after itself.
This looks perfect!
btw, have you submitted this to the maintainers of the gem?
Adam
Hi Chris,
Thanks very much for this!
I was still having a problem with some daemons not dying — hopefully this will help anyone with the same issue.
I poked around and noticed that by the time the call is made to send the KILL signal to any remaining processes, all of the pid files were already deleted, regardless of the daemon’s status.
I fixed this by adding the following line to the beginning of the exception handler that handles Timeout::Error in stop_all:
find_applications(pidfile_dir())
Chris, can I ask which version of daemons you are using?
Hi Chris/Adam – We’re currently running on version 1.0.10, but we’ve also updated the extension (works fine on 1.0.8 too). I’ll upload the new code and a changelog.
We considered submitting a patch, but decided not to because this extension is OS dependent. Although, these are just options you can use with warning. Maybe it’s worth a shot.
@Chris : Thanks for the reply. Looking forward to seeing the updated code.
looks good, maybe it could help me out ?
I need to run 2 cmds in the background so they do not “block” (Linux 2.6) while a 3rd cmd does its own thing, when cmd3 is complete i want a full tear down of the other runners.
system(‘cmdx’)
cmd1 netstat 1 >> /netstat.out
cmd2 dstat –output /dstat.out
cmd3 bonnie++ /mnt/files
The cmds run fine but I can never control “netstat” or “dstat” they always become orphaned
even when “daemonized”
sudo gem sources -a http://gems.github.com
sudo gem install seamusabshere-daemons
just a quick-and-dirty github gem with Chris’s fix.
(note: the gem version is set to 1.0.11 or higher in order to supersede 1.0.10.0, the last version available on rubyforge at time of posting)
Seamus – Thanks for posting the link. Thomas Uehlinger and I were discussing getting this fix into the next version, but that was back in July 2008. I’ve sent him an email to see what his thoughts are.
We have the same issue with daemons not dying properly. “stop” does not block, and deletes the pidfile before the daemon does. Starting a new daemon before the first dies can result in the first daemon deleting the second daemon’s pidfile as it exits.
According to the ruby Process.kill docs, Process.wait should be used to wait for the process to die: http://www.ruby-doc.org/core/classes/Process.html#M003183
So I believe the daemon module’s bug could be fixed quite easily with only one line: Process.wait
Sending KILL after a fixed amount of time does technically solve the issue, although it’s sort of overkill (excuse the pun).
Ok, more than one line. Process.wait doesn’t work because the pid is not a child of the process that is stopping everything. Instead you need:
def fancy_wait pid
begin
while (1) do
Process.kill(0, pid)
sleep 0.1
end
rescue Errno::ESRCH, Errno::ECHILD
end
end
Hi Greg,
We’re always looking for ways to simplify code, so thanks for the ideas. I have a couple of questions:
1. What signal does ‘0′ correspond to (in ‘Process.kill(0, pid)’)? On both my Mac and CentOS, TERM is ‘15′ and KILL is ‘9′. Maybe I’m just missing something…
2. Does this work with :multiple => true? Perhaps it would help to see where this fits into the Daemons code.
Our use cases might be very different, but it is important for us to receive feedback via STDOUT along the way. In your solution, we could just add a ‘puts’ in the while loop, but we probably wouldn’t want that message to appear 10 times a second. That brings up another subtle difference. In your solution, a signal is sent after every sleep period. In our solution we send a TERM signal, then wait a configurable amount of time for the process to exit gracefully before sending the KILL signal.
Signal 0 is, as best I can tell, a meta-signal that causes an exception if the process does not exist. I saw it in some Ruby code, which I copied, then discovered that this signal is not only not POSIX, it’s not even uniformly supported on Linux! I think Greg meant “Process.kill(‘TERM’, pid)”, as signal 0 would not stop anything.
Note that Greg’s code is merely a wait function, not a kill. It continues to loop as long as the process exists. kill -0 does not signal the process, but instead returns true if the process is alive, false otherwise. It’s a pretty standard kill value.
Thanks for posting this, I was having the same problem.
I ended up switching to the daemons-spawn gem instead using instructions here and that also fixed it:
http://rwldesign.com/journals/1-solutions/posts/24-working-with-delayed-job
One drawback is the daemons-gem script doesn’t (yet) support multiple workers with a -n option (it only launches one). Probably wouldn’t be too hard to modify it. Anyway, just wanted to post another solution.
delayed_job should really update their daemonizing instructions on github! As evidenced by this thread, what they have posted is not stable in production.
Thanks!
Brian
http://feedmailpro.com
As Tony said, that is just a wait function. In fact, the Daemons library already uses kill(0, pid) to implement Daemons::Pid.running?(pid), and as noted there kill(0, pid) does not send a signal; it’s just a syscall to see if a signal may be sent.
So you can see where it fits into the Daemons code, and so there’s a gem, I’ve made a github repo out of the Daemons SVN repository:
http://github.com/ghazel/daemons/
Here is the relevant commit:
http://github.com/ghazel/daemons/commit/3e91f91c5a95409bdbd54039e4163ea509d66619
Thanks Chris and Seamus, this fix & gem was exactly what I needed
great extension , thanks
2 Trackbacks
[...] This problem has thankfully been solved by the use of RapLeafs’ daemon_extension code which is basically a bundle of hacks to kill -9 a daemon that refuses to die after a certain timeout period. This isn’t perfect by any stretch of the imagination, but from a pragmatists point of view: It’ll do! [...]
[...] Making sure Ruby Daemons die (tags: ruby sysadmin) [...]