I'm currently working on an OES2 project involving Domain Services for Windows, and we're having some problems with the ndsd process (eDirectory) crashing. It's very random, and while we are trying to get a coredump for analysis, we are unable to pinpoint the offending process. Furthermore, we're having trouble obtaining a coredump with enough detail to find the exact problem. So we are unable to duplicate at will, and when it does crash, we have very little information to help us. In the meantime, it's very important for the project to move forward, and we need eDirectory to be functional and stable.
I've been working pretty closely with Darren McGary from Novell Technical Support on this issue. He has been very helpful, and provided me with a script that monitors the ndsd process. If/when the ndsd process goes dead, it will restart it automatically*. Furthermore, the script can be configured to email or text any time a failure occurs. So far, this has been extremely helpful. I thought I would share since I have not seen a similar script out there. (*Technically, the script checks on defined intervals to confirm the process is dead or alive. It does not immediately know the process has failed, but will find out the next time the script runs).
The requirement for this to work is as follows:
1) Save the script file on the SLES/OES Linux server. The script name that Darren provided was "ndsdmon.sh", but you could call it whatever you want.
2) Make the script executable by using the command "chmod +x ndsdmon.sh"
3) Add an entry into your crontab so the script runs every 5 minutes (or otherwise, whatever your preference may be).
CRONTAB Entry:
1) Edit the cron table with the command:
crontab -e
2) At the bottom of the file, press "i" for insert mode (you're in the "vi" editor)
3) Enter the following into the file (Change the path to the location for which you placed the script file):
*/5 * * * * /tmp/ndsdmon.sh
4) Save and exit. You're in the vi editor, so use the ESC key, then :wq sequence to write/quit.
5) At the command console, confirm that you see the above command when you list the contents of your crontab file:
crontab -l
ndsdmon.sh Script File:
Copy the entire contents below into your ndsdmon.sh file. No editing should be required unless you wish to change the path of the log file, or you wish to add an email address for notifications.
#!/bin/bash
LOGFILE="/tmp/ndsdmon.log" echo "================= Started $(date) =====================" >> $LOGFILE /usr/sbin/rcndsd status &>/dev/null returnCode=$?
echo "Return Code: $returnCode" >> $LOGFILE
if[ $returnCode == "0" ]; then echo -e "NDSD service running" | tee -a $LOGFILEelse # printf "eDirectory is not running on server: $(cat /etc/opt/novell/eDirectory/conf/nds.conf | grep 'n4u.nds.server-name' | cut -d= -f2)" | mail -s "eDirectory is DOWN" @txt.att.net # printf "eDirectory is not running on server: $(cat /etc/opt/novell/eDirectory/conf/nds.conf | grep 'n4u.nds.server-name' | cut -d= -f2)" | mail -s "eDirectory is DOWN" echo "NDSD service NOT running, attempting restart now" | tee -a $LOGFILE /usr/sbin/rcndsd restart echo "NDSD restart attempt complete"| tee -a $LOGFILE fi echo "================= Finished $(date) =====================" >> $LOGFILESummary
Use this script to monitor the ndsd process (eDirectory) on Novell OES2 and restart it if it fails for whatever reason. Futhermore, you can plug in an email address to receive notifications. I have found this script extremely helpful, and believe it could be very helpful even if eDirectory is not crashing regularly.