logo logo

 Back to main page

The NWNX Community Forum

 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 
 
watchdog.sh

 
Post new topic   Reply to topic    nwnx.org Forum Index -> Linux technical support
View previous topic :: View next topic  
Author Message
Calvinthesneak



Joined: 15 Nov 2010
Posts: 14

PostPosted: Mon Nov 15, 2010 1:12    Post subject: watchdog.sh Reply with quote

Hi all... I've been having issues with nwserver hanging (probably due to module issues). Anyhow I've been playing with a watchdog process.

Anyone have any constructive feedback on this or suggestions on how I should improve it?



Code:

#!/bin/bash
#NWServer Process Monitor

#Go to the right directory
NWN_DIR="/mynwndirectory"
cd $NWN_DIR

# Command to start server
START="/mynwnfolderpath/nwn/nwservctl.sh start"

#pgrep command
PGREP="/usr/bin/pgrep"

#daemon name
HTTPD="nwserver"


#variable to hold cpu usage
PEG1=$(top -b -n1 -d3 | grep nwserver | cut -c42-45 | awk '{print $0}')

#create variable to count times server isn't working
OVERLOAD=0

#command to kill the nwserver so it can start again
KILL="killall -9 nwserver"

$PGREP ${HTTPD}
if [ $? -ne 0 ] #if there's nothing running
then #start server
    $START
else
#checks for cpu usage should go here......
 if [ "$PEG1" -gt "90" ]
then
echo "$(top -b -n1 -d3 | grep nwserver | cut -c42-45 | awk '{print $0}')"
echo "Ok trouble detected, wait 30 secs and check again."
sleep 30
if [ "$(top -b -n1 -d3 | grep nwserver | cut -c42-45 | awk '{print $0}')" -gt "90" ]
then
echo "$(top -b -n1 -d3 | grep nwserver | cut -c42-45 | awk '{print $0}')"
echo "Two failed checks, certainly not good."
sleep 30
if [ "$(top -b -n1 -d3 | grep nwserver | cut -c42-45 | awk '{print $0}')" -gt "90" ]
then
echo "Ok three failed checks on CPU usage, time to restart."
$KILL
sleep 5
$START
fi
fi
fi
fi


minor edit to fix syntax mistake and add a bit more debugging.


Last edited by Calvinthesneak on Tue Nov 16, 2010 11:06; edited 1 time in total
Back to top
View user's profile Send private message
Ravine



Joined: 26 Jul 2006
Posts: 105

PostPosted: Mon Nov 15, 2010 10:44    Post subject: Reply with quote

Hmm. Same happens to me sometimes. No idea what causing this, but i doubt it's module-related. Do you run your server in 'screen'? What version?

See this:
http://www.nwnx.org/phpBB2/viewtopic.php?t=1317

Maybe it's losing the STDIN? Surprised From screen, i can write to the console, but i receive no response. Too bad i can't easily reproduce this error, happens about once a week.

If this is the problem, running '-quiet' should solve the problem, but i need the console to communicate with the players...
Back to top
View user's profile Send private message
Calvinthesneak



Joined: 15 Nov 2010
Posts: 14

PostPosted: Mon Nov 15, 2010 16:25    Post subject: Reply with quote

No sir, run the server via daemon, ala another shell script.

The script is this one here:
http://nwn.bioware.com/forums/viewcodepost.html?post=6367678

It's run in standard mode. Sometimes the nwserver process just hangs. It can be an hour after reboot, or a day. Nothing consistant about it, memory usage is something we're trying to monitor on it to see if there is a leak somewhere.

EDIT: I should note this correction to the script, unless you want to recompile your kernel.
Line 28 should be changed to the following:

CATCH_OUTPUT="empty -r -t 30 -b 8192 -i out.fifo" #8192 bytes is size of FIFO on linux
Back to top
View user's profile Send private message
Calvinthesneak



Joined: 15 Nov 2010
Posts: 14

PostPosted: Sat Nov 20, 2010 2:29    Post subject: Reply with quote

No suggestions to make the script better then?
Back to top
View user's profile Send private message
Ravine



Joined: 26 Jul 2006
Posts: 105

PostPosted: Sun Mar 11, 2012 18:16    Post subject: Reply with quote

Hi!

We are still running into this "server hang" bug from time to time. Anyone found a solution?
First i thought it's the stdin bug described here:
http://www.nwnx.org/phpBB2/viewtopic.php?t=1317

But looks like it's not. I'm running the server with -quiet, and still happens sometimes.

axs found something too:
http://www.nwnx.org/phpBB2/viewtopic.php?t=1314

And there's an early topic about this:
http://www.nwnx.org/phpBB2/viewtopic.php?t=236

This was mentioned in the old BW forums too (see Omnibus). Some admins claimed that this is caused by blindness effect conflict with truesee/ultravision effect, but that was many years ago.


This bug pisses me off, coz mostly happens when 4-5+ player playing on the server (which is a record novadays), and i'm not even nearby to restart it.

doh.
Back to top
View user's profile Send private message
leo_x



Joined: 25 Aug 2010
Posts: 75

PostPosted: Sun Mar 11, 2012 23:12    Post subject: Reply with quote

If the hang is on reset have you tried
Code:

/* Shut down the current process. If nForce is specified, the process will be
 * force-killed in that number of seconds, in case it hangs during shutdown.
 */
void ShutdownServer (int nForce=0);


in Acaos' nwnx_system plugin instead of the reset plugin? I had the hang on reset issue and this has overcome it.
_________________
the awakening (PW Action)
Back to top
View user's profile Send private message
Ravine



Joined: 26 Jul 2006
Posts: 105

PostPosted: Wed Apr 18, 2012 17:14    Post subject: Reply with quote

Doh. I deleted this post, the server just stopped working, so it's not the SCO/RCO Sad
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    nwnx.org Forum Index -> Linux technical support All times are GMT + 2 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group