Friday, July 20, 2007

Linux NSD Tip

Generally with the Linux NSD files you will find the "Input arguments" line with the -crashpid and -crashtid. For example:

Input arguments : -batch -crashpid 4459 -crashtid 134 -wrapper

Of course you can find out which process is running by searching for just the pid number compared to the processes listed in the Notes Process Summary section:

notes R .. 3347 3337 -bash -c /Domino/lotus/bin/server
notes R ... 3568 3347 /Domino/lotus/notes/latest/linux/server
...
notes R .... 4459 3568 /Domino/lotus/notes/latest/linux/http


So we've already determined that the http task was the crashing process. But here's another handy search:

crashpid:crashtid

Just put in the crashpid# then a colon followed by the crashtid# (Note that if the tid# is 3 digits that there should be 1 space between the colon and the tid. If the tid is 2 digits there would be two spaces - basically it's expecting up to 4 digits for the tid# with the leading digits being spaces instead of zeroes). This should help out in finding a db that may have caused it.

In this instance we would search the NSD for: 4459: 134

When that string is found, then you know that you're looking at the actual crash thread for the crashing process. For instance, you may see the following section:

/Domino/notesdata/Path/DB.nsf
Version = 43.0
SizeLimit = 0, WarningThreshold = 0
ReplicaID = 0x85256e42:0x034ece60
bContQueue = NSFPool [ 0006dc45]
Offline = No
DeleteInProgress = No
FDGHandle = 0xf024057a, RefCnt = 9, Dirty = Y
SemContQueue ( RWSEM:#0:0x029d) rdcnt=-1, refcnt=0 Writer=[ : 0], n=0, wcnt=-1, Users=-1, Owner=[ : 0]
By: [ http:4459: 90] DBH= 311, User=Imperial
...
By: [ http:4459: 134] DBH= 371, User=CN=User Name/O=Org

So, with this tip, you are able to determine that something in the Path/DB.nsf application was the cause of the server crash.

Technorati:

No comments: