Well, yesterday my production mail server started having some semaphore issues. I was getting tons of the following errors on my console:
WAITING FOR FRWSEM 0x0244 database semaphore
No clients were responding (HTTP, Notes, etc...) so I had to end it after 56 days uptime since loading FP1. After clearing shared memory and semaphores I restarted it. I also enabled semaphore debugging dynamically at the console by issuing the following commands:
set config DEBUG_THREADID=1
set config DEBUG_SHOW_TIMEOUT=1
set config DEBUG_CAPTURE_TIMEOUT=10
Soon after that I started to panic b/c it only stayed up for about 5 minutes. So I got on the phone with Lotus Support and was able to send them the debug.txt file from my semaphore debugging. The server's been up now for about 24 hours. Lotus Support said that there are some hotfixes out for 6.5.4 FP1. Initially it appears that my debugging matches one of them, but we want to wait for another hang to make sure that the errors match up between my debugging and the hotfix code.
In the meantime I'm about 60 days behind on PTFs so I'll probably be IPLing tonight. I'm going to send in my request for downtime now.
No comments:
Post a Comment