Our system is V9.5 FP 6a on RedHat Linux.

We had a rather interesting situation happen today. We got a report that the DB2 server was being "unusable". This usually translates that the system is slow. When I looked into it I noticed that within a 30 minute period we had experienced 58 lock timeouts. Which, on this server is highly unusual. We always capture the lock timeout information, so I looked at that and noticed that the the lock owner of the lock was the same session on almost all of the timeouts. So I talked to that user and she told me that her computer had "hung" at her end so she rebooted. This apparently left the session active that she was running which had an open exclusive lock. This lock caused all of the timeouts, and was basically an orphan session. I am not sure if this is relevant, but we have a linux parameter that we set (tcp_keepalive_time) to 1800 (30 minutes) when the server was put into production. What I surmise is that the orphan session stayed active for 30 minutes and then timed out (either by this liunux parameter or some other method) and then the lock cleared when the session was rolled back. I know this is somewhere in the network layer that is controlling this. My question is, is this linux parameter the mechanism that caused the session to get closed or is there some DB2 setting? If it is a DB2 setting what is it? Also what would be a good value for the setting (whether it is linux or DB2) anyway? I obviously do not want it set too low so that network traffic increases because of the probes to detect a dead session.

Thanks,

Andy