We were having issues with mysql threads where they would be in sleep mode and wouldn’t die off for long time. At the same time we started having issues with our servers where the load will spike and eventually server will come to halt unless we killed all the apache processes and restarted apache (which seems to be the hung application). We traced it back eventually and noticed that the time when server hung was when it burned through all the ram and was using up all the swap also. So we started to work backwards and tried to resolve one thing at a time. We started with MySQL. We put in wait_timeout = 30 in to my.cnf and restarted mysql. Than I closely watched the server for few hours and noticed that we didn’t have any more of those sleep connections. GREAT! A work around until we get to bottom of whats causing this. That was on Friday. Sat we started noticing different problem. Problem worsened and we started to look into what might’ve caused it and found out that we had a script which was pulling row at a time, processing it, and deleting the row. Except, it was never getting to delete the row due to timeout would kick in and close the connection. We found this out when we watched error logs and saw: Mysql has gone away message.
We took out the wait timeout and everything seems to started to work fine. Did anybody ever notice this behavior where you would loose connection to the mysql server due to timeout? The script which processes line by line and deletes line by line takes fraction of second to process that particular line. Does wait timeout starts counting from the starting of the connection? Does it mean that wait timeout is actually a max connection time limit? Suggestions/comments?
Edit 5/31/09: Friend of mine was getting this error:
database error: Lost connection to MySQL server during query
he got around it by adjusting wait_timeout setting.