Monday 25 January 2010

Email Alert when number of established connections in Apache reach a limit on Windows

This method of alerting is actually relevant for any kind of connection checking.

You can see from the below how easily the script can be customised.

I have Apache 2.2.14 mod ssl installed on a Windows Server 2003.

On this server I have also installed cygwin.

We host a number of HTTPS web applications on this server.

Recently I noticed that suddenly Apache would become totally unresponsive.

After a lot of investigation I found that if I did "netstat -nt" on the server when Apache is unresponsive, then I would see a very large number of connections in ESTABLISHED state on port 443 all from the same IP address.

When this happens Apache become unresponsive to all new requests for 5 minutes. Then suddenly all these connections are released and Apache begins to server requests again.

Now I am still investigating as to why certain valid clients seem to suddenly make hundreds of HTTPS connections to our servers and hold the connections until the Apache timeout of 300 seconds kicks in.

In conf/extra/httpd-default.com the default value for Timeout is set to 300 seconds.

Now you should not really change this value but if you are having the same problem as me you do not really want the Apache server to be unresponsive for 5 miutes (300 seconds). I have set this value to 30 seconds and at least now the server will become unresponsive for 30 seconds then it recovers.

At the same time I wanted to be alerted by email when this situation occurred.

So I did the following on the Apache server:

Download bmail from this link, extract and copy bmail.exe in to d:\tools\bin\bmail.exe

In the same directory create a bash script and call it connectionCheck.sh

assuming
1. your email address is techsupport@abc.com
2. your server is called XYZ
3. you would like to be alerted as soon as the total number of concurrent established HTTPS connections (443) reaches 50
4. your email servers IP address is 192.168.1.10 and that you can connect to this server and send emails from your Apache sevrer.

then the contents of this script looks like:

#!/bin/bash

netstat -nt grep -i established grep -i :443 >netstat.log
X=`cat netstat.log wc -l`
if [ "$X" -gt "50" ];
then
Y=`cat netstat.log`
/cygdrive/d/tools/bin/bmail.exe -s 192.168.1.10 -t
techsupport@abc.com -f XYZ@abc.com -a "XYZ more than 50 https connections" -b "$Y"
fi

I also set
chmod +x connectionCheck.sh

and used vim to
:set ff=unix

To test the above so far you can change the value from 50 to 0 and then just execute the script from cygwin shell when you know there is atleast some activity on your HTTPS Apache server:

./connectionCheck.sh

This should send you an email which contains the netstat details at the time of execution.

Then create a windows scheduled task which starts from 00:00 and ends 23:59 and runs every minute and runs:

c:\cygwin\bin\run.exe bash -wait --login -c "/cygdrive/d/tools/bin/connectionCheck.sh"

Now every minute, this script will run silently.

It will send you an email everytime the established number of https connections to your apapche server goes above 50. You can change this number to whatever you want, but I think apache has a limit of 150 and if you are even hitting 100 concurrent established connections, you probably have a very busy site, unless your apache server takes a very long time to respond to each client request.

Monday 18 January 2010

CYGWIN SSHD service not starting on Windows server 2008 due to Netlogon service not starting

I installed Cygwin with Open ssh on a Windows server 2008.

I ran the ssh-host-config as normal and configured sshd as a service.

When I tried to start the service it game me an error 1068. Something abot not being able to start sshd because it depends on the netlogon service.

I tried to start the netlogon service manually to see if that would help, but I could not start it as it started then stopped immediately saying that because the computer is part of a workgroup and not a domain then the netlogon service does not need to run!!!!

I found the solution was to change the startup type of the netlogon service from manual to automatic. Then reboot the server and everything starts to work!!!
I do not know why this works but it does for me.