Site Status.
UPDATE: We've not seen this problem in about a week now, so it may be we've mangaged to track down the bugs that were causing it and get rid of them. At this time, the site seems very stable. We're shifting over to working on the database and after that we'll start working on a new (from the ground up chatroom).
The site is still having an issue with the number of pools being exceeded.
I’m going to try and explain the problem and the steps we’re looking at.
www.jericho-kansas.com is a website which stores most content in a database. When you request a page on the website (usually by following a link, menu item, etc.) the software checks your privileges (are you logged in, registered, a moderator, etcetera and then decides what parts of the website you are allowed to see. This isn’t very unusual. The JKI website takes this a bit further in that every element (we can call these modules) on a page can be viewed or hidden depending on a person’s site privileges.
Most any element is going to be a module (you can think of a module as a container for holding a picture, a movie, some text, etc.). If you look at the front page there is a module (container) for each of the two movies. A module holds the Jericho quote of the day. A module holds “The latest news” etc.
When a page loads, the following happens, a call is made to the database to return all the modules (their types and placements, etc.). Then each module will call the database to ask what it should show.
So here’s the problem: The database can only accept so many calls before it cannot handle anymore (think of overloading the phone lines). Once all the lines are tied up there’s just no way to get a call in and then the website hangs.
When the system is working properly it is like making a quick call, then hanging up to freeing up the phone line. When it isn’t working correctly it is like everyone calling and no one hanging up to free up a line for the next call.
What is happening is simply this: one of the modules we use somewhere has a bug where it makes a call to the database and then for some reason never closes the connection. More and more of these connections build up until the database has no free connections and then the website times out for a couple of minutes while the server figures it needs to reset the server and free up the connections.
So while we’ve worked out many of the issues the server was having and we’re seeing a lot of stability and speed gains from that work, we still need to track down and kill some bugs in the modules used in the website before we see the ‘pool’ errors go away.
In the mean time should the site go down, it should only take a couple of minutes for the site to come back up. On the plus side, once we can nail down this issue we can reable some of the other features of the site.
Thank you for your patience.
P.S. never feel too shy to post about any issues you have or problems you see with the site, this will only help us know where we need to be focusing our efforts.
P.S. We’re using a different temporary chat room module. This one should be a bit more compatible with a wider variety of browsers. If it should act up you might try leaving that page (going to another page) then returning and seeing if the problem clears up. We’re working on a more stable version of the chatroom software. |