Info about Servers
Handling Server Downtimes
Servers are some of the most important networks elements in today’s internet world. Most applications that run today are based on the server-client model where there is a central server machine and client software on the desktops request service from it. These servers are also machines and require regular upgrades and maintenance. When this happens, the servers make have to be shutdown. This leads to server downtimes and the needs to be handled carefully.
Server downtimes are inevitable to ensure the longevity of their service. Many times the number of users grows so high that the server in its current configuration cannot handle it. You need to upgrade the server. Such upgrades require the server to be turned down and reconfigured. Other actions that can cause server downtimes are maintenance work like adding a software patch or installing a new tool on the server which requires it to be restarted.
The server downtimes are not always for hours together. Some actions require just about an hour to turn up the server and run the basic test. Some others are major changes and could require hours or even days of down time. Handling these downtimes is very important to ensure smooth functioning of the operations.
Here are some of the basic checks you must perform before and after the server downtimes:
• Analyze the maintenance activity and estimate the approximate downtime.
• Look for other maintenance work that are going on and try to co-ordinate it with those. For example, if the connectivity to the server itself is going to go down, then that is the best time to perform your maintenance too.
• Notify all the affect users in advance about the downtime. What is worse than a down server is a frustrated customer. You never want that, so advance notification is important. If there are SLAs, then you may even need to get approval from the customers before proceeding with the maintenance or upgrade activity.
• Plan your activity to have the minimum downtime possible. Most times there are other steps that need to be completed before taking down the server, try to do all these when the server is up and running. This will reduce the time for which you need to shut it down.
• Once you have completed the activity, turn on the server and run some basic checks to see if everything is working fine. These checks need to be quick but thorough. You cannot miss out on anything and you cannot take forever to see if the server came up properly. You need to think through this very well.
• Control the load on the server after restart. Once the server is up and running, you need to ensure that it does not get overloaded with requests. Typically a server that can say handle 90% load can be taken out even at 40% or 50% if this entire load is applied at the same time immediately after start up. You need to avoid that.
These are some simple tips that can help you carry out successful server maintenance.