When you have these symptoms:
- When you just turn on your computer after a while it was off, the
systems loads OK, and your programs run fine at the beginning
- After a while using your machine (that can go for few minutes to few
hours), a malfunction suddenly hangs up an application you are using.
The application can give you a error message try to continue or just
close. It just hangs up and you have to kill it by using the task manager
invoked with Ctrl+Alt+Del (in Windows OS).
- Then you try to go on working hoping that all was just a erratic
error, and reopens the application you were using to go ahead with
your work. Your application may runs fine for a while, but minutes (or
seconds) later, the same or another error arises. The things turn not
funny a t all.
- Perhaps you try to restart the application, or restart the whole
machine again, but even though the situation does not improve, it surely
turns worse, and even more, now errors associated not only with the
original problematic application arises but also with any other
application, and even the OS.
- Perhaps you start of suspecting of a disk error and try to run
scandisk. If you are lucky the scan ends with no error or perhaps a
couple of ordinary minor errors fixed, or even the scandisk will hang
up before ending (or even starting). If you were lucky and the scan
ends OK, you still will fine that the applications and even the OS
still are unstable. Enough unstable to drive you mad.
- Minutes later, and many tries after, the whole system hangs up at
anytime, and even it refuses to start completely, blue Windows error
screens appears, and the only chance is to turn off the computer and
forget your work for a while.
- Let's say a couple of hours later you turn on the computer again,
and everything starts running normal, but after some minutes the
errors rises again.
What follows is trying to identify the cause of the annoying and mysterious
cascade errors is
- Suspect of the hard drive and run a scan disk (as already
mentioned): you will find the scan ends with no error or ordinary
minor error fixed. Or it will hang up, and add a new problem to your
list.
- Suspect that the original application that rises the first error is
the sick one. This is very common since most of user tend to use one
specific application over the others (let's say the web browser, the
mail client, the word processor), and try to restart this application
again every time it goes down. As systematically the same application
will rise the same or other error when restarted again and again, the
user will misleading convince himself that the problem is the
application. You try to reinstall the application. If you are lucky,
you can finish reinstalling it. Even though the problems subsist.
- Suspect the OS (Windows for instance) installation has been corrupted
in some way by some software we install, or just by the randomness of thermodynamics
and the 2nd Law. You will try a warm reinstallation of the OS (a
reinstallation in the original directory of the current OS
installation so all your programs, settings and hardware configuration
will be recognize by the new installation). But bad luck, the problems
are still there.
- Finally you can try a whole cold OS reinstallation (a completely new
installation in a directory different of that of the current
installation, and that will obligate you to the reinstall all your
hardware drivers, software applications, settings, etc. Briefly: the nightmare
you has been trying to avoid from the beginning. And this really hard
work won't solve anything at all!
- It's time, oh yeah, to think about your hardware, beyond the hard
disk. About your motherboard, your CPU or your RAM (memory).
And perhaps only now you are close to the truth cause of the annoying
problem: the electronic circuits of your hardware, but fortunately
something that doesn't mean permanent damage to this core technology that
can you cost as much as a new computer. If you act on time the problem can
be easily solved by pumping a little a more of air to it, and refresh the
silicon components form the heat generated for the trillions of electrons
running really fast thru its copper connections to make your software
run.
Electrons running thru conductors, but moreover thru the semiconductors
that forms the microscopic inside of each chip in your hardware generates high
amounts of heat. Particularly the CPU where the main processing and
calculus operations are performed generates the most of the heat. Just
turning the machine on makes the CPU to increase its temperature in a rate
of one Celsius degree per second until it hangs up or burn up.
This is why a silly but very important piece makes part of any modern
hardware ensemble: dissipation and ventilation devices, mainly over and
for the CPU.
And this is what it was happening in your machine. But I have to accept
that it is even hard to the savvy professional to identify a cascade of
pure software errors that rises minutes after a machine is turn on and
everything runs well with a CPU overheating, and consequentially with a failure
of the ventilation or heat dissipation system in the hardware.
Lets enumerate briefly the behaviors that can conduct us to diagnose an
overheating and a heat dissipation malfunction, avoiding in this way going
thru all those painful software based treatments as reinstallations and
all the other nasty stuff:
- Software errors doesn't systematically rise as soon as an
application is started, but begin raising a while later, even in the
same conditions when minutes before everything run OK.
- The errors seems to rise only when running a determinate application
(usually the most often used one), but after a while, errors with
other applications or the OS itself appear.
- Errors has not to be always the same, different errors rises, with
different messages or behaviors. Perhaps a time patter can arise in
the succession, but is hard to find a consistent cause to it.
- Scandisk shows no significant error, or can't finish running at all.
- If you restart the machine immediately after the problems appear,
the errors persists.
- If you turn off the machine, wait a while (really the time enough to
let the hardware cool down), and restart it later, everything
starts to work OK, until after a while the errors appears again.
These are the symptoms of a case scenario. But other scenarios may
arise from overheating, and can add to the symptoms bellow, or substitute
them for some of the following:
- The machine is just on, you are not on the machine, there may be
applications open or not, but when you return an error of the system
is on your screen, or the system is just frozen.
- The terrible blue Windows error screen appears suddenly, even you
were using the machine or not.
- You restart the machine and you can't even access to the BIOS (the
basic system instructions loaded from hardware at the beginning of the
start up). Instead of that nothing happens or some weird screens you
never seen before (the monitor testing screens that appears when no
functioning hardware or BIOS is detected) appears in your monitor.
The key factors to distinguish an overheat problem from the software
and machine ordinary use are:
- The cascade nature of the errors, appearing not only in one
application, but in others, and in the OS itself.
- An immediate restart doesn't solve anything. They keep the same or
worse.
- If the machine continuous on, problems get even weird and more
complicated, until a complete hung up, with little or not chance to
restart the machine at all.
- If you wait a while to restart the computer, it will looks to run
fine until after some time the madness start again.
To check and confirm an overheating problem is going on:
- Open the computer case, wait for the CPU and other components to get
cold (room temperature)
- Start the machine and invoke the BIOS control panel screen
(generally pressing the Del key as soon as the machine starts). Seek
in the BIOS menu for an item called "Advanced features",
select it and look for two main indicators: "CPU
Temperature" and "Fan Speed". Note: There are a lot of
different BIOS over there. This is a feature usually found in the
modern ones. These two indicators could be hidden in other BIOS item
menu, so if you don't find the "Advanced Features" mention
here, try to explore each item in the BIOS menu one by one. It is
worthy, they are great and very useful indicators.
- See the Fan speed, if the CPU ventilation fan is running as it
should this value would be between 5000 and 6000 RPM (normal fan
speeds vary with the hardware model). Slower speeds, or just null or
"Disabled" indicators are telling your fan is not pulling
hot air form the CPU at all, and there is the cause your CPU will
overheat in some minutes and screw up all.
- Additionally take a look to the CPU Temperature indicator to have a
real life experience of what is going on. The CPU Temperature would
start to rise from room temperature some degrees until it get stable
around 38 Celsius degrees or 100 Fahrenheit degrees (nice coincidence,
a little more of the human normal body temperature). Of course this
numbers may vary from CPU to CPU. If temperatures goes beyond this
point, there is overheating even the fan speed seems to be normal, and
an overheating treatment should be conducted anyway.
- If your BIOS has not these fantastic features, as you have the case
open, get reach to the CPU, or the metal CPU heat dissipation device
immediately between the CPU and the fan, and try to put a extreme of a
thermometer there and monitor how temperature evolves as if you have a
BIOS indicator. The measures would be equally useful to confirm
diagnose.
Overheating causes, and lines to solve it:
- The main cause is a fan malfunction since this tiny artifact is the
main tool for keeping the CPU cool. The most of the times you must
first focus over it. If you have a RPM measure from the BIOS and it
shows is running slower than normal, you have the point. If you don't
have access to that indicator in the BIOS, just use your sight, ear
and common sense to determine if it is spinning fast and clean enough.
You should not see the fan blades and the sound should be a soft
buzz.
- Check for dirt and dust accumulated in the fan and between the metal
heat dissipater blades. Remove all the dirt and dust!
- If the fan isn't working properly try to make it run better by
lubricating it. Remember, this is the only mobile piece related with
the circuits, and so that is the first candidate to fail. Try dropping
carefully lubricating oil (or just kitchen vegetable oil), as close of
the rotation axis as you can. Sometimes you can gain access to the
axis head under a stick in the back of the fun.
- Obviously all the procedures mention before will require you to unassembled
part or all the different pieces involved (CPU, metal heat dissipater,
fan), or those ones in the way (power supply, other boards or devices,
etc)
A practical case showing a ventilation system treatment can be seen
here: CPU fan and cooling system
troubleshooting
|