Erratic software cascade failures hide CPU overheat
When you have these symptoms:
  • When you just turn on your computer after a while it was off, the systems loads OK, and your programs run fine at the beginning
  • After a while using your machine (that can go for few minutes to few hours), a malfunction suddenly hangs up an application you are using. The application can give you a error message try to continue or just close. It just hangs up and you have to kill it by using the task manager invoked with Ctrl+Alt+Del (in Windows OS).
  • Then you try to go on working hoping that all was just a erratic error, and reopens the application you were using to go ahead with your work. Your application may runs fine for a while, but minutes (or seconds) later, the same or another error arises. The things turn not funny a t all.
  • Perhaps you try to restart the application, or restart the whole machine again, but even though the situation does not improve, it surely turns worse, and even more, now errors associated not only with the original problematic application arises but also with any other application, and even the OS.
  • Perhaps you start of suspecting of a disk error and try to run scandisk. If you are lucky the scan ends with no error or perhaps a couple of ordinary minor errors fixed, or even the scandisk will hang up before ending (or even starting). If you were lucky and the scan ends OK, you still will fine that the applications and even the OS still are unstable. Enough unstable to drive you mad.
  • Minutes later, and many tries after, the whole system hangs up at anytime, and even it refuses to start completely, blue Windows error screens appears, and the only chance is to turn off the computer and forget your work for a while.
  • Let's say a couple of hours later you turn on the computer again, and everything starts running normal, but after some minutes the errors rises again.

What follows is trying to identify the cause of the annoying and mysterious cascade errors is

  • Suspect of the hard drive and run a scan disk (as already mentioned): you will find the scan ends with no error or ordinary minor error fixed. Or it will hang up, and add a new problem to your list.
  • Suspect that the original application that rises the first error is the sick one. This is very common since most of user tend to use one specific application over the others (let's say the web browser, the mail client, the word processor), and try to restart this application again every time it goes down. As systematically the same application will rise the same or other error when restarted again and again, the user will misleading convince himself that the problem is the application. You try to reinstall the application. If you are lucky, you can finish reinstalling it. Even though the problems subsist.
  • Suspect the OS (Windows for instance) installation has been corrupted in some way by some software we install, or just by the randomness of thermodynamics and the 2nd Law. You will try a warm reinstallation of the OS (a reinstallation in the original directory of the current OS installation so all your programs, settings and hardware configuration will be recognize by the new installation). But bad luck, the problems are still there.
  • Finally you can try a whole cold OS reinstallation (a completely new installation in a directory different of that of the current installation, and that will obligate you to the reinstall all your hardware drivers, software applications, settings, etc. Briefly: the nightmare you has been trying to avoid from the beginning. And this really hard work won't solve anything at all!
  • It's time, oh yeah, to think about your hardware, beyond the hard disk. About your motherboard, your CPU or your RAM (memory).

And perhaps only now you are close to the truth cause of the annoying problem: the electronic circuits of your hardware, but fortunately something that doesn't mean permanent damage to this core technology that can you cost as much as a new computer. If you act on time the problem can be easily solved by pumping a little a more of air to it, and refresh the silicon components form the heat generated for the trillions of electrons running really fast thru its copper connections to make your software run. 

Electrons running thru conductors, but moreover thru the semiconductors that forms the microscopic inside of each chip in your hardware generates high amounts of heat. Particularly the CPU where the main processing and calculus operations are performed generates the most of the heat. Just turning the machine on makes the CPU to increase its temperature in a rate of one Celsius degree per second until it hangs up or burn up.
This is why a silly but very important piece makes part of any modern hardware ensemble: dissipation and ventilation devices, mainly over and for the CPU.

And this is what it was happening in your machine. But I have to accept that it is even hard to the savvy professional to identify a cascade of pure software errors that rises minutes after a machine is turn on and everything runs well with a CPU overheating, and consequentially with a failure of the ventilation or heat dissipation system in the hardware.

Lets enumerate briefly the behaviors that can conduct us to diagnose an overheating and a heat dissipation malfunction, avoiding in this way going thru all those painful software based treatments as reinstallations and all the other nasty stuff:

  1. Software errors doesn't systematically rise as soon as an application is started, but begin raising a while later, even in the same conditions when minutes before everything run OK.
  2. The errors seems to rise only when running a determinate application (usually the most often used one), but after a while, errors with other applications or the OS itself appear.
  3. Errors has not to be always the same, different errors rises, with different messages or behaviors. Perhaps a time patter can arise in the succession, but is hard to find a consistent cause to it.
  4. Scandisk shows no significant error, or can't finish running at all.
  5. If you restart the machine immediately after the problems appear, the errors persists.
  6. If you turn off the machine, wait a while (really the time enough to let the hardware cool down),  and restart it later, everything starts to work OK, until after a while the errors appears again.

These are the symptoms of a case scenario. But other scenarios may arise from overheating, and can add to the symptoms bellow, or substitute them for some of the following:

  1. The machine is just on, you are not on the machine, there may be applications open or not, but when you return an error of the system is on your screen, or the system is just frozen.
  2. The terrible blue Windows error screen appears suddenly, even you were using the machine or not.
  3. You restart the machine and you can't even access to the BIOS (the basic system instructions loaded from hardware at the beginning of the start up). Instead of that nothing happens or some weird screens you never seen before (the monitor testing screens that appears when no functioning hardware or BIOS is detected) appears in your monitor.

The key factors to distinguish an overheat problem from the software and machine ordinary use are:

  1. The cascade nature of the errors, appearing not only in one application, but in others, and in the OS itself.
  2. An immediate restart doesn't solve anything. They keep the same or worse.
  3. If the machine continuous on, problems get even weird and more complicated, until a complete hung up, with little or not chance to restart the machine at all.
  4. If you wait a while to restart the computer, it will looks to run fine until after some time the madness start again.

To check and confirm an overheating problem is going on:

  1. Open the computer case, wait for the CPU and other components to get cold (room temperature)
  2. Start the machine and invoke the BIOS control panel screen (generally pressing the Del key as soon as the machine starts). Seek in the BIOS menu for an item called "Advanced features", select it and look for two main indicators: "CPU Temperature" and "Fan Speed". Note: There are a lot of different BIOS over there. This is a feature usually found in the modern ones. These two indicators could be hidden in other BIOS item menu, so if you don't find the "Advanced Features" mention here, try to explore each item in the BIOS menu one by one. It is worthy, they are great and very useful indicators.
  3. See the Fan speed, if the CPU ventilation fan is running as it should this value would be between 5000 and 6000 RPM (normal fan speeds vary with the hardware model). Slower speeds, or just null or "Disabled" indicators are telling your fan is not pulling hot air form the CPU at all, and there is the cause your CPU will overheat in some minutes and screw up all.
  4. Additionally take a look to the CPU Temperature indicator to have a real life experience of what is going on. The CPU Temperature would start to rise from room temperature some degrees until it get stable around 38 Celsius degrees or 100 Fahrenheit degrees (nice coincidence, a little more of the human normal body temperature). Of course this numbers may vary from CPU to CPU. If temperatures goes beyond this point, there is overheating even the fan speed seems to be normal, and an overheating treatment should be conducted anyway.
  5. If your BIOS has not these fantastic features, as you have the case open, get reach to the CPU, or the metal CPU heat dissipation device immediately between the CPU and the fan, and try to put a extreme of a thermometer there and monitor how temperature evolves as if you have a BIOS indicator. The measures would be equally useful to confirm diagnose. 

Overheating causes, and lines to solve it:

  1. The main cause is a fan malfunction since this tiny artifact is the main tool for keeping the CPU cool. The most of the times you must first focus over it. If you have a RPM measure from the BIOS and it shows is running slower than normal, you have the point. If you don't have access to that indicator in the BIOS, just use your sight, ear and common sense to determine if it is spinning fast and clean enough. You should not see the fan blades and the sound should be a soft buzz. 
  2. Check for dirt and dust accumulated in the fan and between the metal heat dissipater blades. Remove all the dirt and dust!
  3. If the fan isn't working properly try to make it run better by lubricating it. Remember, this is the only mobile piece related with the circuits, and so that is the first candidate to fail. Try dropping carefully lubricating oil (or just kitchen vegetable oil), as close of the rotation axis as you can. Sometimes you can gain access to the axis head under a stick in the back of the fun.
  4. Obviously all the procedures mention before will require you to unassembled part or all the different pieces involved (CPU, metal heat dissipater, fan), or those ones in the way (power supply, other boards or devices, etc)

A practical case showing a ventilation system treatment can be seen here: CPU fan and cooling  system troubleshooting