At times there is a driver or two that’s misbehaving and causing bluescreens. As the server automatically reboots after dumping memory to the memory.dmp file you might not get a report from your users that there has been a problem. And depending on your monitoring tool you might not get an alter there either. Operations Manager can easily alert you for things like that, but far from all customers use OpsMgr due to it’s complexity. Luckily, it’s just a 1 minute job to get alert in OMS if you have got a bluescreen! And as OMS can be run in Free mode, you may be able to monitor your servers for free (all depending on the amount of data you collect) and else, it’s really cheap so no big deal if you need to use a standard subscription. Anyway, lets get to the technical stuff!
First of all, enable OMS to collect Eventlog System and all Error messages.
Then create an Alert like this,
The Alert text to be used is:
EventID=1001 EventLevelName=error Type=Event
That will only alert for Crashes. You can also enable an alert for Event ID 6008 which will alert you for an unexpected shutdown. The difference is that my alert will only send an alert if there was a BSOD while an unexpected alert could also alert if someone pulled the power. Or even combine both into one alert with an OR statement. In my case, I just want to get alerted about the BSOD’s so thats the only thing I look for right now.
Tell how often is should check. There is usually no need to check more than once or twice an hour. And finally define if it should send an email alert or use one of the other alert methods.
Easy as that! Next time you get a bluescreen on a server, you will get an alert by mail so you can debug the dump and find out what’s causing it.
It will look like this,