Author Topic: Continuous memory tests  (Read 5628 times)

Offline ITDUDE

  • Newbie
  • *
  • Posts: 9
Hi guys,

I have a pc i've been struggling to diagnose. I have run various tests with PC Doctor in Windows and DOS but haven't been able to come up with a consistent error. For a moment it seemed as though everything was pointing to COM1 after the system restarted automatically once i plugged in the COM1 tester. However, since then i haven't been able to replicate the fault.

I have been running tests in DOC PC-Doc and found nothing until i ran the continuous memory tests and discovered a failed test at PASS 1 of 100. What does this mean? As i'm writing this, it has gone through two more passes and not failed.

As i'm kind of new to PC-Doc, i feel a little lost with regards to the tests i should be running when i cannot pinpoint a fault.

Any advice would be much appreciated.

Regards, ITDUDE.

 
A pessimist sees the difficulties in every opportunity... an optimist sees the opportunities in every difficulty.

Offline fwilson

  • Hero Member
  • *****
  • Posts: 779
ITDUDE,

Intermittent errors are the hardest to diagnose.  The behavior you describe is indicative of a transient memory fault.  Concentrate on the memory tests with multiple passes and see if it comes up again.  I have seen this behavior with memory with tin memory connectors, especially when paired with gold motherboard connectors.  Over time, oxidation occurs causing the otherwise good memory to fail.

-Fred
“Integrity is doing the right thing, even if nobody is watching.”  ~ J.C. Watts

Offline ITDUDE

  • Newbie
  • *
  • Posts: 9
Interesting that you say over time as this system in not very old (approx 3 months).

I had left the testing to go overnight and it has completed 23 passes with only the initial error showing and no new ones. With this test, am i right to assume that the test simply repeats 100 times?
A pessimist sees the difficulties in every opportunity... an optimist sees the opportunities in every difficulty.

Offline fwilson

  • Hero Member
  • *****
  • Posts: 779
ITDUDE,

You are correct it will do 100 passes.  With the pass count set to zero, it would continue indefinitely.

I say over time as it takes time for corrosion to form.  This of course can be slower or faster depending on environmental conditions.

-Fred
“Integrity is doing the right thing, even if nobody is watching.”  ~ J.C. Watts

Offline ITDUDE

  • Newbie
  • *
  • Posts: 9
Hi Fred,

Hoping you can provide some more advice as this PC's issues are beating me!

After i completed the continuous memory test in DOS with 100 passes, only the initial 1 was showing the error. So i went ahead and did every other test that had to do with memory, both in DOS and Win. No errors were found. I ran further tests in Win including tests with the serial and parallel testers installed. Still no errors found.

However as i removed the serial tester, the PC rebooted. Now i'm sure i did this before and the system rebooted too but after running the serial test in both DOS and Win nothing came up in error. So, testing it again to see if the system will reboot after attaching and removing the serial tester, proves... nothing! Must have been a coincidence initially.

To add to this, when the system rebooted, it froze at the POST screen, not detecting the HDDs. Soft resetting the PC causes a black screen with no POST. Cold restart (disconnecting the power for 5 sec) eventually gets the system booting again.

I've checked the error codes that show up in the event viewer and i get one that shows:
Error code 10000000a, parameter1 00000017, parameter2 0000001c, parameter3 00000000, parameter4 80502e9a.

This same error shows up a few times.

Now, this system is a relatively new PC with minimal other software installed. However, it's basically had zero use and has never been connected to the Internet. It's basically been built to on sell but as soon as it started being used, the faults started.

Any ideas what to try or where to look? I'm being driven insane :D and need to vent!!!

PS. Sorry for the long post.
A pessimist sees the difficulties in every opportunity... an optimist sees the opportunities in every difficulty.

Offline fwilson

  • Hero Member
  • *****
  • Posts: 779
ITDUDE,

I would try running continuous testing in Dos (all tests) with the pass count set to zero.  You can do the same in windows.  The point here is to see if it may be a thermal issue.

I have seen systems with insufficient thermal compound on the Northbridge chip exhibit these types of issues.

The things I would look for in the order I would look for them in are;
Flakey Power Supply (even new ones can be bad)
Bad RAM
Thermal issues.

Try a new power supply, you don’t even have to install it just make sure you connect a ground wire between the chassis and PSU.   

Put different ram in it.

Try getting it warm.  Place in a box with a thermometer inside (poor mans environmental chamber) let the ambient temp get to around 110-120 degrees.

See if failures increase or decrease.  What type of system is this by the way MB, CPU and RAM.

-Fred
“Integrity is doing the right thing, even if nobody is watching.”  ~ J.C. Watts

Offline ITDUDE

  • Newbie
  • *
  • Posts: 9
Just thought I'd close this issue with a result.

In my previous post, i mentioned playing with the serial port and the system rebooting. I was able to replicate this. I found that every time i moved the unit by rocking it or knocked the back of it, it would reboot.

I bit the bullet, ordered the exact same motherboard, installed it and presto. No more tears. I can do everything to it short of a piledriver from the top rope and no reboots.

Flakey MB i guess.

Thanks for your help Fred. Any idea why PC-Doc wasn't able to diagnose it?

Cheers,
ITD.
A pessimist sees the difficulties in every opportunity... an optimist sees the opportunities in every difficulty.

Offline fwilson

  • Hero Member
  • *****
  • Posts: 779
ITDUDE,

Glad you found the problem.  Intermittent problems are always difficult to diagnose.  PC-Doctor could not diagnose the problem because there was none, right up until the time there was and then it rebooted.   PC-Doctor, being software, reboots right along with it.

From your description it sounds like a connector or motherboard was flexing, causing a short.  This was probably a defect from the factory but could have been caused or exacerbated by mountings on the case that allowed flex.  I have seen this on some inexpensive cases in the past.

-Fred
“Integrity is doing the right thing, even if nobody is watching.”  ~ J.C. Watts