Hi All: I have a big Opteron server with 256GB Kingston ECC DDR3 DRAM - I was wondering if any hardware Wizard out there can decode this for me?
I *think* this means it caught and fixed an error, but I'm not sure. (The only reason I think that is that it is supposedly "error-correcting" memory...) :| It throws errors like this about once every few weeks or so when it is running near 100% capacity (no swapping, ~90% of available CPUs with 100% on each CPU used, 40% memory utilization).
THANK YOU!!:)
EDIT: Trusty internet... http://superuser.com/questions/50226...s-from-syslogd - I guess my main question is, shouldn't it actually *say* that the error was corrected?
Code:
Message from syslogd@OS121-TY3 at Jun 10 06:40:08 ...
kernel:[1961523.572779] [Hardware Error]: CPU:36 MC4_STATUS[-|CE|MiscV|-|AddrV|CECC]: 0x9c08400008080a13
Message from syslogd@OS121-TY3 at Jun 10 06:40:08 ...
kernel:[1961523.572791] [Hardware Error]: MC4_ADDR: 0x0000003186466d90
Message from syslogd@OS121-TY3 at Jun 10 06:40:08 ...
kernel:[1961523.572795] [Hardware Error]: Northbridge Error (node 6): DRAM ECC error detected on the NB.
Message from syslogd@OS121-TY3 at Jun 10 06:40:08 ...
kernel:[1961523.572815] [Hardware Error]: cache level: L3/GEN, mem/io: MEM, mem-tx: RD, part-proc: RES (no timeout)
THANK YOU!!:)
EDIT: Trusty internet... http://superuser.com/questions/50226...s-from-syslogd - I guess my main question is, shouldn't it actually *say* that the error was corrected?