We have a bunch of HP Proliant DL360P servers, running SLES 11 SP3 (and OES 11 SP2). These machines have a internal RAID controller (B120i), which uses a driver from HP named “hpvsa”.
We also use a few Lenovo ix4-300d storages linked to these servers, working as an iSCSI source, so we can have a Novell Cluster on our branch offices.
We’ve seen that when, for any reason, the servers loose contact with the iSCSI host (the storage), system throws a kernel panic, related to the hpvsa driver.
Does anyone have seen this before? This situation is causing a lot of trouble, specially in one of the branches (which, for some reason, have its’ switches reset nightly — so I have to reset both servers almost every morning).
Yes, I saw… but I don’t think is the case, as this feature seems to be AMD-related, and our servers are Intel-based.
And we have 8 servers intalled, and the problem only occurs on 4 of them.
[QUOTE=jqueiroz;24292]Yes, I saw… but I don’t think is the case, as this feature seems to be AMD-related, and our servers are Intel-based.
And we have 8 servers intalled, and the problem only occurs on 4 of them.[/QUOTE]
as I cannot exactly tell the source of the difficulties, I’m mostly guessing here - but IOMMU, as a general term, might also apply to Intel hardware (there it’s called VT-x, see http://en.wikipedia.org/wiki/IOMMU). So if you’re in a position to test, I’d recommend booting with “iommu=soft” to see if it makes any difference.
A second approach would be to search for differences between the four affected and the four non-affected servers. But as many different aspects might influence the problem, that easily can become rather tedious.