And then, all of a sudden the Dell R710 decided it needed a reboot.
I still have not figured out why it happened. When I checked the rack, the R710 lcd display informed me that it was rebooting. The noise resulting from fan spinup at boot (which my wife calls the vacuum cleaner) alerted me to that already. I saw that the HP POE switch had its fault light on (bright red, steady red - which is hardware failure according to the manual). Strange thing though is that the HP switch is not connected to the Dell and the led turned off again as well. Brownout? Something else?
Anyway, I logged onto the SmartOS IRC channel to see if someone could help me figure out what was going on. Turns out that the older platform image I am running (June 2015) does not create enough space for a kernel dump should it occur. This is easy to fix. First, check the size of the current dump zone:
zfs list zones/dump
In my case, it was only 4Gb. Now, destroy the old and recreate:
dumpadm -d none zfs destroy zones/dump zfs create -V 72G -o checksum=noparity zones/dump dumpadm -d /dev/zvol/dsk/zones/dump
This first tells dumpadm to use no dump device (effectively disabling itself), then destroys the dump zone, recreates that zone with enough space (72G is the amount of internal memory in my Dell R710) and then instructs dumpadm to use that.
Now, next time this issue occurs I will at least have enough space for a core dump.
[UPDATE 11 MAY] While I was away for a few days, the machine decided to reboot itself again. This time it was a bit harder to get it back up, which was due to Redis (which I use in my Sensu setup) acting up: it quickly filled the 2Tb I have reserved for the zones, after which pretty much nothing worked again. I've seen quite a few strange error messages now all due to SmartOS running out of diskspace on the headzone, but that is to be expected I guess.
It does leave me with quite a mess to sort out. The first thing I will do is wait for the next reboot to occur and share the dump file with the SmartOS people. After that, I guess I will have to redesign the layout of the headzone (particularly disk space). And start the upgrade to Triton, which is long overdue...