Future repository of HPC/cluster operations documentation

TODO:
HW monitoring and failover (smartd, lmsensors)
Disk hotswap
networking notes, adaptor bonding
mpi howto
bioinformatics multithreading and parallel considerations
bioinformatics apps
analyze if we need rocks or another cluster distro
webmin howto
SNMP and syslog reporting facilities
production portage best practices
security and IDS howto: partitioning, "nosuid" mount flag, tripwire/aide/snort, iptables, quotas, PaX/PIE/SSP/selinux/grsecurity, iproute2 (http://lartc.org/howto/), rkhunter (or chkrootkit)
IDS and forensic analysis node running off a cdrom
motd: to include link to config wiki, cluster info, current users/load/status of critical services
make a system that emails root with (only) critical status info (tripwire/aide, glsa, backup check, hardware check)

TOPICS:
Diskless/network boot - move and expand network boot howto; see http://forums.gentoo.org/viewtopic-t-434302.html
Filesystems overview and howtos: XFS, LVM, NFS; one of: OCFS2, GFS, Lustre; boot labels
Scheduling and queueing
SSH keys and remote shell
Replication and backup (see rsnapshot, rdiff-backup, backuppc, shadow copy using lvm snapshots)
Monitoring (see cacti, ganglia, MRTG, lm_sensors)
Failover and HA (see ucarp)
Cluster utilities: pvm, ...?
Wikis, web services, and citation management: wikimedia, trac, http://connotea.org, plone, horde/imp
Misc: unionfs,

Core CFLAGS: http://archives.gentoo.org/gentoo-amd64/msg_14402.xml

LINKS:
http://clustermonkey.net/
http://www-03.ibm.com/systems/clusters/library/
http://sources.redhat.com/cluster/
http://www.buyya.com/cluster/
http://onesis.org/
http://support.cis.ksu.edu/BeocatAdminDocs/HowBeocatWorks
http://wiki.neuralbs.com/index.php/Gentoo_Diskless_Client

http://www.tldp.org/HOWTO/LVM-HOWTO/
http://www.gentoo.org/doc/en/lvm2.xml
http://www.gentoo.org/proj/en/cluster/