Table of Contents

TODO

We've still got plenty of work to do before we can move the new system(s) into production.

TODO on Bud on 2007-06-13

TLS/SASL additions to Postfix (DONE by CMB, except for SSL certificates)
	Have not yet implemented since rebuilding Bud.
	Documentation on wiki page is not clear; needs work.
		Try to be consistent with existing docs, and use postconf -e.
		We have previous history and 'script' files to work with too.
	Copy over from Dark any static IPs that are allowed to send email through us.
MailMan (2-3 hr)
	Install, configure, and test.
	Ensure archives can be viewed via the web.
		Should have a password, like existing site.
		We can leave the previous archives on Dark temporarily.
			Remove rewrite rules from .htaccess file when we move those to Bud too.
	Ensure we have a plan to maintain the lists.
	Copy list memberships from Dark/Michelob.
Backups (2-3 hr)
	Ensure everything we need to back up gets backed up.
	Create SSH restrictions to copy backups from one server to the other.
		Give 'backup' user write access to /var/backups/bud/ and budlight/.
		In ~backup/.ssh/authorized_keys:
			from="bud,bud.sluug.org,static-206.196.99.162.primary.net",command="cat > /var/backups/bud/bud-`date +'%Y%m%d'`.tar.gz",no-pty,no-port-forwarding,no-x11-forwarding ssh-dsa ...... 
	Rotate backups, so we don't keep too many around.
		Would be nice to keep:
			1 backup per day for 1 week (or 2 weeks)
			1 backup per week for a year (or 1-6 months)
			1 backup per month forever
	Need to be able to back up MySQL data properly.
		I don't think copying the directories works correctly.
Move MX record to Bud (READY to flip switch [MX record in DNS] for Plan A - CMB)
	Plan A: have email bounce from Bud to Dark
		We could do this on a per-user basis.
	Plan B: do not move MX record, until users change their settings on Dark/Michelob.
		Problem is that we'd tell them how via email, which they might not get.
			Some of our users read email very infrequently.
System Monitoring and Admin (2-4 hr)
	Monitor Apache logs for missing pages.
	Scripts for user creation.
		Primarily for Budlight.
		Probably want to maintain UIDs.
			But ensure that regular user UIDs are > 1000, per Debian requirements.
		Manually copy user directories from Michelob/Dark?
	Metrics on emails (spam).
	Metrics on web.
	What else do we need?
SSL Certificates (1-2 hr) (DONE - LVL created openssl-gencrt)
	Need to be consistent in parameters we pass in to create various certificates.
		Would be nice to not have to type the parameters in every time.
	Move all SSL cert creation to one place in the docs.
		http://wiki.sluug.org/build/security
		Currently some stuff in build/misc, build/apache, and build/postfix.

Things that can wait until later

Rebuilding Budlight
	Build based on Bud build documentation on Wiki.
	Requires an on-site visit to install the OS.
		Be prepared with the CORRECT network settings.
		Be prepated with the install docs from the wiki.
	Apache for users.sluug.org on Budlight.
		<Directory /home/*/public_html>
			AllowOverride All
			Options Indexes FollowSymLinks
		</Directory>
	Migrate users from Dark/Michelob.
Security
	Change shell from /bin/sh for accounts 1-99.
		Change to either /bin/false or /usr/sbin/nologin.
	Documentation on http://wiki.sluug.org/build/security page.
		Section for each of:
			GPG, SSL certs, rootkit checker, tripwire, tiger, john-the-ripper, etc.
		Move Security section (containing GPG info) from 'misc' page.
	From http://www.linuxsecurity.com/docs/harden-doc/html/securing-debian-howto/ch4.en.html:
		Mount /tmp nosuid,noexec,nodev
		Mount /home nodev
		Mount /usr ro
			Edit ''/etc/apt/apt.conf'' to add:
				DPkg
				{
					Pre-Invoke  { "mount /usr -o remount,rw" };
					Post-Invoke { "mount /usr -o remount,ro" };
				};
			Found a note saying that sometimes the ro remount does not work.
				That doesn't leave us any more vulnerable than if we did not implement this.
				Will get remounted ro the next time we install something.
			We should probably point /usr/src at /usr/local/src then.
		Run tripwire or daily checksums:
			for dir in /bin/ /sbin/ /usr/bin/ /usr/sbin/ /lib/ /usr/lib/; do
				find $dir -type f | xargs /usr/bin/md5sum > /var/backups/checksums-$DATE.txt
			done
	Security stuff to install/use.
		john (run it daily, send reports to sysadmin)
			really need to filter and only send email if problems
		chkrootkit
		tiger (internal audit, sends report every day to root)
		tripwire
		checksecurity
		logcheck logcheck-database
	Web content should NOT be owned by www-data, per Debian security manual!
		(Although /var/www/index.html is.)
		I don't know how to make data readable by www-data, but writable by a group.
System Tuning
	It would probably be a good idea to mount the /var partition with the noatime flag.
	Hard drive tuning.
		Set noatime option in /etc/fstab.
			mount /var -o remount
		hdparm -t /dev/ida/c0d0p11 (speed test, spare partition, just in case)
		hdparm -vi (get info)
		hdparm -m max_blocks_per_io
	Tune some TCP/IP settings.
		NOTE: With everything running, but no clients/load, we're using less than 300MB of RAM.
		Edit /etc/sysctl.conf:
			# Use TCP syncookies when needed (see http://www.ibm.com/developerworks/linux/library/l-tune-lamp-1/)
			net.ipv4.tcp_syncookies = 1
			# Increase TCP max buffer sizes
			net.core.rmem_max = 16777216
			net.core.wmem_max = 16777216
			# Increase Linux autotuning TCP buffer limits
			net.ipv4.tcp_rmem = 4096 87380 16777216 
			net.ipv4.tcp_wmem = 4096 65536 16777216
			# Increase number of ports available
			net.ipv4.ip_local_port_range = 1024 65000
		sysctl -p /etc/sysctl.conf
Tune DokuWiki
	Remove globe icon.
	Make the trace work like my web site.
	Probably wait for new version, which should be out in early June.
Tune SpamAssassin config.
	Install more/better SpamAssassin rules.
		Rules Du Jure.
	Get pyzor/razor/spamassasin rule updates from debian-volatile repository.
		These packages aren't actually updated there.
Add some RBLs to Postfix config.
Add HTTPS to Apache config.
	Simple, once we get SSL certificates scripted.
Consider some Apache tuning.
	Tuning per http://www.ibm.com/developerworks/linux/library/l-tune-lamp-2.html
		Set MaxRequestsPerChild 1000 in apache.conf
			Once a processes has served 1000 HTTP requests, kill it and start a new one.
			Good to prevent memory leaks.
			Make change in /etc/apache2/conf.d file if possible.
		<Directory />
			AllowOverride None # Don't look for .htaccess files above vhosts' root dirs.
			Options -Indexes FollowSymLinks # No indexes unless explicit in vhost; FollowSymLinks for performance
		</Directory>
		Keep-alive settings (override what's in default apache.conf)
			KeepAlive On
			MaxKeepAliveRequests 100
			KeepAliveTimeout 1	# very short - 1 second; may solve AJAX "state 3" problem, else KeepAlive Off
	/server-status and /server-info
		Only from localhost. Probably doesn't matter which vhosts.
		<Location /server-status>
			SetHandler server-status
			Order deny,allow
			Deny from all
			Allow from 127.0.0.1
		</Location>
		<Location /server-info>
			SetHandler server-info
			Order deny,allow
			Deny from all
			Allow from 127.0.0.1
		</Location>

Commands

(Enter any command-line programs you want to have installed here.)

Jeff: mutt, lsof, strace, tcpdump (just off the top off my head - probably I will think of more later)

Operating System

Need to configure all regular users. (Actually, we need to determine if regular users will have accounts on these boxes at all. But the answer will probably be yes.) If we do allow normal users, we need to transfer over the same UIDs and (shadow) passwords as the existing boxes (Michelob and Dark) use.

Need to fix server not shutting down properly. It hangs before finishing the shutdown process. (I believe this is fixed. I rebooted the systems on 3/16/2006 using init 6 and they came back up just fine. – Craig)

Need to fix the console not coming back up after it blanks out. May be APM power settings. (Our current kernel doesn't support APM.)

It would probably be a good idea to mount the /var partition with the noatime flag.

Kernel

We'd like to upgrade to a 2.6 kernel. We'd also like to compile a custom kernel, to remove some stuff we don't need and to get better support for some of the features of our systems — such as APM.

Firewall

We may need to open some more ports for new services, if we add any new services.

Email

Outbound email seems to be working OK. Not anymore :( as of November 4, 2005. (CMB)

Inbound SMTP seems to be working OK, but the hand-off to Cyrus seems to be broken.

Jeff Muse and Craig Buchek think we should switch to Courier IMAP. Cyrus isn't working, and there's better documentation for setting up Courier in a configuration similar to ours.

We'll need to get mailing lists working on the new systems before we can migrate email.

Lots of testing (via the test.sluug.org domain) will be required before we're ready to send production email to the new system.

We need to figure out how users on the existing servers (Michelob and Dark) will be able to access their email without needing to make any major changes on those systems.

Web

Make sure SSL is working properly, including (self-signed) certificates. (Probably include instructions to users on how to accept our certificate authority.)

Need to verify that all virtual sites are working.

Need to determine what web apps we plan to run, especially if any of them will control the root home page of a virtual site. Possible web apps include:

Before moving www.sluug.org to the new system(s), we need to make sure that we've got all the old content moved over, and pages with all the same names that external sites are pointing at. To make sure, we need to monitor the Apache logs after moving over, to see what non-existant pages people are trying to access.

Backups

We discussed backups a little. The consensus was that full backups would not be necessary. We could build the system from scratch just about as quickly as performing a full restore. Instead, we plan to back up only the data and configuration info. I.e. /home (including web sites), /etc, parts of /var (email spool), and maybe /usr/local.

We'll also need to back up the MySQL and PostgreSQL databases.

For the majority of our backups, we'll probably just transfer the data to another computer across the Internet.

Routine Maintanence

MySQL and PostgreSQL databases require periodic maintanence. For example, Matthew Porter recently explained how failure to VACUUM a PostgreSQL database will lead to very slow access times after about a month of moderate-heavy use.

Redundancy

We have 2 identical systems, so that we can fail over to a backup system if the primary fails. Each system will be doing its own thing under normal conditions. If one system happens to fail, we'll manually switch its functionality over to the other system. Mostly by pointing the DNS records for those services to the other system.

Misc

We need to require a password for sudo for those who can run anything through sudo. For those who can only run a few commands, we can allow them to use sudo without a password.

We also need to set up restricted users in sudo, who can only run a few commands.