====== TODO ======
We've still got plenty of work to do before we can move the new system(s) into production.
===== TODO on Bud on 2007-06-13 =====
TLS/SASL additions to Postfix (DONE by CMB, except for SSL certificates)
Have not yet implemented since rebuilding Bud.
Documentation on wiki page is not clear; needs work.
Try to be consistent with existing docs, and use postconf -e.
We have previous history and 'script' files to work with too.
Copy over from Dark any static IPs that are allowed to send email through us.
MailMan (2-3 hr)
Install, configure, and test.
Ensure archives can be viewed via the web.
Should have a password, like existing site.
We can leave the previous archives on Dark temporarily.
Remove rewrite rules from .htaccess file when we move those to Bud too.
Ensure we have a plan to maintain the lists.
Copy list memberships from Dark/Michelob.
Backups (2-3 hr)
Ensure everything we need to back up gets backed up.
Create SSH restrictions to copy backups from one server to the other.
Give 'backup' user write access to /var/backups/bud/ and budlight/.
In ~backup/.ssh/authorized_keys:
from="bud,bud.sluug.org,static-206.196.99.162.primary.net",command="cat > /var/backups/bud/bud-`date +'%Y%m%d'`.tar.gz",no-pty,no-port-forwarding,no-x11-forwarding ssh-dsa ......
Rotate backups, so we don't keep too many around.
Would be nice to keep:
1 backup per day for 1 week (or 2 weeks)
1 backup per week for a year (or 1-6 months)
1 backup per month forever
Need to be able to back up MySQL data properly.
I don't think copying the directories works correctly.
Move MX record to Bud (READY to flip switch [MX record in DNS] for Plan A - CMB)
Plan A: have email bounce from Bud to Dark
We could do this on a per-user basis.
Plan B: do not move MX record, until users change their settings on Dark/Michelob.
Problem is that we'd tell them how via email, which they might not get.
Some of our users read email very infrequently.
System Monitoring and Admin (2-4 hr)
Monitor Apache logs for missing pages.
Scripts for user creation.
Primarily for Budlight.
Probably want to maintain UIDs.
But ensure that regular user UIDs are > 1000, per Debian requirements.
Manually copy user directories from Michelob/Dark?
Metrics on emails (spam).
Metrics on web.
What else do we need?
SSL Certificates (1-2 hr) (DONE - LVL created openssl-gencrt)
Need to be consistent in parameters we pass in to create various certificates.
Would be nice to not have to type the parameters in every time.
Move all SSL cert creation to one place in the docs.
http://wiki.sluug.org/build/security
Currently some stuff in build/misc, build/apache, and build/postfix.
===== Things that can wait until later =====
Rebuilding Budlight
Build based on Bud build documentation on Wiki.
Requires an on-site visit to install the OS.
Be prepared with the CORRECT network settings.
Be prepated with the install docs from the wiki.
Apache for users.sluug.org on Budlight.
AllowOverride All
Options Indexes FollowSymLinks
Migrate users from Dark/Michelob.
Security
Change shell from /bin/sh for accounts 1-99.
Change to either /bin/false or /usr/sbin/nologin.
Documentation on http://wiki.sluug.org/build/security page.
Section for each of:
GPG, SSL certs, rootkit checker, tripwire, tiger, john-the-ripper, etc.
Move Security section (containing GPG info) from 'misc' page.
From http://www.linuxsecurity.com/docs/harden-doc/html/securing-debian-howto/ch4.en.html:
Mount /tmp nosuid,noexec,nodev
Mount /home nodev
Mount /usr ro
Edit ''/etc/apt/apt.conf'' to add:
DPkg
{
Pre-Invoke { "mount /usr -o remount,rw" };
Post-Invoke { "mount /usr -o remount,ro" };
};
Found a note saying that sometimes the ro remount does not work.
That doesn't leave us any more vulnerable than if we did not implement this.
Will get remounted ro the next time we install something.
We should probably point /usr/src at /usr/local/src then.
Run tripwire or daily checksums:
for dir in /bin/ /sbin/ /usr/bin/ /usr/sbin/ /lib/ /usr/lib/; do
find $dir -type f | xargs /usr/bin/md5sum > /var/backups/checksums-$DATE.txt
done
Security stuff to install/use.
john (run it daily, send reports to sysadmin)
really need to filter and only send email if problems
chkrootkit
tiger (internal audit, sends report every day to root)
tripwire
checksecurity
logcheck logcheck-database
Web content should NOT be owned by www-data, per Debian security manual!
(Although /var/www/index.html is.)
I don't know how to make data readable by www-data, but writable by a group.
System Tuning
It would probably be a good idea to mount the /var partition with the noatime flag.
Hard drive tuning.
Set noatime option in /etc/fstab.
mount /var -o remount
hdparm -t /dev/ida/c0d0p11 (speed test, spare partition, just in case)
hdparm -vi (get info)
hdparm -m max_blocks_per_io
Tune some TCP/IP settings.
NOTE: With everything running, but no clients/load, we're using less than 300MB of RAM.
Edit /etc/sysctl.conf:
# Use TCP syncookies when needed (see http://www.ibm.com/developerworks/linux/library/l-tune-lamp-1/)
net.ipv4.tcp_syncookies = 1
# Increase TCP max buffer sizes
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
# Increase Linux autotuning TCP buffer limits
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
# Increase number of ports available
net.ipv4.ip_local_port_range = 1024 65000
sysctl -p /etc/sysctl.conf
Tune DokuWiki
Remove globe icon.
Make the trace work like my web site.
Probably wait for new version, which should be out in early June.
Tune SpamAssassin config.
Install more/better SpamAssassin rules.
Rules Du Jure.
Get pyzor/razor/spamassasin rule updates from debian-volatile repository.
These packages aren't actually updated there.
Add some RBLs to Postfix config.
Add HTTPS to Apache config.
Simple, once we get SSL certificates scripted.
Consider some Apache tuning.
Tuning per http://www.ibm.com/developerworks/linux/library/l-tune-lamp-2.html
Set MaxRequestsPerChild 1000 in apache.conf
Once a processes has served 1000 HTTP requests, kill it and start a new one.
Good to prevent memory leaks.
Make change in /etc/apache2/conf.d file if possible.
AllowOverride None # Don't look for .htaccess files above vhosts' root dirs.
Options -Indexes FollowSymLinks # No indexes unless explicit in vhost; FollowSymLinks for performance
Keep-alive settings (override what's in default apache.conf)
KeepAlive On
MaxKeepAliveRequests 100
KeepAliveTimeout 1 # very short - 1 second; may solve AJAX "state 3" problem, else KeepAlive Off
/server-status and /server-info
Only from localhost. Probably doesn't matter which vhosts.
SetHandler server-status
Order deny,allow
Deny from all
Allow from 127.0.0.1
SetHandler server-info
Order deny,allow
Deny from all
Allow from 127.0.0.1
===== Commands =====
(Enter any command-line programs you want to have installed here.)
Jeff: mutt, lsof, strace, tcpdump (just off the top off my head - probably I will think of more later)
===== Operating System =====
Need to configure all regular users. (Actually, we need to determine if regular users will have accounts on these boxes at all. But the answer will probably be yes.) If we do allow normal users, we need to transfer over the same UIDs and (shadow) passwords as the existing boxes (Michelob and Dark) use.
Need to fix server not shutting down properly. It hangs before finishing the shutdown process. (I believe this is fixed. I rebooted the systems on 3/16/2006 using ''init 6'' and they came back up just fine. -- Craig)
Need to fix the console not coming back up after it blanks out. May be APM power settings. (Our current kernel doesn't support APM.)
It would probably be a good idea to mount the /var partition with the noatime flag.
===== Kernel =====
We'd like to upgrade to a 2.6 kernel. We'd also like to compile a custom kernel, to remove some stuff we don't need and to get better support for some of the features of our systems --- such as APM.
===== Firewall =====
We may need to open some more ports for new services, if we add any new services.
===== Email =====
Outbound email seems to be working OK. Not anymore :( as of November 4, 2005. (CMB)
Inbound SMTP seems to be working OK, but the hand-off to Cyrus seems to be broken.
Jeff Muse and Craig Buchek think we should switch to Courier IMAP. Cyrus isn't working, and there's better documentation for setting up Courier in a configuration similar to ours.
We'll need to get mailing lists working on the new systems before we can migrate email.
Lots of testing (via the test.sluug.org domain) will be required before we're ready to send production email to the new system.
We need to figure out how users on the existing servers (Michelob and Dark) will be able to access their email without needing to make any major changes on those systems.
===== Web =====
Make sure SSL is working properly, including (self-signed) certificates.
(Probably include instructions to users on how to accept our certificate authority.)
Need to verify that all virtual sites are working.
Need to determine what web apps we plan to run, especially if any of them will control the root home page of a virtual site. Possible web apps include:
* Content Management System
* Wiki
* Calendar
* Webmail (most likely [[http://www.horde.org/imp/|Horde]])
* Forums
* FAQ
* System Admin (like Webmin)
Before moving www.sluug.org to the new system(s), we need to make sure that we've got all the old content moved over, and pages with all the same names that external sites are pointing at. To make sure, we need to monitor the Apache logs after moving over, to see what non-existant pages people are trying to access.
===== Backups =====
We discussed backups a little. The consensus was that full backups would not be necessary. We could build the system from scratch just about as quickly as performing a full restore. Instead, we plan to back up only the data and configuration info. I.e. /home (including web sites), /etc, parts of /var (email spool), and maybe /usr/local.
We'll also need to back up the MySQL and PostgreSQL databases.
For the majority of our backups, we'll probably just transfer the data to another computer across the Internet.
===== Routine Maintanence =====
MySQL and PostgreSQL databases require periodic maintanence. For example, Matthew Porter recently explained how failure to VACUUM a PostgreSQL database will lead to very slow access times after about a month of moderate-heavy use.
===== Redundancy =====
We have 2 identical systems, so that we can fail over to a backup system if the primary fails. Each system will be doing its own thing under normal conditions. If one system happens to fail, we'll **manually** switch its functionality over to the other system. Mostly by pointing the DNS records for those services to the other system.
===== Misc =====
We need to require a password for sudo for those who can run anything through sudo.
For those who can only run a few commands, we can allow them to use sudo without a password.
We also need to set up restricted users in sudo, who can only run a few commands.