Tuesday, February 26, 2008

By Request - My dspam Training Script

In a post I made about one year ago, I mentioned a script which I created which trains dspam to recognize missed spam email, and corrects it when it falsely identifies a good ( or "ham") email as spam. Someone has requested that I post that script, so here it is. Please note that my qmail installation uses the maildir format!

--- start file: train-spam.sh ---
#!/bin/sh

# train-spam.sh
#
# Description: Checks each user's /home/Maildir/.Spam.Missed
# directories to see if the user placed any "missed" spam
# messages which got through SpamAssassin to their INBOX.
# If there are messages in this directory, then the script
# invokes sa-learn to update the site-wide tokens to try
# and improve the defenses for next time...
#

# learn_spam - Function which takes a directory and a user as
# arguments, and then feeds that directory to our anti-spam
# applications for further SPAM training.
#
# Arguments:
# $1 - Directory name containing SPAM emails. Required
# $2 - User name. If it is not provided, $USER will be used.
#
# Example:
# learn_spam /home/alank/Maildir/.Spam.Missed/cur alank
#
function learn_spam {

# loop through all emails in given directory
for email in $(ls $1); do

# process SPAM email using DSPAM
/usr/local/bin/dspam --mode=teft --source=error --class=spam --feature=chained,noise --user $2 < $1/$email
echo -n "."

# delete SPAM email
rm $1/$email

done # end of email loop

} # end function learn_spam

# learn_ham - Function which takes a directory and a user as
# arguments, and then feeds that directory to our anti-spam
# applications for further HAM training.
#
# Arguments:
# $1 - Directory name containing HAM emails. Required
# $2 - User name. If it is not provided, $USER will be used.
#
# Example:
# learn_ham /home/alank/Maildir/.Spam.NotSpam/cur alank
#
function learn_ham {

# loop through all emails in given directory
for email in $(ls $1); do

# process HAM email using DSPAM
/usr/local/bin/dspam --mode=teft --source=error --class=innocent --feature=chained,noise --user $2 < $1/$email
echo -n "."

# delete HAM
rm $1/$email

done # end of email loop

} # end function learn_ham

#
# Script starts here!
#

# loop through all user home directories
for file in $(ls /home); do

# if there is a Spam/Missed maildir
if [ -d /home/$file/Maildir/.Spam.Missed/cur ]; then

# then process any missed SPAM
echo -n "missed spam for $file: "
learn_spam /home/$file/Maildir/.Spam.Missed/cur $file
learn_spam /home/$file/Maildir/.Spam.Missed/new $file
echo ""

fi # end if

# if there is a Spam/NotSpam dir
if [ -d /home/$file/Maildir/.Spam.NotSpam/cur ]; then

# then process any falsely identified spam, i.e. HAM
echo -n "false positives for $file: "
learn_ham /home/$file/Maildir/.Spam.NotSpam/cur $file
learn_ham /home/$file/Maildir/.Spam.NotSpam/new $file
echo ""

fi # end if

done # end for loop

echo "Done!"
--- end file:
train-spam.sh ---

I place the above script in /root and create a cron job to run it every day in the early morning. You will need to edit some parts of the script if your missed spam and not spam directories are named differently. Good luck, and I hope it is helpful in your continuing battle against spam!

Yes, I Am Still Here

A bit dusty, eh?
Wow - it's a little dusty around here, isn't it? I haven't updated the ole' blog in months, and every time I did think about updating it, I said to myself, "but it's been so long!" That is a nice self-perpetuating situation, so now it's time I break free from the shackles of stupidity and update the damn thing.

I am very well aware that I still need to post pictures up from last summer's vacation, but those will have to wait for another day. Instead, I'll try to catch you up with what has happened over the last four months or so.

Little League

As I may have mentioned previously, I am on the board of directors for our local little league. Finding volunteers is always a challenge, as many people are just too busy (or unable) to help out. I have managed my son's team for the last three years, and really enjoyed it and the kids and parents in the community that I have been lucky enough to meet and get to know. Since the league was in dire straits to find a new board, I volunteered as the information officer.

It has been a lot of work, but overall, I think it is worth it. The people on the board are all good people, doing their best to make sure that the kids of our community have the opportunity to baseball in a safe and fun environment. I have spent late nights at monthly board meetings, weekends doing sign-ups and field prep work, and time during the day responding to board emails and the like.

I am not managing this year, and am instead, merely coaching. It's sort of driving me nuts not being in control, but I'll get over it.

Christmas

I took my normal two weeks off after Christmas, and promptly got sicker than I was all year. During the coarse of 2008, I took two sick days off from work. While on vacation, I was bed ridden for nearly three days. That'll teach me...

Other than that, we had a very nice Christmas and New Years. No Christmas cards were sent this year, due to the ebola plague that swept through the house. I like to think of it as a minor inconvenience to our friends, and a minor bonus for the tress.

Tech-talk

The day after Thanksgiving, I braved Fry's and purchased parts for a new computer at prices almost too good to be true. Actually, they were too good to be true because of those damn rebates. Here it is, four months later, and I am still waiting on my final rebate to arrive from Abit.

The system is an Intel Core Duo 2.6 Ghz machine with 2 GB ram, and nice Nvidia video card, and a decent SATA hard drive. It plays Team Fortress 2 beautifully, and I haven't really bought any other recent games due to time limitations. It should last me for the next couple of years easily.

So, that is a very brief update of what I've been doing over these last few months. I'll try to update the blog a bit more regularly as time and obligations permit.