March 2004 Archives

All this spam

| No Comments | No TrackBacks

Man, spam is getting to be such a problem.

Today I got word that two of our clients are getting seriously frustrated with the number of unmarked messages getting past SpamAssassin — the filter we have installed on some of our NetGateways.

SpamAssassin isn’t a bad product. It’s just as good as it used to be. 95% accuracy sounds great when you’re only getting 20-30 spam messages per day. When only one or two messages slip through the cracks, that’s alright. But when you’re getting hundreds or thousands of messages being lobbed at your inbox every day, that’s a big number of messages slipping through.

So, I’ve been doing some research on methods I can use to improve SpamAssassin or software that I can replace SpamAssassin with. I’ve found a few things, but mostly I’ve discovered this is going to take some work to find the right solution.

I did a search on google for the term “better than spamassassin” just to see what would come up. There were a few interesting hits, but the one that probably caught my eye more than any other was this one: http://crm114.sourceforge.net/.

CRM114 is basically a text filter, but one that was designed from the ground up to be proficient at isolating patterns in spam.

Here is a very interesting presentation (in PDF format) describing the need for a very accurate (i.e. 99.9% or better) mail filter in order to cripple the business model of most commercial spammers.

I’ve played around with dspam a little as well, but it scares the bejeebers out of me because of how complex the installation/configuration is. Maybe I just need to spend more time with it because the author of dspam also claims very high accuracy and it would seem there’s less hassle for end-users than with SpamAssassin.

Another project I’m currently investigating is this Sender Policy Framework (SPF) stuff. This is bigger than a server-side mail filter. This is a way for servers to identify themselves so that when e-mail comes in saying “I came from blahblahblah.com,” the server can look up the SPF record for blahblahblah.com and determine if the information in the message is authentic.

The biggest hurdle in implementing SPF is that it’s a modification to (a hack on top of, really) SMTP. You have to get all your users on-board to make a full switchover.

Meanwhile, the next version of SpamAssassin (2.70) will support SPF.

The answer to my problems right now might be that I just need to work a little harder on helping SpamAssassin do its job better.

I just migrated all the Fozzolog mod_perl code to use Template Toolkit instead of CGI::FastTemplate.

To do this, I had to rewrite all the template files dealing with Fozzolog pages. There are seven of those. Because the Template Toolkit syntax supports more logic than CGI::FastTemplate, I moved as much of the presentation logic as I could out of the mod_perl content handler into the template files.

I believe the move to Template Toolkit makes the Fozzolog code more portable, easier to maintain, and easier to understand.

It’s fairly trivial now for me to fire up another journal site based on Fozzolog code. All I need to do is set up a <Location> section in the Apache configuration file with some Perl variables.

<Location /journal>
  PerlSetVar Fozzolog_DSN dbi:Pg:dbname=joealog
  PerlSetVar Fozzolog_Path /journal
  PerlSetVar Fozzolog_TemplateDir /www/joejoejoe.com/templates
  PerlSetVar Fozzolog_Site www.joejoejoe.com
  PerlSetVar Fozzolog_DateFormat "%a %d %b, %Y at %H:%M"
  SetHandler perl-script
  PerlHandler FGM::FozzologHandler
</Location>

The PerlSetVar directives provide all the key data needed. Any other customization can be done in the template files.

For the insanely bored, here’s a snippet of a template file:

<table width="100%" border="1" cellpadding="5" cellspacing="0">
  <tr>
   <td class="highlight">
    <table width="100%">
     <tr>
      <td align="left">
       <div class="journalhead">
        Date: [% manip.UnixDate(jentry.dt_posted, date_format) %]<br/>
        Topic: <strong>[% jentry.topic_name %]</strong><br/>
        Headline: <strong>[% jentry.headline %]</strong>
       </div>
      </td>
     </tr>
    </table>
    [% jentry.body %]
    <div class="commentsummary"> 
      [% IF jentry.comment_count == 0 %]
        No comments
      [% ELSIF jentry.comment_count == 1 %]
        <a href="[% fozzolog_path %]/SC[% jentry.jentry_id %]">
	  1 comment</a>
      [% ELSE %]
        <a href="[% fozzolog_path %]/SC[% jentry.jentry_id %]">
	   [% jentry.comment_count %] comments</a>
      [% END %]
      |
      <a href="[% fozzolog_path %]/PC/[% jentry.jentry_id %]">
        Add a comment</a>
    </div>
   </td>
  </tr>
 </table>

About this Archive

This page is an archive of entries from March 2004 listed from newest to oldest.

February 2004 is the previous archive.

April 2004 is the next archive.

Find recent content on the main index or look in the archives to find all content.