Spamassassin + FuzzyOCR (not really a debian way)

What do we need:

  • Debian Sarge
  • Spamassassin (CPAN up-to-date)
  • luck :)

Getting started

upgrade your Spamassassin (non debian / CPAN way)

 perl -MCPAN -e shell
 cpan> install Mail:Spamassassin

keep your finger crossed

get some graphic manipulation packages from debian-sarge-backports

 # apt-get install -t sarge-backports install gocr libungif-bin \
   libimage-exif-perl libimage-exiftool-perl libstring-approx-perl imagemagick netpbm

The Fuzzy magic

 # cd /usr/src/
 # wget
 # tar xzvf fuzzyocr-2.3b.tar.gz
 # wget
 # cd FuzzyOcr-2.3b
 # patch < ../fuzzyocr-23b-hashdb-poison.patch
 # cp /usr/share/perl5/Mail/SpamAssassin/Plugin/
 # cp /etc/spamassassin/
 # cp FuzzyOcr.words.sample /etc/spamassassin/FuzzyOcr.words


 # echo "loadplugin FuzzyOcr /usr/share/perl5/Mail/SpamAssassin/Plugin/"\
   >> /etc/spamassassin/v310.pre
 # sed -i "s/^loadplugin FuzzyOcr FuzzyOcr"\
 # sed -i "s/^#focr_base_score\ 4/focr_base_score\ 2/" /etc/spamassassin/

For SpamAssassin less than 3.1.4:

  # sed -i "s/^focr_pre314 0.0/focr_pre314 1.0/" /etc/spamassassin/

Fingers still crossed

Verify Spamassassin config:

  # spamassassin --lint

See also: Fighting image spam on our Debian spamfilter with FuzzyOcr and ImageInfo plugins

