Q&A: Text in images are spam filter proof?

Q: Hello EmailKarma.net,

I’m going back and forth with a co-worker because I have advised that a client remove some of the text from their latest creative. My Co-worker is adamant that that spam filters can’t read text within an HTML image, and that the message should be fine to send.

If that’s true, do you think its about to change? Also heard emails are being ‘fingerprinted’ and treated accordingly by inbound filters at the ISPs, do you know about this?

Thanks,
[Name Withheld]

A:

Hi,

Images are no longer the safe haven for text that might be questionable to to spam filters, there is actually a very well known image reading SpamAssassin plug in called FuzzyORC.

The basis of the FuzzyORC system utilizes the Optical Character Recognition (commonly used by scanning software to import paper documents as an editable document) service to read the text in an image thus allowing the SpamAssassin tools to analyst the content of the image. This software became extremely popular over the last 2 years as spammers moved from text based messaging to image based spam to avoid keyword spam filtering.

Other systems like the Barracuda Networks spam firewall use a digital Finger Print, built from the messages received by their spam trap network to classify spam regardless of the contents. Messages with the same images become easy to detect as all of these will have the same finger print and thus be easy to detect and filter. This is especially common in legitimate email traffic as the sender is not modifying the content or appearance of the image for each message (or group of messages), this is a common tactic used by spammers.

The practice of finger printing email has become common place and I strongly believe that all commercial spam filters are doing this to some degree.

Thanks for your question.

Do you have a question for EmailKarma.net? Send us an email or leave a comment.

Author: Matt V - @emailkarma

Share This Post On

4 Comments

  1. I know this is true for embedded images, however is it also true for images that need to be downloaded from a server or another website like photo-bucket?

    Post a Reply
  2. The FuzzyOCR scan occurs after the body portion of the message is delivered and is set to run on unclassified messages – this should scan images regardless of embedding or the hosting service.

    Unfortunately services like Photo-bucket are commonly used by spammers and will likely have different rules associated with them.

    Post a Reply
  3. After going to some trouble to build, install, and test FuzzyOCR on a local SpamAssassin installation, it appears it does *not* scan linked images, only images that are embedded. If it can scan linked images, there are no obvious ways to configure it to do so.

    Post a Reply

Submit a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.