Q&A: Text in images are spam filter proof?

Q: Hello EmailKarma.net,

I’m going back and forth with a co-worker because I have advised that a client remove some of the text from their latest creative. My Co-worker is adamant that that spam filters can’t read text within an HTML image, and that the message should be fine to send.

If that’s true, do you think its about to change? Also heard emails are being ‘fingerprinted’ and treated accordingly by inbound filters at the ISPs, do you know about this?

Thanks,
[Name Withheld]

Hi,

Images are no longer the safe haven for text that might be questionable to to spam filters, there is actually a very well known image reading SpamAssassin plug in called FuzzyORC.

The basis of the FuzzyORC system utilizes the Optical Character Recognition (commonly used by scanning software to import paper documents as an editable document) service to read the text in an image thus allowing the SpamAssassin tools to analyst the content of the image. This software became extremely popular over the last 2 years as spammers moved from text based messaging to image based spam to avoid keyword spam filtering.

Other systems like the Barracuda Networks spam firewall use a digital Finger Print, built from the messages received by their spam trap network to classify spam regardless of the contents. Messages with the same images become easy to detect as all of these will have the same finger print and thus be easy to detect and filter. This is especially common in legitimate email traffic as the sender is not modifying the content or appearance of the image for each message (or group of messages), this is a common tactic used by spammers.

The practice of finger printing email has become common place and I strongly believe that all commercial spam filters are doing this to some degree.

Thanks for your question.

Do you have a question for EmailKarma.net? Send us an email or leave a comment.

4 Comments

Anonymous on March 3, 2009 at 02:19

I know this is true for embedded images, however is it also true for images that need to be downloaded from a server or another website like photo-bucket?

Matt Vernhout - @EmailKarma on March 3, 2009 at 10:08

The FuzzyOCR scan occurs after the body portion of the message is delivered and is set to run on unclassified messages – this should scan images regardless of embedding or the hosting service.

Unfortunately services like Photo-bucket are commonly used by spammers and will likely have different rules associated with them.

Justin on March 17, 2009 at 17:25

After going to some trouble to build, install, and test FuzzyOCR on a local SpamAssassin installation, it appears it does *not* scan linked images, only images that are embedded. If it can scan linked images, there are no obvious ways to configure it to do so.

Rob Peter Verhagen on March 18, 2009 at 02:26

Thanks Justin for clearing that up.

Anonymous on March 3, 2009 at 02:19

I know this is true for embedded images, however is it also true for images that need to be downloaded from a server or another website like photo-bucket?
Matt Vernhout - @EmailKarma on March 3, 2009 at 10:08

The FuzzyOCR scan occurs after the body portion of the message is delivered and is set to run on unclassified messages – this should scan images regardless of embedding or the hosting service.

Unfortunately services like Photo-bucket are commonly used by spammers and will likely have different rules associated with them.
Justin on March 17, 2009 at 17:25

After going to some trouble to build, install, and test FuzzyOCR on a local SpamAssassin installation, it appears it does *not* scan linked images, only images that are embedded. If it can scan linked images, there are no obvious ways to configure it to do so.
Rob Peter Verhagen on March 18, 2009 at 02:26

Thanks Justin for clearing that up.

Q&A: Text in images are spam filter proof?

About The Author

Matt V - @emailkarma

4 Comments

Search EmailKarma

Follow EmailKarma

Support these EmailGeeks!

Delivery Bloggers