익명 18:51

Spamassassin: increase spam score for emoticons in email subject line

Spamassassin: increase spam score for emoticons in email subject line

We are receiving more and more spam messages and marketing mails with emoticons in the subject line and I want to increase the spam score of such messages using a custom spam assassin rule.

It seems the emoticons are not embedded images, but technically normal UTF-8 characters, since they even show up when loading of images from email messages in disabled in the email client. There seems to be no way to remove/block them in Outlook or Thunderbird.

My questions are:

  1. Can you confirm that these emoticons are UTF-8 characters, or suggest how I could test that?

  2. I would like to create a custom spam assassin rule to increase the spam score of messages that contain these emoticons in the subject line. How would I do this? If the are UTF-8 characters, is there a character range of emoticons I could look for?

Example image of messages:Example showing emails with emoticons



Top Answer/Comment:

I just had a wave of spam with subject lines all starting with emoji, so I made a dedicated rule for those:

header LOCAL_SUBJ_START_EMOJI Subject =~ /^(\xf0\x9f|\xe2[\x98-\x9b])/
score LOCAL_SUBJ_START_EMOJI 1.0

Notes:

  • This only filters emoji at the start of the subject line. If you remove the ^, your false positive rate might increase, since it will also match those byte sequences if they appear inside an UTF-8 encoded Unicode codepoint.

  • This filters the following Unicode ranges:

    • UTF-8 starting with F0 9F: 1F000-1FFFF, containing various emoji and other pictograms,
    • UTF-8 starting with E2 98/99/9A/9B: 2600-26FF, aka "Miscellaneous Symbols".

    Note that this does not cover all emoji, nor does it cover only emoji (e.g. 1F000-1FFFF also includes arrows and chess/playing card symbols). Adapt as needed.

  • Legitimate mails from kids or marketing departments might also use emoji in the subject line, so don't set the score too high.


Can you confirm that these emoticons are UTF-8 characters, or suggest how I could test that?

Currently, the SMTP standard does not support embedding images in e-mail headers (and I sure hope it stays that way), so, yes, Unicode characters are currently the only way to get little pictures into subject lines. To verify things like that, have a look at the raw headers of your e-mail. How to do that depends on the e-mail client you use.

상단 광고의 [X] 버튼을 누르면 내용이 보입니다