Greylisting using Kamaelia

September 19, 2007 at 10:49 PM | categories: python, oldblog | View Comments

I've written a greylisting server using Kamaelia, and its turned my mail back to something usable. I've been running this server for 52 hours now & it's processed over 5000 mails. 94% of those have been rejected as spam, leaving a handful of spams coming through from mailing lists. It's a spectacular change for me.

How does it work? Well at it's core, when someone connects, a mail handler is create, which is managed by this main loop:
def main(self):
    brokenClient = False
    self.handleConnect()
    self.gettingdata = False
    self.client_connected = True
    self.breakConnection = False

    while (not self.gettingdata) and (not self.breakConnection):
        yield WaitComplete(self.getline(), tag="_getline1")
        try:
            command = self.line.split()
        except AttributeError:
            brokenClient = True
            break
        self.handleCommand(command)
    if not brokenClient:
        if (not self.breakConnection):
            EndOfMessage = False
            self.netPrint('354 Enter message, ending with "." on a line by itself')
            while not EndOfMessage:
                yield WaitComplete(self.getline(), tag="getline2")
                if self.lastline():
                    EndOfMessage = self.endOfMessage()
            self.netPrint("250 OK id-deferred")

    self.send(producerFinished(),"signal")
    if not brokenClient:
        yield WaitComplete(self.handleDisconnect(),tag="_handleDisconnect")
    self.logResult()

Handle command then results in a bunch of SMTP commands being dealt with, and dispatched:
def handleCommand(self,command):
    if len(command) < 1:
        self.netPrint("500 Sorry we don't like broken mailers")
        self.breakConnection = True
        return
    if command[0] == "HELO": return self.handleHelo(command) # RFC 2821 4.5.1 required
    if command[0] == "EHLO": return self.handleEhlo(command) # RFC 2821 4.5.1 required
    if command[0] == "MAIL": return self.handleMail(command) # RFC 2821 4.5.1 required
    if command[0] == "RCPT": return self.handleRcpt(command) # RFC 2821 4.5.1 required
    if command[0] == "DATA": return self.handleData(command) # RFC 2821 4.5.1 required
    if command[0] == "QUIT": return self.handleQuit(command) # RFC 2821 4.5.1 required
    if command[0] == "RSET": return self.handleRset(command) # RFC 2821 4.5.1 required
    if command[0] == "NOOP": return self.handleNoop(command) # RFC 2821 4.5.1 required
    if command[0] == "VRFY": return self.handleVrfy(command) # RFC 2821 4.5.1 required
    if command[0] == "HELP": return self.handleHelp(command)
    self.netPrint("500 Sorry we don't like broken mailers")
    self.breakConnection = True

In practical terms that MailHandler is subclassed by a ConcreteMailHandler that effectively enforces the normal sequence of commands of SMTP. However part of it has a core hook when we receive the DATA command:
def handleData(self, command):
    if not self.seenRcpt:
        self.error("503 valid RCPT command must precede DATA")
        return

    if self.shouldWeAcceptMail():
        self.acceptMail()
    else:
        self.deferMail()
Clearly the main hook here is "shouldWeAcceptMail" which defaults in ConcreteMailHandler to returning False.

In the actual class we instantiate to handle connections - GreyListingPolicy which subclasses ConcreteMailHandler - we customise shouldWeAcceptMail as follows:
def shouldWeAcceptMail(self):
    if self.sentFromAllowedIPAddress():
        return True           # Allowed hosts can always send to anywhere through us
    if self.sentFromAllowedNetwork():
        return True           # People on trusted networks can always do the same
    if self.sentToADomainWeForwardFor():
        try:
            for recipient in self.recipients:
                if self.whiteListed(recipient):
                    return True
                if not self.isGreylisted(recipient):
                    return False
        except Exception, e:
            print "Whoops", e
        return True # Anyone can always send to hosts we own

    # print "NOT ALLOWED TO SEND, no valid forwarding"
    return False

Finally the actual core code for handling greylisting looks like this:
def isGreylisted(self, recipient):
    max_grey = 3000000
    too_soon = 180
    min_defer_time = 3600
    max_defer_time = 25000

    IP = self.peer
    sender = self.sender
    def _isGreylisted(greylist, seen, IP,sender,recipient):
        # If greylisted, and not been there too long, allow through
        if greylist.get(triplet,None) is not None:
            greytime = float(greylist[triplet])
            if (time.time() - greytime) > max_grey:
                del greylist[triplet]
                try:
                    del seen[triplet]
                except KeyError:
                    # We don't care if it's already gone
                    pass
                print "REFUSED: grey too long"
            else:
                print "ACCEPTED: already grey (have reset greytime)" ,
                greylist[triplet] = str(time.time())
                return True
        # If not seen this triplet before, defer and note triplet
        if seen.get( triplet, None) is None:
            seen[triplet] = str(time.time())
            print "REFUSED: Not seen before" ,
            return False

        # If triplet retrying waaay too soon, reset their timer & defer
        last_tried = float(seen[triplet])
        if (time.time() - last_tried) < too_soon:
            seen[triplet] = str(time.time())
            print "REFUSED: Retrying waaay too soon so resetting you!" ,
            return False
   
        # If triplet retrying too soon generally speaking just defer
        if (time.time() - last_tried) < min_defer_time :
            print "REFUSED: Retrying too soon, deferring" ,
            return False
   
        # If triplet hasn't been seen in aaaages, defer
        if (time.time() - last_tried) > max_defer_time :
            seen[triplet] = str(time.time())
            print "REFUSED: Retrying too late, sorry - reseting you!" ,
            return False
   
        # Otherwise, allow through & greylist them
        print "ACCEPTED: Now added to greylist!" ,
        greylist[triplet] = str(time.time())
        return True

    greylist = anydbm.open("greylisted.dbm","c")
    seen = anydbm.open("attempters.dbm","c")
    triplet = repr((IP,sender,recipient))
    result = _isGreylisted(greylist, seen, IP,sender,recipient)
    seen.close()
    greylist.close()
    return result

All of which is pretty compact, and I suspect is pretty OK for people to follow. The rest of the code in the file is really about dealing with errors and abuse of the SMTP code. (The reaction to which is to disconnect telling the sender to retry later)

At present I'm ironing out some remaining issues (some people simply don't disconnect and need booting), and the code also depends on versions of Axon & Kamaelia that are sitting on my Scratch branch. All that said, you can check out the code (link is to web svn) here using this command line:
svn co https://kamaelia.svn.sourceforge.net/svnroot/kamaelia/trunk/Sketches/MPS/Grey Grey
You can get the Axon & Kamaelia versions you need from this command line:
svn co https://kamaelia.svn.sourceforge.net/svnroot/kamaelia/branches/private_MPS_Scratch Kamaelia
Install the contents of the Axon directory, then the Kamaelia directory by doing "python setup.py install" in each.

You can then configure the greylisting code, by changing the class GreylistServer, which for me looks like this:
class GreylistServer(MoreComplexServer):
    socketOptions=(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
    port = 25
    class protocol(GreyListingPolicy):
        servername = "mail.cerenity.org" 
# Server name we greet the world with
        serverid = "MPS-SMTP 1.0"         # Server type we declare ourselves to be
        smtp_ip = "192.168.2.9"  # SMTP server we forward to
        smtp_port = 8025         # SMTP server port we forward to
        allowed_senders = ["127.0.0.1"]
        allowed_sender_nets = ["192.168.2"] # Yes, only class C network style
        allowed_domains = [ "private.thwackety.com",
                            "thwackety.com",
                            "yeoldeclue.com",
                            ... other domains snipped ...
                            "kamaelia.org",
                            "owiki.org",
                            "cerenity.org"
        ]
        whitelisted_triples = [
             # IP, claimed sender (MAIL FROM:), recipient from "RCPT TO:"
             ( "213.38.186.202", "<post@mx1.redcats.co.uk>", "<...email censored...>"),
        ]
        whitelisted_nonstandard_triples = [
             # claimed hostname, IP prefix (can be full IP), recipient from "RCPT TO:"
             ("listmail.artsfb.org.uk", "62.73.155.19", "<...email censored...>"),
             ("domainwithborkedmailer.com", "204.15.20", "<
...email censored...>"),
             ("adomainwithborkedmailer.com", "204.15.20", "<
...email censored...>"),
             ("yetanotherdomainwithborkedmailer.com", "204.15.20", "<
...email censored...>"),
             ("andanotherdomainwithborkedmailer.com", "204.15.20", "<
...email censored...>"),
        ]

I've blanked out the email addresses, since there's no point in encouraging more spam... :-)

I'll be packaging this up properly at some point when I'm happy with the code. In the meantime if anyone grabs it and uses it from SVN, I'd be interested in hearing how you get on :-)
Read and Post Comments

Ye Old Cue - Visualizando multiples posts

September 13, 2007 at 09:08 AM | categories: python, oldblog | View Comments

Posted here ---  El blog de Michael Sparks Ye Old Cue, propone una solución nueva al viejo probelma de la visualización de posts, usando una ventana de preview y los posts en algo muy parecido a post-its, valga la redundancia. Además el blog está abierto a que cualquiera, sin necesidad de registrase, deje un post
Read and Post Comments

Ruby Based Kamaelia Core (miniaxon.rb)

September 10, 2007 at 11:07 PM | categories: python, oldblog | View Comments

I finally got around to learning sufficient ruby to write a Mini Axon in Ruby - which means there is a basic Kamaelia core in Ruby available now. This means the following code is valid Ruby code and the components work in exactly the same way as Python based Kamaelia components:
class Producer < Component
   @@name = "Producer"
   def initialize(message)
      super
      @message = message
   end
   def main
      loop do
         yield 1
         send @message, "outbox"
         showboxes if $debug
      end
   end
end

class Consumer < Component
   @@name = "Consumer"
   def main
      count = 0
      loop do
         yield 1
         count = count +1
         if dataReady("inbox")
            data = recv("inbox")
            print data, " ", count, "\n"
         end
      end
   end
end

p = Producer.new("Hello World")
c = Consumer.new()
postie = Postman.new(p, "outbox", c, "inbox")

myscheduler = Scheduler.new()
myscheduler.activateMicroprocess(p)
myscheduler.activateMicroprocess(c)
myscheduler.activateMicroprocess(postie)

run(myscheduler)
(yes, I bought 2 ruby books at pycon uk, so shoot me :-) )

This compares fairly well with the equivalent python mini-axon code:
class Producer(component):
    def __init__(self, message):
        super(Producer, self).__init__()
        self.message = message
    def main(self):
        while 1:
            yield 1
            self.send(self.message, "outbox")

class Consumer(component):
    def main(self):
        count = 0
        while 1:
            yield 1
            count += 1 # This is to show our data is changing :-)
            if self.dataReady("inbox"):
                data = self.recv("inbox")
                print data, count

p = Producer("Hello World")
c = Consumer()
postie = postman(p, "outbox", c, "inbox")

myscheduler = scheduler()
myscheduler.activateMicroprocess(p)
myscheduler.activateMicroprocess(c)
myscheduler.activateMicroprocess(postie)

for _ in myscheduler.main():
    pass

The code for this miniaxon is here in subversion:
The first class in that file (Coroutine) and first utility function there come from this entry:
What's next for this? Don't know really. As a proof of concept, it's interesting - its *as* capable an equivalent python based mini-axon, and could in theory give ruby developers the same boost that we get in python.  I've no intention on doing a massive rewrite of existing python  kamaelia code into Ruby, but it does open up some interesting options.

Language agnosticism is something I've always wanted for Kamaelia and this, along with Michael Barker's  Java based experiments (and the fact that jython can run mini axon too), imply to me that the approach really can be language agnostic as was always intended :-)

Read and Post Comments

Erlang vs Stackless

August 02, 2007 at 12:16 AM | categories: python, oldblog | View Comments

Not sure how valid the following benchmark is, but it's an interesting datapoint.
Read and Post Comments

More People on Facebook than live in Iraq

July 31, 2007 at 11:08 AM | categories: python, oldblog | View Comments

Interesting factoid - there are over 30 million people on Facebook. That means if facebook was a nation, it would be the 39th (possibly 38th) largest country in the world according to the list of countries by population page of Wikipedia. This puts it in the spot above Iraq and below Uganda.
Read and Post Comments

Greylisting for non-techies

July 31, 2007 at 12:03 AM | categories: python, oldblog | View Comments

Greylisting is like magical glass for mystical flies. Real email is delivered by real flies, whereas spam email is delivered by fake mystical flies. The difference is that fake mystical flies don't bang their head against the glass repeatedly when they hit a window to try and get through, they just bounce of and don't try again. However real flies do bang their head against the window repeatedly until they get through.

Now greylisting is like magical glass that can recognise real flies from mystical flies because it can see that this sort of fly (since flies are all numbered as we all know) is willing to bang its head against the glass repeatedly. As a result the first time it sees a new sort of fly it tests the fly - does it bang its head repeatedly to get through or does it give up. If it gives up, it will never let that fly deliver its message. However if it proves its worth as a real fly and retries, then the magical glass will, from that day forward always allow that fly (they're all numbered remember) through to deliver the real email it carries.

Now obviously, the first time a new fly is seen this also means the magical glass has to check to see if the fly is a real fly or a mystical fly. This takes a little while, so the first time this happens this can cause the email the fly is delivering to be delayed, but does reduce the spam you recieve, so its not all bad :-)


Read and Post Comments

Design Thinking Links

July 20, 2007 at 10:38 AM | categories: python, oldblog | View Comments

A bunch of links nabbed from Michael Tiemann's OSI blog
I like this, because the second from last link focusses on design books. 3 of which I already own. Though it would be good to read the other 7 probably.

It also matches with the way I tend to manage development in Kamaelia's SVN - it's set up to encourage a high diversity of ideas, large amounts of checkins, and rigourously stable and clean code in releases. Design is crucial in making something new. Taking that design and moving it to engineering is just as vital however, and there are some very important steps to bear in mind how that happens. (not least that the skill set can be extremely different and many people need to learn at least one of those sets of skills)

Read and Post Comments

Python, Nokia Mobiles, Easy control from linux; Sync of profile/blog picture with facebook

July 19, 2007 at 01:41 AM | categories: python, oldblog | View Comments

Quick and simple access to the BT console (based on notes here)
# sdptool add-channel=27 SP
# rfcomm listen /dev/rfcomm0 27
Waiting for connection on channel 27
Connection from 00:11:22:33:44:55 to /dev/rfcomm0
Press CTRL-C for hangup
connect using btconsole.py on phone
on different console on linux
minicom -s -m (set device to /dev/rfcomm0 )
Also, facebook image syncing:
curl 2>/dev/null -O -A "Mozilla/5.0 (X11; U; Linux i686; en-GB; rv:1.8.1) Gecko/2006102
3 SUSE/2.0-30 Firefox/2.0" http://www.facebook.com/p/Michael_Sparks/747770380
F=`grep profileimage 747770380|sed -e "s/^.*http:/http:/"|sed -e "s/\.jpg.*/\.jpg/"`
G=`grep profileimage 747770380|sed -e "s/^.*http:/http:/"|sed -e "s/\.jpg.*/\.jpg/"|sed
 "s#.*/##"`
curl 2>/dev/null -O $F
mv $G michael.jpg
cp michael.jpg /usr/local/httpd/sites/com.yeoldeclue/docs/michael.jpg

Read and Post Comments

Configuring Exim to block email to all except specified addresses

May 20, 2007 at 05:41 PM | categories: python, oldblog | View Comments

Ever needed to only allow emails from specific people through to specific addresses using exim? If you have then hopefully this post is of use to you. I'm writing it up here because it's proving useful to me right now.

Blocking email to all addresses except specific ones using exim is pretty easy. First of all create one file /etc/blocked_emails.list, and add to it a list of email addresses which are blocked:
foo@bar.com
bibble@bar.com
etc@bar.com
Next step is to create a list of addresses those emails can send to. Put these into a file called /etc/exceptions.list and list one local part per line - for example:
john
bob
rita

You then have two possible modes here. You can either defer accepting email so it takes a while to bounce, or have it deny delivery immediately. The former is in many cases actually preferable because someone will assume its been delivered and only find out its bounced, with a relatively innocuous error message some days later. Given you only tend to block people because they're being OTT, this gives them a chance to cool off and for any nasty messages to be lost, unread, in the ether.

To have the mail system defer delivery of email from any of the blocked_emails, to any address other than any of the emails in the exceptions, put the following in your exim ACL's rules for rcpt checking:

begin acl

acl_check_rcpt:

  accept local_parts   = /etc/exceptions.list
         senders = /etc/blocked_emails.list

  defer   message = Mailbox full, retry later
          senders = /etc/blocked_emails.list


The message is deliberately innocuous. However if the person (or persons) ramps up their antisocial behaviour and doesn't take the hint, you can change this to instantly deny access and send a message back immediately rather than 4-24 hours later by changing defer to deny:

begin acl

acl_check_rcpt:

  accept local_parts   = /etc/exceptions.list
         senders = /etc/blocked_emails.list

  deny   message = Your email has not been and will not be delivered - it has been blocked
          senders = /etc/blocked_emails.list
It's really sad when things come to this. There is an advantage to using config files like this however in that you only need to edit the contents then of blocked_emails and exceptions in order to re-allow emails through, or to block access completely to all emails.

In case anyone is wondering why I know these rules and why I'm writing it up - it's because I'm in the situation where I'm having to use this right now.

Read and Post Comments

Come to PyCon UK - September 8th&9th !

April 20, 2007 at 11:26 PM | categories: python, oldblog | View Comments

The PyCon UK Society has announced a UK Python Conference. This is an affordable community conference taking place on 8th/9th September. The conference is fantastic value, especially if you take advantage of the extra early bird booking offer. Both new and experienced Python programmers will benefit from the varied programme.

Why am I posting about this? I'd personally like to invite UK pythonistas to come, share their knowledge with others, learn new things and hang out. It's a community conference, which means it has the following characteristics:

  • You can help make it amazing, by participating & speaking, by helping, by attending!
  • It is cheap
  • It will be fun, and accessible. We (I'm helping organise this :) ) really want the conference to be accessible to all, from those who have no idea of what python is, let alone coded in it, through to those who are working on their upteenth bytecode hack/compiler.
Seriously though aside from this, I'm really posting about this because despite being a language pragmatist - ie I'll use any language that gets the job done - I'm largely finding that I'll use python for almost everything these days. Community conferences are a real opportunity to dive in and help and learn and share. Like python itself, the conference is also platform agnostic, so you're welcome if you use Windows, Mac OS X, Solaris, FreeBSD, or even the same OS as me - Linux :) .
Read and Post Comments

« Previous Page -- Next Page »