.TH qmail-spamthrottle 5

.SH NAME
qmail-spamthrottle \- the qmail spam throttle mechanism

.SH INTRODUCTION
The idea of spam throttling came about after would-be spammers were
easily circumventing (classic) tarpitting.  A reasonable recipient
limit in tarpitting must not adversely affect acceptable mail usage,
so spam clients typically create multiple SMTP connections, all of
which fall under this threshold.  Other sources have similar concepts,
using rate limiting, stuttering, et cetera to describe them.

It was originally intended for use at ISPs to control their internal
clients (users) SMTP usage, although it can applied equally in other
environments.  An ISP may wish to enable this mechanism for its customers
to prevent them from using the mail servers as a convenient location
from which to send spam.  However, in some or all other cases (other
originating IP addresses) this mechanism might be disabled to allow
for legitimate high-volume mail traffic such as mailing lists.

Spam throttling acts in a similar manner to tarpitting, except that
it is highly parameterized, more flexible, and (hopefully) more effective.
A wait is imposed (via
.BR sleep (3))
following the
.B DATA
command depending on these SMTP parameters: remote IP address;
previous SMTP connection timestamp; and previous wait time.

With the addition of teergrubing, spammers should keep their
connections open and deliver less mail.


.SH DETAILS
Two files,
.I wait
and
.IR time ,
store the previous wait time and SMTP connection timestamp,
respectively.  Both files are found in
.BR /var/qmail/spam/\fIdir\fB .
Where
.I dir
is based on parameters set in
.BR /var/qmail/control/spamt .
If
.I dir
is empty as a result, then it will be automatically set to
.IR a\fR/\fIb /0/0,
where
.I a
and
.I b
are the two octets (in decimal) for the remote IP address,
.IR a\fR.\fIb\fR.\fIc\fR.\fId .

Similarly, if
.I dir
starts with a slash (\fB/\fR), then it be automatically set to
the \fIn\fR-bit masked IP address (format
.IR \fR[/\fIn ]),
based on the remote IP address.

See 
.B qmail-spamt(5)
for details.

.B Note:
In case it is not yet evident, when
.I dir
is empty (or starts with a slash), as indicated above, then every
dot (\fB.\fR) is interpreted as a slash (\fB/\fR) in the construction
of the directory where the spam throttle state files are stored.


If you are using libtai for your time calculations,
then the format for the
.I time
file is a packed TAI64NA label.  If you have perl and the tai64nlocal
program, you can use the following perl expression to convert from
a packed TAI64NA label to a TAI64N timestamp:

.EX
	print join("","@",unpack("H24",<>)), "\n";
.EE


Given an entry in
.BR /var/qmail/control/spamt ,
such as

.EX
   ipblock:dir:st:stmax:flush:rcpt:tg:tg_resp:
.EE

Message throughput is controlled via the value of
.IR st .
The delays imposed (by calling 
.BR sleep (3))
depend on:  the value of \fIst\fR); number of recipients
for the current SMTP session (\fIR\fR); the number of reasonable
recipients per connection (\fIrcpt\fR); how much time has
passed (\fIT\fR) since the last SMTP request (as determined by
.BR /var/qmail/spam/\fIdir\fB/time );
and the last imposed delay (\fIW\fR) (as determined by
.BR /var/qmail/spam/\fIdir\fB/wait ).
The new delay is approximately
.EX

    (\fIR\fR - \fIR\fR / 2^(\fIR\fR/\fIrcpt\fR)) * ((\fIW\fR * \fIst\fR * \fIR\fR) / \fIT\fR)

.EE
when \fIrcpt\fR is greater than 0, and
.EX

     (\fIW\fR * \fIst\fR * \fIR\fR) / \fIT\fR

.EE
otherwise.  The unit of time is milliseconds.

If
.I stmax
is defined (and is non-zero), then it is used as a maximum
(in milliseconds) for the delay calculated above.

In short,
.I st
is roughly the minimum time between messages and/or connections.  If you already know
that you only want a throughput of N messages per second, then you can use 1000/N
as a good starting point for
.IR st .

.SH CONFIGURATION

For the following discussion, we assuming the matching
entry in
.B /var/qmail/control/spamt
is

.EX
   ipblock:dir:st:stmax:flush:rcpt:tg:tg_resp:
.EE

Despite efforts to impose a waiting period on would-be
spammers, it is still possible for the client to circumvent
the call to
.BR sleep (3).
That is, they may not wait for the response from
the DATA command, continuing to write their message, assuming
success, then closing the socket, again without waiting for a
response from the server; the message will be delivered at no
(time) cost to them.  Adherence to standards (such as ignoring
the absence of PIPELINING) should not be assumed for clients
acting as agents for unsolicited bulk email.  As such, the
.I flush
variable can be set (non-zero) to indicate that all input will
be flushed after calling
.BR sleep (3)
and prior to sending a response to the DATA command.
RFC 2920 (STD 60) prohibits flushing of the input buffer if
PIPELINING is supported.  As such, EHLO responses will not
advertise PIPELINING while 
.I flush
is set.

Another method, teergrubing, involves issuing continuation lines
periodically to keep the client connected while they wait for the
go ahead from the DATA command.  By setting (non-zero) the variable
.IR tg ,
you can specify the frequency of continuation lines in response to the
DATA command.  If the argument to
.BR sleep (3)
would have been 11 (seconds) and
.I tg
is set to 2, then the response to the DATA command would result in
several calls to sleep(2) (and one sleep(1)) with each accompanied
by a continuation line.  A continuation line consist of a 3-digit code,
a dash, and an arbitrary string.  The default string is "please wait",
but can be changed using the
.I tg_resp
variable. For example,

.EX
     ...
     DATA
     354-please wait
     354-please wait
     354 go ahead
     ...
.EE


.SH ENVIRONMENT
The environment variable,
.BR TCPREMOTEIP ,
is strictly required by spam throttle.  If you are not using
.BR tcpserver ,
then you will have to use
.B tcp-env
to ensure
.B TCPREMOTEIP 
is set.


.SH CAVEATS
The implicit translation of an empty directory to one based on the
remote IP address will most certainly result in an unwieldy spam
directory structure and should be reserved for small networks, such
as the internal network side of an office or ISP (including ISP users).
It is recommended that the
.I \fR/\fIn
format be used in the default
.B /var/qmail/control/spamt
entry (empty network block).  Then, for specific networks, a directory
per IP address is still possible: for example, the entries

.EX
   192.168.0.0/24:/32:::::::
   :/16:1500:120000::::::
.EE

define the default spam throttle directory (assuming the remote IP address is
.IR a\fR.\fIb\fR.\fIc\fR.\fId )
as
.IR a\fR/\fIb /0/0.
However, when the remote IP address is in the 192.168.0.0/24 network block,
the spam throttle directory will be
.IR a\fR/\fIb\fR/\fIc\fR/\fId ,
since the
.I dir
parameter is
.BR /32 .

.SH EXAMPLES
These examples assume that
.B /var/qmail/control/spamthrottle 
contains a non-zero value.

Here is a sample
.B /var/qmail/control/spamt
file for a home user:
.EX

    # network:dir:st:stmax:flush:rcpt:tg:tg_resp:
    # 
    # default entry (make it all share the public directory)
    :public:1500:120000::::::
    #
    # private (trusted) network does not enforce spamthrottle
    192.168.0.0/24::0::::::
    #
    # some external network which we would like to throttle collectively
    10.0.0.0/24:collected:::::::
    #
    # an external network (semi-trusted) which is throttled
    # based on individual IP address
    # - we don't specify SPAMTHROTTLEDIR and the default
    #   behaviour of storing state files in directories
    #   based on IP address is used)
    # - we also allow relaying from this semi-trusted
    #   network
    10.1.0.0/16:/32:::::::
    .

.EE

Here is a sample file for a high-volume mail server (or servers)
for some arbitrary ISP (with customer network 10.0.0.0/16 and internal/
employee network 10.1.0.0/24):
.EX

    # network:dir:st:stmax:flush:rcpt:tg:tg_resp:
    #
    # by default, turn throttling off
    ::0:::::::
    #
    # customer network uses default behaviour
    # (IP-based throttle files)
    10.0.0.0/16:/32:::::::
    #
    # employee network doesn't adhere to throttling
    10.1.0.0/24::0::::::
    #
    # external trusted network which legitimately
    # provides high volume mail traffic
    10.1.1.0/24::0::::::
    #
    # a collection of addresses/networks which we
    # might have gathered from past abuse experience
    # - we allow the mail, but we're aggressive
    #   about throttling it
    10.1.2.1/32:abuse:5000::::::
    10.1.2.2/32:abuse:5000::::::
    10.1.2.3/32:abuse:5000::::::
    10.1.3.0/24:abuse:5000::::::
    .

.EE

.SH "SEE ALSO"
tcp-env(1),
tcp-environ(5),
qmail-spamt(5),
qmail-smtpd(8)

.SH AUTHOR
Dale Woolridge, James Law, and Moto Kawasaki.  Contact the authors
via email: <spamthrottle@qmail.ca>.