Content-type: text/html Manpage of stakrate

stakrate

Section: User Commands (1)
Updated: 21-March-2004
Index Return to Main Contents

NAME

stak - Statistical Traffic Analysis Kit

SYNOPSIS

stakextract [-i <interface>] [-p <prefix>] [-s <snarflen>] [-l] [-X <expression> [-0 <c>]] [-f <filtering expression>] [-a] [-P <pattern>] <regular expression>

DESCRIPTION

stakextract is a part of the Statistical Traffic Analysis Kit (STAK), which is a set of utilities designed to help an administrator to figure out what is happening in his network at the moment.

stakextract extracts strings from packets. Such a capability is useful to monitor protocols that use plain text, like the HTTP, SMTP or Jabber protocols. The utility breaks the standard stak convention: no reports are generated, the extracted strings are being outputted as soon as they are catched.

USAGE

stakextract accepts parameters in a standard, short getopt(3) form.

There are several options concerning the stak sniffer framework, common for the all stak utilities - these options have been described in the GENERIC OPTIONS section below.

The remaining options, described in the REGULAR EXPRESSION EXTRACTOR MODE are stakextract-specific and do not apply to other stak utilities.

stakextract uses regular expressions (regex(7)) to match strings in packets. Before doing the actual analysis, the user has to invent a regular expression 'catching' packets and string he or she wishes to monitor. For instance, to see what WWW sites are being visited at the moment, one could use an expression like:

: # stakextract -i eth5 'Host: \([^\r\n]+\)\r\n'
www.modo.cyber.pl
www.fiat.pl
www.microsoft.com
a.as-eu.falkag.net
onet.hit.gemius.pl
www.barpolsat1.interia.pl

Like in the sed(1) utility, the \( and \) strings are used to delimit the actual string that is to be extracted. In addition to the standard regular expressions, special sequences are permitted to mark a CR (\r), LF (\n), TAB (\t) or to specify any character by its hexadecimal code (\xNN).

The output can be formatted in a sense, by specyfing an output-formatting string with the -P option. Like in sed(1) again, the special sequences \1, \2, \3, etc. are replaced with the corresponding bracket content in the expression:

: # stakextract -i eth5 'MAIL FROM: \([^\r\n]+\)\r' -P '\1 tries to send a message'
<009245@web.intertele.pl> tries to send a message
<QNXIRRKSWX@chez.com> tries to send a message
<zvsjvytyjrkzte@jpopmail.com> tries to send a message
<gexbvwkgmih@yahoo.no> tries to send a message

Additionally, you can specify the -a option to see the source and destination addresses of the packet the string was extracted from.

GENERIC OPTIONS

-0 c

Replace every NUL character (ASCII 0) with c before doing regular expression based matching. Ignored if the -x option was not specified. The default is '@'.

-f f

BPF filter expression to use. Using this option causes stak to ignore any packets not matching the specified BPF filter expression. For a detailed description of BPF filter expressions syntax, consult the tcpdump(1) manual page.

-h -?

Print help. stak dumps a short help on available command-line options and quits, regardless of other options.

-i I

Bind to interface I. The default is 'eth0', which of course will cause a failure on systems other than Linux. Make sure you specify the datalink prefix (see -p) when you order stak to bind to an interface of an uncommon type.

-l

Make stdout line-buffered. This option is useful when reports are redirected (eg. using shell redirection) to a file.

-p N

Datalink layer header prefix length. Every (or at least almost every) known datalink layer protocol prefixes a packet with its own header - which has to be stripped before the actual data essential for stak (the IP protocol header) can be read. stak is able to determine automatically how many bytes to skip only for the most common datalink layer protocols (Ethernet, FDDI, TokenRing, loopback, PPP) - in other cases the prefix length must be specified using this option. It is EXTREMELY IMPORTANT to set the right value - otherwise stak might print completely irrevelant reports and output invalid IP addresses. The default is autosense, or if that fails - 14 bytes, which is the length of an Ethernet header.

-s N

Capture at least N bytes. For performance reasons, stak does not acquire the whole packet from network, it just reads and processes first N bytes. The default is 64 bytes, which might be not enough if you are using complicated BPF expressions or filtering the packets using a regular expression. In such cases, it is good to set the capture length to MTU on the interface. The value is automatically increased to at least 1500 (which is the default MTU for an Ethernet interface) if one of -x, -E or -T options is used. This option does NOT affect statistical data (amount of bytes, per-second byte rate) collected by stak - the accounted packet size is always the 'real' one.

-v

Print exact values. Normally, stak uses SI prefixes (like k - kilo, M - mega, G - giga, T - tera) to make the printed numeric values more attractive for a human being. The -v option disables this feature, causing stak to print exact values. -X r Regular expression-based filtering. This option will cause stak to ignore packets that DO NOT match specified regular expression. Before any tests, NUL characters occuring in a packet are replaced with an other character, as specified in the -0 option (the default is '@'). Consult regex(5) manual for a detailed description of POSIX regular expressions. In addition to standard regex syntax, you may use the \r (CR), \n (LF), \t (TAB), \\ (\) and \xNN (hex NN) special sequences.

REGULAR EXPRESSION EXTRACTOR MODE

-P <string>

Output formatting string. Instead of just printing the extracted strings, this string is outputted, and the special sequences it may contain: \1, \2, \3, etc. are replaced with the contents of corresponding groups in the extracting expression.

-a

Enables outputting source and destination IP addresses of the packet the strings were extracted from.

AUTHOR

Mateusz Golicz <ziewk@jaszczur.org>

Feel free to send comments, suggestions, bug reports, etc. The author is not a native english speaker, and is aware of the fact that his english is far from perfect. Because of that, reports on grammar or vocabulary mistakes in this manual are also welcome.

The asynchronous DNS resolver part was taken from mtr - a very handy traceroute replacement by Matt Kimball.

LICENSE

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License, Version 2, as published by the Free Software Foundation. A copy of this license is distributed with this software in the file "COPYING".

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. Read the file "COPYING" for more details.

This document was created by man2html, using the manual pages.
Time: 14:58:55 GMT, March 21, 2004