Re: [PATCH] AF_PACKET and packet mmap

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Mickael,

> The patch looks useful. Could you tell me how you got the info? (It
> would help me try to verify it.)
- networking/packet_mmap.txt (in kernel doc)
- http://wiki.ipxwarzone.com/index.php5?title=Linux_packet_mmap (TX
only, I've made this patch)

> Also, what kernel version number did these options appear in?
Normally next 2.6

PS: Sorry for slow reply,  I was in vacation.

Best regards,
Johann


On Fri, Jul 31, 2009 at 5:57 AM, Michael Kerrisk
<mtk.manpages@xxxxxxxxxxxxxx> wrote:
>
> Hi Johann.
>
> On Thu, Jul 30, 2009 at 1:04 AM, Johann Baudy<johann.baudy@xxxxxxxxxxx> wrote:
> > From: Johann Baudy <johann.baudy@xxxxxxxxxxx>
> >
> > Documentation of PACKET_RX_RING and PACKET_TX_RING socket options.
> >
> > Signed-off-by: Johann Baudy <johann.baudy@xxxxxxxxxxx>
>
> (Please CC me on patches. Otherwise I can easily miss them.)
>
> The patch looks useful. Could you tell me how you got the info? (It
> would help me try to verify it.)
>
> Also, what kernel version number did these options appear in?
>
> Thanks,
>
> Michael
> > --
> >
> >  man7/packet.7 |  212 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >  1 files changed, 212 insertions(+), 0 deletions(-)
> >
> > diff --git a/man7/packet.7 b/man7/packet.7
> > index 0b6c669..ec4973a 100644
> > --- a/man7/packet.7
> > +++ b/man7/packet.7
> > @@ -222,6 +222,218 @@ In addition the traditional ioctls
> >  .BR SIOCADDMULTI ,
> >  .B SIOCDELMULTI
> >  can be used for the same purpose.
> > +
> > +Packet sockets can also be used to have a direct access to network device
> > +through configurable circular buffers mapped in user space.
> > +They can be used to either send or receive packets.
> > +
> > +.B PACKET_TX_RING
> > +enables and allocates a circular buffer for transmission process.
> > +
> > +.B PACKET_RX_RING
> > +enables and allocates a circular buffer for capture process.
> > +
> > +They both expect a
> > +.B packet_mreq
> > +structure as argument:
> > +
> > +.in +4n
> > +.nf
> > +struct tpacket_req {
> > +    unsigned int    tp_block_size;  /* Minimal size of contiguous block */
> > +    unsigned int    tp_block_nr;    /* Number of blocks */
> > +    unsigned int    tp_frame_size;  /* Size of frame */
> > +    unsigned int    tp_frame_nr;    /* Total number of frames */
> > +};
> > +.fi
> > +.in
> > +
> > +This structure establishes a circular buffer of unswappable memory.
> > +Being mapped in the capture process allows reading the captured frames and
> > +related meta-information like timestamps without requiring a system call.
> > +Being mapped in the transmission process allows writing multiple packets that will be sent during
> > +.BR send (2).
> > +By using a shared buffer between the kernel and the user space also has
> > +the benefit of minimizing packet copies.
> > +
> > +Frames are grouped in blocks. Each block is a physically contiguous
> > +region of memory and holds
> > +.B tp_block_size
> > +/
> > +.B tp_frame_size
> > +frames.
> > +
> > +The total number of blocks is
> > +.B tp_block_nr.
> > +Note that
> > +.B tp_frame_nr
> > +is a redundant parameter because
> > +
> > +.in +4n
> > +frames_per_block = tp_block_size/tp_frame_size
> > +.in
> > +
> > +Indeed, packet_set_ring checks that the following condition is true
> > +
> > +.in +4n
> > +frames_per_block * tp_block_nr == tp_frame_nr
> > +.in
> > +
> > +A frame can be of any size with the only condition it can fit in a block. A block
> > +can only hold an integer number of frames, or in other words, a frame cannot
> > +be spawned across two blocks. Please refer to
> > +.I networking/packet_mmap.txt
> > +in kernel documentation for more details.
> > +
> > +Each frame contains a header followed by data.
> > +Header is either a
> > +.B struct tpacket_hdr
> > +or
> > +.B struct tpacket2_hdr
> > +according to socket option
> > +.B PACKET_VERSION
> > +(which can be set to
> > +.B TPACKET_V1
> > +or
> > +.B TPACKET_V2
> > +respectively through
> > +.BR setsockopt(2)
> > +).
> > +
> > +With
> > +.B TPACKET_V1:
> > +
> > +.in +4n
> > +.nf
> > +struct tpacket_hdr
> > +{
> > +    unsigned long      tp_status;
> > +    unsigned int       tp_len;
> > +    unsigned int       tp_snaplen;
> > +    unsigned short     tp_mac;
> > +    unsigned short     tp_net;
> > +    unsigned int       tp_sec;
> > +    unsigned int       tp_usec;
> > +};
> > +.fi
> > +.in
> > +
> > +With
> > +.B TPACKET_V2:
> > +
> > +.in +4n
> > +.nf
> > +struct tpacket2_hdr
> > +{
> > +    __u32 tp_status;
> > +    __u32 tp_len;
> > +    __u32 tp_snaplen;
> > +    __u16 tp_mac;
> > +    __u16 tp_net;
> > +    __u32 tp_sec;
> > +    __u32 tp_nsec;
> > +    __u16 tp_vlan_tci;
> > +};
> > +.fi
> > +.in
> > +
> > +.B tp_len
> > +is the size of data received from network.
> > +
> > +.B tp_snaplen
> > +is the size of data that follows the header.
> > +
> > +.B tp_mac
> > +is the mac address offset (
> > +.B PACKET_RX_RING
> > +only).
> > +
> > +.B tp_net
> > +is the network offset (
> > +.B PACKET_RX_RING
> > +only).
> > +
> > +.B tp_sec
> > +,
> > +.B tp_usec
> > +is the timestamp of received packet (
> > +.B PACKET_RX_RING
> > +only).
> > +
> > +.B tp_status
> > +is the status of current frame.
> > +
> > +For
> > +.B PACKET_TX_RING ,
> > +status can be
> > +.B TP_STATUS_AVAILABLE
> > +if the frame is available for new packet transmission;
> > +.B TP_STATUS_SEND_REQUEST
> > +if the frame is filled by user for transmission;
> > +.B TP_STATUS_SENDING
> > +if the frame is currently in transmission within the kernel;
> > +.B TP_STATUS_WRONG_FORMAT
> > +if the frame format is not properly formatted (This status will only be used if socket option
> > +.B PACKET_LOSS
> > +is set to 1).
> > +
> > +For
> > +.B PACKET_RX_RING ,
> > +a status equal to
> > +.B TP_STATUS_KERNEL
> > +indicates that the frame is available for kernel;
> > +.B TP_STATUS_USER
> > +indicates that kernel has received a packet (The frame is ready for user);
> > +.B TP_STATUS_COPY
> > +indicates that the frame (and associated meta information)
> > +has been truncated because it's larger than
> > +.B tp_frame_size
> > +;
> > +.B TP_STATUS_LOSING
> > +indicates there were packet drops from last time
> > +statistics where checked with
> > +.BR getsockopt(2)
> > +and the
> > +.B PACKET_STATISTICS
> > +option;
> > +.B TP_STATUS_CSUMNOTREADY
> > +is used for outgoing IP packets which it's checksum will be done in hardware.
> > +
> > +In order to use this shared memory, the user must call
> > +.BR mmap (2)
> > +function on packet socket. Then process depends on socket options:
> > +
> > +For
> > +.B PACKET_TX_RING ,
> > +the kernel initializes all frames to
> > +.B TP_STATUS_AVAILABLE.
> > +To send a packet, the user fills a data buffer of an available frame, sets tp_len to
> > +current data buffer size and sets its status field to
> > +.B TP_STATUS_SEND_REQUEST.
> > +This can be done on multiple frames. Once the user is ready to transmit, it
> > +calls
> > +.BR send (2) .
> > +Then all buffers with status equal to
> > +.B TP_STATUS_SEND_REQUEST
> > +are forwarded to the network device.
> > +The kernel updates each status of sent frames with
> > +.B TP_STATUS_SENDING
> > +until the end of transfer.
> > +At the end of each transfer, buffer status returns to
> > +.B TP_STATUS_AVAILABLE.
> > +
> > +For
> > +.B PACKET_RX_RING ,
> > +the kernel initializes all frames to
> > +.B TP_STATUS_KERNEL ,
> > +when the kernel
> > +receives a packet it puts in the buffer and updates the status with
> > +at least the
> > +.B TP_STATUS_USER
> > +flag. Then the user can read the packet,
> > +once the packet is read the user must zero the status field, so the kernel
> > +can use again that frame buffer.
> > +
> >  .SS Ioctls
> >  .B SIOCGSTAMP
> >  can be used to receive the timestamp of the last received packet.
> >
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-man" in
> > the body of a message to majordomo@xxxxxxxxxxxxxxx
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >
>
>
>
> --
> Michael Kerrisk
> Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
> Watch my Linux system programming book progress to publication!
> http://blog.man7.org/
--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Kernel Documentation]     [Netdev]     [Linux Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux