加入收藏 | 设为首页 | 会员中心 | 我要投稿 李大同 (https://www.lidatong.com.cn/)- 科技、建站、经验、云计算、5G、大数据,站长网!
当前位置: 首页 > 综合聚焦 > 服务器 > Linux > 正文

skb head/data/tail/end/介绍

发布时间:2020-12-13 23:43:49 所属栏目:Linux 来源:网络整理
导读:? 2017年04月26日 18:21:12?abcLinux? 阅读数 799 ? This first diagram illustrates the layoutof the SKB data area and where in that area the various pointers in ‘structsk_buff‘ point. The rest of this page will walk throughwhat the SKB data

?

This first diagram illustrates the layoutof the SKB data area and where in that area the various pointers in ‘structsk_buff‘ point.

The rest of this page will walk throughwhat the SKB data area looks like in a newly allocated SKB. How to modify thosepointers to add headers,add user data,and pop headers.

Also,we will discuss how page non-lineardata areas are implemented. We will also discuss how to work with them.

?


??????? skb= alloc_skb(len,GFP_KERNEL);


This is what a new SKB looks like rightafter you allocate it using?alloc_skb()

As you can see,the head,data,and tailpointers all point to the beginning of the data buffer. And the end pointerpoints to the end of it. Note that all of the data area is considered tailroom.

The length of this SKB is zero,it isn‘tvery interesting since it doesn‘t contain any packet data at all. Let‘s reservesome space for protocol headers using?skb_reserve()

?


??????? skb_reserve(skb,header_len);


This is what a new SKB looks like rightafter the?skb_reserve()?call.

Typically,when building output packets,we reserve enough bytes for the maximum amount of header space we think we‘llneed. Most IPV4 protocols can do this by using the socket value?sk->sk_prot->max_header.

When setting up receive packets that anethernet device will DMA into,we typically call?skb_reserve(skb,NET_IP_ALIGN). By default?NET_IP_ALIGN?is defined to ‘2‘. This makes it so that,after theethernet header,the protocol header will be aligned on at least a 4-byteboundary. Nearly all of the IPV4 and IPV6 protocol processing assumes that theheaders are properly aligned.

Let‘s now add some user data to thepacket.

?


??????? unsignedchar *data = skb_put(skb,user_data_len);

??????? interr = 0;

??????? skb->csum= csum_and_copy_from_user(user_pointer,

????????????????????????????????????? ??? user_data_len,&err);

??????? if(err)

?????????????? gotouser_fault;


This is what a new SKB looks like rightafter the user data is added.

skb_put()?advances ‘skb->tail‘ by the specified number of bytes,it alsoincrements ‘skb->len‘ by that number of bytes as well. This routine must notbe called on a SKB that has any paged data. You must also be sure that there isenough tail room in the SKB for the amount of bytes you are trying to put. Bothof these conditions are checked for by?skb_put()?and an assertion failure will trigger if either rule is violated.

The computed checksum is remembered in‘skb->csum‘. Now,it‘s time to build the protocol headers. We‘ll build a UDPheader,then one for IPV4.

?


??????? structinet_sock *inet = inet_sk(sk);

??????? structflowi *fl = &inet->cork.fl;

??????? structudphdr *uh;

?

??????? skb->h.raw= skb_push(skb,sizeof(struct udphdr));

??????? uh= skb->h.uh

??????? uh->source= fl->fl_ip_sport;

??????? uh->dest= fl->fl_ip_dport;

??????? uh->len= htons(user_data_len);

??????? uh->check= 0;

??????? skb->csum= csum_partial((char *)uh,

?????????????????????????????? ?sizeof(struct udphdr),skb->csum);

??????? uh->check= csum_tcpudp_magic(fl->fl4_src,fl->fl4_dst,

?????????????????????????????? ????? user_data_len,IPPROTO_UDP,skb->csum);

??????? if(uh->check == 0)

?????????????? uh->check= -1;


This is what a new SKB looks like after wepush the UDP header to the front of the SKB.

skb_push()?will decrement the ‘skb->data‘ pointer by the specified number ofbytes. It will also increment ‘skb->len‘ by that number of bytes as well.The caller must make sure there is enough head room for the push beingperformed. This condition is checked for by?skb_push()?and an assertion failure will trigger if this rule is violated.

Now,it‘s time to tack on an IPV4 header.

?


??????? structrtable *rt = inet->cork.rt;

??????? structiphdr *iph;

?

??????? skb->nh.raw= skb_push(skb,sizeof(struct iphdr));

??????? iph= skb->nh.iph;

??????? iph->version= 4;

??????? iph->ihl= 5;

??????? iph->tos= inet->tos;

??????? iph->tot_len= htons(skb->len);

??????? iph->frag_off= 0;

??????? iph->id= htons(inet->id++);

??????? iph->ttl= ip_select_ttl(inet,&rt->u.dst);

??????? iph->protocol= sk->sk_protocol; /* IPPROTO_UDP in this case */

??????? iph->saddr= rt->rt_src;

??????? iph->daddr= rt->rt_dst;

??????? ip_send_check(iph);

?

??????? skb->priority= sk->sk_priority;

??????? skb->dst= dst_clone(&rt->u.dst);


This is what a new SKBlooks like after we push the IPv4 header to the front of the SKB.

Just as above for UDP,?skb_push()?decrements ‘skb->data‘ andincrements ‘skb->len‘. We update the ‘skb->nh.raw‘ pointer to the beginningof the new space,and build the IPv4 header.

This packet is basically ready to bepushed out to the device once we have the necessary information to build theethernet header (from the generic neighbour layer and ARP).

?


Things start to get a little bit morecomplicated once paged data begins to be used. For the most part the ability touse?[page,offset,len]?tuples for SKB data came about sothat file system file contents could be directly sent over a socket. But,as itturns out,it is sometimes beneficial to use this for nomal buffering ofprocess sendmsg() data.

It must be understood that once paged datastarts to be used on an SKB,this puts a specific restriction on all future SKBdata area operations. In particular,it is no longer possible to do?skb_put()?operations.

We will now mention that there areactually two length variables assosciated with an SKB,?len?and?data_len. The latter onlycomes into play when there is paged data in the SKB.?skb->data_len?tells how many bytes of paged datathere are in the SKB. From this we can derive a few more things:

  • The existence of paged data in an SKB is indicated by?skb->data_len?being non-zero. This is codified in the helper routine?skb_is_nonlinear()?so that it the function you should use to test this.
  • The amount of non-paged data at?skb->data?can be calculated as?skb->len - skb->data_len. Again,there is a helper routine already defined for this called?skb_headlen()?so please use that.

The main abstraction isthat,when there is paged data,the packet begins at?skb->data?for?skb_headlen(skb)?bytes,thencontinues on into the paged data area for?skb->data_len?bytes. That is why it is illogicalto try and do an?skb_put(skb)?when there is pageddata. You have to add data onto the end of the paged data area instead.

Each chunk of paged data in an SKB isdescribed by the following structure:

struct skb_frag_struct {

??????? structpage *page;

??????? __u16page_offset;

??????? __u16size;

};

There is a pointer to thepage (which you must hold a proper reference to),the offset within the pagewhere this chunk of paged data starts,and how many bytes are there.

The paged frags are organized into anarray in the shared SKB area,defined by this structure:

#define MAX_SKB_FRAGS (65536/PAGE_SIZE + 2)

?

struct skb_shared_info {

??????? atomic_tdataref;

??????? unsignedint?? nr_frags;

??????? unsignedshort tso_size;

??????? unsignedshort tso_segs;

??????? structsk_buff *frag_list;

??????? skb_frag_t???? frags[MAX_SKB_FRAGS];

};

The?nr_frags?member states howmany frags there are active in the?frags[]?array. The?tso_size?and?tso_segs?is used to convey information to the devicedriver for TCP segmentation offload. The?frag_list?is used to maintain a chain of SKBs organizedfor fragmentation purposes,it is _not_ used for maintaining paged data. Andfinally the?frags[]?holds the frag descriptorsthemselves.

A helper routine is available to help youfill in page descriptors.


void skb_fill_page_desc(struct sk_buff *skb,int i,

?????????????????????? structpage *page,

?????????????????????? intoff,int size)


This fills the?i‘th?page vector to point to?page?at offset?off?of size?size. It also updates the?nr_frags?member to be onepast?i.

If you wish to simply extend an existingfrag entry by some number of bytes,increment the?size?member by that amount.


With all of the complications imposed bynon-linear SKBs,it may seem difficult to inspect areas of a packet in astraightforward way,or to copy data out from a packet into another buffer.This is not the case. There are two helper routines available which make thispretty easy.

First,we have:


void*skb_header_pointer(const struct sk_buff *skb,int offset,int len,void *buffer)


You give it the SKB,theoffset (in bytes) to the piece of data you are interested in,the number ofbytes you want,and a local buffer which is to be used _only_ if the data youare interested in resides in the non-linear data area.

You are returned a pointer to the dataitem,or NULL if you asked for an invalid offset and len parameter. Thispointer could be one of two things. First,if what you asked for is directly inthe?skb->data?linear data area,you are given a directpointer into there. Else,you are given the buffer pointer you passed in.

Code inspecting packet headers on theoutput path,especially,should use this routine to read and interpret protocolheaders. The netfilter layer uses this function heavily.

For larger pieces of data other thanprotocol headers,it may be more appropriate to use the following helperroutine instead.


int skb_copy_bits(const struct sk_buff *skb,

?????????????? ? void *to,int len);


This will copy thespecified number of bytes,and the specified offset,of the given SKB into the?‘to‘buffer. This is used for copies of SKBdata into kernel buffers,and therefore it is not to be used for copying SKBdata into userspace. There is another helper routine for that:


int skb_copy_datagram_iovec(const structsk_buff *from,

?????????????????????? ??? int offset,struct iovec *to,

?????????????????????? ??? int size);


Here,the user‘s data area is described by the given IOVEC. The other parameters arenearly identical to those passed in to?skb_copy_bits()?above.

?

?

http://vger.kernel.org/~davem/skb_data.html

(编辑:李大同)

【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容!

    推荐文章
      热点阅读