1. 29 Oct, 2016 1 commit
    • Daniel Borkmann's avatar
      bpf: fix samples to add fake KBUILD_MODNAME · 96a8eb1e
      Daniel Borkmann authored
      Some of the sample files are causing issues when they are loaded with tc
      and cls_bpf, meaning tc bails out while trying to parse the resulting ELF
      file as program/map/etc sections are not present, which can be easily
      spotted with readelf(1).
      
      Currently, BPF samples are including some of the kernel headers and mid
      term we should change them to refrain from this, really. When dynamic
      debugging is enabled, we bail out due to undeclared KBUILD_MODNAME, which
      is easily overlooked in the build as clang spills this along with other
      noisy warnings from various header includes, and llc still generates an
      ELF file with mentioned characteristics. For just playing around with BPF
      examples, this can be a bit of a hurdle to take.
      
      Just add a fake KBUILD_MODNAME as a band-aid to fix the issue, same is
      done in xdp*_kern samples already.
      
      Fixes: 65d472fb ("samples/bpf: add 'pointer to packet' tests")
      Fixes: 6afb1e28 ("samples/bpf: Add tunnel set/get tests.")
      Fixes: a3f74617 ("cgroup: bpf: Add an example to do cgroup checking in BPF")
      Reported-by: default avatarChandrasekar Kannan <ckannan@console.to>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      96a8eb1e
  2. 18 Sep, 2015 1 commit
    • Alexei Starovoitov's avatar
      bpf: add bpf_redirect() helper · 27b29f63
      Alexei Starovoitov authored
      Existing bpf_clone_redirect() helper clones skb before redirecting
      it to RX or TX of destination netdev.
      Introduce bpf_redirect() helper that does that without cloning.
      
      Benchmarked with two hosts using 10G ixgbe NICs.
      One host is doing line rate pktgen.
      Another host is configured as:
      $ tc qdisc add dev $dev ingress
      $ tc filter add dev $dev root pref 10 u32 match u32 0 0 flowid 1:2 \
         action bpf run object-file tcbpf1_kern.o section clone_redirect_xmit drop
      so it receives the packet on $dev and immediately xmits it on $dev + 1
      The section 'clone_redirect_xmit' in tcbpf1_kern.o file has the program
      that does bpf_clone_redirect() and performance is 2.0 Mpps
      
      $ tc filter add dev $dev root pref 10 u32 match u32 0 0 flowid 1:2 \
         action bpf run object-file tcbpf1_kern.o section redirect_xmit drop
      which is using bpf_redirect() - 2.4 Mpps
      
      and using cls_bpf with integrated actions as:
      $ tc filter add dev $dev root pref 10 \
        bpf run object-file tcbpf1_kern.o section redirect_xmit integ_act classid 1
      performance is 2.5 Mpps
      
      To summarize:
      u32+act_bpf using clone_redirect - 2.0 Mpps
      u32+act_bpf using redirect - 2.4 Mpps
      cls_bpf using redirect - 2.5 Mpps
      
      For comparison linux bridge in this setup is doing 2.1 Mpps
      and ixgbe rx + drop in ip_rcv - 7.8 Mpps
      Signed-off-by: default avatarAlexei Starovoitov <ast@plumgrid.com>
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarJohn Fastabend <john.r.fastabend@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      27b29f63
  3. 07 Jun, 2015 1 commit
    • Alexei Starovoitov's avatar
      bpf: make programs see skb->data == L2 for ingress and egress · 3431205e
      Alexei Starovoitov authored
      eBPF programs attached to ingress and egress qdiscs see inconsistent skb->data.
      For ingress L2 header is already pulled, whereas for egress it's present.
      This is known to program writers which are currently forced to use
      BPF_LL_OFF workaround.
      Since programs don't change skb internal pointers it is safe to do
      pull/push right around invocation of the program and earlier taps and
      later pt->func() will not be affected.
      Multiple taps via packet_rcv(), tpacket_rcv() are doing the same trick
      around run_filter/BPF_PROG_RUN even if skb_shared.
      
      This fix finally allows programs to use optimized LD_ABS/IND instructions
      without BPF_LL_OFF for higher performance.
      tc ingress + cls_bpf + samples/bpf/tcbpf1_kern.o
             w/o JIT   w/JIT
      before  20.5     23.6 Mpps
      after   21.8     26.6 Mpps
      
      Old programs with BPF_LL_OFF will still work as-is.
      
      We can now undo most of the earlier workaround commit:
      a166151c ("bpf: fix bpf helpers to use skb->mac_header relative offsets")
      Signed-off-by: default avatarAlexei Starovoitov <ast@plumgrid.com>
      Acked-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3431205e
  4. 16 Apr, 2015 1 commit
    • Alexei Starovoitov's avatar
      bpf: fix bpf helpers to use skb->mac_header relative offsets · a166151c
      Alexei Starovoitov authored
      For the short-term solution, lets fix bpf helper functions to use
      skb->mac_header relative offsets instead of skb->data in order to
      get the same eBPF programs with cls_bpf and act_bpf work on ingress
      and egress qdisc path. We need to ensure that mac_header is set
      before calling into programs. This is effectively the first option
      from below referenced discussion.
      
      More long term solution for LD_ABS|LD_IND instructions will be more
      intrusive but also more beneficial than this, and implemented later
      as it's too risky at this point in time.
      
      I.e., we plan to look into the option of moving skb_pull() out of
      eth_type_trans() and into netif_receive_skb() as has been suggested
      as second option. Meanwhile, this solution ensures ingress can be
      used with eBPF, too, and that we won't run into ABI troubles later.
      For dealing with negative offsets inside eBPF helper functions,
      we've implemented bpf_skb_clone_unwritable() to test for unwriteable
      headers.
      
      Reference: http://thread.gmane.org/gmane.linux.network/359129/focus=359694
      Fixes: 608cd71a ("tc: bpf: generalize pedit action")
      Fixes: 91bc4822 ("tc: bpf: add checksum helpers")
      Signed-off-by: default avatarAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a166151c
  5. 06 Apr, 2015 1 commit
    • Alexei Starovoitov's avatar
      tc: bpf: add checksum helpers · 91bc4822
      Alexei Starovoitov authored
      Commit 608cd71a ("tc: bpf: generalize pedit action") has added the
      possibility to mangle packet data to BPF programs in the tc pipeline.
      This patch adds two helpers bpf_l3_csum_replace() and bpf_l4_csum_replace()
      for fixing up the protocol checksums after the packet mangling.
      
      It also adds 'flags' argument to bpf_skb_store_bytes() helper to avoid
      unnecessary checksum recomputations when BPF programs adjusting l3/l4
      checksums and documents all three helpers in uapi header.
      
      Moreover, a sample program is added to show how BPF programs can make use
      of the mangle and csum helpers.
      Signed-off-by: default avatarAlexei Starovoitov <ast@plumgrid.com>
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      91bc4822