• Daniel Borkmann's avatar
    bpf: add meta pointer for direct access · de8f3a83
    Daniel Borkmann authored
    This work enables generic transfer of metadata from XDP into skb. The
    basic idea is that we can make use of the fact that the resulting skb
    must be linear and already comes with a larger headroom for supporting
    bpf_xdp_adjust_head(), which mangles xdp->data. Here, we base our work
    on a similar principle and introduce a small helper bpf_xdp_adjust_meta()
    for adjusting a new pointer called xdp->data_meta. Thus, the packet has
    a flexible and programmable room for meta data, followed by the actual
    packet data. struct xdp_buff is therefore laid out that we first point
    to data_hard_start, then data_meta directly prepended to data followed
    by data_end marking the end of packet. bpf_xdp_adjust_head() takes into
    account whether we have meta data already prepended and if so, memmove()s
    this along with the given offset provided there's enough room.
    xdp->data_meta is optional and programs are not required to use it. The
    rationale is that when we process the packet in XDP (e.g. as DoS filter),
    we can push further meta data along with it for the XDP_PASS case, and
    give the guarantee that a clsact ingress BPF program on the same device
    can pick this up for further post-processing. Since we work with skb
    there, we can also set skb->mark, skb->priority or other skb meta data
    out of BPF, thus having this scratch space generic and programmable
    allows for more flexibility than defining a direct 1:1 transfer of
    potentially new XDP members into skb (it's also more efficient as we
    don't need to initialize/handle each of such new members). The facility
    also works together with GRO aggregation. The scratch space at the head
    of the packet can be multiple of 4 byte up to 32 byte large. Drivers not
    yet supporting xdp->data_meta can simply be set up with xdp->data_meta
    as xdp->data + 1 as bpf_xdp_adjust_meta() will detect this and bail out,
    such that the subsequent match against xdp->data for later access is
    guaranteed to fail.
    The verifier treats xdp->data_meta/xdp->data the same way as we treat
    xdp->data/xdp->data_end pointer comparisons. The requirement for doing
    the compare against xdp->data is that it hasn't been modified from it's
    original address we got from ctx access. It may have a range marking
    already from prior successful xdp->data/xdp->data_end pointer comparisons
    Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
    Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
    Acked-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>