Configuring chains

From nftables wiki
Revision as of 12:50, 15 February 2021 by Fmyhr (talk | contribs) (→‎Base chain priority: Improved descriptions of NF_IP_PRI_CONNTRACK, NF_IP_PRI_CONNTRACK_CONFIRM)
Jump to navigation Jump to search

As in iptables, with nftables you attach your rules to chains. Unlike in iptables, there are no predefined chains like INPUT, OUTPUT, etc. Instead, to filter packets at a particular processing step, you explicitly create a base chain with name of your choosing, and attach it to the appropriate Netfilter hook. This allows very flexible configurations without slowing Netfilter down with built-in chains not needed by your ruleset.

Adding base chains

The syntax to add a base chain is:

% nft add chain [<family>] <table-name> <chain-name> { type <type> hook <hook> priority <value> \; [policy <policy>] }

Base chains are those that are registered into the Netfilter hooks, ie. these chains see packets flowing through your Linux TCP/IP stack.

The following example shows how to add a new base chain input to the foo table (which must have been previously created):

% nft 'add chain ip foo input { type filter hook input priority 0 ; }'

Important: nft re-uses special characters, such as curly braces and the semicolon. If you are running these commands from a shell such as bash, all the special characters need to be escaped. The simplest way to prevent the shell from attempting to parse the nft syntax is to quote everything within single quotes. Alternatively, you can run the command

% nft -i

and run nft in interactive mode.

The add chain command registers the input chain, that it attached to the input hook so it will see packets that are addressed to the local processes.

The priority is important since it determines the ordering of the chains, thus, if you have several chains in the input hook, you can decide which one sees packets before another. For example, input chains with priorities -12, -1, 0, 10 would be consulted exactly in that order. It's possible to give two base chains the same priority, but there is no guaranteed evaluation order of base chains with identical priority that are attached to the same hook location.

If you want to use nftables to filter traffic for desktop Linux computers, i.e. a computer which does not forward traffic, you can also register the output chain:

% nft 'add chain ip foo output { type filter hook output priority 0 ; }'

Now you are ready to filter incoming (directed to local processes) and outgoing (generated by local processes) traffic.

Important note: If you don't include the chain configuration that is specified enclosed in the curly braces, you are creating a non-base chain that will not see any packets (similar to iptables -N chain-name).

Since nftables 0.5, you can also specify the default policy for base chains as in iptables:

% nft 'add chain ip foo output { type filter hook output priority 0 ; policy accept; }'

As in iptables, the two possible default policies are accept and drop.

When adding a chain on ingress hook, it is mandatory to specify the device where the chain will be attached:

% nft 'add chain netdev foo dev0filter { type filter hook ingress device eth0 priority 0 ; }'

Base chain types

The possible chain types are:

  • filter, which is used to filter packets. This is supported by the arp, bridge, ip, ip6 and inet table families.
  • route, which is used to reroute packets if any relevant IP header field or the packet mark is modified. If you are familiar with iptables, this chain type provides equivalent semantics to the mangle table but only for the output hook (for other hooks use type filter instead). This is supported by the ip, ip6 and inet table families.
  • nat, which is used to perform Networking Address Translation (NAT). Only the first packet of a given flow hits this chain; subsequent packets bypass it. Therefore, never use this chain for filtering. The nat chain type is supported by the ip, ip6 and inet table families.

Base chain hooks

The possible hooks that you can use when you configure your base chain are:

  • ingress (only in netdev family since Linux kernel 4.2, and inet family since Linux kernel 5.10): sees packets immediately after they are passed up from the NIC driver, before even prerouting. So you have an alternative to tc.
  • prerouting: sees all incoming packets, before any routing decision has been made. Packets may be addressed to the local or remote systems.
  • input: sees incoming packets that are addressed to and have now been routed to the local system and processes running there.
  • forward: sees incoming packets that are not addressed to the local system.
  • output: sees packets that originated from processes in the local machine.
  • postrouting: sees all packets after routing, just before they leave the local system.

Base chain priority

Within a given hook, Netfilter performs operations in order of increasing numerical priority. Each nftables base chain is assigned a priority that defines its ordering among other base chains and Netfilter internal operations at the same hook. For example, a chain on the prerouting hook with priority -300 will be placed before connection tracking operations.

The following table shows Netfilter priority values, check the nft manpage for reference.

nftables Families Hooks nft Keyword Value Netfilter Internal Priority Description
prerouting -450 NF_IP_PRI_RAW_BEFORE_DEFRAG
inet, ip, ip6 prerouting -400 NF_IP_PRI_CONNTRACK_DEFRAG Packet defragmentation / datagram reassembly
inet, ip, ip6 all raw -300 NF_IP_PRI_RAW Traditional priority of the raw table placed before connection tracking operation
-225 NF_IP_PRI_SELINUX_FIRST SELinux operations
inet, ip, ip6 prerouting, output -200 NF_IP_PRI_CONNTRACK Connection tracking processes run early in prerouting and output hooks to associate packets with tracked connections.
inet, ip, ip6 all mangle -150 NF_IP_PRI_MANGLE Mangle operation
inet, ip, ip6 prerouting dstnat -100 NF_IP_PRI_NAT_DST Destination NAT
inet, ip, ip6, arp, netdev all filter 0 NF_IP_PRI_FILTER Filtering operation, the filter table
inet, ip, ip6 all security 50 NF_IP_PRI_SECURITY Place of security table, where secmark can be set for example
inet, ip, ip6 postrouting srcnat 100 NF_IP_PRI_NAT_SRC Source NAT
postrouting 225 NF_IP_PRI_SELINUX_LAST SELinux at packet exit
inet, ip, ip6 postrouting 300 NF_IP_PRI_CONNTRACK_HELPER Connection tracking helpers, which identify expected and related packets.
inet, ip, ip6 input, postrouting INT_MAX NF_IP_PRI_CONNTRACK_CONFIRM Connection tracking adds new tracked connections at final step in input & postrouting hooks.
 
bridge prerouting dstnat -300 NF_BR_PRI_NAT_DST_BRIDGED
bridge all filter -200 NF_BR_PRI_FILTER_BRIDGED
bridge 0 NF_BR_PRI_BRNF
bridge output out 100 NF_BR_PRI_NAT_DST_OTHER
bridge 200 NF_BR_PRI_FILTER_OTHER
bridge postrouting srcnat 300 NF_BR_PRI_NAT_SRC

Starting with nftables 0.9.6 you may use keywords instead of numbers to configure the chain priority. (Note that the same keyword maps to different numerical priorities in the bridge family vs. the other families.) It's possible to specify keyword priorities even in family/hook combinations where they don't make logical sense. Recall that the relative numerical ordering of priorities within a given hook is all that matters as far as Netfilter is concerned. (Keep in mind that this relative ordering includes packet defragmentation, connection tracking and other Netfilter operations as well as your nftables base chains.) You can also specify priority as an integral offset from a keyword, i.e. mangle - 5 is equivalent to numerical priority -155.

NOTE: If a packet is accepted and there is another chain, bearing the same hook type and with a later priority, then the packet will subsequently traverse this other chain. Hence, an accept verdict - be it by way of a rule or the default chain policy - isn't necessarily final. However, the same is not true of packets that are subjected to a drop verdict. Instead, drops take immediate effect, with no further rules or chains being evaluated.

The following ruleset demonstrates this potentially surprising distinction in behaviour:

table inet filter {
        # This chain is evaluated first due to priority
        chain services {
                type filter hook input priority 0; policy accept;

                # If matched, this rule will prevent any further evaluation
                tcp dport http drop

                # If matched, and despite the accept verdict, the packet proceeds to enter the chain below
                tcp dport ssh accept

                # Likewise for any packets that get this far and hit the default policy
        }

        # This chain is evaluated last due to priority
        chain input {
                type filter hook input priority 1; policy drop;
                # All ingress packets end up being dropped here!
        }
}

If the priority of the 'input' chain above were to be changed to -1, the only difference would be that no packets have the opportunity to enter the 'services' chain. Either way, this ruleset will result in all ingress packets being dropped.

In summary, packets will traverse all of the chains within the scope of a given hook until they are either dropped or no more base chains exist. An accept verdict is only guaranteed to be final in the case that there is no later chain bearing the same type of hook as the chain that the packet originally entered.

Netfilter's hook execution mechanism is described in more detail in Pablo's paper on connection tracking.

Base chain policy

This is the default verdict that will be applied to packets reaching the end of the chain (i.e, no more rules to be evaluated against).

Currently there are 2 policies: accept (default) or drop.

  • The accept verdict means that the packet will keep traversing the network stack (default).
  • The drop verdict means that the packet is discarded if the packet reaches the end of the base chain.

NOTE: If no policy is explicitly selected, the default policy accept will be used.

Adding non-base chains

You can also create non-base chains, analogous to iptables user-defined chains:

% nft add chain ip <table_name> <chain_name>

The chain name is an arbitrary string, with arbitrary case.

Note that no hook keyword is included when adding a non-base chain. Because it is not attached to a Netfilter hook, by itself a base chain does not see any traffic. But one or more base chains can include rules that jump or goto this chain -- following which, the non-base chain processes packets in exactly the same way as the calling base chain. It can be very useful to arrange your ruleset into a tree of base and non-base chains by using the jump and/or goto actions. (Though we're getting a bit ahead of ourselves, nftables vmaps provide an even more powerful way to construct highly-efficient branched rulesets.)

Deleting chains

You can delete chains as:

% nft delete chain ip foo input

The only condition is that the chain you want to delete needs to be empty, otherwise the kernel will complain that the chain is still in use.

% nft delete chain ip foo input
<cmdline>:1:1-28: Error: Could not delete chain: Device or resource busy
delete chain ip foo input
^^^^^^^^^^^^^^^^^^^^^^^^^

You will have to flush the ruleset in that chain before you can remove the chain.

Flushing chains

To flush (delete all of the rules in) the chain input of the foo table:

nft flush chain foo input

Example configuration: Filtering traffic for your standalone computer

You can create a table with two base chains to define rule to filter traffic coming to and leaving from your computer, asumming IPv4 connectivity:

% nft add table ip filter
% nft 'add chain ip filter input { type filter hook input priority 0 ; }'
% nft 'add chain ip filter output { type filter hook output priority 0 ; }'

Now, you can start attaching rules to these two base chains. Note that you don't need the forward chain in this case since this example assumes that you're configuring nftables to filter traffic for a standalone computer that doesn't behave as router.