Mangling packet headers: Difference between revisions

From nftables wiki
Jump to navigation Jump to search
(Create page mangle packet header fields)
 
(→‎Mangling TCP options: remove oifname pppoe0, see note regarding mangling TCP MSS option)
 
(10 intermediate revisions by 5 users not shown)
Line 1: Line 1:
== Mangle packet header fields ==
Since nft v0.6 nftables supports packet header mangling, including stateless NAT.


Since nft v0.6 nftables supports stateless payload mangling.
'''Note''': if you mangle packet fields that are included in the [https://en.wikipedia.org/wiki/User_Datagram_Protocol#IPv4_Pseudo_Header layer 4 checksum pseudoheader], then you require a Linux kernel version >= 4.10.


To mangle packet header fields you should create a rule to match the packet, match the desired header field and set a new value to it:
To mangle packet header fields you should create a rule to match the packet, match the desired header field and set a new value to it:


<source lang="bash">
<source lang="bash">
% nft add table mangle
% nft add table raw
% nft add chain mangle forward {type filter hook forward priority 0\;}
% nft add chain raw prerouting {type filter hook prerouting priority -300\;}
% nft add rule mangle forward tcp dport 8080 tcp dport set 80
% nft add rule raw prerouting tcp dport 8080 tcp dport set 80
</source>
</source>


The commands above create a table named ''mangle'', a chain named ''forward'', see [[Netfilter hooks]], and a rule to mangle the destination port of packets over TCP from 8080 to 80. Keep in mind the interactions with conntrack, flows with mangled traffic must be untracked.
The commands above create a table named ''raw'', a chain named ''prerouting'', see [[Netfilter hooks]], and a rule to mangle the destination port of packets over TCP from 8080 to 80.


The rule below is another example, it matches packets heading to address ''192.168.1.3'' and modifies their ''Time to Live'' field:
 
== Mangling TCP options ==
 
Since Linux kernel 4.14 and nftables 0.9, you can clamp your TCP MSS to Path MTU. This is very convenient in case your router encapsulates traffic over PPPoE, which is what many DSL (and some FTTH) providers do:
 
<source lang="bash">
nft add rule ip filter forward tcp flags syn tcp option maxseg size set rt mtu
</source>
 
where '''rt mtu''' calculates the MTU in runtime based on what the routing cache has observed via Path MTU Discovery (PMTUD).
 
Note: The TCP maximum segment size is announced through TCP options in the original syn and the reply syn+ack packets. TCP maximum segment size is not negotiated, the RFC specifies that it is possible to have different TCP maximum segment size in each direction of the flow. Therefore, make sure you mangle both the TCP options of the original syn and the reply syn+ack packets.
 
Note for iptables users: 'tcp option maxseg size set rt mtu' is equivalent to '-j TCPMSS --clamp-mss-to-pmtu'.
 
You can also manually set to fixed value, eg. PPPoE takes 8 bytes to encapsulate packets, therefore, assuming MTU of 1500 bytes, 1500 - 20 (IPv4 Header) - 20 (TCP header) - 8 (PPPoE header) = 1452 bytes:


<source lang="bash">
<source lang="bash">
% nft add rule mangle forward ip daddr 192.168.1.3 ip ttl set 2
nft add rule ip filter forward tcp flags syn tcp option maxseg size set 1452
</source>
 
Other supported TCP options are: window, sack-permitted, sack, timestamp and eol.
 
== Interactions with conntrack ==
 
Keep in mind the interactions with conntrack, flows with mangled traffic must be [[Setting packet connection tracking metainformation | untracked]]. You can do this in a single rule:
 
<source>
% nft add rule ip6 raw prerouting ip6 daddr fd00::1 ip6 daddr set fd00::2 notrack
</source>
</source>


For more information about packet headers to mangle check manpage nft(8), [[Matching packet header fields]] and [[Quick reference-nftables in 10 minutes]].
For more information about packet headers to mangle check manpage nft(8), [[Matching packet headers]] and [[Quick reference-nftables in 10 minutes]].

Latest revision as of 22:27, 3 May 2021

Since nft v0.6 nftables supports packet header mangling, including stateless NAT.

Note: if you mangle packet fields that are included in the layer 4 checksum pseudoheader, then you require a Linux kernel version >= 4.10.

To mangle packet header fields you should create a rule to match the packet, match the desired header field and set a new value to it:

% nft add table raw
% nft add chain raw prerouting {type filter hook prerouting priority -300\;}
% nft add rule raw prerouting tcp dport 8080 tcp dport set 80

The commands above create a table named raw, a chain named prerouting, see Netfilter hooks, and a rule to mangle the destination port of packets over TCP from 8080 to 80.


Mangling TCP options

Since Linux kernel 4.14 and nftables 0.9, you can clamp your TCP MSS to Path MTU. This is very convenient in case your router encapsulates traffic over PPPoE, which is what many DSL (and some FTTH) providers do:

nft add rule ip filter forward tcp flags syn tcp option maxseg size set rt mtu

where rt mtu calculates the MTU in runtime based on what the routing cache has observed via Path MTU Discovery (PMTUD).

Note: The TCP maximum segment size is announced through TCP options in the original syn and the reply syn+ack packets. TCP maximum segment size is not negotiated, the RFC specifies that it is possible to have different TCP maximum segment size in each direction of the flow. Therefore, make sure you mangle both the TCP options of the original syn and the reply syn+ack packets.

Note for iptables users: 'tcp option maxseg size set rt mtu' is equivalent to '-j TCPMSS --clamp-mss-to-pmtu'.

You can also manually set to fixed value, eg. PPPoE takes 8 bytes to encapsulate packets, therefore, assuming MTU of 1500 bytes, 1500 - 20 (IPv4 Header) - 20 (TCP header) - 8 (PPPoE header) = 1452 bytes:

nft add rule ip filter forward tcp flags syn tcp option maxseg size set 1452

Other supported TCP options are: window, sack-permitted, sack, timestamp and eol.

Interactions with conntrack

Keep in mind the interactions with conntrack, flows with mangled traffic must be untracked. You can do this in a single rule:

% nft add rule ip6 raw prerouting ip6 daddr fd00::1 ip6 daddr set fd00::2 notrack

For more information about packet headers to mangle check manpage nft(8), Matching packet headers and Quick reference-nftables in 10 minutes.