Still using IPTOS_TOS() in kernel? Really???

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I'm looking at net/ipv4/ip_sockglue.c:

   512		case IP_TOS:	/* This sets both TOS and Precedence */
   513			if (sk->sk_type == SOCK_STREAM) {
   514				val &= ~3;
   515				val |= inet->tos & 3;
   516			}
   517			if (inet->tos != val) {
   518				inet->tos = val;
   519				sk->sk_priority = rt_tos2priority(val);
   520				sk_dst_reset(sk);
   521			}
   522			break;


and include/net/route.h:

   141	extern const __u8 ip_tos2prio[16];
   142	
   143	static inline char rt_tos2priority(u8 tos)
   144	{
   145		return ip_tos2prio[IPTOS_TOS(tos)>>1];
   146	}

and finally net/ipv4/route.c:

   165	#define ECN_OR_COST(class)	TC_PRIO_##class
   166	
   167	const __u8 ip_tos2prio[16] = {
   168		TC_PRIO_BESTEFFORT,
   169		ECN_OR_COST(FILLER),
   170		TC_PRIO_BESTEFFORT,
   171		ECN_OR_COST(BESTEFFORT),
   172		TC_PRIO_BULK,
   173		ECN_OR_COST(BULK),
   174		TC_PRIO_BULK,
   175		ECN_OR_COST(BULK),
   176		TC_PRIO_INTERACTIVE,
   177		ECN_OR_COST(INTERACTIVE),
   178		TC_PRIO_INTERACTIVE,
   179		ECN_OR_COST(INTERACTIVE),
   180		TC_PRIO_INTERACTIVE_BULK,
   181		ECN_OR_COST(INTERACTIVE_BULK),
   182		TC_PRIO_INTERACTIVE_BULK,
   183		ECN_OR_COST(INTERACTIVE_BULK)
   184	};



and it's slowly dawning on me that we're using an interpretation of the IP_TOS (and ip.ip_tos field) values that have been deprecated since 1998!  Quoting RFC 2474:

3.  Differentiated Services Field Definition

   A replacement header field, called the DS field, is defined, which is
   intended to supersede the existing definitions of the IPv4 TOS octet
   [RFC791] and the IPv6 Traffic Class octet [IPv6].



Seems pretty clear, right?  That DSCP is the new testament, here to replace the old testament... although if you look closely, the precedence values of IPTOS_PREC_ROUTINE looks a lot like IPTOS_CLASS_CS0, etc... so some backward compatibility was maintained.  (See http://sourceware.org/bugzilla/show_bug.cgi?id=11027 and http://sourceware.org/bugzilla/show_bug.cgi?id=10789 if your glibc doesn't yet include the the IPTOS_CLASS_CSn and IPTOS_DSCP_AFxx values).

And indeed, only routers seem to pay attention to bits in the 0x1c space...  I.e. between the upper 3 bits which still mean precedence, and the lower 2 bits which now signify experienced-congestion-notification (ECN).

Assuming that whatever the local host does to the output of the packet, that it's not going to sufficiently delay the packet enough because we're connected to some fast media (Fast ethernet, etc) then what we do locally shouldn't matter... unless of course we're using 802.1p tagging, in which case we can seriously mess up what happens next.

So how is it that no one noticed this issue yet, and given that Linux is used in a fair number of commercial embedded real-time boxes (like satellite and IPTV set-top boxes)... how are they not impacted by this?

Assuming my crusade to get various common apps and services (wget, TB, FF, Sendmail, Cyrus, ProFTPd, etc) to use DSCP/CS marking (very few apps currently use DSCP or precedence marking), then kernels with the proper default behavior will need to start shipping, right?  I.e. out-of-the-box kernels should handle such apps without further configuration, such as needing to have the DSCP iptables module installed.  They should "just work".

Thanks,

-Philip

--
To unsubscribe from this list: send the line "unsubscribe linux-net" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Netdev]     [Ethernet Bridging]     [Linux 802.1Q VLAN]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Git]     [Bugtraq]     [Yosemite News and Information]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux PCI]     [Linux Admin]     [Samba]

  Powered by Linux