12-24-2014 09:50 AM
On an ERLite-3 with EdgeOS 1.6.0, I can see:
ubnt@ubnt:~$ /sbin/sysctl net.netfilter.nf_conntrack_tcp_timeout_established net.netfilter.nf_conntrack_tcp_timeout_established = 7200
I think this value is too low. Linux sets its TCP keepalive time to 7200s (see also http://www.tldp.org/HOWTO/html_single/TCP-Keepalive-HOWTO/), meaning that it will only _start_ sending keepalive packets at that point, but the EdgeRouter will already kill the connection at that point. In practice, I can actually see connections timing out because of that (i.e. the client thinks it’s connected, but the server doesn’t know about the connection anymore).
Quoting RFC 5382, Section 5:
REQ-5: If a NAT cannot determine whether the endpoints of a TCP connection are active, it MAY abandon the session if it has been idle for some time. In such cases, the value of the "established connection idle-timeout" MUST NOT be less than 2 hours 4 minutes.
2 hours and 4 minutes translates to 7440, so I’d like to ask you to increase the default value from 7200 to 7440.
05-05-2015 12:12 AM - edited 05-05-2015 12:13 AM
We're experiencing some odd behavior with Google Drive after updating to 1.6, the session times out and the user end up with "Trying to connect...". Could this be related? Is there anyway to change the TCP timeout value in 1.6? Any help appriciated!
05-05-2015 12:33 AM
It sounds unlikely that this setting and timeout problems are related. If anything, increasing the value makes timeouts more unlikely, not more frequent.
You can SSH into your router and use “sysctl -w net.ipv4.netfilter.ip_conntrack_tcp_timeout_established=7200” to restore the old value.
05-05-2015 12:37 AM - edited 05-05-2015 12:39 AM
I meant the other way around. If the low value could cause the timeout problems? I ended up running the command from CLI but the GUI now reads "Fatal error: Unable to load configuration" I'm still able to access via SSH but can't reboot the router now with users depending on it
05-05-2015 12:45 AM
I don’t think the low value causes problems in this case either. I trust Google Drive will enable TCP keepalive timeouts with a much lower delay, so that it doesn’t rely on various routers having their settings correct. If you’re on Linux, netstat -anto will list the active timers for all TCP connections.
08-21-2015 12:56 AM
Sorry for bumping this old thread but this post made me solve a very annoying problem I had with a certain application..
I recently replaced my old firewall with a EdgeRouter PoE (which is a great little device). There were no problems setting up the firewall and all (except for one) service have been running without any trouble!
The application is a web-based client-server application. Users are running the application for 8~ hours per day and the connection is established at all times.
I have been monitoring the connection from the client to the server with ping, traceroute, wireshark and I have not seen anything that could have caused the connection problems we encountered.
The problem itself occurred at random and it was hard to find a pattern why this was so. The random problem showed itself probably because the clients started the application at different times.
I looked into the conntrack_tcp_timeout_established parameter and saw the the timeout was set to 7440 seconds. This made me think and investigate further because the value was so low. I read up on the kernel documentation here and found that the default value for this parameter is 432000 seconds (5 days).
I changed this value with the following command and then saved the configuration.
set system conntrack timeout tcp established 432000
I have not received anymore complaints after changing this parameter and I hope it will continue to work.. We have quite few connections so the 5 days timeout is ok for us.