Reply
Highlighted
Regular Member
Posts: 464
Registered: ‎07-21-2010
Kudos: 77
Solutions: 6

Offloading-Flow randomly "jumps" from Offload-Engine to Linux / ER-Pro8 / v.1.10.0

@UBNT-afomins

 

Why is the RTP-Stream not offloaded for the whole time? Why does ASIC not process all the RTP-Stream?

This issue also relates to TCP-Streams while downloading huge files.

 

Spoiler
root@bgp01:/home/rack# echo 1 > /proc/cavium/stats
root@bgp01:/home/rack# cat /proc/cavium/stats

 Statistics Information
========================

RX packets:                   167625    bytes:             122748440
TX packets:                    58137    bytes:              47729316
Bypass packets:               109488    bytes:              75833042
Bad L4 checksum:                  28    bytes:                  1400

Protocol        RX packets      RX bytes                TX packets      TX bytes

ipv4            1                71               58138          47729387
ipv6            0                 0                   0                 0
pppoe           0                 0                   0                 0
vlan            167624         122748369              109459          75831571

root@bgp01:/home/rack# netstat -ni
Kernel Interface table
Iface   MTU Met   RX-OK RX-ERR RX-DRP RX-OVR    TX-OK TX-ERR TX-DRP TX-OVR Flg
eth0       1500 0         0      0      0 0             0      0      0      0 BMU
eth1       1500 0  4748130871      0 1358189 0      2843271808      0 208969      0 BMRU
eth2       1500 0   1106237      0    255 0       7616386      0      0      0 BMRU
eth3       1500 0  533078809      0 157565 0      348474892      0   2190      0 BMRU
eth4       1500 0  3487031329      0 1092680 0      5629304532      0 1793558      0 BMRU
eth5       1500 0  2611907892      0 591272 0      2622588312      0 209399      0 BMRU
eth6       1500 0  6641630248      0 1900917 0      6544738139      0 2557550      0 BMRU
eth7       1500 0   2562173      0    273 0       2645805      0      0      0 BMRU
eth1.100   1500 0  2812910578      0      0 0      1928318192      0      0      0 BMRU
eth1.401   1500 0  24907671      0    632 0      13682115      0      0      0 BMRU
eth2.10    1500 0    986250      0      0 0       6033749      0      0      0 BMRU
eth3.13    1500 0  393009567      0      0 0      212763832      0      0      0 BMRU
eth4.1     1500 0         0      0      0 0             5      0      0      0 BMRU
eth4.102   1500 0  1781577729      0  12664 0      3488925640      0      0      0 BMRU
eth4.573   1500 0  372159069      0      0 0       1512400      0      0      0 BMRU
eth5.563   1500 0  1443096583      0      0 0      1078755879      0      0      0 BMRU
eth5.571   1500 0  151279607      0      0 0      204198536      0      0      0 BMRU
eth5.572   1500 0         5      0      0 0        214693      0      0      0 BMRU
eth5.575   1500 0    633171      0      0 0      13217911      0      0      0 BMRU
eth5.577   1500 0    647726      0      0 0       1090360      0      0      0 BMRU
eth5.578   1500 0     11847      0      0 0           598      0      0      0 BMRU
eth5.710   1500 0  140073113      0      0 0      433402032      0      0      0 BMRU
eth5.711   1500 0    623173      0      0 0        857474      0      0      0 BMRU
eth6.550   1500 0  13500294      0      0 0      17765878      0      0      0 BMRU
eth6.551   1500 0  25675548      0      0 0      23584793      0      0      0 BMRU
eth6.561   1500 0     27599      0      0 0         31592      0      0      0 BMRU
eth6.576   1500 0  2421051027      0      0 0      2267556580      0      0      0 BMRU
eth6.2220  1500 0  1554866191      0      0 0      1355661644      0      0      0 BMRU
eth6.2221  1500 0  531363695      0      0 0      604626146      0      0      0 BMRU
eth6.2250  1500 0    664440      0      0 0        476503      0      0      0 BMRU
imq0      16000 0         0      0      0 0             0      0      0      0 ORU
lo        65536 0     17954      0      0 0         17954      0      0      0 LRU
tun0       1476 0   4513070      0      0 0      238217607      0      0      0 OPRU
tun1       1476 0    409223      0      0 0        387955      0      0      0 OPRU
root@bgp01:/home/rack# cat /proc/cavium/stats

 Statistics Information
========================

RX packets:                  3874630    bytes:            2688072457
TX packets:                  1719824    bytes:            1337500536
Bypass packets:              2154806    bytes:            1374649449
Bad L4 checksum:                 396    bytes:                 19072

Protocol        RX packets      RX bytes                TX packets      TX bytes

ipv4            18              6149             1719833        1337501866
ipv6            0                 0                   0                 0
pppoe           0                 0                   0                 0
vlan            3874613        2688067792             2154402        1374630531

root@bgp01:/home/rack# netstat -ni
Kernel Interface table
Iface   MTU Met   RX-OK RX-ERR RX-DRP RX-OVR    TX-OK TX-ERR TX-DRP TX-OVR Flg
eth0       1500 0         0      0      0 0             0      0      0      0 BMU
eth1       1500 0  4749457491      0 1358248 0      2844022715      0 208969      0 BMRU
eth2       1500 0   1106517      0    255 0       7617077      0      0      0 BMRU
eth3       1500 0  533261184      0 157570 0      348578193      0   2190      0 BMRU
eth4       1500 0  3487586182      0 1092740 0      5630366197      0 1793558      0 BMRU
eth5       1500 0  2612345961      0 591324 0      2623004605      0 209399      0 BMRU
eth6       1500 0  6643061422      0 1901046 0      6546335418      0 2557550      0 BMRU
eth7       1500 0   2562210      0    273 0       2645942      0      0      0 BMRU
eth1.100   1500 0  2813622285      0      0 0      1928794897      0      0      0 BMRU
eth1.401   1500 0  24952521      0    632 0      13700706      0      0      0 BMRU
eth2.10    1500 0    986527      0      0 0       6034390      0      0      0 BMRU
eth3.13    1500 0  393153003      0      0 0      212827315      0      0      0 BMRU
eth4.1     1500 0         0      0      0 0             5      0      0      0 BMRU
eth4.102   1500 0  1781837645      0  12666 0      3489544217      0      0      0 BMRU
eth4.573   1500 0  372223343      0      0 0       1512706      0      0      0 BMRU
eth5.563   1500 0  1443287913      0      0 0      1078891061      0      0      0 BMRU
eth5.571   1500 0  151287893      0      0 0      204209740      0      0      0 BMRU
eth5.572   1500 0         5      0      0 0        214735      0      0      0 BMRU
eth5.575   1500 0    633308      0      0 0      13220696      0      0      0 BMRU
eth5.577   1500 0    648117      0      0 0       1090805      0      0      0 BMRU
eth5.578   1500 0     11848      0      0 0           598      0      0      0 BMRU
eth5.710   1500 0  140146476      0      0 0      433503761      0      0      0 BMRU
eth5.711   1500 0    623488      0      0 0        857850      0      0      0 BMRU
eth6.550   1500 0  13505524      0      0 0      17770087      0      0      0 BMRU
eth6.551   1500 0  25675915      0      0 0      23585007      0      0      0 BMRU
eth6.561   1500 0     27604      0      0 0         31598      0      0      0 BMRU
eth6.576   1500 0  2421572155      0      0 0      2268045465      0      0      0 BMRU
eth6.2220  1500 0  1555098184      0      0 0      1355871479      0      0      0 BMRU
eth6.2221  1500 0  531531364      0      0 0      604914786      0      0      0 BMRU
eth6.2250  1500 0    664577      0      0 0        476600      0      0      0 BMRU
imq0      16000 0         0      0      0 0             0      0      0      0 ORU
lo        65536 0     17957      0      0 0         17957      0      0      0 LRU
tun0       1476 0   4513986      0      0 0      238254408      0      0      0 OPRU
tun1       1476 0    409309      0      0 0        387999      0      0      0 OPRU
root@bgp01:/home/rack# echo > /proc/cavium/stats

The Server is connected to eth5.710.

 

HTTP-Downloads will sometimes not be offloaded. The Download sometimes starts at 1 GBit and goes down to 400 MBit. Sometimes while downloading the bandwidth goes back to 1 GBit, but most of the time it is nailed to arround 400 MBit (i guess full cpu usage). Sometimes it starts directly with 400 MBit and the bandwidth did not increase nor decrease.

 

What i see also: On BGP2 i never hit that issue / i was not able to reproduce that issue.

The configuration between BGP1 and BGP2 is copy & past with some corrections on ip-addresses and bgp-peers.

 

stg.JPG

 

here you can see all the downloads. The Download-Server have 2 nexthops with weight 1 (BGP1 and BGP2). you can see the spikes at 2:11 on BGP1 (first grapher), that is, when HTTP-Traffic is offloaded and you can see at 2:09 that there was not full bandwidth available due to cpu-forwarding instead of offloading (ksoftirqd/0 and ksoftirqd/1 uses full CPU-Power).

 

Routing is done from eth5.710 to eth6.2220.

VLAN-Offloading is enabled.

 

rack@at-sbg-itz-tz-k10-r10-bgp01:~$ show ubnt offload

IP offload module   : loaded
IPv4
  forwarding: enabled
  vlan      : enabled
  pppoe     : disabled
  gre       : enabled
IPv6
  forwarding: enabled
  vlan      : enabled
  pppoe     : disabled

IPSec offload module: loaded

Traffic Analysis    :
  export    : disabled
  dpi       : disabled
    version       : 1.354

Why will ASIC / HW-Offload-Engine drop the Flow-Item from forwarding-table while tcp-stream is active?

 

Please move your Message-ID 198060 also to that new topic.

Veteran Member
Posts: 6,020
Registered: ‎03-24-2016
Kudos: 1586
Solutions: 680

Re: Offloading-Flow randomly "jumps" from Offload-Engine to Linux / ER-Pro8 / v.1.10.0

guess mode:

1st thing that comes to mind:  offload connection table overflow.

The # of simultaneously offloaded connections must have some limit, if there's no room left, it's up to the CPU to do the routing job

Regular Member
Posts: 464
Registered: ‎07-21-2010
Kudos: 77
Solutions: 6

Re: Offloading-Flow randomly "jumps" from Offload-Engine to Linux / ER-Pro8 / v.1.10.0

[ Edited ]

Okay, but the target is 2 Mio+ Offloaded Packets per second?

How many flow-items can be inserted into the flow-table?

 

I wrote my own statistic-reader, so i can see offloading-Engine and interrupts on one screen.

 

100perc.JPG

 

It is really freaky....

 

OFFLOADING-TOTAL-STATS
                         BYTES      per sec.              PACKETS      per sec.
      RX:      677.125.034.110    40.565.496          956.290.779        61.714
      TX:      289.994.016.349    40.964.718          365.029.213        56.359
  Bypass:      392.241.288.427       389.758          591.261.566         5.354
  Bad-L4:           12.215.964            50              193.571             1
-------------------------------------------------------------------------------
    Hits:       Total:  42.83%    Current: 100.98%

Offloding-Stats by protocol
    IPv4:          659.905.550             0              735.326             0
    IPv6:                    0             0                    0             0
   PPPoE:                    0             0                    0             0
    VLAN:      676.465.130.036    40.566.926          955.555.454        61.714

You can see the Hits in percentage since "offloding-engine-stats are enabled" and the current value. The current-value is a TX/RX ratio for the last second. So if 100%, all packets were offloaded.

 

But there are also < 10% of packets being offloaded:

OFFLOADING-TOTAL-STATS
                         BYTES      per sec.              PACKETS      per sec.
      RX:      673.169.545.794    73.581.729          952.086.015        84.589
      TX:      287.253.045.329     7.060.551          362.406.501        12.405
  Bypass:      390.990.053.163    66.694.672          589.679.514        72.183
  Bad-L4:           12.169.013             0              193.294             0
-------------------------------------------------------------------------------
    Hits:       Total:  42.67%    Current:   9.60%

Offloding-Stats by protocol
    IPv4:          659.901.279             0              735.308             0
    IPv6:                    0             0                    0             0
   PPPoE:                    0             0                    0             0
    VLAN:      672.509.644.515    73.581.553          951.350.707        84.588

In my attached stats.log you can see all values second by second.

 

On the SNMP-Grapher image you can see on BGP1 (upper grapher) that on the first 90 seconds there were not many hits. After 90 seconds the hits reached 100% and i was able to get 80 MBytes (640 MBits).

 

The Grapher + Logs shows the same results.

 

I do not use Linux CONNTRACK Tables, because of getting "to many entrys - flushing ...." messages.

And this router does not need conntrack-tables, because it is a dumb packet-forwarding-router.

 

Bad-Layer4-Checksum

Also on Hitrates near 100% there are BadLayer4-Checksum Packets, but that did not hurt the offloading-engine.

So we dont have to investigate time into the L4-Bad-Checksum issue.

 

My Environment:

Server1 (eth0.711) --[ 1 GbE CAT5e ]--> (eth5.711) ER-Pro (eth6.2220) --[ 1 GbE Fiber ]--> Switch --[ 1 GbE CAT5e]--> (eth0) Server2

Server2# /usr/bin/curl -o /dev/null http://server1/1gb.iso

 

EDIT:

Did some tests on second machine BGP2. This machine also have the same firmware, the same configuration, but seems to be more stable. Offloading-Engine also does not handle 100% of traffic, but the CPU is able to forward more packets, so the issue is not so dramatic. But you feel the performance, when you know the statistic values. If the offload-engine kicks in you will get much better results > 90 MByte/s. If offload-engine is not forwarding packets you see only 80 MByte/s.

 

101perc.JPG

Test on BGP2

 

I have attached a new stats-file "bgp-02.txt" where you can see 100% CPU-Usage on both CPUs when do this tests and < 30% hits in offloading. When offloading kicks in, CPU-Usage will decrease dramatically.

 

Question:

Is there a way to dump the connection table in offloading engine?

 

Conntrack:

i did a modprobe nf_conntrack_ipv4 to see flows:

# conntrack -L |grep flows
conntrack v0.9.14 (conntrack-tools): 262151 flow entries have been shown.

 

after some minutes i see this:

nf_conntrack: table full, dropping packet
nf_conntrack: table full, dropping packet
nf_conntrack: table full, dropping packet
nf_conntrack: table full, dropping packet
nf_conntrack: table full, dropping packet
nf_conntrack: table full, dropping packet
nf_conntrack: table full, dropping packet
ctnetlink: unregistering from nfnetlink.
net_ratelimit: 9424 callbacks suppressed

This was now very uncool, because my complete network was stop working in a production system with over 1.500 customers connected.

 

After rmmod nf_conntrack* the system was back fully routing.

 

Regular Member
Posts: 464
Registered: ‎07-21-2010
Kudos: 77
Solutions: 6

Re: Offloading-Flow randomly "jumps" from Offload-Engine to Linux / ER-Pro8 / v.1.10.0

[ Edited ]

If anyone is interested in cavium ip offloading performance, you can download my script it is attached to this message.

You can see the CPU-Usage + Interrupts + Cavium Offload Engine Performance.

 

CaviumPerformance.JPG

 

This script runs on 2 ER-Pro8.

 

This script sets /proc/cavium/stats to 1, because you have to exit from this programm with CTRL-C, you have to manually disable Cavium-Stats by issuing

 

disable cavium stats:

echo 0 > /proc/cavium/stats

 

You have to make that programm executable:

chmod +x cavium_stats

./cavium_stats

Regular Member
Posts: 464
Registered: ‎07-21-2010
Kudos: 77
Solutions: 6

Re: Offloading-Flow randomly "jumps" from Offload-Engine to Linux / ER-Pro8 / v.1.10.0

[ Edited ]

Here is a video that shows the issue:

Enable CC / Subtitles!

 

Veteran Member
Posts: 6,020
Registered: ‎03-24-2016
Kudos: 1586
Solutions: 680

Re: Offloading-Flow randomly "jumps" from Offload-Engine to Linux / ER-Pro8 / v.1.10.0

Video doesn't work.

So indeed it looks like max # of connections is causing this.  To combat this try:

 

Spoiler
# Kill idle TCP sessions faster, normally, time-out takes multiple hours or even an entire day
set system conntrack timeout tcp established 1800
#  Increase connection limit.  Is the Cavium engine limit of # of connections known? 
set system conntrack table-size 262144

Regular Member
Posts: 464
Registered: ‎07-21-2010
Kudos: 77
Solutions: 6

Re: Offloading-Flow randomly "jumps" from Offload-Engine to Linux / ER-Pro8 / v.1.10.0

[ Edited ]

> Video doesn't work.

Fixed

 

> set system conntrack table-size 262144

Will this only affect net.ipv4.netfilter.ip_conntrack_max or is this someting that will also affect offload-engine?

And why should offload-engine not handle more than 200.000 concurrent flows?

 

Is there more information on how offload-engine works?

 

Okay what we know:

  1. A new flow will be initialized, it is processed by CPU
  2. CPU inserts a flow on offload-engine forwarding hash-table
  3. next packets that match a flow will be forwarded without need of CPU

But what are the conditions between 1 and 2?

Is there a condition if hash.length > 200.000 then hash.pop;; hash.append(flow);;

 

And when there is a pop before append, why is a new TCP-Connection not offloaded?

 

In this case, you must have 200.000 new flows per second to pop the current flow from hash-table! That is not possible. Okay, it is possible if you have a DDoS with thousands of requests with different src_ip, src_port and dst_ip, dst_port combinations.... But that is not the case when i analyse the netflow statistics on my mirror-port.

 

Question:

cat /proc/sys/net/ipv4/ip_conntrack_max results in "No such file or directory"-exception. nf_conntrack is loaded. This is very strange!? Also cat /proc/net/ip_conntrack results in the same exception. Where are all these tables and attributes stored?

 

conntrack.JPG

Veteran Member
Posts: 6,020
Registered: ‎03-24-2016
Kudos: 1586
Solutions: 680

Re: Offloading-Flow randomly "jumps" from Offload-Engine to Linux / ER-Pro8 / v.1.10.0

seems like settings you're after are at:

Spoiler
admin@ERX:~$ cat /proc/sys/net/netfilter/nf_conntrack_max
262144

I don't understand your "200.000 new flows per second to...."

It's not about new flows per seconds, it's about the amount of active flows combined.

Active flows can last multiple hours

Outlook for a single user opening multiple calendars and mailboxes can have 50+ simultaneous tcp sessions open.  

 

Veteran Member
Posts: 6,020
Registered: ‎03-24-2016
Kudos: 1586
Solutions: 680

Re: Offloading-Flow randomly "jumps" from Offload-Engine to Linux / ER-Pro8 / v.1.10.0

While watching the video, I wondered:  Do you have full bgp routing table, or only default route?

 

If offload engine itself has copy of routing table too, full table takes up lots of space.

Regular Member
Posts: 464
Registered: ‎07-21-2010
Kudos: 77
Solutions: 6

Re: Offloading-Flow randomly "jumps" from Offload-Engine to Linux / ER-Pro8 / v.1.10.0

[ Edited ]

> Do you have full bgp routing table, or only default route?

I have 2x BGP-Full-Feed on each of my two routers (BGP1, BGP2).

 

> If offload engine itself has copy of routing table too, full table takes up lots of space.

Yes, that is a interesting point that have to be cleared. Is RIB installed in offload-engines-memory? I think, that offloading-engine does not have access to RIB (Routing Information Database). First packet is routed via CPU, CPU checks all conditions for routing, natting and so on, and after find a path to desired destination, it installs a flow-entry into the offload-engines Flow-Entry-Memory (FIB / Forwarding Information Database).

 

Standard-Settings on nf_conntrack.

Thank you, now i can see the nf_conntrack-Attributes.

root@at-sbg-itz-tz-k10-r10-bgp01:/home/rack# cat /proc/sys/net/netfilter/nf_conntrack_max
262144
root@at-sbg-itz-tz-k10-r10-bgp01:/home/rack# cat /proc/sys/net/netfilter/nf_conntrack_buckets
32768

 

> I don't understand your "200.000 new flows per second to...."

Lets assume that Offload-Engine can only handle up to 200.000 concurrent flows. So Offload-Engine is out of memory when you reach 200.000 flow-entries, so every new flow-entry have to delete a old one or new entries can not be added.

 

As we can see that throughput increases and decreases on the same flow, flows will be deleted / removed.

 

So the question is, why would a high bandwidth usage flow be deleted?

 

If the programming is like if hash.length > 200.000 then hash.pop;; then the oldest entry will be deleted. There is no normal behavior that will generates over 200.000 flows, so my current high bandwidth usage flow will be deleted. 200.000 TCP-Syns or 200.000 new UDP-Connections. Wow that is a huge amount of new connections.... That means, that a DDoS to one of our clients can screw your router. Creating 200.000 flows per second is not a big problem for most common servers. You can do that by simply programm a perl-script that sends 1 bytes UDP-Packets to a destination behind your router udp_port+1 for every packet.

 

On my video you can see the hits from 9% to 100%, the most time it is < 50% thats because the overall-stats is 43%. So more than the half of packets is processed by CPU instead of Offload-Engine.

 

For me there are some questions... Is Offload-Engine inspecting TCP-Flags as FIN, so it will delete Flow-Entries? What timeout for TCP-Flows and for UDP-Flows will be used? Are the kernel timeout settings system wide, so also Offload-Engine takes care about that times or have offload-engine own timeout-values?

 

And if offload-engine really drops old flows first, it have to be more intelligent. Flows with higher bandwidth usage should never get deleted, so it is always processed by offload-engine and never via CPU.

 

Veteran Member
Posts: 6,020
Registered: ‎03-24-2016
Kudos: 1586
Solutions: 680

Re: Offloading-Flow randomly "jumps" from Offload-Engine to Linux / ER-Pro8 / v.1.10.0

you assume on full offload table , oldest connection is dropped.  But is it?

Maybe offload engine gets new flow from CPU, and directly starts processing those packets,  New flow isn't yet in hash table, but still in some CPU registers.

Then offload engine tries to move this new flow to hash table, finds out it can't, and gives connection back to cpu

 

 

Also, are you multihomed in any way?  A packet received on another interface (=loadbalanced) could also make the flow drop from offload hash table

 

Did you already try altering tcp established time-out setting?

It did help in:

https://community.ubnt.com/t5/EdgeMAX/Edgepoint-Conntrack-full-but-no-reason-for-conntrack-to-be/td-...

 

Regular Member
Posts: 464
Registered: ‎07-21-2010
Kudos: 77
Solutions: 6

Re: Offloading-Flow randomly "jumps" from Offload-Engine to Linux / ER-Pro8 / v.1.10.0

[ Edited ]

> you assume on full offload table , oldest connection is dropped.  But is it?

One of Ubiquiti Stuff have to clarify that.

 

> Also, are you multihomed in any way?

Yes i am multihomed.

 

       AS33891                         AS1299
          |  \-----------\        ------/|
          |            ---------/        |
          |  ---------/    \-----------  |
          | /                          \ |
  BGP1 (ER-Pro8) -------------- BGP2 (ER-Pro8)
          |  \-----------\        ------/|
          |            ---------/        |
          |  ---------/    \-----------  |
          | /                          \ |
Switch#1 Switch#2

 

 

> A packet received on another interface (=loadbalanced) could also make the flow drop from offload hash table

This is not LACP compliant nor Routing-Cache relevant. All our upstream-peers / ip-transit-peers sends a flow via the same route. So one TCP-Stream or UDP-Stream is never "load-balanced in round robin manner".

 

If you insert "nexthop" routes to your RIB, it is the same behavior.

 

> Did you already try altering tcp established time-out setting?

Of course i did, but this did not change anything.

 

I hope Ubiquiti will give us more hands on debugging the offload-engine. So it is necessary to see the installed FIB-Tables that is currently installed in the engine. And we need some feedback from engine for drop- and add-events of flow-entries. Also a reason of dropping a flow-entry is needed (reason=timeout, l4checksuminvalid, fulltable, flushtable ....)

 

With the current amount of information, we are not able to move forward and need help from ubnt and perhaps from cavium. In general Cavium and Ubiquiti should expand there DevOps cooperation, so we can faster troubleshoot such issues.

 

Question:

When was /proc/cavium/stats added? On v.1.7.0 i am not able to get stats from cavium offload-engine.

Is there a other way to get stats on this version of firmware?

Currently i am not able to compare v.1.7.0 with v1.10.0.

Regular Member
Posts: 464
Registered: ‎07-21-2010
Kudos: 77
Solutions: 6

Re: Offloading-Flow randomly "jumps" from Offload-Engine to Linux / ER-Pro8 / v.1.10.0

Yeha it is getting better.... Overall-Performance is now so good, that you are not able to do things on ssh-console.

Ghost-Typing, over 700ms delay.... Wow, what a incredible router!

 

This has impact for all customers having bad internet-connectivity. For us that means potential loss of revenue.

 

unresponsable.JPG

 

I have ordered a Juniper MX480 because that issue is not longer justifiable for our customers, and Mikrotik (keep that crap out of my datacenters). Over weeks we thougt that this service impact is related to the reordering issue, but as i can see, that issue is bigger than that.

 

I will send a video after MX480 is installed and running and how i throw ER-Pro8 into trash, give petrol to it and burn that crep router and send him directly to hell!

 

giphygiphy

Ubiquiti Employee
Posts: 1,072
Registered: ‎07-20-2015
Kudos: 1088
Solutions: 73

Re: Offloading-Flow randomly "jumps" from Offload-Engine to Linux / ER-Pro8 / v.1.10.0

@RcRaCk2k
> I wrote my own statistic-reader, so i can see offloading-Engine and interrupts on one screen
That's a super-handy script, thanks

 

> But there are also < 10% of packets being offloaded:

> Is there more information on how offload-engine works?

> set system conntrack table-size 262144
> Will this only affect net.ipv4.netfilter.ip_conntrack_max or is this someting that will also affect offload-engine?
This will affect only Linux. Offload engine has own flow table.

 

> Is there a condition if hash.length > 200.000 then hash.pop;; hash.append(flow);;
I can not answer this question right now, need to investigate source code of offload engine. But I would say that popping actrive connection is wrong.

 

> If offload engine itself has copy of routing table too, full table takes up lots of space.
> Yes, that is a interesting point that have to be cleared. Is RIB installed in offload-engines-memory?
No, offload engine does not need RIB.

 

> Is Offload-Engine inspecting TCP-Flags as FIN, so it will delete Flow-Entries?
I can not answer this question right now, but I would expect that FIN should remove flow from hash table

 

> What timeout for TCP-Flows and for UDP-Flows will be used?
> Are the kernel timeout settings system wide, so also Offload-Engine takes care about that times or have offload-engine own timeout-values?
And again I can not answer this question, need to investigate source code.

 

> When was /proc/cavium/stats added? On v.1.7.0 i am not able to get stats from cavium offload-engine.
It was added in 1.9.1

 

> Is there a other way to get stats on this version of firmware?
No

 

I will simulate 200K+ flows in lab environment and will try to reproduce this issue.
Also I will create new "cavium_ip_offload" with additional stats so we can debug this issue.

Member
Posts: 194
Registered: ‎12-11-2013
Kudos: 222
Solutions: 7

Re: Offloading-Flow randomly "jumps" from Offload-Engine to Linux / ER-Pro8 / v.1.10.0

@UBNT-afomins This offload information seems very useful. Can we get a "show ubnt offload statistics" option added to EdgeOS baseline?
Member
Posts: 198
Registered: ‎04-28-2015
Kudos: 106
Solutions: 1

Re: Offloading-Flow randomly "jumps" from Offload-Engine to Linux / ER-Pro8 / v.1.10.0

@UBNT-afomins,

 

Can you say what the number-of-flows (for offload) entries are in the Cavium chipsets as implemented in the ER-Pro8 and on the ER-Infinity?

 

 

Established Member
Posts: 2,476
Registered: ‎08-06-2015
Kudos: 1016
Solutions: 150

Re: Offloading-Flow randomly "jumps" from Offload-Engine to Linux / ER-Pro8 / v.1.10.0


@rpswrote:
@UBNT-afominsThis offload information seems very useful. Can we get a "show ubnt offload statistics" option added to EdgeOS baseline?

Offload statistics collection is not enabled by default so this wouldn't be very informative.  It doesn't seem like this is necessarily a capability UBNT is ready to use other than for specific diagnosis.

 

I had posted a question regarding any impact of enabling this but never got any response:  Overhead of enabling Cavium offload stats?

 

user@er:~$ sudo cat /proc/cavium/stats
user@er:~$ echo 1 | sudo dd of=/proc/cavium/stats
0+1 records in
0+1 records out
user@er:~$ sudo cat /proc/cavium/stats

 Statistics Information
========================

RX packets:	               12681	bytes:	             9864911
TX packets:	               11575	bytes:	             9846463
Bypass packets:	                1106	bytes:	              169746
Bad L4 checksum:                   0	bytes:	                   0

Protocol	RX packets	RX bytes		TX packets	TX bytes

ipv4		1823            161266                9727           9479843
ipv6		0                 0                2008            383189
pppoe		0                 0                   0                 0
vlan		10858           9703645                 946            153177

user@er:~$ echo 0 | sudo dd of=/proc/cavium/stats
0+1 records in
0+1 records out
user@er:~$ sudo cat /proc/cavium/stats
user@er:~$ 
Member
Posts: 198
Registered: ‎04-28-2015
Kudos: 106
Solutions: 1

Re: Offloading-Flow randomly "jumps" from Offload-Engine to Linux / ER-Pro8 / v.1.10.0

Regarding the original question of what might cause an in-progress offloaded flow to get ejected from offload...

 

I understand that the offload flow table and offload engine don't hold a copy of the RIB/FIB.

 

I do, however, wonder if the FIB update routine doesn't send a "route has changed" message of some form upon FIB updates.  Something along the lines of path to "a.b.c.d/xx has changed, handle appropriately" doesn't get passed as it occurs to the offload engine, which might then scrub current offloaded flows for flows which have that destination, and eject those flows from offload?


Just a thought.

 

(Because if it didn't do something like that, it would have a perverse consequence for established flows: that changing the routing table wouldn't cause the next packets within the flow to continue on via the new path)

Regular Member
Posts: 464
Registered: ‎07-21-2010
Kudos: 77
Solutions: 6

Re: Offloading-Flow randomly "jumps" from Offload-Engine to Linux / ER-Pro8 / v.1.10.0

[ Edited ]

> I do, however, wonder if the FIB update routine doesn't send a "route has changed" message of some form upon FIB updates.  Something along the lines of path to "a.b.c.d/xx has changed, handle appropriately" doesn't get passed as it occurs to the offload engine, which might then scrub current offloaded flows for flows which have that destination, and eject those flows from offload?


> Just a thought.

> (Because if it didn't do something like that, it would have a perverse consequence for established flows: that changing the routing table wouldn't cause the next packets within the flow to continue on via the new path)

 

Okay, i did some research on exactly that, without having source-code of offload-engine and the user-space applications.

 

I have 2x BGP-FullFeed.... every change is commited to ribd... and there is where issue begins!

 

I've installed a default-route on BGP2, and installed a IN-Filter to only allow as_path.length <= 2 from my Transit-Providers so i only get 50803 Routes from Transit#1 and 83279 from Transit#2. BGP2's offload-engine have now a overall-performance of >= 72.00%!!!!!!!

 

overBGP2.JPG

 

So we can make this story short:

  • every time, ribd comes into play, Offload-Engine will suck
  • ribd will flush all flow entries, so traffic will jump back to Linux
  • on a BGP-FullTable or n*BGP-FT ribd is flushing all the time

How to fix that:

  1. ribd shoud have a compare-table, to know what routes are withdrawn and with routes were updated
  2. only remove flows that match that routes they changed

I did speedtests between two internal servers they are static routed, so the routes will never change nor updated. So flows from Server1 to Server2 via BGP-Router should never be deleted. But they will be deleted!

 

So now it is up to Ubiquiti-Programmers to fix that issue.

 

Now i also changed IN-Filter on BGP1....

 

botj12.JPG

 

Overall-Performance = 81,36% on BGP1 and 79,27% on BGP2

@UBNT-afomins

 

Sideeffect = Now we push 80.111.684 bytes per second instead of 50 Megs! So my customers were not able to get that bandwith they ordered because EdgeMax was not able to route that amount of traffic.

 

 

Member
Posts: 198
Registered: ‎04-28-2015
Kudos: 106
Solutions: 1

Re: Offloading-Flow randomly "jumps" from Offload-Engine to Linux / ER-Pro8 / v.1.10.0

The kind of fix you propose may be theoretically possible, but it's not quite as easy as you suggest.

 

The naive first take at the problem says:

 

If a FIB update occurs pertaining to destination prefix 10.20.0.0/16, locate all flows to a destination inside that mask and eject from the offload flows.

 

This behavior would be correct.


Unfortunately, you must consider whether there are smaller (network size wise) prefixes inside 10.20.0.0/16 which are not changing.  If so, you would want to not eject that flow.

 

But a really perverse example shows how far that logic could go:

 

Prefix A:  10.20.0.0/16

Prefix B:  10.20.1.0/24

Prefix C:  10.20.1.4/24

 

Now, we have to determine for any changes of A,B,C which is the most specifc matching prefix before and after and determine whether the fib next hop would change and eject on that case.

 

The math gets time consuming from a CPU perspective.

 

I'm not saying it can't be solved.  It's not an easy thing, though.

 

I suppose the one comment that I would make is that even the naive approach of just ejecting any flow with a destination matching an update would be better than ejecting all flows for every update.  Does it really do that?  I've not had time to confirm.

Reply