Reply
Senior Member
Posts: 3,277
Registered: ‎07-28-2009
Kudos: 975
Solutions: 44

AC-500 - 100% CPU Usage - Large latency/packet loss

I got an AC-500 that has been installed for a month or so.  Got a call from the customer and when I went to look at it the CPU had a sudden spike to 75-100% and it held it until I rebooted it.  The CPE also saw a drastic drop in RX Capacity.

 

After the CPE reboot the CPU usage and RX capacity went back to normal, but now the CPE has 25% packet loss with triple digit latency.  Reboot of CPE and AP has not corrected it.  All other CPEs on this AP are fine.

 

CPU load.PNGRX CAP before.PNGRX rate before.PNG

 

Latency.PNGAP stats.PNG

 

I upgraded the CPE from 8.3.2 to 8.4 and it did not make any difference. 

Senior Member
Posts: 3,277
Registered: ‎07-28-2009
Kudos: 975
Solutions: 44

Re: AC-500 - 100% CPU Usage - Large latency/packet loss

What is also odd is I have other CPEs with worse signals that shoot to the same AP on pretty much the same path that are fine with single digit latency and 0% loss.

Path.PNGCustomer with issues is circled blue

Senior Member
Posts: 3,277
Registered: ‎07-28-2009
Kudos: 975
Solutions: 44

Re: AC-500 - 100% CPU Usage - Large latency/packet loss

[ Edited ]

Here are the constellations from the CPE.  As you can see everything looks like it should be OK.  I am at a total loss as to what is going on.

 

Const.PNG

 

@UBNT-sriram?

Regular Member
Posts: 755
Registered: ‎01-27-2015
Kudos: 290
Solutions: 6

Re: AC-500 - 100% CPU Usage - Large latency/packet loss

Is the cable to the CPE OK?

 

I had one like that a while back. Didn't have a clue until the customer said the POE injector was making a sizzling noise and unplugged it. 

 

Turns our a squirrel chewed into the cable down to the copper which let water seep in. Had to replace the entire cable.

 

Senior Member
Posts: 3,277
Registered: ‎07-28-2009
Kudos: 975
Solutions: 44

Re: AC-500 - 100% CPU Usage - Large latency/packet loss


@Solideco wrote:

Is the cable to the CPE OK?

 

I had one like that a while back. Didn't have a clue until the customer said the POE injector was making a sizzling noise and unplugged it. 

 

Turns our a squirrel chewed into the cable down to the copper which let water seep in. Had to replace the entire cable.

 


I'll give it a look, but as far as the radio is reporting the ethernet link is good with no errors and his router is getting an IP address and the service is passing data.  Seems to me to be either mechanical or interference issue, but it nothing in the radio or AP show interference, just the packet loss/latency.

Ubiquiti Employee
Posts: 882
Registered: ‎08-12-2009
Kudos: 509
Solutions: 31

Re: AC-500 - 100% CPU Usage - Large latency/packet loss

@sbyrd

 

can you send me output of the following command from the CPE:

 

cat /proc/interrupts; sleep 1; cat /proc/interrupts;

 

I want to see the difference between the first command and second command after the sleep.

 

Senior Member
Posts: 3,277
Registered: ‎07-28-2009
Kudos: 975
Solutions: 44

Re: AC-500 - 100% CPU Usage - Large latency/packet loss


@UBNT-sriram wrote:

@sbyrd

 

can you send me output of the following command from the CPE:

 

cat /proc/interrupts; sleep 1; cat /proc/interrupts;

 

I want to see the difference between the first command and second command after the sleep.

 


@UBNT-sriram - Here you go.  I am about to go out to the location in a couple of hours and try a new radio.

 

XC.v8.4.0# cat /proc/interrupts
CPU0
2: 577425 dummy wifi1
4: 70983 MIPS eth0
6: 0 MIPS cascade
7: 13406918 MIPS timer
18: 0 ATH MISC cascade
19: 214 ATH MISC serial
20: 0 ATH MISC Watchdog Panic Handler
22: 0 ATH MISC hs-uart
64: 1025177 ATH GPIO ath-spectral-filter
75: 6908416 ATH PCI wifi0

ERR: 46567


XC.v8.4.0# sleep 1


XC.v8.4.0# cat /proc/interrupts
CPU0
2: 578018 dummy wifi1
4: 70994 MIPS eth0
6: 0 MIPS cascade
7: 13419118 MIPS timer
18: 0 ATH MISC cascade
19: 214 ATH MISC serial
20: 0 ATH MISC Watchdog Panic Handler
22: 0 ATH MISC hs-uart
64: 1025781 ATH GPIO ath-spectral-filter
75: 6915188 ATH PCI wifi0

ERR: 46584

Ubiquiti Employee
Posts: 882
Registered: ‎08-12-2009
Kudos: 509
Solutions: 31

Re: AC-500 - 100% CPU Usage - Large latency/packet loss

@sbyrd; Sorry forgot to mention, it is important to run those commands in one batch so that it is approximately one second between the two executions.

 

Senior Member
Posts: 3,277
Registered: ‎07-28-2009
Kudos: 975
Solutions: 44

Re: AC-500 - 100% CPU Usage - Large latency/packet loss


@UBNT-sriram wrote:

@sbyrd; Sorry forgot to mention, it is important to run those commands in one batch so that it is approximately one second between the two executions.

 


@UBNT-sriram

XC.v8.4.0# cat /proc/interrupts; sleep 1; cat /proc/interrupts;
CPU0
2: 595883 dummy wifi1
4: 71782 MIPS eth0
6: 0 MIPS cascade
7: 13852647 MIPS timer
18: 0 ATH MISC cascade
19: 214 ATH MISC serial
20: 0 ATH MISC Watchdog Panic Handler
22: 0 ATH MISC hs-uart
64: 1046583 ATH GPIO ath-spectral-filter
75: 7138211 ATH PCI wifi0

ERR: 47513
CPU0
2: 595883 dummy wifi1
4: 71786 MIPS eth0
6: 0 MIPS cascade
7: 13853662 MIPS timer
18: 0 ATH MISC cascade
19: 214 ATH MISC serial
20: 0 ATH MISC Watchdog Panic Handler
22: 0 ATH MISC hs-uart
64: 1046659 ATH GPIO ath-spectral-filter
75: 7138800 ATH PCI wifi0

ERR: 47520

Ubiquiti Employee
Posts: 882
Registered: ‎08-12-2009
Kudos: 509
Solutions: 31

Re: AC-500 - 100% CPU Usage - Large latency/packet loss

Thanx @sbyrd; So I see nothing alarming with the output. 

 

Now when I look at your screenshot, I see the RX signal at the CPE being much lower than it's TX signal at the AP.

 

Was this also the case when the CPE was working fine ?

 

Senior Member
Posts: 3,277
Registered: ‎07-28-2009
Kudos: 975
Solutions: 44

Re: AC-500 - 100% CPU Usage - Large latency/packet loss

Actually if I look at the signal in Aircontrol the CPE signal was a few db worse at the CPE side before the issue.  The signal has actually improved a little bit by a db or two, but to answer your question I believe the signal at the AP side has always been a little better than the CPE RX.

 

I am going out to the site and may try replacing the CPE.  I will let you know if replacing the CPE 'fixes' it.

Senior Member
Posts: 3,277
Registered: ‎07-28-2009
Kudos: 975
Solutions: 44

Re: AC-500 - 100% CPU Usage - Large latency/packet loss

I went to the customer's home and could not find any reason for the issue.  I could not find any visible interference nor could I find any damage.

 

I replaced the surge arrestor and power supply and I could only get 5-8Mbps on a TCP speedtest to our office.  I then replaced the CPE with a new AC-500 and I got 45Mbps to our website and good latency and zero packet loss.  I ran these tests for probably 15min. and then called it solved.

 

I went on to do other service calls and when I checked later it was back to high loss and latency and I could barely push 5Mbps on an iperf test to the CPE.  All the while the CPE was at 29CINR on both AP/CPE sides.  Capacity was a steady 120Mbps on the RX and when under load was 50-143Mbps.  So to me it still looks like interference.  Possibly AP side interference.  However, none of the other 24 CPEs on the AP, some with much worse signals were having this issue and had normal latency with less than 1% loss.

 

To me it is almost like the CPE is getting interfernce, but the radio is not modulating down to the proper level that interference of this type would require.

 

As a last ditch effort to see if I could see the interfernce on the AP or CPE Airview I changed the frequency of the AP and everything is perfect again for this CPE.  Now unless I am reading things wrong the AP/CPE seem to be showing there is little to no interference on the channel I was on before (5815 at 20mhz). 

 

Is there such a thing as interference that can cause high loss and latency to a single client, but is practically invisible to all the reporting of noise (Airview, CINR, SSID) that the AP/CPEs have?

 

Latency Before.PNGLatency of CPE on 5815mhzlatency After.PNGLatency after AP frequency change

Latency Graph.PNGPing chart showing when the issue first started

 

CPE Noise.PNGCPE noise After moving AP from 5815mhzCPE Airview.PNGCPE Airview after moving AP from 5815mzh

AP Noise.PNGAP noise after moving away from 5815mhzAP Airview.PNGAP Airview after moving AP away from 5815mhz

 

 

Ubiquiti Employee
Posts: 882
Registered: ‎08-12-2009
Kudos: 509
Solutions: 31

Re: AC-500 - 100% CPU Usage - Large latency/packet loss

The only thing that can cause a phenomemon like this is CPE side local interference. 

If the interference were at the AP side, then all stations would be affected in some way.

 

If it only affects a specific station, then it typically means local interference. Now what would be interesting is tomorrow if you get a longer term view of the airView and see if the interference comes and goes.

The only thing I can think of is, it is off and on type of interference where things seem fine at times and not fine at other times for that specific CPE.

 

Highlighted
Senior Member
Posts: 3,277
Registered: ‎07-28-2009
Kudos: 975
Solutions: 44

Re: AC-500 - 100% CPU Usage - Large latency/packet loss

[ Edited ]

@UBNT-sriram wrote:

The only thing that can cause a phenomemon like this is CPE side local interference. 

If the interference were at the AP side, then all stations would be affected in some way.

 

If it only affects a specific station, then it typically means local interference. Now what would be interesting is tomorrow if you get a longer term view of the airView and see if the interference comes and goes.

The only thing I can think of is, it is off and on type of interference where things seem fine at times and not fine at other times for that specific CPE.

 


In regards to latency/loss and CPE performance (speedtests) it looks like local interfernce, but for something that was causing interference that strongly (enough to severely impact a CPE with 29CINR) I would have expected to be able to find the device or see it on some type of scan.

 

Also, is there a big difference in how AC CPEs handle local interfernce vs M CPEs?  I have had over the years on several occasions where an M CPE has local interference.  Usually it is the customer's router or wireless TV receiver.  In all those cases the tell tale sign of local interference was the consistent drops in the CPEs RX modulation rates especially when the CPE was under load.  The CPE TX rates stayed stable.

 

Basically the way I keep track of whose side the interference is on is by looking at the devices Modulation rates.  If the CPE TX rates only drop then it is usually AP side.  If the CPE RX rates only drop it is usually CPE side.

 

In the case with this AC-500 CPE the RX rates to the AP stayed at a stable (173Mbps or 8X).  So either the interference was not CPE side or the AC CPE is not smart enought to modulate down during the interference and that is why I saw the very poor performance, latency, and packet loss.  Maybe if the CPE had modulated down on the RX rates (does the AP control this?) then it could have settled on a rate that would not have so many retransmissions.

 

After talking to the customer he said his performance was junk since we replaced his NBM5-25 with the AC-500.  Keep in mind he was using the M5-25 on an AC AP for several weeks before the change.  I have now looked back at the customer's performance history on the AC-500 and he is correct.  From Sept 13-23rd Aircontrol shows his latency was horrible, just like it was yesterday.  It then was fine from the 23rd until Oct 3rd when the CPE's CPU topped out and needed a reboot.

 

So on the surface I agree that the latency (both ICMP and WLAN TX latency), throughput, and packet loss make it look like local interference, but the Airview, CINR, and CPE Modulation rates/Capacity make it look like nothing is wrong.

 

Here is the Airview from the CPE.  As you can see after 5pm (when I changed the AP frequency) there is no visible signs of noise on or around 5815mhz.

Airview after.PNG

Reply