10-17-2018 12:29 PM
I have an open case with ubiquiti, but maybe it will go better to ask around here....
I have deployed 10 nano APs. Deployment is on a dedicated vlan across several stacks of cisco 2960x switches.
I have 3 SSIDs, across 3 vlans. APs have ip addresses set as static (dhcp to bring online then static assignements).
The longest "up time" I have been able to achieve is 4 days.
I have removed all QOS and other settings from the switchports in an effort to isolate the issue.
The APs will stay online for anywhere from 1 hour to 4 days. When they go offline the LED on the AP remains blue, the POE port is up. Cisco shows the ports as up/up and does not show any errors in counters or in the logs. The cable runs are certified cat6. This occurs no matter the switch or switch ports, and the current 10 AP test is deployed across 3 floors and 5 different 2960x stacks.
When the AP is offline, it cannot be reached by ping or any other IP protocol. The switchport MAC address table for the interface is blank. It is as if it is only taking power and is "frozen" with no OS responding.
The APs are all running 126.96.36.19973 with a cloudkey onsite (also static ip on same vlan as APs.
The cisco switch port configuration for each AP is identical and is very simple:
switchport trunk native vlan 50
switchport trunk allowed vlan 25,50,220,222
switchport mode trunk
Today I "caught" an AP as it had just gone offline, I was able to ping it and SSH to it even though unifi showed it as offline. I was able to do this for about 2 minutes before it stopped responding. When I tried the command info, it would not return anything., the command "help" returned the command options but none worked.The only other clue I have seen is that the memory over the last few minutes goes to 100% . I have only seen this (the memory and being able to reach the AP) once, as I am not constantly watching unifi and I think I just got lucky seeing it drop off.
Any ideas on what to try before I just take them out?
10-17-2018 01:06 PM
You didn't mention what controller version you are running. You may want to look at v.4.0.0 or v4.0.1. Understanding these are Beta, maybe try a one or two AP's. There are a bunch of fixes for the nano HD; https://community.ubnt.com/t5/UniFi-Beta-Blog/bg-p/Blog_UniFi_Beta
10-18-2018 05:23 AM - edited 10-18-2018 05:24 AM
I am running controller version 188.8.131.52 on the cloud key. Are you suggesting I back it to a 4.x version?
He's suggesting you upgrade the nanoHD to v4.x firmware.
May as well give it a try as we're all going to be on v4.x firmware soon.
10-18-2018 01:19 PM - edited 10-18-2018 01:21 PM
I have verified that band steering is disabled.
On my guest SSID I have disabled the "Block LAN to WLAN Multicast and Broadcast Data".
I have 2 AP running 184.108.40.20621
2 AP running 220.127.116.1152 (request from Mandy - tech support)
5 AP running 18.104.22.16873
1 AP running nanoHDemail@example.com (request from jeffreychang - UBNT support)
What could possibly go wrong?!?
All are sending syslogs and all have stayed up today.
10-22-2018 05:34 AM
The AP1 running 22.214.171.124.73 dropped offline at 6-7pm on Friday.
It is showing a blue ring on it. It shows that it is taking power. There is no mac address in the table for the switchport it is on (no network L2 connectivity/activity).
Attached is the last few minutes of syslog that it produced.
I have disabled the Uplink Connectivity Monitor and enabled debug logging at the site level.
Notice the 4-5min gaps in the syslog before the unit went off-network at 18:41-18:46 and 18:46-18:51.
10-22-2018 09:42 AM - edited 10-22-2018 09:42 AM
I have updated the case with the log files. I just saw the 126.96.36.19993 beta posted. I will put it on 2 access points. Some of the fixes listed include "improve ethernet stability" and "fix memory leak"... I am hopeful.
10-25-2018 06:59 AM
I have had several more nanoAP crash. Most just freeze, with the blue led on. Occasionally they will just restart (after about 8-20 minutes of not passing traffic).
I have had freezing with 188.8.131.5273 and 184.108.40.20621 while 220.127.116.1193 mostly seems prone to the reboot.
I noticed after adding a few more APs that the wifi radios were sitting at auto (disabled) for long periods of time after adaption, before choosing a channel and giving access to the SSIDs. In an attempt to solve that I have built a ubuntu controller and migrated off of the cloud key.
I have asked ubiquiti to elevate the case to the next level.
10-29-2018 05:08 AM
Another nanoHD offline again this weekend. This time it recovered itself. There seems to be a pattern, although it still may be too early to tell:
18.104.22.16893 will hang the accesspoint for 10-20 minutes (blue light, taking power, no wired network traffic at all) and then it will reset.
22.214.171.12473 will hang the accesspoint (blue light, taking power, no wired network traffic at all) and requires a powered restart (removing POE and resetting it).
the firstname.lastname@example.org seems to be solid. I have yet to see an AP hang while using it. (https://dl.ubnt-ut.com/albert/K-12/nanoHD-lan-fix-5.bin)
With that being said. My longest AP uptimes - 13days and 12days are both running 126.96.36.19973