Reply
New Member
Posts: 28
Registered: ‎05-17-2014
Kudos: 6

HELP -=- USG Crash-lockup/isolates all AP's (TESTED on other USG & Replicated)

So I have a site with 6 unifi AP's on a large house with a USG in place. There is a Unifi NVR running the unifi NVR software and Unifi Software version 4.6.6.

 

The USG under Devices shows 4.2.9.4778536

 

On this site there are 26 to 30 clients on the network from workstations to multimedia devices including sonos, tivo, and other various entertainment data uses.

 

But basically between 1 and 6 days all Access points will go isolated, USG becomes very hot to the touch, unresponsive and can not be told to even reboot. Will not hand out IP's from the DHCP and if you happen to have ip info it will pass data or set a static and still get out to the Internet. But even on power cycle it will not come up it stays in a locked state. Nothing short of a factory reset will get the USG to respond.

 

Now I forget the USG from unifi and factory reset from the button at this point I set inform URL and re- adopt and anything that was isolated goes back to normal and every thing is back to normal.

 

So do I have a bad USG? well I happen to have gone out and purchased a second one. Same issue exactly. Moved two usg's from 2 other sites to test on this site and they all exhibit the same issue on this location. Also took the first two from the problem site and tested on same setup but different site and same issue. In fact I have 4 sites in total same unifi hardware but different clients on each site and just this one is problematic. 

 

So something on this network is causing the USG to go into this non responsive state and completely tank before needing a total factory reset and re-adoption. Keep in mind the top of the USG when in this state is also very hot. Hotter than any usg running on any other site.

"There are three things extremely hard: steel, a diamond, and to know one’s self". – Benjamin Franklin
New Member
Posts: 28
Registered: ‎05-17-2014
Kudos: 6

Re: HELP -=- USG Crash-lockup/isolates all AP's (TESTED on other USG & Replicated)

Saving.supp files after every crash at this moment. Right now the most recent crash happened only 3 days after re adopting a factory reset USG. Now the last time this happened only some unifi AP's went isolated and some did not. But as all ways DHCP was down and any new clients connecting to a AP did not get the DHCP server. Upon manually setting ip you could get out and pass data to the Internet via the crashed usg but otherwise unresponsive to SSH, Unifi server. Factory reset via reset button-> forget-> re-adopt USG-> back to normal.

"There are three things extremely hard: steel, a diamond, and to know one’s self". – Benjamin Franklin
New Member
Posts: 28
Registered: ‎05-17-2014
Kudos: 6

Re: HELP -=- USG Crash-lockup/isolates all AP's (TESTED on other USG & Replicated)

[ Edited ]

two more crashes now. I may have to remove the usg from this site and discontinue using them if this continues with no response from UBNT support even though I have submitted the issue.

 

Just FYI 5 Different USG's when added to this site exhibits the exact same behavior. 

"There are three things extremely hard: steel, a diamond, and to know one’s self". – Benjamin Franklin
Senior Member
Posts: 2,937
Registered: ‎03-25-2014
Kudos: 920
Solutions: 40

Re: HELP -=- USG Crash-lockup/isolates all AP's (TESTED on other USG & Replicated)

Paging @MAKuser / @UBNT-Devin

Ubiquiti Certified - UEWA / UCWA
Senior Member
Posts: 3,813
Registered: ‎12-21-2013
Kudos: 1089
Solutions: 86

Re: HELP -=- USG Crash-lockup/isolates all AP's (TESTED on other USG & Replicated)

Thanks David! That's a very interesting issue.

@fearstyle can you please tell me a little bit about your site, as for now i only know you got 6 UAPs.

How many networks/VLANs do you have? What's the throughput the USG has to handle? Which switches do you use?


It can't be, that the UAPs become isolated when the USG dies. As long as their Ethernet is still up, they will keep their DHCP'd IP, even after it expired, if there is no DHCP server that responds.


I got a site that is way larger than your little setup, and the USG performs pretty well. It has to route lots of internal traffic from about 70 VLANs (eg. Security Cams, lots of mFi sensors, house controlling devices talking to each other), but their throughput to WAN is only 32MBit/s max, as that's all we get at that location for now. So your issue is a bit special.

When you receive a solution to your question/issue, don't forget to mark your thread as solved and to give kudos to the people who have helped you out!

Ubiquiti Employee
Posts: 9,181
Registered: ‎01-28-2013
Kudos: 15821
Solutions: 605
Contributions: 20

Re: HELP -=- USG Crash-lockup/isolates all AP's (TESTED on other USG & Replicated)

[ Edited ]

Do you have a console cable? Would be interesting to see if you can communicate with it at any level via console (when it's in the problem state). I'm not asking you to go buy one, just asking if you have one.

 

Otherwise, can you setup a syslog daemon and set the controller to report to it? The USG log data will be in there (as well as any other UniFi hardware. If so, I would like you to set enable the debug level for syslog. Do note that the devices report directly to the syslog server, they aren't forwarded from the controller, so if configuring syslog on the same system as the controller please make sure to use the controller's LAN IP, not the loopback address.

 

Cheers,

Mike

New Member
Posts: 28
Registered: ‎05-17-2014
Kudos: 6

Re: HELP -=- USG Crash-lockup/isolates all AP's (TESTED on other USG & Replicated)

Single Network No VLANS and No VoIP Currently.

 

25mbit Cable modem 

 

Wifi clients there is no more than 25 users  roaming between AP's and there is 8 Sonos Devices and at least 4 more coming (I single out sonos only because sonos has proven problematicin the past with other clients but first time sonos and unifi are in the mix together  for me). In total wired devices with IP's  no more than 50 at a single time.

 

Devices include Three AP-LR, One AP-Outdoor, Twp AP Outdoor+

 

And yes by far this is a small site and smallest of them all.  But yet I keep getting this same issue, and Cant be is indeed happening and yes makes no sense because logically this behaviour should not happen. They indeed show isolated when the USG crashes. It is not always all of them some times 2 or 3.There is no pattern in what AP's go isolated when this happens. Because ive rotated 5 USG's trough this location that have no operating issues on other sites all exhibit the same issue. Making me wonder if its not a device or the unifi controller directly. 

 

Also in the 'crash' state the USG under devices will show offline. If you try to browse to it via a browser ala https://192.168.1.1/ not found, but SSH will accept and while it will take any command it wont respond to it for example if you issue a restart to reboot the device it will say now rebooting but never actually reboot. at this point I factory reset and instantly back after adoption.

 

Also in this 'crash' state not every thing can pass data even if it already has ip information from the dhcp server I have found. Some of the wired clients will loose the ability to push data out to the lan while others can.

 

Being ive moved USG's from other sites as well as one new out of the box USG's that do not exibit this behavior on the other sites but they do here.

 

 

 

"There are three things extremely hard: steel, a diamond, and to know one’s self". – Benjamin Franklin
New Member
Posts: 28
Registered: ‎05-17-2014
Kudos: 6

Re: HELP -=- USG Crash-lockup/isolates all AP's (TESTED on other USG & Replicated)

Controller is running on a NVR appliance

 

Will set that up now and get back to you with the log 

 

- S

"There are three things extremely hard: steel, a diamond, and to know one’s self". – Benjamin Franklin
Senior Member
Posts: 3,813
Registered: ‎12-21-2013
Kudos: 1089
Solutions: 86

Re: HELP -=- USG Crash-lockup/isolates all AP's (TESTED on other USG & Replicated)

Please connect to the USG and issue dmesg and post the output here. Also the output of top would be good.

When you receive a solution to your question/issue, don't forget to mark your thread as solved and to give kudos to the people who have helped you out!

Established Member
Posts: 860
Registered: ‎09-01-2014
Kudos: 383
Solutions: 51

Re: HELP -=- USG Crash-lockup/isolates all AP's (TESTED on other USG & Replicated)

Are any of the Sonos devices Wired or are they all Wireless?

 

I had problems on my network recently - all my Sonos units are wired and until I disabled the built-in WiFi on each one they'd cause havoc on the network.

If you found this post helpful feel free to sprinkle some Kudos!
Senior Member
Posts: 3,813
Registered: ‎12-21-2013
Kudos: 1089
Solutions: 86

Re: HELP -=- USG Crash-lockup/isolates all AP's (TESTED on other USG & Replicated)

That should not have happened. Please try it again and otherwise submit a ticket on their helpdesk to tell them about it. Normally the Sonos devices disable their wireless once they've detected an Ethernet link.
I'm a Sonos tester, and even with such "unstable" firmware and software, I've never had this problem. I also have lots of Sonos hardware in different sites and mostly use them wirelessly. I like using some of them as portable devices that I can take with me to other sites, where I always have my SSID broadcasted as well.

When you receive a solution to your question/issue, don't forget to mark your thread as solved and to give kudos to the people who have helped you out!

Established Member
Posts: 860
Registered: ‎09-01-2014
Kudos: 383
Solutions: 51

Re: HELP -=- USG Crash-lockup/isolates all AP's (TESTED on other USG & Replicated)

I just disabled the WiFi on them. Seems it's not that uncommon;

 

https://community.ubnt.com/t5/UniFi-Routing-Switching/Unifi-48-Port-750-PoE-Switch-What-is-a-blocked...

If you found this post helpful feel free to sprinkle some Kudos!
Previous Employee
Posts: 90
Registered: ‎01-23-2014
Kudos: 41
Solutions: 10

Re: HELP -=- USG Crash-lockup/isolates all AP's (TESTED on other USG & Replicated)

@fearstyleCan you send the supp file to me?  I think that it did crash~~  Do you know any special traffic on the network, or any special activities are performed before crash?

New Member
Posts: 28
Registered: ‎05-17-2014
Kudos: 6

Re: HELP -=- USG Crash-lockup/isolates all AP's (TESTED on other USG & Replicated)

[ Edited ]

Just before the 'crash' event happens in the logs of the unifi controller you start to see a lot of clients being dropped and reconnected. There are 3 sonos boxes that will have a higher count in the disconnect/reconnect loop as opposed to other devices. But other wise there are Windows VPN going out standard user traffic. Nothing crazy. Sonos is one thing that pops to mind, there is a amazon echo in the building, pool controller, um mfi is on the network, tivo, couple arcade cabinets, and a bowlmore bowling lane controller as well. Nest, Smart Things, think that's about the major stuff. Right now 3 Unifi Cams will be more soon. Oh there's a Tesla that jumps on the wifi too aside from the usual ipads, PC's, ect

 

Now I am aware that sonos in some cases when Ethernet is plugged in and wifi attempts to connect will cause an ap to go isolated because sonos didn't disable the nic interface fast enough or at all, but the few sonos boxs that are on wifi do not have Ethernet connected at all as wifi was preconfigured while tethered than moved and is the reason they are on wifi. If CAT is available wifi is disabled on the sonos device. Whomever wired the place used CAT5 to all the RJ11 jacks in the house so ive been slowly converting them to RJ45 for data runs and there is 4 NETGEAR switches. One that is the master to facilitate all the primary runs. There is a small one in the game room for the arcade cabinets, One by the Bowling alley, Pool House, ect

 

I am onsite today and set up a syslog and is logging. I am going to wait for a 'crash' event to happen before posting the log. Thus far we are 1d8hr uptime since last event.

 

This site is a very large residence and client left this weekend for vacay in spain so for next two weeks site traffic will be considerably lower.

 

for MAXUSER here is that output attached 

 

Pio Emailed you right now with a GDRive link to a folder that has the last 3 supp files, did not save the last 2 being i was just testing the other 2 usg from the other sites just to see if I can make it happen again.

 

 

 

 

"There are three things extremely hard: steel, a diamond, and to know one’s self". – Benjamin Franklin
Senior Member
Posts: 3,813
Registered: ‎12-21-2013
Kudos: 1089
Solutions: 86

Re: HELP -=- USG Crash-lockup/isolates all AP's (TESTED on other USG & Replicated)

[ Edited ]

@FEARstyle wrote:

for MAXUSER here is that output attached 


Yeah, I'm really cursed with that name.... Man Very Happy So far, I already saw lots of different variations of MAKuser, but MAXuser was not amongst them, yet. Man Tongue

 

dmesg doesn't help if its full of firewall logs. I was looking for special kernel messages when the crash/lockup actually happens.

 

I assume 192.168.1.30 is your controller?

When you receive a solution to your question/issue, don't forget to mark your thread as solved and to give kudos to the people who have helped you out!

New Member
Posts: 28
Registered: ‎05-17-2014
Kudos: 6

Re: HELP -=- USG Crash-lockup/isolates all AP's (TESTED on other USG & Replicated)

ndeed it is

"There are three things extremely hard: steel, a diamond, and to know one’s self". – Benjamin Franklin
New Member
Posts: 28
Registered: ‎05-17-2014
Kudos: 6

Re: HELP -=- USG Crash-lockup/isolates all AP's (TESTED on other USG & Replicated)

[ Edited ]

Guess what.... crash event

 

and syslog is currently being flooded with the following over 22,000 entries and only diff between each entry is the line # is diff each time. Note that it also crashed the syslog server. in less than a min a I have 12mb log with just this:

 

 

Aug  4 12:47:13 ubnt dnsmasq[3599]: bad address at /etc/hosts line 12908	8/4/2015 12:47:45 PM

but before that I see the USG showing a few curious things one is that it set-inform changed back to default

 

MsgB
"Aug  4 12:46:51 (BZ2LR,24a43c7c126b,v3.2.12.2920) syslog: ace_reporter.reporter_fail(): Timeout (http://unifi:8080/inform)
"
"Aug  4 12:46:51 (BZ2LR,24a43c7c126b,v3.2.12.2920) syslog: ace_reporter.reporter_fail(): inform failed #1 (last inform: 36 seconds ago), rc=4
"
"Aug  4 12:46:53 (U2O,00272274a5c8,v3.2.12.2920) syslog: ace_reporter.reporter_fail(): Timeout (http://unifi:8080/inform)
"
"Aug  4 12:46:53 (U2O,00272274a5c8,v3.2.12.2920) syslog: ace_reporter.reporter_fail(): inform failed #1 (last inform: 38 seconds ago), rc=4
"
"Aug  4 12:46:54 (BZ2LR,24a43c7c135d,v3.2.12.2920) syslog: ace_reporter.reporter_fail(): Timeout (http://unifi:8080/inform)
"
"Aug  4 12:46:54 (BZ2LR,24a43c7c135d,v3.2.12.2920) syslog: ace_reporter.reporter_fail(): inform failed #1 (last inform: 36 seconds ago), rc=4
"

 

but again I can still ssh into the usg and when i hit info I get the correct ip info as 

 

 

admin@ubnt:~$ info

Model:       UniFi-Gateway-3
Version:     4.2.9.4778536
MAC Address: 04:18:d6:f1:94:31
IP Address:  173.54.11.XX
Hostname:    ubnt
Uptime:      533443 seconds

Status:      Connected (http://192.168.1.30:8080/inform)

So I just tried to do set-inform from via ssh and responds with no errors but no change in state.

 

Just forgot usg -> factory reset -> now waiting for adoption

 

only change during this crash event is that instead of isolated we have all ap's showing disconnected. But I can see the SSID and am connected via an android device and can get out to the Internet but once I disconnect no new IP info is handed out.If static is set I can get to the net.

 

going to get the log up on GDrive standby

 

 

 -= EDIT =-

 

One more observation in the controller UI in alerts it only showed one AP disconnected and in Devices it showed all disconnected.

 

 

 

 

 

 

"There are three things extremely hard: steel, a diamond, and to know one’s self". – Benjamin Franklin
New Member
Posts: 28
Registered: ‎05-17-2014
Kudos: 6

Re: HELP -=- USG Crash-lockup/isolates all AP's (TESTED on other USG & Replicated)

[ Edited ]

Just FYI before adopting again syslog is back up to continue logging

 

Syslog is here

 

 

"There are three things extremely hard: steel, a diamond, and to know one’s self". – Benjamin Franklin
Senior Member
Posts: 3,813
Registered: ‎12-21-2013
Kudos: 1089
Solutions: 86

Re: HELP -=- USG Crash-lockup/isolates all AP's (TESTED on other USG & Replicated)


@FEARstyle wrote:

Just FYI before adopting again syslog is back up to continue logging

 

Syslog is here

 

 


thats the wrong file.

When you receive a solution to your question/issue, don't forget to mark your thread as solved and to give kudos to the people who have helped you out!

New Member
Posts: 28
Registered: ‎05-17-2014
Kudos: 6

Re: HELP -=- USG Crash-lockup/isolates all AP's (TESTED on other USG & Replicated)

that was entirely my fault, Man Happy

 

https://goo.gl/IIpjcY <- syslog

"There are three things extremely hard: steel, a diamond, and to know one’s self". – Benjamin Franklin
Reply