03-18-2019 09:17 AM
I am bringing this to the community forum now because UBNT has clammed up on me and hasn't done anything to help with this problem. Perhaps others have the problem as well. Perhaps UBNT will get some skin in this game at some point.
I have a large distributed Unifi network all over the Northern part of my state. There are around 130 UniFi devices and many,many AC and M decices used for backhauls. The sites are almost all the identical: Mikrotik CCR1009-7G-1C-1S+ router. A few older ones are the 8G version with 8 Ethernet ports.
For APs I used the original Outdoor AP 2GHz ones . Abiout 4 years ago I upgraded t the 2GHz Outdoor+ units . and then last Spring I began upgrading to the Outdoor AC Mesh Pro units. Although years ago I originally tried the mesh feature, all of them now each have their own backhauls. I was pretty much the poster child for Outdoor Unifi and have been doing outdoor WiFi since the 2006 Tropos & Earthlink days.
I do not use the Unifi captive portal, I use an external cloud hosted accounting and RADIUS system which works well for me. The captive portal is provided by the hotspot function of the Mikrotiks as is the DHCP.
After upgrading to the Mesh AC Pros, I started noticing a strange message in my logs occasionally. It got worse and worse the more sites I upgraded. The message was similar to this: dhcp1 offering lease 10.22.5.99 for 00:28:F8D:96:3B without success. I figured it was just something about someones laptop, phone or smart TV in the facility. Eventually I noticed that it was only happening on the newly installed units. It wasn't possible to find out where these users were and since they never got an IP I couldn't tell anything from their DHCP host name entries either. Users hadn't started complaining.
It got worse and worse the more I upgraded. I can't remember if I did a firmware upgrade on the units from the controller or not but I guess I probably did. FINALLY I received a complaint. It was a long time customer with a "smart thermostat". Sure enough he was on a newly replaced AP. I did everything I could think of. I even tried to assign him a static address, but the device didn't offer that feature. As a lark, I began downgrading firmware on the AP. There aren't that many older firmwares on the archive and even last year all of them weren't there but I kept going back one release at a time. When I got to the oldest remaining available firmware (for the mesh pro), I installed it and the issue went away. Device got an IP and customer was ecstatic.
I downgraded all the others and all those DHCP errors disappeared. The working version is
Link to firmware file which isn't even in the "past firmware" section any longer. It was released on 12/7/2016 . So I am having to run firmware that is over 2 years old and full of who knows how many bugs and vulnerabilities. Here is the link to the release announcement from @UBNT-MikeD (the last working firmware I could find) . FIRMWARE ANNOUNCEMENT-3-7-29-5446
I opened a ticket with UBNT in the Fall of 2018. I got a few cursory responses from people who didn't understand the problem and then the responses stopped. I called and Emailed everyone I knew but of course you all know the difficulties in communicating with their support. Finally I went around that and contacted someone in their corporate level management and they passed the word along. I got an Email from someone who had been "specially directed" to own the problem but then barely heard anything from them. I update the ticket with a request for an update about every month, but they never respond.
I was ready to just chuck UniFi and UBNT all together but have decided to give them a chance to fix what they broke between 2016 and now. It is clearly an issue with passing DHCP to clients of the AP and they should be able to fix it since I know closely which version the problem began with. This is a shame because I am a 100% UBNT shop as far as radios go and have been using them since they first showed up on the scene.
03-18-2019 09:43 AM
While I can't speak to any issues with using official support, let's give it a crack here on the forums. There are some rather veteran members here who have encountered about everything you can with a UniFi network.
So to start off, what controller version are you running, and is it local to the network those devices are on, or is it a cloud installation or some sort of remote controller?
Also some of the configuration in use will be helpful, both for the UniFi controller and the Mikrotik devices.
03-18-2019 09:51 AM
@rwf Sorry for your mishaps here.. Lets try to get you fixed it up.
Could you post a screenshot of your system config screen? Setting>Maintenance>Show System Config.
03-18-2019 10:11 AM
Thanks Jeff, @UBNT-JaM
Here is a screen shot attached. I included the first 4 APs.
You may notice that I haven't downgraded the first AP yet. It is because I haven't observed any (complaining) customers with the issue there and because it is quite a far drive, I prefer to do it when I am local because UniFi APs don't always come back after that much of a downgrade.
I am running 5.8.24 on the controller because I really don't want to disturb the repeatable nature of this issue by changing controller version. At the moment I can replicate the issue by upgrading the AP of the thermostat guy and "cure" it by downgrading him, so I prefer not to change things drastically.
My controller in my data center and all the APs are at remote sites. They receive option 43 from their respective Mikrotiks and that is how they talk back over the Internet.
In case I didn't say it, the problem is at all my sites and not just a particular one.
03-18-2019 10:15 AM
I think I included most of what you asked in the other reply. The Mikrotiks are just your standard MTs with hotspot. One IP pool and the MT issues option 43 as well. No weird VLANs or anything. All of them have worked well for years until I started using the MESH AP Pros. It works well as long as I use the old Mesh AP Pro firmware from 2016.
03-18-2019 10:43 AM
I don't use UBNT switches or a USG so I guess not. I think that is a feature they have.
I do use DHCP alerting on the MT, but it is passive and just tells me if it sees one. I have seen one or two in all these years. One of them was a real pill to track down.
03-18-2019 10:57 AM - edited 03-18-2019 10:59 AM
If I recall correctly, it is also a feature the WAPs can use, so it might be good to just do a quick verification if DHCP guarding is set at all. This sounds like it is likely not the source of your issue though.
EDIT: Also double check if you have any "Block LAN to WLAN Multicast and Broadcast Data" set on your SSIDs. Its a good setting when used correctly, but you need to include EVERYTHING that needs to be able to broadcast from LAN to WLAN, including DHCPs.
03-18-2019 11:16 AM
I apologize for the negative support experience. Definitely could have handled this case better but we'll make sure to get you straightened away.
If it's only one or a select few clients there probably is probably an issue on the device side and that's usually just a matter of finding out how to configure things to get it functioning.
I think if you can get a packet capture, see what's happening to DHCP traffic on newer firmware you can find the source of the issue. Given the scale of the networks using the same hardware as yours without DHCP issues, my guess is there's something unique in your environment/network that's contributing to the DHCP failures, which is most often more on the router/dhcp server side than the APs. If the packet capture shows the AP isn't passing the DHCP traffic that can help us narrow things down.
I would recommend upgrading to newer firmware on a few devices and testing to see which devices have issues and which don't and make sure it is consistently occurring.
Check out our ever-evolving Help Center for answers to many common questions!
03-18-2019 11:50 AM
That feature is enabled. I think it is a default for it to be on. I don't have any traffic that needs to go between WAPs and I don't do any multicast stuff. I actually discourage users from trying to use Chromecast or wireless printers.
These are all subscribers on yachts and houseboats who buy anywhere from a daily to yearly hotspot type of connection and use a captive portal to log in. If they have a lot of devices or they want a "home type connection" so they can use printers, Chromecast, or not have to go through the portal, they have to buy a CPE and Aircube from me that is on a seperate complete wireless network back to the site router.
I have never seen DHCP to have to worry about multicast, and besides, 99% of the DHCP offerings work fine.
I could uograde the thermostat guy and then see if the errors return and then see if they stop if I turn that off, but leving that on would be costly in overhead due to the slow beacons as well as any peer to peer trffic that my network would have to carry for those few devices.
03-18-2019 12:02 PM
I hear what you are saying but this:
1. Yes it is a few or select clients. However I have only been able to identify one particular one. On the others, once I downgrade all the APs then I mostly don't ever see the client ever sign up for service. It gets an IP, sure- but they may be "drive by's" or users who have already decided my service stinks and have written me off because they can never "connect". There are hundreds of devices at each site and many of them are just devices looking for an open connection automatically with no human interaction. I don't want to deny andy person at all from getting an IP and having the chance to purchase my service. These are close knot communities and word travels instantly if someone hs a bad experience. I count every refused DHCP offering as a bad experience for some unknown person.
2. Again I don't think it is a device problem. If it was, why would it go away when I downgrade to 5446? Also It isn't my job to configure user's devices. Especially since I don't know who they are since they can't get an IP and certainly can't subscribe if they wanted to.
We could get a packet capture but it would have to be done on the client side by the Mikrotik, because I am not physically near any of the sites. To visit all of them is over a day's drive. I'm not sure how to do it on the MT, although I'm sure I can find something online.
3. Remember, there are about 9 seperate sites, seperate routers, seperate DHCP servers and all worked fine before. And they work fine whan I go to 5446 firmware.
04-08-2019 05:29 PM
I had some strange DHCP problems as well. Some (in fact all ;-) windows 7 and 8 machines did not get an IP address anymore after a UniFi update. Windows 10 machines where able to connect without a problem. I am using a mikrotik router RB1100ahx4.
After changing the the router’s DHCP authoritative option from “after 2s delay” to “yes” the DHCP timeouts are gone and the windows 7/8 machines do connect again. Hope this helps a bit.
04-10-2019 08:24 AM
Good luck with this, I have the exact same problem, here is m y thread, https://community.ubnt.com/t5/UniFi-Wireless/DHCP-addresses-not-being-assigned-out-through-AC-PRO/m-...
Expecting zero response from support,even though I can duplicate the issue. Whatever you do, don't upgrade your software controller. I had an older version controller and was running 3.7.x firmware on AP's for over a year or two. Upgraded the controller and firmware to the latest, now nothing gets a dhcp address. I can only downgrade to 4.0.09, which still has the issue. Have tried beta firmware also, nothing fixes the problem. So, short of downgrading controller and firmware, I am screwed. Luckily I had the issue crop up with only 8-9 ap's deployed. We were going to roll out another 50 to replace all of our older Cisco AP's over the next two months. That is not going to happen now, or probalby ever if Ubiquti can't be bothered to even look at the problem.
04-10-2019 10:47 AM
Not upgrading old controller and firmware versions is not recommended due to security fixes. Finding a solution to make the configuration work is necessary. I'll head over to that thread and take a look.