03-12-2017 11:44 PM - edited 03-13-2017 02:17 AM
When my router starts up, the DHCP server fails to start with it. /var/log/messags points to an error in dnsmasq-dhcp-config.conf. (It can see the host attached to eth2)
Mar 12 23:03:01 erl2001 dnsmasq: error at line 27 of /etc/dnsmasq.d/dnsmasq-dhcp-config.conf Mar 12 23:03:01 erl2001 dnsmasq: FAILED to start up Mar 12 23:03:03 erl2001 dnsmasq: error at line 27 of /etc/dnsmasq.d/dnsmasq-dhcp-config.conf Mar 12 23:03:03 erl2001 dnsmasq: FAILED to start up Mar 12 23:03:51 erl2001 kernel: eth1: Link down Mar 12 23:03:54 erl2001 kernel: eth2: 1000 Mbps Full duplex, port 2
If I issue the following in the CLI, the DHCP server starts up successfully and the host gets an IP (on either interface)
set service dhcp-server use-dnsmasq disable commit set service dhcp-server use-dnsmasq enable commit
The contents of dnsmasq-dhcp-config with line 27 marked are
# # autogenerated by /opt/vyatta/sbin/dnsmasq-dhcp-config.pl on Sun Mar 12 23:29:15 PDT 2017 # dhcp-leasefile=/var/run/dnsmasq-dhcp.leases domain=bpp.lan ###shared-network LAN1 #subnet 192.168.1.0/24 dhcp-range=set:LAN1,192.168.1.128,192.168.1.191,255.255.255.0,86400 domain=bpp.lan,192.168.1.0/24,local dhcp-option=tag:LAN1,option:domain-name,bpp.lan dhcp-option=tag:LAN1,option:router,192.168.1.1 dhcp-option=tag:LAN1,option:dns-server,192.168.1.1 #static reservations for subnet 192.168.1.0/24 #end of subnet 192.168.1.0/24 ###end of shared-network LAN1 ###shared-network LAN2 #subnet 192.168.2.0/26 dhcp-range=set:LAN2,192.168.2.32,192.168.2.47,255.255.255.192,86400 domain=bpp.lan,192.168.2.0/26,local #<--- this is line 27 dhcp-option=tag:LAN2,option:domain-name,bpp.lan dhcp-option=tag:LAN2,option:router,192.168.2.1 dhcp-option=tag:LAN2,option:dns-server,192.168.2.1 #static reservations for subnet 192.168.2.0/26 #end of subnet 192.168.2.0/26 ###end of shared-network LAN2 #global settings depending on previous config dhcp-ttl=43200 #this is half the smallest lease time found dhcp-fqdn #we have default domain, so we can use dhcp-fqdn dhcp-authoritative #at least one shared-network was declared authoritative
The things is, line 27 is the same before and after restarting the DHCP server with dnsmasq enabled, so I'm guessing this is a timing issue.
Without even investigating why this is occuring, what is the best method for me to address this? I assume adding a script /config/scripts/post-config.d that uses /opt/vyatta/etc/functions/script-template to issue the above commands would work?? I'm interested in whether there are better or more robust approaches.
11-29-2017 01:58 AM
I discovered that the complaint about the improperly formatted line in /etc/dnsmasq.d/dnsmasq-dhcp-config.conf is actually simply because it is not an expected subnet.
@canufrank, in your example, you are using a /26 subnet and I was getting a "line 12" error that was otherwise the same. That particular VLAN was using 10.128.128.128/28 and causing it to choke. I changed the vlan interface and dhcp-server settings to something like 10.12.34.56/24 and 10.12.34.0/24 resepctively, and like magic the use-dnsmasq option worked without issue.
I think we could probably call this a bug, since I'm sure that dnsmasq should absolutely be able to use subnet masks that aren't 255.255.255.0. Or am I missing something that would prevent labelling as such?
12-17-2017 03:05 PM
After reading the documentation for dnsmasq's --domain option, it seems that this is moreso an EdgeOS bug actually. The option is working as expected, as the documentation notes that:
If the address range is given as ip-address/network-size, then a additional flag "local" may be supplied which has the effect of adding --local declarations for forward and reverse DNS queries. Eg. --domain=thekelleys.org.uk,192.168.0.0/24,local is identical to --domain=thekelleys.org.uk,192.168.0.0/24 --local=/thekelleys.org.uk/ --local=/0.168.192.in-addr.arpa/ The network size must be 8, 16 or 24 for this to be legal.
So it seems that if we are going to set domain=<fqdn>,<ipaddr/sub>,<local> then I think there needs to be a check for the subnet mask being 255.0.0.0, 255.255.0.0, or 255.255.255.0 when generating the /etc/dnsmasq.d config file. If the mask does not match /8, /16, or /24 then it should act accordingly.
It sounds like the local option being appended to the domain= string is what causes enforcement of those specific subnets. I wonder what the consequences of simply omitting local are for a domain= that is in fact , local?
Of course if you hand edit the generated dnsmasq config, it would get overwritten eventually and end in dnsmasq not starting. But I imagine this wouldn't be much of a change to the config generating script to make it check for, and react according to the subnet mask.
The simplest solution for the time being would probably be just to not set the domain in the dhcp-server configuration. This prevents a domain= line from being generated for that subnet.
delete service dhcp-service shared-network-name <network> subnet <ipaddress> domain-name
Where can I file a bug report for this? Is there an issue tracker somewhere?
01-16-2018 07:19 AM
Is there a place to file bugs for things like this? I cannot find a bug tracker.
To break this down, it is as simple as only being able to use /8, /16, of /24 if you have a domain set when using dnsmasq for DHCP as well. So the following would result in dnsmasq failing to start:
set service dhcp-service use-dnsmasq enable
set service dhcp-service shared-network-name TEST subnet 10.32.64.128/29 domain-name test.com
- use-dnsmasq for dhcp
- subnet being /29
- domain-name specified
- the local flag being added
The use of domain-name causes the script to include the domain= configuration line. But the above example would result in this:
Including the local flag at the end of that is what is causing this issue. The documentation for dnsmasq states that the local flag is only to be used with 255.0.0.0, 255.255.0.0, or 255.255.255.0 subnet masks. It seems this is a convenience option to provide forward and reverse DNS queries for the specified network.
Simply changing the above line to not include the local option allows the config to work right. This line below would work fine.
It seems to me that since the DHCP subnet must be set using CIDR notation, a simple check to see if that string ends in 8, 16, or 24 would be quite simple...
case $domainsettings in
(do scripty script to add the line as this)
(do scripty script to add the line as this)
Even being ghetto and adding this line below to the end of the script would work. Though I don't know if the script that does the work is actually bash... but you get the idea.
sed -i '/^\s*domain=.*\/\(8\|16\|24\),local$/!s/,local$//' /etc/dnsmasq.d/dnsmasq-dhcp-config.conf
Hopefully someone of authority sees this. I quite literally had to type it three times because I accidentally hit the browser back button twice... which was due to my vim addiction and the Vrome extension!
01-16-2018 07:31 AM
I just want to point out that I think it quite odd that 8, 16, and 24 are the only subnet masks that can be given the local flag. It seems to me that it exists to use for local domain and hostname resolution, since you are likely not to have A records for those things. Being named local and automatically creating forward/reverse lookups for the network would seem pretty supportive of that assumption.
Oddly it seems to line it up with the boundaries of Class A, B, and C ranges though. Is there significance to those subnet masks that I'm missing? Private IPv4 subnets don't even line up with that being 10.0.0.0/8, 172.16.0.0/12, and 192.168.0.0/24 for Class A, B, and C, respectively. Also, nothing prevents one from further breaking those down (ie. 10.20.30.40/16 or 172.26.10.0/22).
Does anyone have any further insight into this design decision? I think I must be missing something here and just don't get it. This is obviously not critical... just curious.
01-16-2018 09:33 AM
This appears to be an oddity in the way dnsmasq's maintainer chose to allow the use of "local" in the "domain=" lines. Please see my comments below to get what I'm saying here. I scoured the "support" section of this site, but I couldn't find a way to submit a case or open a bug as an end user, so I'm thinking this is my only option.
Please let me know if I'm doing something wrong or if I should be barking up a different tree. This is an easy fix, yet when it bites it can really suck pretty badly.
01-16-2018 05:47 PM
@gfunkdave, thank you kind wizard for dropping such knowledge! Who are you, who are so wise in the ways of science?
Are the forums where us end users can report issues or am I just totally missing a bug tracker or issue reporting page somewhere? At least I now know why I was talking to myself here now...
01-16-2018 06:12 PM
01-16-2018 07:29 PM
@gfunkdave, as a new-ish member of these threads, I appreciate you providing guidance through the local customs.
This isn't really that big of an issue. It only bites if you "use-dnsmasq enable" with a domain-name and less common subnet mask set. Apparently that hasn't been many folks. But in any case, I can see this being the wrong place with a poor title for sure!
Actually... maybe this should also be addressed in upstream dnsmasq? The "local" option for only 8, 16, & 24 masks seems silly. But our scripts should do the right thing too I think.