Reply
Highlighted
New Member
Posts: 6
Registered: ‎10-26-2014
Kudos: 1

Edgerouter POE rebooting every few days...

Hello all,

 

I recently puchased an EdgeRouter POE for my network. It was running fine the first week and now it appears to reboot every 1-3 days. I have been unable to find any pattern. I've sifted through every log in /var/log and there does not seem to be any information being logged as why this is occuring. I was wondering if there was some area in the GUI or utility I can setup that comes with the hardware/software that might be able to monitor the situation.

 

Btw, everything was working fine with the default firmware, but after a week of using the unit I noticed there was an update(v. 1.5). I have plans on moving to the v1.6 beta in order to utilize the IPv6-PD additions. Until this update goes live on my hardware I was hoping to see if I had options to check so I can troubleshoot whether the actual unit needs to be swapped out. Or if software is in fact to blame.

 

I've barely had it for 15 days.

SuperUser
Posts: 21,755
Registered: ‎11-20-2011
Kudos: 7880
Solutions: 233

Re: Edgerouter POE rebooting every few days...

I have never seen or heard of this behavior.

 

Also had a pro running on 1.5 for several months, and moved to 1.6b3 about two weeks ago.



isp builder | linux sorcerer | datacenter automation conjurer | blog: blog.engineered.online
link to our slack channel on the blog
New Member
Posts: 6
Registered: ‎10-26-2014
Kudos: 1

Re: Edgerouter POE rebooting every few days...

I will admit that it is quite strange. Checking the standard linux logs doesn't show a point at which it was rebooted by software or crashed. I'm wondering if it's overheating. It does seem to run quite warm to the touch. Regardless, I've updated to v1.6.0rc3 as I was able to squeeze in some downtime and wanted to start playing around with the new features it provides, along with the IPv6-PD stuff. I'll post an update either "if" it reboots again or after it appears to have stabilized.

 

 

New Member
Posts: 6
Registered: ‎10-26-2014
Kudos: 1

Re: Edgerouter POE rebooting every few days...

Update #1:

 

This afternoon Unifi AP was inaccessible. Upon closer inspection the router went into an unusable state. No DHCP requests, couldn't ping interfaces, etc. A cold boot of the device did nothing to help. Ended up resetting the configuration and started from scratch. Let's see how things go from here on out.

 

Btw, upon config reset and reconfigure nothing was needed on the controller or AP end to get the network back up and running. So definately seems to be a router issue thus far.

SuperUser
Posts: 21,755
Registered: ‎11-20-2011
Kudos: 7880
Solutions: 233

Re: Edgerouter POE rebooting every few days...

Do you have a UPS on this unit? Does power seem stable?

 

That aside, I'd consider it RMA time if you don't think it's power related. First case *I* have heard about of a problem with anything but an erlite though.



isp builder | linux sorcerer | datacenter automation conjurer | blog: blog.engineered.online
link to our slack channel on the blog
New Member
Posts: 6
Registered: ‎10-26-2014
Kudos: 1

Re: Edgerouter POE rebooting every few days...

I don't plug in any device without at least a surge protector, but this setup currently is not UPS backed. I do have a server and cable modem hooked up to the same strip, and have never had an issue with anything but this router. The server uptime is now at 72 days and the modem has unfortunately needed a few power cycles for random things. New construction, power has always appeared clean in other areas, as UPS' never showed any sign of problems.

 

Currently 1.5 days uptime on the router without any occurences so I'll just wait and see. If I see another hiccup I do plan on RMA'ing. 

 

 

Member
Posts: 113
Registered: ‎03-15-2014
Kudos: 65
Solutions: 4

Re: Edgerouter POE rebooting every few days...

There has been discussions of this issue, but there hasn't been a definite trigger and/or solution to this problem.  I've experienced that same situation myself, check out this thread.

 

https://community.ubnt.com/t5/EdgeMAX/Suddenly-Restart/td-p/924504

Member
Posts: 113
Registered: ‎03-15-2014
Kudos: 65
Solutions: 4

Re: Edgerouter POE rebooting every few days...

Hi @UBNT-ancheng @UBNT-stig 

 

Yesterday for "no reason" the router booted itself again after working flawlessly for almost a month.  I know there are different threads regarding this issue, but I want to update with my latest information.  This is the second ER-PoE I own because the first one was replaced by RMA because of this issue.  The problem persists and since I have the router none of them lasts 2 months without the it being rebooted.   I don't see any crashlog available for the router, but I included the "show log" I currently have.  Notice that in this case the log showed the following:

 

Jun 1 10:00:11 mc-rtr kernel: ubnt_platform: module license 'Proprietary' taints kernel.
Jun 1 10:00:11 mc-rtr kernel: Disabling lock debugging due to kernel taint

 

Never seen that message in the other threads.  I would really appreciate is this reboot/panic situation can be taken care of because it's been happening to me since version 1.4.1 (when I purchased the router) and even though everything else works without problems, this makes the unit less reliable and it can be a show stopper for some people.  I've been following some of those threads to see if there is a fix/work around for the issue but all of them have the same outcome, the problem can't be identified. 

 

http://community.ubnt.com/t5/EdgeMAX/Edgerouter-Pro-frequent-Kernel-panic-amp-reset/m-p/1128965#M505...

 

http://community.ubnt.com/t5/EdgeMAX/Suddenly-Restart/m-p/1125127#M50102

 

http://community.ubnt.com/t5/EdgeMAX/NMI-watchdog/m-p/1125125#M50100

Previous Employee
Posts: 13,551
Registered: ‎06-10-2011
Kudos: 5429
Solutions: 1656
Contributions: 2

Re: Edgerouter POE rebooting every few days...

Actually those two log messages about the module are normal and should not be related. As discussed before those several previous reports actually may be different issues, and we have not been able to find a way to replicate them (even working with the reporting users using debug kernel etc.). Certainly it would be good to address such issues, but we'll have to find a way to reliably reproduce an issue in a controlled environment in order to feasibly look into fixing it.

New Member
Posts: 22
Registered: ‎12-04-2014
Kudos: 4

Re: Edgerouter POE rebooting every few days...

[ Edited ]

I am experience the rebooting problem and it has increased its frequency the week or so.  I suspect the router will go at most 1.5 days before reboot.  The logs (which are syslogged) on another server show when the reboots occured but not why - just a reset of the timestamp showing the reboot of the router. I just purchased a console cable and had to dig up an old windows 2000 machine with Hyperterminal... which is now monitoring the console.  Question, do I have to do anything special to monitor the router via the console to detect the reason for the reboots? Any specific commands?  Hyperterminal is connected and it will capture the output to a text file.

 

Please advise.

 

Oh, I am on 1.6.

 

Thanks!

New Member
Posts: 4
Registered: ‎12-15-2014

Re: Edgerouter POE rebooting every few days...

I did also experience this behaviour a couple of weeks ago. By this I mean a reboot every couple of days. And no obvious pattern of course.

 

But this has stopped for the time being. I have been running without any reboot for a full week now. The only thing was that I couldn't login to the web interface - it said wrong user/password. But SSH worked. So I had to reboot it via the CLI. And after that I am into a new full week of troublefree operation.

 

Stay tuned...

Emerging Member
Posts: 53
Registered: ‎11-26-2013
Kudos: 5
Solutions: 1

Re: Edgerouter POE rebooting every few days...

I have the same problem on 2 ER-pro5. All running the latest version of the software. I recall I didn't have the problem with previous versions. On both ER's I run load-balancing.

 

BR

Hans

Previous Employee
Posts: 13,551
Registered: ‎06-10-2011
Kudos: 5429
Solutions: 1656
Contributions: 2

Re: Edgerouter POE rebooting every few days...

Yeah as mentioned earlier, I've added a potential fix that we are testing now (so far so good), and it will be in the next alpha/beta release (which should be soon). Thanks for providing the additional information.

New Member
Posts: 22
Registered: ‎12-04-2014
Kudos: 4

Re: Edgerouter POE rebooting every few days...

[ Edited ]

Reboot again, but console caught something about NMI Watchdog.  I tried to add an attachment, but it did not take. The console log is below .....

 

BTW, at the previous reboot I disabled ip offloading as recommended by a number of others in the Edgemax forums.  Not sure it did anything, maybe actually increased the frequency of reboots.  But thats speculation ...

 

*** NMI Watchdog interrupt on Core 0x00 ***
$0 0x0000000000000000 at 0x0000000010108ce1
v0 0x0000000000000001 v1 0x0000000000010000
a0 0xffffffffc0010ae0 a1 0x0000000000000000
a2 0x0000000000006239 a3 0x0000000000000000
a4 0x0000000000000000 a5 0x800000041c1b8bd8
a6 0x0000000000000000 a7 0x0000000000000000
t0 0x0000000000000400 t1 0x000000000001c000
t2 0x000000000000000c t3 0x800000041d2e8000
s0 0x0000000000000000 s1 0xffffffffc00109f8
s2 0xffffffff805a0000 s3 0x0000000000000001
s4 0x0000000000000000 s5 0xffffffffc0010000
s6 0x0000000000000001 s7 0xffffffffc0010000
t8 0x0000000000000050 t9 0xffffffff801a2b80
k0 0x0000000000000000 k1 0x0000000000000001
gp 0x800000041cfd0000 sp 0x800000041cfd3c10
s8 0xffffffffc0017c30 ra 0xffffffffc00181c0
err_epc 0xffffffff804b5efc epc 0xffffffffc007a660
status 0x0000000010588ce4 cause 0x0000000040808c08
sum0 0x000000f000008000 en0 0x0100400500008000
*** Chip soft reset soon ***

*** NMI Watchdog interrupt on Core 0x01 ***
$0 0x0000000000000000 at 0x0000000000000001
v0 0x0000000000000001 v1 0x0000000000010000
a0 0xffffffffc0010ae0 a1 0x0000000000000009
a2 0x000000000000623a a3 0x800000000165f348
a4 0x8000000081143000 a5 0x0000000000000010
a6 0x0000000000000000 a7 0x0000000000000000
t0 0xffffffffffffffff t1 0x0000000000000000
t2 0x0000000000000000 t3 0x0000000073c51000
s0 0xffffffffc000da30 s1 0xffffffffc0010000
s2 0xffffffffc00151c0 s3 0x800000041d00e000
s4 0xfffffffffffffff4 s5 0x0000000000001000
s6 0x0000000073c41000 s7 0xffffffffc000d998
t8 0x0000000000080000 t9 0x0000000076e421a4
k0 0x0000000072b06930 k1 0x0f00000010715407
gp 0x800000041c5ec000 sp 0x800000041c5efd50
s8 0xffffffffc00151c0 ra 0xffffffffc00173bc
err_epc 0xffffffff804b5f04 epc 0xffffffff804b5ac4
status 0x0000000010588ce4 cause 0x0000000040808808
sum0 0x000000f000008000 en0 0x0000000100000000
*** Chip soft reset soon ***

Looking for valid bootloader image....
Jumping to start of image at address 0xbfc80000


U-Boot 1.1.1 (UBNT Build ID: 4567941-g15e9b5d) (Build time: Jun 4 2013 - 14:51:00)

BIST check passed.
UBNT_E100 r1:1, r2:24, f:8/135, serial #: 24A43C06CBAA
Core clock: 500 MHz, DDR clock: 266 MHz (532 Mhz data rate)
DRAM: 512 MB
Clearing DRAM....... done
Flash: 8 MB
Net: octeth0, octeth1, octeth2

USB: (port 0) scanning bus for devices... 1 USB Devices found
scanning bus for storage devices...
Device 0: Vendor: Prod.: USB DISK 2.0 Rev: PMAP
Type: Removable Hard Disk
Capacity: 3824.0 MB = 3.7 GB (7831552 x 512)
 0
reading vmlinux.64
............................

5567368 bytes read
argv[2]: coremask=0x3
argv[3]: root=/dev/sda2
argv[4]: rootdelay=15
argv[5]: rw
argv[6]: rootsqimg=squashfs.img
argv[7]: rootsqwdir=w
argv[8]: mtdparts=phys_mapped_flash:512k(boot0),512k(boot1),64k@1024k(eeprom)
ELF file is 64 bit
Allocating memory for ELF segment: addr: 0xffffffff80100000 (adjusted to: 0x100000), size 0x69dfd0
Allocated memory for ELF segment: addr: 0xffffffff80100000, size 0x69dfd0
Processing PHDR 0
Loading 54dd80 bytes at ffffffff80100000
Clearing 150250 bytes at ffffffff8064dd80
## Loading Linux kernel with entry point: 0xffffffff804aeb00 ...
Bootloader: Done loading app on coremask: 0x3
Linux version 3.10.20-UBNT (root@ubnt-builder2) (gcc version 4.7.0 (Cavium Inc. Version: SDK_3_1_0_p2 build 34) ) #1 SMP Thu Oct 16 16:29:39 PDT 2014
CVMSEG size: 2 cache lines (256 bytes)
Cavium Inc. SDK-3.1
bootconsole [early0] enabled
CPU revision is: 000d0601 (Cavium Octeon+)
Checking for the multiply/shift bug... no.
Checking for the daddiu bug... no.
Determined physical RAM map:
memory: 0000000007800000 @ 0000000000800000 (usable)
memory: 0000000007c00000 @ 0000000008200000 (usable)
memory: 000000000fc00000 @ 0000000410000000 (usable)
memory: 000000000050b000 @ 0000000000100000 (usable)
memory: 0000000000045000 @ 000000000060b000 (usable after init)
Wasting 14336 bytes for tracking 256 unused pages
software IO TLB [mem 0x01707000-0x01747000] (0MB) mapped at [8000000001707000-8000000001746fff]
Zone ranges:
DMA32 [mem 0x00100000-0xefffffff]
Normal [mem 0xf0000000-0x41fbfffff]
Movable zone start for each node
Early memory node ranges
node 0: [mem 0x00100000-0x0064ffff]
node 0: [mem 0x00800000-0x07ffffff]
node 0: [mem 0x08200000-0x0fdfffff]
node 0: [mem 0x410000000-0x41fbfffff]
Primary instruction cache 32kB, virtually tagged, 4 way, 64 sets, linesize 128 bytes.
Primary data cache 16kB, 64-way, 2 sets, linesize 128 bytes.
Secondary unified cache 128kB, 8-way, 128 sets, linesize 128 bytes.
PERCPU: Embedded 10 pages/cpu @8000000001784000 s11648 r8192 d21120 u40960
Built 1 zonelists in Zone order, mobility grouping on. Total pages: 126581
Kernel command line: bootoctlinux $loadaddr coremask=0x3 root=/dev/sda2 rootdelay=15 rw rootsqimg=squashfs.img rootsqwdir=w mtdparts=phys_mapped_flash:512k(boot0),512k(boot1),64k@1024k(eeprom) console=ttyS0,115200
PID hash table entries: 2048 (order: 2, 16384 bytes)
Dentry cache hash table entries: 65536 (order: 7, 524288 bytes)
Inode-cache hash table entries: 32768 (order: 6, 262144 bytes)
Memory: 499340k/513344k available (3810k kernel code, 14004k reserved, 1352k data, 276k init, 0k highmem)
Hierarchical RCU implementation.
NR_IRQS:255
Calibrating delay loop (skipped) preset value.. 1000.00 BogoMIPS (lpj=5000000)
pid_max: default: 32768 minimum: 501
Security Framework initialized
Mount-cache hash table entries: 256
Checking for the daddi bug... no.
SMP: Booting CPU01 (CoreId 1)...
CPU revision is: 000d0601 (Cavium Octeon+)
Brought up 2 CPUs
NET: Registered protocol family 16
bio: create slab <bio-0> at 0
SCSI subsystem initialized
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
Switching to clocksource OCTEON_CVMCOUNT
NET: Registered protocol family 2
TCP established hash table entries: 4096 (order: 4, 65536 bytes)
TCP bind hash table entries: 4096 (order: 4, 65536 bytes)
TCP: Hash tables configured (established 4096 bind 4096)
TCP: reno registered
UDP hash table entries: 256 (order: 1, 8192 bytes)
UDP-Lite hash table entries: 256 (order: 1, 8192 bytes)
NET: Registered protocol family 1
octeon_pci_console: Console not created.
/proc/octeon_perf: Octeon performance counter interface loaded
HugeTLB registered 2 MB page size, pre-allocated 0 pages
squashfs: version 4.0 (2009/01/31) Phillip Lougher
Registering unionfs 2.5.13 (for 3.10.34)
msgmni has been set to 975
io scheduler noop registered
io scheduler cfq registered (default)
Serial: 8250/16550 driver, 6 ports, IRQ sharing disabled
1180000000800.serial: ttyS0 at MMIO 0x1180000000800 (irq = 34) is a OCTEON
console [ttyS0] enabled, bootconsole disabled
console [ttyS0] enabled, bootconsole disabled
1180000000c00.serial: ttyS1 at MMIO 0x1180000000c00 (irq = 35) is a OCTEON
loop: module loaded
ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
OcteonUSB 16f0010000000.usbc: Octeon Host Controller
OcteonUSB 16f0010000000.usbc: new USB bus registered, assigned bus number 1
OcteonUSB 16f0010000000.usbc: irq 56, io mem 0x00000000
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 1 port detected
OcteonUSB: Registered HCD for port 0 on irq 56
usbcore: registered new interface driver usb-storage
octeon_wdt: Initial granularity 5 Sec
TCP: cubic registered
NET: Registered protocol family 17
NET: Registered protocol family 15
L2 lock: TLB refill 256 bytes
L2 lock: General exception 128 bytes
L2 lock: low-level interrupt 128 bytes
L2 lock: interrupt 640 bytes
L2 lock: memcpy 1152 bytes
Bootbus flash: Setting flash for 8MB flash at 0x1f400000
phys_mapped_flash: Found 1 x16 devices at 0x0 in 8-bit bank. Manufacturer ID 0x0000c2 Chip ID 0x0000c9
Amd/Fujitsu Extended Query Table at 0x0040
Amd/Fujitsu Extended Query version 1.1.
phys_mapped_flash: Swapping erase regions for top-boot CFI table.
number of CFI chips: 1
3 cmdlinepart partitions found on MTD device phys_mapped_flash
Creating 3 MTD partitions on "phys_mapped_flash":
0x000000000000-0x000000080000 : "boot0"
0x000000080000-0x000000100000 : "boot1"
0x000000100000-0x000000110000 : "eeprom"
Waiting 15sec before mounting root device...
usb 1-1: new high-speed USB device number 2 using OcteonUSB
usb-storage 1-1:1.0: USB Mass Storage device detected
scsi0 : usb-storage 1-1:1.0
scsi 0:0:0:0: Direct-Access USB DISK 2.0 PMAP PQ: 0 ANSI: 6
sd 0:0:0:0: [sda] 7831552 512-byte logical blocks: (4.00 GB/3.73 GiB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] No Caching mode page found
sd 0:0:0:0: [sda] Assuming drive cache: write through
sd 0:0:0:0: [sda] No Caching mode page found
sd 0:0:0:0: [sda] Assuming drive cache: write through
sda: sda1 sda2
sd 0:0:0:0: [sda] No Caching mode page found
sd 0:0:0:0: [sda] Assuming drive cache: write through
sd 0:0:0:0: [sda] Attached SCSI removable disk
kjournald starting. Commit interval 3 seconds
EXT3-fs (sda2): warning: maximal mount count reached, running e2fsck is recommended
EXT3-fs (sda2): using internal journal
EXT3-fs (sda2): recovery complete
EXT3-fs (sda2): mounted filesystem with journal data mode
VFS: Mounted root (unionfs filesystem) on device 0:11.
Freeing unused kernel memory: 276K (ffffffff8060b000 - ffffffff80650000)
Algorithmics/MIPS FPU Emulator v1.5
INIT: version 2.88 booting
INIT: Entering runlevel: 2
[....] Starting routing daemon: rib[?25l[?1c7[ ok 8[?25h[?0c^[[?1;2c^@^@^@^@^@^@^@^[[?1;2c^@^@^@^@^@^@^@.
[....] Starting EdgeOS router: migrate rl-system configure[?25l7[ ok 8[?25h.

New Member
Posts: 1
Registered: ‎04-23-2015
Kudos: 1

Re: Edgerouter POE rebooting every few days...

[ Edited ]

New here, same issue: New Edgerouter POE upgraded to 1.6 rebooting every 1-2 days of uptime, nothing in logs afterwards.  Using internal switch, and powering two access points. Just set up a serial console watcher waiting for next crash. 

 

Would be delighted to help test a fix.  Is the alpha/beta referenced above available yet? Found it

 

Edit: 1.7alpha appears to have cured the reboots

New Member
Posts: 13
Registered: ‎02-25-2015
Kudos: 2

Re: Edgerouter POE rebooting every few days...

Same problem with my Edgerouter lite.

Previous Employee
Posts: 13,551
Registered: ‎06-10-2011
Kudos: 5429
Solutions: 1656
Contributions: 2

Re: Edgerouter POE rebooting every few days...

As mentioned the fix in the alpha release also applies to ER Lite, so you could give it a try if feasible.

Emerging Member
Posts: 53
Registered: ‎11-26-2013
Kudos: 5
Solutions: 1

Re: Edgerouter POE rebooting every few days...

I am running  1.7 alpha 3, and my probleems seem to have disappeared.

 

BR

Hans

New Member
Posts: 13
Registered: ‎02-25-2015
Kudos: 2

Re: Edgerouter POE rebooting every few days...

It's good to hear! Thx!

New Member
Posts: 6
Registered: ‎10-26-2014
Kudos: 1

Re: Edgerouter POE rebooting every few days...

Hi all!

 

Sorry it seems like I had fallen off the face off the planet and didn't get back to this thread. RL went a little crazy on me.

 

Basically I had ended up pushing this router to the "get to it later" project bench and just recently started looking back into it. In the meantime I had been using a different brand to take its place for the time being. I see that people stated the v1.7 alpha had solved their issues and since the v1.7 release is now out I upgraded to it and am currently testing it out to see if it solves my issues as well. I'll report back in a week or two after thorough testing of it in a test environment.

 

Thanks for all the feedback. 

 

PS. Been enjoying my LR-AP since before my original post and never had issues. I hope to get the router in place soon so I can strictly have an UBNT setup. Cheers!   =}

Reply