Reply
Member
Posts: 211
Registered: ‎05-21-2013
Kudos: 264
Solutions: 8
Contributions: 3

Re: EdgeMax rescue kit (now you can reinstall EdgeOS from scratch)

Perhaps I should make it a "Reboot now? Y/N" dialog.

Also, will look into the mkfs oddities. It's been a while since the last release, probably it's time to review and improve it.

New Member
Posts: 19
Registered: ‎07-03-2014
Kudos: 1

Re: EdgeMax rescue kit (now you can reinstall EdgeOS from scratch)

That would be awesome.....and a huge THANK YOU!!!!  You saved me from having to do two RMA's (I also need these routers for an install tomorrow).

New Member
Posts: 20
Registered: ‎08-06-2013
Kudos: 11

Re: EdgeMax rescue kit (now you can reinstall EdgeOS from scratch)

HI

the upgrade from 1.50.beta2 to 1.5.0 bricked my ERL. Searching around i found this thread. But wouldnt it be faster by just dd an image onto a drive like shown here : http://rtfm.net/FreeBSD/ERL/ -i just tried it with another stick and it workd nice. But the system images from UBNT cannot be flashed like this . At least i didnt find a tutorial and the content is the kernel and a squasfs so i jut wanted to inform you that is is easier that you just plug in the stick into your computer and do the following.

 

--- everything done on MacOSX  with the orginal USB stick ---

Open the ERL with a Philips Screwdriver

unplug the USB stick

plug it into your mac

untar the ER image.

cp vmlinux.tmp /Volumes/Untitled/vmlinux.64

cp squashfs.tmp /Volumes/Untitled/squasfs.img

replug into your ERL and reboot.

--------

 

Now your are back. Interesting wise it still mentions that im on 1.5.0beta2 even that i copied the final to the stick.

 

Hope this helps also others

Member
Posts: 211
Registered: ‎05-21-2013
Kudos: 264
Solutions: 8
Contributions: 3

Re: EdgeMax rescue kit (now you can reinstall EdgeOS from scratch)

>Open the ERL with a Philips Screwdriver

This was the point: technically opening it voids the hardware warranty. For some it's not an issue, for others it is.

 

Also, I'm not really sure why it worked for you (unless the instructions omit something). The kernel and the squashfs image are on different partitions with different filesystem type.

 

Also #2, this is not enough. You need to copy the version file too, or you will be unable to upgrade via "add system image". The script manages those small details for you.

 

 

New Member
Posts: 20
Registered: ‎08-06-2013
Kudos: 11

Re: EdgeMax rescue kit (now you can reinstall EdgeOS from scratch)

[ Edited ]

Hi

as the screws are not secured i don't see a warrenty void.

 

My current investigation show that the upgrade tries to do the upgrade but it fails.

Then i have 2 possibilities how the ERL is left after the failure.

 

1) nothing on the FAT16 boot partition

2) a renamed kernel: vmlinux.64o --> which doesnt fit the the settings for the bootcmd

 

 

I will try the squasfs and vmlinux from the 1.5.0beta2 together with the version file . lets see i this helps.

During the upgrade process you can see that the sda1 partition is mounted and all the upgrade stuff is hapening. Then you get the error message and the sda1 is dismounted with the wrong or no content.

 

 

Thanks for the hint.

New Member
Posts: 11
Registered: ‎09-06-2013
Kudos: 4
Solutions: 1

Re: EdgeMax rescue kit (now you can reinstall EdgeOS from scratch)

@sxpert 

Thank you for your shell script. I "repaired" 2 Edgerouter with your script today.

New Member
Posts: 8
Registered: ‎10-08-2014

Re: EdgeMax rescue kit (now you can reinstall EdgeOS from scratch)

dmbaturin you are the man!  I used this to quickly "unbrick" my ERL.

New Member
Posts: 18
Registered: ‎07-08-2014
Kudos: 1
Solutions: 1

Re: EdgeMax rescue kit (now you can reinstall EdgeOS from scratch)

+1 saved me from RMA. Smiley Very Happy

New Member
Posts: 13
Registered: ‎01-22-2013
Kudos: 17
Solutions: 1

Re: EdgeMax rescue kit (now you can reinstall EdgeOS from scratch)

Thanks very much for this....... saved my butt!

New Member
Posts: 33
Registered: ‎02-04-2014
Kudos: 3
Solutions: 1

Re: EdgeMax rescue kit - my experience rescuing a corrupt drive

So yesterday I went to make some changes to my ER Lite (was going to enable SFlow) however it was complaining about saving the settings (I forget the error). Then I tried to issue a reboot and it wouldn't reboot. I ran dmesg and I saw a bunch of squashfs errors. Unfortunately I didn't record the output for this.

Next I issued a shutdown and this responded and the unit shut down.

 

This is when the fun began. The unit wasn't booting up anymore. I connected my console cable and got this:

Looking for valid bootloader image....
Jumping to start of image at address 0xbfc80000


U-Boot 1.1.1 (UBNT Build ID: 4493936-g009d77b) (Build time: Sep 20 2012 - 15:48:51)

BIST check passed.
UBNT_E100 r1:2, r2:14, serial #: DC9FDB8012D5
Core clock: 500 MHz, DDR clock: 266 MHz (532 Mhz data rate)
DRAM:  512 MB
Clearing DRAM....... done
Flash:  4 MB
Net:   octeth0, octeth1, octeth2

USB:   (port 0) scanning bus for devices... 1 USB Devices found
       scanning bus for storage devices...
  Device 0: Vendor:          Prod.: USB DISK 2.0     Rev: PMAP
            Type: Removable Hard Disk
            Capacity: 3700.6 MB = 3.6 GB (7579008 x 512)                      0
reading vmlinux.64
............................

5683792 bytes read
argv[2]: coremask=0x3
argv[3]: root=/dev/sda2
argv[4]: rootdelay=15
argv[5]: rw
argv[6]: rootsqimg=squashfs.img
argv[7]: rootsqwdir=w
argv[8]: mtdparts=phys_mapped_flash:512k(boot0),512k(boot1),64k@3072k(eeprom)
ELF file is 64 bit
Allocating memory for ELF segment: addr: 0xffffffff80100000 (adjusted to: 0x100000), size 0x59c5f0
Allocated memory for ELF segment: addr: 0xffffffff80100000, size 0x59c5f0
Processing PHDR 0
  Loading 567400 bytes at ffffffff80100000
  Clearing 351f0 bytes at ffffffff80667400
## Loading Linux kernel with entry point: 0xffffffff804585b0 ...
Bootloader: Done loading app on coremask: 0x3
Linux version 3.4.27-UBNT (ancheng@ubnt-builder2) (gcc version 4.7.0 (Cavium Inc. Version: SDK_3_0_0 build 16) ) #1 SMP Thu Jun 19 18:21:34 PDT 2014
CVMSEG size: 2 cache lines (256 bytes)
Cavium Inc. SDK-3.0
bootconsole [early0] enabled
CPU revision is: 000d0601 (Cavium Octeon+)
Checking for the multiply/shift bug... no.
Checking for the daddiu bug... no.
Determined physical RAM map:
 memory: 0000000007800000 @ 0000000000700000 (usable)
 memory: 0000000007c00000 @ 0000000008200000 (usable)
 memory: 000000000fc00000 @ 0000000410000000 (usable)
 memory: 0000000000047000 @ 0000000000629000 (usable after init)
Wasting 88312 bytes for tracking 1577 unused pages

 That's where the output stops. At this point I was read to file an RMA. However while searching on the forum I came across this thread and decided to try the rescue script.

I got it running and it reported some concerning informatino:

Loading EMRK 0.9a
Mounting filesystems
Bringing up eth0

Checking boot partition
Boot partition looks intact
Attempting to mount boot partition
Boot partition successfully mounted
Looking for kernel file
Found a kernel
Checking kernel MD5 sum file
Found kernel MD5 sum file
Checking kernel MD5 sum
Kernel MD5 sum is not correct! Your kernel may be corrupted.

Checking root partition
Root partition looks intact
Attempting to mount root partition
kjournald starting.  Commit interval 5 seconds
EXT3-fs warning: mounting fs with errors, running e2fsck is recommended
EXT3 FS on sda2, internal journal
EXT3-fs: recovery complete.
EXT3-fs: mounted filesystem with writeback data mode.
Root partition successfully mounted
Looking for system image file
Found a system image file
Checking system image MD5 sum file
Found system image MD5 sum file
Checking system image MD5 sum
System image MD5 sum is not correct! Your image may be corrupted.

 So I decided to try a emrk-reinstall.

EMRK>emrk-reinstall
WARNING: This script will reinstall EdgeOS from scratch
If you have any usable data on your router storage,
it will be irrecoverably destroyed!
Do you want to continue?
yes or no: yes
Unmounting boot partition
Unmounting root partition
Re-creating partition table
sd 0:0:0:0: [sda] Unhandled sense code
sd 0:0:0:0: [sda] Result: hostbyte=0x00 driverbyte=0x08
sd 0:0:0:0: [sda] Sense Key : 0x3 [current]
sd 0:0:0:0: [sda] ASC=0x11 ASCQ=0x0
sd 0:0:0:0: [sda] CDB: cdb[0]=0x28: 28 00 00 73 a5 68 00 00 08 00
end_request: I/O error, dev sda, sector 7578984
Error: Input/output error during write on /dev/sda
Creating boot partition
Error: /dev/sda: unrecognised disk label
Formatting boot partition
mkfs.vfat 3.0.9 (31 Jan 2010)
Creating root partition
Error: /dev/sda: unrecognised disk label
Formatting root partition
sd 0:0:0:0: [sda] Unhandled sense code
sd 0:0:0:0: [sda] Result: hostbyte=0x00 driverbyte=0x08
sd 0:0:0:0: [sda] Sense Key : 0x3 [current]
sd 0:0:0:0: [sda] ASC=0x11 ASCQ=0x0
sd 0:0:0:0: [sda] CDB: cdb[0]=0x28: 28 00 00 10 78 00 00 00 08 00
end_request: I/O error, dev sda, sector 1079296

Warning, had trouble writing out superblocks.Mounting boot parition
Mounting root partition
kjournald starting.  Commit interval 5 seconds
EXT3 FS on sda2, internal journal
EXT3-fs: mounted filesystem with writeback data mode.
Enter EdgeOS image url: http://dl.ubnt.com/firmwares/edgemax/v1.5.0/ER-e100.v1.5.0.4677648.tar
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 68.9M  100 68.9M    0     0  3386k      0  0:00:20  0:00:20 --:--:-- 1530k
Unpacking EdgeOS release image
Verifying EdgeOS kernel
Copying EdgeOS kernel to boot partition
Verifying EdgeOS system image
Copying EdgeOS system image to root partition
Copying version file to the root partition
Creating EdgeOS writable data directory
Cleaning up
Installation finished
Please reboot your router
EMRK>reboot
starting pid 289, tty '': '/bin/umount -a -r'
The system is going down NOW!
Sent SIGTERM to all processes
Sent SIGKILL to all processes
Requesting system reboot
Restarting system.

Looking for valid bootloader image....
Jumping to start of image at address 0xbfc80000


U-Boot 1.1.1 (UBNT Build ID: 4493936-g009d77b) (Build time: Sep 20 2012 - 15:48:51)

BIST check passed.
UBNT_E100 r1:2, r2:14, serial #: DC9FDB8012D5
Core clock: 500 MHz, DDR clock: 266 MHz (532 Mhz data rate)
DRAM:  512 MB
Clearing DRAM....... done
Flash:  4 MB
Net:   octeth0, octeth1, octeth2

USB:   (port 0) scanning bus for devices... 1 USB Devices found
       scanning bus for storage devices...
  Device 0: Vendor:          Prod.: USB DISK 2.0     Rev: PMAP
            Type: Removable Hard Disk
            Capacity: 3700.6 MB = 3.6 GB (7579008 x 512)                                                                                                                       0
** No FAT signature found on device 0 **

** Unable to use usb 0:1 for fatload **
argv[2]: coremask=0x3
argv[3]: root=/dev/sda2
argv[4]: rootdelay=15
argv[5]: rw
argv[6]: rootsqimg=squashfs.img
argv[7]: rootsqwdir=w
argv[8]: mtdparts=phys_mapped_flash:512k(boot0),512k(boot1),64k@3072k(eeprom)
## No elf image at address 0x09f00000
Octeon ubnt_e100#

I decided to run the rescue again and this time it did not report any errors while creating the partition and writing the image. I am not sure how the drive got corrupted and I am not sure I can trust that it is "fixed" now.

Member
Posts: 203
Registered: ‎10-18-2010
Kudos: 325
Solutions: 2
Contributions: 1

Re: EdgeMax rescue kit (now you can reinstall EdgeOS from scratch)

I wouldnd trust it. Replace it. I just had my s3 on flash drive die in mine yesterday. Time to get a decent SanDisk Cruzer Fit.
New Member
Posts: 30
Registered: ‎09-22-2013
Kudos: 6

Re: EdgeMax rescue kit - my experience rescuing a corrupt drive

I would definitely RMA it too if you are within the 1 year warranty period.. This has happend to two of mine now. If you are past the warranty period I would replace the internal flash drive..

Veteran Member
Posts: 5,417
Registered: ‎03-12-2011
Kudos: 2708
Solutions: 128

Re: EdgeMax rescue kit - my experience rescuing a corrupt drive


@hilo90mhz wrote:

I would definitely RMA it too if you are within the 1 year warranty period.. This has happend to two of mine now. If you are past the warranty period I would replace the internal flash drive..


That said, the time and effort to RMA an ERL vs the cost of a Sandisk Cruzer Fit means I usually just replace it myself.

New Member
Posts: 30
Registered: ‎09-22-2013
Kudos: 6

Re: EdgeMax rescue kit - my experience rescuing a corrupt drive

[ Edited ]

@NVX wrote:

@hilo90mhz wrote:

I would definitely RMA it too if you are within the 1 year warranty period.. This has happend to two of mine now. If you are past the warranty period I would replace the internal flash drive..


That said, the time and effort to RMA an ERL vs the cost of a Sandisk Cruzer Fit means I usually just replace it myself.


In your experience NVX does replacing the flash drive usually fix this sort of error? How many units/times have you had to replace them? Do you think it will happen just as often/easily with the cruzer fit as the original OEM drive? Do you think the included flash drives are inferior? Or is the OS doing too many writes to them or something? I am just not sure why allot of them would be failing.. Both of mine were manufactured middle of 2013, but I bought them after that.. 

 

Cost of shipping was $6 for USPS flat rate box so about the same as the cruzer fit.. time was minimal as I ship products already for my business. Would have taken longer to order the flash drive, open up the ERL(and void warranty) and then have to reinstall the OS on it. But to each his own Man Happy I like fixing things too and will probably have to do this sooner or later if the flash drives keep failing and my units will be out of the 1 year period soon. 

Veteran Member
Posts: 5,417
Registered: ‎03-12-2011
Kudos: 2708
Solutions: 128

Re: EdgeMax rescue kit - my experience rescuing a corrupt drive


@hilo90mhz wrote:

@NVX wrote:

@hilo90mhz wrote:

I would definitely RMA it too if you are within the 1 year warranty period.. This has happend to two of mine now. If you are past the warranty period I would replace the internal flash drive..


That said, the time and effort to RMA an ERL vs the cost of a Sandisk Cruzer Fit means I usually just replace it myself.


In your experience NVX does replacing the flash drive usually fix this sort of error? How many units/times have you had to replace them? Do you think it will happen just as often/easily with the cruzer fit as the original OEM drive? Do you think the included flash drives are inferior? Or is the OS doing too many writes to them or something? I am just not sure why allot of them would be failing.. Both of mine were manufactured middle of 2013, but I bought them after that.. 

 

Cost of shipping was $6 for USPS flat rate box so about the same as the cruzer fit.. time was minimal as I ship products already for my business. Would have taken longer to order the flash drive, open up the ERL(and void warranty) and then have to reinstall the OS on it. But to each his own Man Happy I like fixing things too and will probably have to do this sooner or later if the flash drives keep failing and my units will be out of the 1 year period soon. 


I live in Australia where shipping is a tad more expensive annoyingly...

I've had 3-4 fail from the usb stick getting corrupt that I've repaired. I haven't had any repaired ones fail again so far.

I have had a look at how much writes EdgeOS does to flash and it's fairly minimal (I did this mainly because I run a bit of custom software on them and I wanted to make sure I wasn't doing anything silly causing them to die more so than thinking EdgeOS was doing something silly, but at the end of the day neither EdgeOS nor my custom stuff seemed to be writing much at all to the flash - even when you consider the total data over a full year it was a pretty small amount - I don't have the numbers on hand though).

I'm guessing the flash they purchased was just cheap stuff (can't really blame them for a $99 MSRP router). Sandisk's are pretty reliable so I suspect replacing one with a Sandisk would probably have a better lifespan than getting one replaced from ubnt (which would probably be a similar cheapie flash drive anyway)

New Member
Posts: 30
Registered: ‎09-22-2013
Kudos: 6

Re: EdgeMax rescue kit (now you can reinstall EdgeOS from scratch)

Greetings from Hawaii! We have expensive shipping for most larger items too but luckily USPS flat rate boxes are the same price anywhere in the us.

Thanks for your insight in into the issue and your write testing.. That does make me want to replace all my important edge routers flash drives now.. I use only sandisk flash drives for most things too.. I am still hoping that Ubiquiti has started using a higher quality flash drive.. Seems like it would be worth it to them if they didnt have to deal with all the RMAs + reliability is always a big deal in networking..

They seem to work fine for months, then one power off and on or reset and they never boot up again... I'm guessing once the os is in ram in just keeps going along happily.. Then as soon as it has to reload the image it finds the flash drive has corrupted.

The ERL do get quite hot too so I have fans on all o mine now to maybe help out the flash drives / other components.


Veteran Member
Posts: 5,417
Registered: ‎03-12-2011
Kudos: 2708
Solutions: 128

Re: EdgeMax rescue kit (now you can reinstall EdgeOS from scratch)


@hilo90mhz wrote:
That does make me want to replace all my important edge routers flash drives now.. I use only sandisk flash drives for most things too.. I am still hoping that Ubiquiti has started using a higher quality flash drive.. Seems like it would be worth it to them if they didnt have to deal with all the RMAs + reliability is always a big deal in networking..

They seem to work fine for months, then one power off and on or reset and they never boot up again... I'm guessing once the os is in ram in just keeps going along happily.. Then as soon as it has to reload the image it finds the flash drive has corrupted.

So worth noting I've never had an ER-PoE fail, and all my ERLs that I've used thusfar were from some of the original batches (plastic case). We've only recently gotten a new shipment in of metal-cased ERLs go figure - point is it could very well be that the newer flash drives are more reliable (ie, the ones used in in ER-PoE's and the metal-cased ERLs) but time will tell I guess. I also use more ERLs than ER-PoE's and the ER-PoE's are newer so it could just be that not enough time has passed yet for one to fail too.

Another trick is you'll usually find errors in dmesg output regarding ext3 and/or squashfs errors before the thing gives up the ghost completely. You can use this to either re-image the device at that stage (copy the squashfs and kernel images over hayo) which sometimes works (although sometimes not) or at least know "hmm, this isn't going to come back up if I reboot it". You can also do an md5 of the squashfs and kernel images (the latter requires manually mounting the partition though) and compare it to the .md5 file to see if it's corrupt there too.

In my important sites I have ER8/ER8Pro's as they use on-board eMMC flash instead of USB sticks for storage and to date I haven't seen anyone complain of those failing in the same way the ERLs do, nor have I personally witnessed it on the ER8/ER8Pro's I run (and I've still got an Alpha unit running fine!) either so it seems safe to say that the eMMC used on those is more reliable so if you really care about a site it's probably worth going that little extra for an ER8 or ER8Pro. Plus they're more easily rack mounted, beefier CPU (seriously, configuring them is much faster - it feels really snappy using them compared to ERLs), more RAM, more ports, etc.

New Member
Posts: 33
Registered: ‎02-04-2014
Kudos: 3
Solutions: 1

Re: EdgeMax rescue kit (now you can reinstall EdgeOS from scratch)

The question is will Ubiquiti even do an RMA on my unit if it appears to be working fine?

I've got until April to file an RMA.

Maybe it'll brick itself again by then.

New Member
Posts: 30
Registered: ‎09-22-2013
Kudos: 6

Re: EdgeMax rescue kit (now you can reinstall EdgeOS from scratch)

Ya hopefully it keeps working Man Wink if not you may be better advised to take NVXs advice and replace the flash drive with a known quality one like sandisk. I may open the RMA one they send me and see if the flash drive looks any different from the old types - they are just a metal box though with no marking so might be hard to tell. 

Highlighted
Veteran Member
Posts: 5,417
Registered: ‎03-12-2011
Kudos: 2708
Solutions: 128

Re: EdgeMax rescue kit (now you can reinstall EdgeOS from scratch)


@hilo90mhz wrote:

Ya hopefully it keeps working Man Wink if not you may be better advised to take NVXs advice and replace the flash drive with a known quality one like sandisk. I may open the RMA one they send me and see if the flash drive looks any different from the old types - they are just a metal box though with no marking so might be hard to tell. 


If you take the USB stick out there's a little plastic retainer end cap. Pull that off and the whole thing comes apart. On the black ceramic thing that actually has the USB pins on it the reverse side usually has a manufacturer logo. On all the dead ones I've pulled apart they had "Toshiba" branding.

Reassembly is easy, just put the 3 parts back together and et voila.

Reply