02/24/2018
Distant Curve - A Story About Coming Unstuck...
Reason for Installation
A series of radio repeaters to take high speed internet to an industrial site in the Northern Territory
Used Products
×28
Location
Tennant Creek NT, Australia
Description

Distant Curve Remote Area Communications prides itself on considered and redundant design for remote areas and challenging environments, having said that, sometimes we've learnt to expect the unexpected. From Cockatoos eating GPS units to Wallabies jumping on our solar panels...

We had one such event today... The attached graph is temperature inside 2 of our sites in Central Northern Nortern Territory (each 30km apart). There's a couple of things unusual about this graph - the blue line (at a site called 'Yakulla') shows a dramatic drop in temperature just after sunrise - that's the fans activating just before sunrise and pulling in cooler air - as they're meant too. You'll see the orange line (Kankawalla) shows no such drop - in fact it's just continued to climb until reaching a peak of nearly 50 degrees Celcius at about lunchtime - it looked to us like all 5 fans had stopped working - unusual in itself because they're on two separate circuits for redundancy and we use expensive fans with ceramic bearings for long life.

Logging into the cameras this morning about 7am we confirmed the same - we could not hear the fans. With the equipment inside the boxes rapidly looking like it might reach 70 degrees Celcius today we knew we needed to take some action. To fly out there is incredibly expensive - around $3000 return. But what else could we do? All of the gear inside each control box is industrial rated (we tend to use rockwell, Allen Bradley, Hirschmann etc) but nonetheless a total ventilation failure is something that is not only unusual, but potentially problematic. So time for some analysis - What was the cause? Stuck Fan Control Relays? A problem with the Controller? Hot temperatures causing a false trigger of the polyswitch fuses?

We custom designed the controllers for each site - we call these site controllers the 'curveIQ' - they're an embedded expert system that controls all the subsystems at each site - they're based around an atmel processor and they're at the heart of what makes our systems so reliable. They do things like supremely accurately control the charging of batteries, monitor state of charge, autonomously interrogate devices and troubleshoot in the event of a link failure, fix and communicate problems with redundancy subsystems etc..

Amongst many other things they control the fans based upon temperature, but we do have the ability to switch the fans on and off remotely - with about a 2 minute time delay between 'executing the instruction' and the fans following the instruction. So we tried power cycling the fans a few times... Using the camera, we could hear the fan control switches clicking when we executed these commands - but still, the fans wouldn't start - so we were down to 1 of 3 causes - a thermal effect on our fuses (self resetting fuses), poor contact on our relays or perhaps a progressive failure of our fans.

So what could we do? If it was a polyswitch problem, it would probably self resolve in a week or so when the temps get lower - but it seemed really unlikely this was the cause..

One huge benefit of the fact we designed and wrote the code for CurveIQ is that we can get it to do novel stuff... so after a bit of thinking, we wrote an extra function in C++ and uploaded the new code / firmware through an encrypted tunnel 3000kms into the Desert into the atmel chip of the controller - the code looked something like this:-

 

void rattleRelays()

{
for (int i=0; i <= 200; i++){
wdt_reset();
PORTC=B11111100;
delay(100);
PORTC=B11111111;
wdt_reset();
delay(100);
}
}

The basic idea (or 'hope') of the code was to rapidly open and close the electronic switches (relays) on both sets of fans about 5 times a second for 40 seconds - to 'rattle' the fans - and hopefully get them turning again - after uploading, we told the new program to execute - straight away we could hear the rattling of the relays - a sound a bit like a woodpecker - through the cameras and gradually, after about 30 seconds, we started to hear the familiar 'hum' of the fans kicking in.

If you have a look at the graph you'll see the temps dropped substantially after 12pm (when we executed 'the rattle') - and they continue to drop. It was a success.

We're now fairly certain that the problem was probably due to the recent dust storms in the area - it seems that the very fine (like talcum powder) desert dust had entered through the filters and been ingested into the fan ceramic bearings - and with their very fine tolerances had clogged them. With the benefit of hindsight we can now see that the power consumption of the box seems to indicate that they'd probably been failing one at a time over the last few days until this morning all 5 had failed -

We thought we were doing the right thing by using open ceramic bearings for long life, but it seems to me like in the future we'll be using more traditional grease encapsulated sealed sleeve bearing fans. Managing these remote sites sometimes feels a bit like managing something on mars - but we've got enough redundancy in each system that we can address almost any problem.At no time during this event did our client ever lose their connectivity, so, for that reason, I consider this unexpected failure to be a success Icon Lol . We've got a trip planned out there in March and the weather is due to cool down over the next 2 weeks or so, so we'll replace these fans when we're out there instead of being forced to rush out - and in the meantime we'll just leave the fans running 24/7 - momentum usually keeps a dicky fan going. 

 

Kankawalla North Facing - Feb 24 '18 01_14_06 PM.jpgPR1 - Looking East - Jan 20 '18 07_07_37 AM.jpg

 

20151123_121305.jpg

 

 

Graph.jpg

 console.jpg

 

 

Distant Curve - A Story About Coming Unstuck...

by ‎02-24-2018 03:41 PM - edited ‎02-24-2018 08:55 PM

Distant Curve Remote Area Communications prides itself on considered and redundant design for remote areas and challenging environments, having said that, sometimes we've learnt to expect the unexpected. From Cockatoos eating GPS units to Wallabies jumping on our solar panels...

We had one such event today... The attached graph is temperature inside 2 of our sites in Central Northern Nortern Territory (each 30km apart). There's a couple of things unusual about this graph - the blue line (at a site called 'Yakulla') shows a dramatic drop in temperature just after sunrise - that's the fans activating just before sunrise and pulling in cooler air - as they're meant too. You'll see the orange line (Kankawalla) shows no such drop - in fact it's just continued to climb until reaching a peak of nearly 50 degrees Celcius at about lunchtime - it looked to us like all 5 fans had stopped working - unusual in itself because they're on two separate circuits for redundancy and we use expensive fans with ceramic bearings for long life.

Logging into the cameras this morning about 7am we confirmed the same - we could not hear the fans. With the equipment inside the boxes rapidly looking like it might reach 70 degrees Celcius today we knew we needed to take some action. To fly out there is incredibly expensive - around $3000 return. But what else could we do? All of the gear inside each control box is industrial rated (we tend to use rockwell, Allen Bradley, Hirschmann etc) but nonetheless a total ventilation failure is something that is not only unusual, but potentially problematic. So time for some analysis - What was the cause? Stuck Fan Control Relays? A problem with the Controller? Hot temperatures causing a false trigger of the polyswitch fuses?

We custom designed the controllers for each site - we call these site controllers the 'curveIQ' - they're an embedded expert system that controls all the subsystems at each site - they're based around an atmel processor and they're at the heart of what makes our systems so reliable. They do things like supremely accurately control the charging of batteries, monitor state of charge, autonomously interrogate devices and troubleshoot in the event of a link failure, fix and communicate problems with redundancy subsystems etc..

Amongst many other things they control the fans based upon temperature, but we do have the ability to switch the fans on and off remotely - with about a 2 minute time delay between 'executing the instruction' and the fans following the instruction. So we tried power cycling the fans a few times... Using the camera, we could hear the fan control switches clicking when we executed these commands - but still, the fans wouldn't start - so we were down to 1 of 3 causes - a thermal effect on our fuses (self resetting fuses), poor contact on our relays or perhaps a progressive failure of our fans.

So what could we do? If it was a polyswitch problem, it would probably self resolve in a week or so when the temps get lower - but it seemed really unlikely this was the cause..

One huge benefit of the fact we designed and wrote the code for CurveIQ is that we can get it to do novel stuff... so after a bit of thinking, we wrote an extra function in C++ and uploaded the new code / firmware through an encrypted tunnel 3000kms into the Desert into the atmel chip of the controller - the code looked something like this:-

 

void rattleRelays()

{
for (int i=0; i <= 200; i++){
wdt_reset();
PORTC=B11111100;
delay(100);
PORTC=B11111111;
wdt_reset();
delay(100);
}
}

The basic idea (or 'hope') of the code was to rapidly open and close the electronic switches (relays) on both sets of fans about 5 times a second for 40 seconds - to 'rattle' the fans - and hopefully get them turning again - after uploading, we told the new program to execute - straight away we could hear the rattling of the relays - a sound a bit like a woodpecker - through the cameras and gradually, after about 30 seconds, we started to hear the familiar 'hum' of the fans kicking in.

If you have a look at the graph you'll see the temps dropped substantially after 12pm (when we executed 'the rattle') - and they continue to drop. It was a success.

We're now fairly certain that the problem was probably due to the recent dust storms in the area - it seems that the very fine (like talcum powder) desert dust had entered through the filters and been ingested into the fan ceramic bearings - and with their very fine tolerances had clogged them. With the benefit of hindsight we can now see that the power consumption of the box seems to indicate that they'd probably been failing one at a time over the last few days until this morning all 5 had failed -

We thought we were doing the right thing by using open ceramic bearings for long life, but it seems to me like in the future we'll be using more traditional grease encapsulated sealed sleeve bearing fans. Managing these remote sites sometimes feels a bit like managing something on mars - but we've got enough redundancy in each system that we can address almost any problem.At no time during this event did our client ever lose their connectivity, so, for that reason, I consider this unexpected failure to be a success Icon Lol . We've got a trip planned out there in March and the weather is due to cool down over the next 2 weeks or so, so we'll replace these fans when we're out there instead of being forced to rush out - and in the meantime we'll just leave the fans running 24/7 - momentum usually keeps a dicky fan going. 

 

Kankawalla North Facing - Feb 24 '18 01_14_06 PM.jpgPR1 - Looking East - Jan 20 '18 07_07_37 AM.jpg

 

20151123_121305.jpg

 

 

Graph.jpg

 console.jpg

 

 

{"location":{"title":"Tennant Creek NT, Australia","placeId":"ChIJSVS7TkNMTCsRIIQkKqgXAgQ"},"addedProducts":[{"id":"airfiber-5x","count":28}],"solved":"","numbers":"","description":"A series of radio repeaters to take high speed internet to an industrial site in the Northern Territory","mainImage":"144839i7AFB298A8ABDE8D1"}

Comments
by
on ‎02-24-2018 05:45 PM

AWESOME Story , Thank you for sharing!!! 

by
on ‎02-25-2018 08:22 AM

@doc_karl You guyz are so cool !!!

"desert dust had entered through the filters". Why not try other filters then ? Or can't you just add grease to the ceramic bearings ?

by
‎02-25-2018 09:55 AM - edited ‎02-25-2018 09:56 AM

Desert dust (from dry lake beds and clay deposits especially) gets into the low and sub-micron range.  Very tough to filter that without a significant amounts of fan power.

by Ubiquiti Employee
on ‎02-25-2018 11:52 AM

@doc_karl thanks for the story on the radios deployed, definitely appreciated that you share!

by
‎02-25-2018 04:46 PM - edited ‎02-25-2018 04:49 PM

Thanks guys - they're still going strong. 

 

@jjonsson @MimCom - dust ingress is one of those funny things. Given that we aim to visit these sites no more than once a year, and given the highlighted deficiencies of the cabinet filters, I'm almost beginning to consider doing away with the fine filters altogether and using a coarse filter that's basically there only to protect against ingress of vertebrate and invertebrate pests (eg ants and snakes). My reasoning here is that the gear inside the boxes (industrial ethernet switches, control gear, power conversion gear) is all designed to withstand a dusty environment anyway - a lot of the 'received knowledge' about the need for air filtration probably harks back to the days of spinning platter hard drives when a dust particle had the capacity to bring down servers as the width of the dust was greater than the distance between the platter and the head. In these environments, airflow is king - so mimcom makes a good point about the potential for filters to get clogged - there's about nothing I could do remotely if the filters got clogged and the effect would be as bad or worse. I think I just need to find fans that can handle any dust a bit better..

 

Having said that, I'm working on an alternate hypothesis about what may have happened - we had about 2 weeks running where night time temps never went below about 100 farenheit - it was a long heatwave. polyswitches have a heat derating curve. Looking at the specs for these polyswitches there should be plenty of overhead to handle the startup pulse of the fans each morning, but perhaps in this extreme case, not - still wouldn't really explain how the fan rattle got them spinning but interesting to consider. We'll find out when we get out there Man Happy 

 

by Ubiquiti Employee
on ‎02-26-2018 08:52 AM

Very cool, always nice to see a challenging deployment like this done right.

by
on ‎03-01-2018 02:17 AM

rattleRelays() !!!!

Pure bloody genius!   

 

The fact you have managed to get these links up and working at all, let alone for over a year, is very impressive. Fixing a set of fans with a remote rattle routine?  **Words fail. Tips cap**

by
on ‎03-03-2018 08:12 AM

Congratulations, and thanks so much for you share your issues with us , from Ecuador Latin America , God bless you!

 

I hope go on talking about your experiecies! and solutions !

 

regards ! from Ecuador!

 

by
on ‎03-03-2018 10:55 AM

Your company's design skills are impressive. For the sake of redundant ventillation system, have you considered using both the ceramic bearing fans and the sealed lubricated fans, using only one system at a time? Since the ceramics have potentially better life span if protected from desert dust, use those as primary, switching over to the lubricated fans when dust storms or other conditions indicate the need to protect the ceramics. Without the ceramics running during heavy dust conditions, you would probably reduce the chances of clogging the fine particulate filters, as well.

 

Just a thought.

by
on ‎03-03-2018 12:15 PM

Very nicely done - reminds me of some of the things deep space satellite and Mars Rover folks have had to do over hte years...  

 

Anything that's moving/mechanical is a problem in certain environments.  If it was indeed the bearings, about all you can do is use sealed ones instead.   Perhaps adding something like felt donut pads to the motor shafts to keep the dust out, or maybe look into fans used by the oil and gas industry - they see conditions like you do a lot.   The cabinets that MetroLink/Riccochet used here in the US for their outdoor equipment included Peltier Effect heat pumps between an outer and inner sealed chamber with inches of rigid insulation around the equipment chamber to make sure no dust got onto the equipment - I know that won't work in your case due to power considerations.   Might just have to accept replacing the fans every year to prevent them siezing - sometimes you just have to declare something a consumable part and go on down the road...

Jim