Randy Yates recently started a thread on programming flash that had an
interesting tangent into watchdog timers. I thought it was interesting
enough that I'm starting a thread here.
I had stated in Randy's thread that I avoid watchdogs, because they
mostly seem to be a source of erroneous behavior to me.
However, on reflection I realized that I lied: I _do_ use watchdog
timers, but not automatically. To date I've only used them when the
processor is spinning a motor that might crash into something or
otherwise engage in damaging behavior if the processor goes nuts.
In general, my rule on watchdogs, as with any other feature, is "use it
if using it is better", which means that I think about the consequences
of the thing popping off when I don't want it to (as during a code update
or during development when I hit a breakpoint) vs. the consequences of
not having the thing when the processor goes haywire.
Furthermore, if I use a watchdog I don't just treat updating the thing as
a requirement check-box -- so you won't find a timer ISR in my code that
unconditionally kicks the dog. Instead, I'll usually have just one task
(the motor control one, on most of my stuff) kick the dog when it feels
it's operating correctly. If I've got more than one critical task (i.e.,
if I'm running more than one motor out of one processor) I'll have a low-
priority built-in-test task that kicks the dog, but only if it's getting
periodic assurances of health from the (multiple) critical tasks.
Generally, in my systems, the result of the watchdog timer popping off is
that the system will no longer work quite correctly, but it will operate
safely.
So -- what do you do with watchdogs, and how, and why? Always use 'em?
Never use 'em? Use 'em because the boss says so, but twiddle them in a
"last part to break" bit of code?
Would you use a watchdog in a fly-by-wire system? A pacemaker? Why?
Why not? Could you justify _not_ using a watchdog in the top-level
processor of a Mars rover or a satellite?
--
Tim Wescott
Wescott Design Services
http://www.wescottdesign.com
Reply by Paul Rubin●May 9, 20162016-05-09
Tim Wescott <seemywebsite@myfooter.really> writes:
> To date I've only used them when the processor is spinning a motor
> that might crash into something or otherwise engage in damaging
> behavior if the processor goes nuts.
I haven't done anything with motors, but have used watchdogs in comm
gear that typically sits in a customer's closet somewhere. The concern
is less about the code going nuts and damaging something, than about
getting wedged somehow so that the box stops working. Typically a
customer with a wedged box would call the help desk, and the support rep
would first tell them to try going to the box and power cycling it. The
watchdog does essentially the same thing, without the customer getting
involved. They at worst experience a short service outage and hopefully
shrug it off. But usually it would happen without them even noticing.
> Would you use a watchdog in a fly-by-wire system? A pacemaker? Why?
Yes. The idea again is not to prevent something from going nuts, but to
restore the system from a wedged or broken state to a known good state
if it gets in trouble somehow.
That's actually part of "official" Erlang programming methodology: to
not engage in recovering from bad inputs or other sorts of defensive
programming. If the system passed QA and still managed to reach some
scroggled state where unexpected input reached your little function deep
in the weeds, who knows what else is borked that got it like that? The
Erlang motto is "let it crash", which means allow unexpected errors to
unceremoniously blow away your whole process. The Erlang supervision
system sees the crash and restarts/reinitializes the process so it's
again in a known good state, and that (reliability studies show) usually
fixes things.
The above might sound cavalier but Erlang was developed for serious
high-end telecom gear that's supposed to keep running for decades at a
time, purport to have nine 9's of uptime, allow code upgrades without
interrupting any phone calls, etc. Another Erlang saying is that a
reliable system must be distributed: if it runs on only one CPU, the
power cord is a single point of failure. So yeah, if a CPU fails, the
supervision system sees that too, and transfers control to another one.
> Could you justify _not_ using a watchdog in the top-level processor of
> a Mars rover or a satellite?
> Randy Yates recently started a thread on programming flash that had an
> interesting tangent into watchdog timers. I thought it was interesting
> enough that I'm starting a thread here.
>
> I had stated in Randy's thread that I avoid watchdogs, because they
> mostly seem to be a source of erroneous behavior to me.
>
> However, on reflection I realized that I lied: I _do_ use watchdog
> timers, but not automatically. To date I've only used them when the
> processor is spinning a motor that might crash into something or
> otherwise engage in damaging behavior if the processor goes nuts.
>
> In general, my rule on watchdogs, as with any other feature, is "use it
> if using it is better", which means that I think about the consequences
> of the thing popping off when I don't want it to (as during a code update
> or during development when I hit a breakpoint) vs. the consequences of
> not having the thing when the processor goes haywire.
>
> Furthermore, if I use a watchdog I don't just treat updating the thing as
> a requirement check-box -- so you won't find a timer ISR in my code that
> unconditionally kicks the dog. Instead, I'll usually have just one task
> (the motor control one, on most of my stuff) kick the dog when it feels
> it's operating correctly. If I've got more than one critical task (i.e.,
> if I'm running more than one motor out of one processor) I'll have a low-
> priority built-in-test task that kicks the dog, but only if it's getting
> periodic assurances of health from the (multiple) critical tasks.
>
> Generally, in my systems, the result of the watchdog timer popping off is
> that the system will no longer work quite correctly, but it will operate
> safely.
>
> So -- what do you do with watchdogs, and how, and why? Always use 'em?
> Never use 'em? Use 'em because the boss says so, but twiddle them in a
> "last part to break" bit of code?
>
> Would you use a watchdog in a fly-by-wire system? A pacemaker? Why?
> Why not? Could you justify _not_ using a watchdog in the top-level
> processor of a Mars rover or a satellite?
Watchdog timers are not often used in FPGAs. I guess that's because
processes in HDL seldom get stuck or lost in the weeds. ;)
When I did design a software project we had multiple tasks each kicking
another task which would track what was going on and "pet" the watch dog
to keep it from barking. The various tasks had periods of "interest"
different from the watch dog timeout, so this process dealt with the
appropriate time period of each of the tasks being watched. Only this
task needed to actually deal with the watch dog period.
--
Rick C
Reply by Rob Gaddi●May 9, 20162016-05-09
rickman wrote:
> On 5/9/2016 1:06 PM, Tim Wescott wrote:
>> Randy Yates recently started a thread on programming flash that had an
>> interesting tangent into watchdog timers. I thought it was interesting
>> enough that I'm starting a thread here.
>>
>> I had stated in Randy's thread that I avoid watchdogs, because they
>> mostly seem to be a source of erroneous behavior to me.
>>
>> However, on reflection I realized that I lied: I _do_ use watchdog
>> timers, but not automatically. To date I've only used them when the
>> processor is spinning a motor that might crash into something or
>> otherwise engage in damaging behavior if the processor goes nuts.
>>
>> In general, my rule on watchdogs, as with any other feature, is "use it
>> if using it is better", which means that I think about the consequences
>> of the thing popping off when I don't want it to (as during a code update
>> or during development when I hit a breakpoint) vs. the consequences of
>> not having the thing when the processor goes haywire.
>>
>> Furthermore, if I use a watchdog I don't just treat updating the thing as
>> a requirement check-box -- so you won't find a timer ISR in my code that
>> unconditionally kicks the dog. Instead, I'll usually have just one task
>> (the motor control one, on most of my stuff) kick the dog when it feels
>> it's operating correctly. If I've got more than one critical task (i.e.,
>> if I'm running more than one motor out of one processor) I'll have a low-
>> priority built-in-test task that kicks the dog, but only if it's getting
>> periodic assurances of health from the (multiple) critical tasks.
>>
>> Generally, in my systems, the result of the watchdog timer popping off is
>> that the system will no longer work quite correctly, but it will operate
>> safely.
>>
>> So -- what do you do with watchdogs, and how, and why? Always use 'em?
>> Never use 'em? Use 'em because the boss says so, but twiddle them in a
>> "last part to break" bit of code?
>>
>> Would you use a watchdog in a fly-by-wire system? A pacemaker? Why?
>> Why not? Could you justify _not_ using a watchdog in the top-level
>> processor of a Mars rover or a satellite?
>
> Watchdog timers are not often used in FPGAs. I guess that's because
> processes in HDL seldom get stuck or lost in the weeds. ;)
>
> When I did design a software project we had multiple tasks each kicking
> another task which would track what was going on and "pet" the watch dog
> to keep it from barking. The various tasks had periods of "interest"
> different from the watch dog timeout, so this process dealt with the
> appropriate time period of each of the tasks being watched. Only this
> task needed to actually deal with the watch dog period.
>
I'd say the FPGA equivalent to a watchdog is integrity checking
hardware, like ECC RAM, state machines with explicit invalid state
checking, all the way up to triple-modular redundancy. I've never
needed any of that nonsense because everything I work on remains
pleasantly surrounded by atmosphere, but it's definitely out there.
--
Rob Gaddi, Highland Technology -- www.highlandtechnology.com
Email address domain is currently out of order. See above to fix.
Reply by Don Y●May 9, 20162016-05-09
On 5/9/2016 10:06 AM, Tim Wescott wrote:
> In general, my rule on watchdogs, as with any other feature, is "use it
> if using it is better", which means that I think about the consequences
> of the thing popping off when I don't want it to (as during a code update
> or during development when I hit a breakpoint) vs. the consequences of
> not having the thing when the processor goes haywire.
The problem is that you don't often know at design time which (if any)
failures will require this sort of protection. Even "bug free" code
can reside in a system that experiences hardware faults (power
supply fluctuations, input latchup, etc.).
So, do you try to bolt this capability on, after the fact? Or,
design around it from the start (hoping not to need it)?
> Furthermore, if I use a watchdog I don't just treat updating the thing as
> a requirement check-box -- so you won't find a timer ISR in my code that
> unconditionally kicks the dog. Instead, I'll usually have just one task
> (the motor control one, on most of my stuff) kick the dog when it feels
> it's operating correctly. If I've got more than one critical task (i.e.,
> if I'm running more than one motor out of one processor) I'll have a low-
> priority built-in-test task that kicks the dog, but only if it's getting
> periodic assurances of health from the (multiple) critical tasks.
There's no hard and fast rule for how you should implement a watchdog.
It's a component in your system, just like any other component.
Putting the stroking of the watchdog in the idle task can leave your system
vulnerable to any sort of momentary overload; or, necessitate an unduly
long timeout (to accommodate short overloads).
Putting it in an ISR is almost always silly -- for obvious reasons.
OTOH, I currently use the software equivalent of that mechanism by
having my "watchdog monitor" run as a very HIGH priority task! But,
one that spends most of its life blocking awaiting "sanity messages"
from the various tasks that are trying to stroke this *virtual*
watchdog.
Putting all of the watchdog (hardware) interface in one task allows
a more consistent -- and discerning -- implementation.
First, it ensures any such activities will get logged! If you've got
lots of independant/autonomous tasks stroking the watchdog, you never
know which one FORGOT to do so. As a result, you can't recover
(post mortem) when the device comes out of reset.
Second, it allows the "stroking" to be smarter and more demonstrable of
sentience on the part of the individual "strokers". I.e., instead of
just twiddling a bit, you can engage the other party in a dialog
and place further constraints on verify its sanity. ("Why are you
sending me these keep alive messages at such an alarming rate?
I was only EXPECTING to receive them from you at a more modest
rate. Perhaps something has gone wrong in your implemlentation
or process state??")
Third, it allows for tasks to *request* a watchdog intervention!
("OhMiGsh! The motor is ignoring my commands to turn off!
Somebody pull the plug -- NOW!!!!") And, this can be logged
for post mortem.
> Generally, in my systems, the result of the watchdog timer popping off is
> that the system will no longer work quite correctly, but it will operate
> safely.
>
> So -- what do you do with watchdogs, and how, and why? Always use 'em?
> Never use 'em? Use 'em because the boss says so, but twiddle them in a
> "last part to break" bit of code?
(sigh) I have a lengthy paper/tutorial I wrote many years ago on
the subject as I'd had the "argument" with clients many times over
the years. People seem to have a naive concept of what watchdogs
(sentries) can and can't do -- as well as when they are indicated
vs. contraindicated.
[One of these days, I'll set up a web site and push all these
documents out there. But, far more interesting things to do
with the few hours present in each day :-/ ]
Watchdogs take many forms -- hardware and software. A process that
deliberately KILL's processes that it suspects of being corrupt is
just as much a watchdog as a piece of hardware that tugs on /RESET.
Communication happens both in-band and out-of-band. The former,
of course, tends to rely on "some (software)" remaining operational.
The latter works around it.
A watchdog plays a LAGGING role in a system (it "happens" AFTER
something has already gone wrong) as well as a LEADING role
(it informs the user/environment of a potential "more significant"
failure that hasn't percolated through the "system", yet!
This role should not be glossed over. INFORMATION IS CONVEYED
by these mechanisms. Simply ignoring that information (i.e.,
letting the device reset itself) is usually not a very good idea.
[Consider what happens when you have a device that is eager to start
up quickly. So, if the device has incurred a watchdog upset, everything
appears to shut down, unceremoniously. Then, as the device starts
up, again, it rushes to get everything running, again -- just in time
for it to be (possibly) shut down by the same, persistent failure
retriggering the watchdog overrun. SOMETHING wants to be able to
detect when a watchdog event has occurred and adjust the RESTART
procedure (different from the START procedure) accordingly.]
I'm currently working on ways to signal remote devices when a
watchdog event has been triggered in some OTHER remote device;
without relying on in-band signalling (if the device is misbehaving,
how do I know it will be ABLE to inform others that it has just
been watchdogged?). The point being so those other devices can
adjust to this INFORMATION -- instead of wondering why some
service/capability (in which the failed node played a part) isn't
working properly AFTER SOME ARTIFICIAL DELAY.
> Would you use a watchdog in a fly-by-wire system? A pacemaker? Why?
> Why not? Could you justify _not_ using a watchdog in the top-level
> processor of a Mars rover or a satellite?
What's the reliability of each system and PROTECTION system?
I'd surely not want a watchdog on a Mars rover that resets
more frequently than the round trip radio delay to its earth
station!
(some hand-waving, there, but the point should be obvious)
Reply by Tim Wescott●May 9, 20162016-05-09
On Mon, 09 May 2016 15:07:08 -0400, rickman wrote:
> On 5/9/2016 1:06 PM, Tim Wescott wrote:
>> Randy Yates recently started a thread on programming flash that had an
>> interesting tangent into watchdog timers. I thought it was interesting
>> enough that I'm starting a thread here.
>>
>> I had stated in Randy's thread that I avoid watchdogs, because they
>> mostly seem to be a source of erroneous behavior to me.
>>
>> However, on reflection I realized that I lied: I _do_ use watchdog
>> timers, but not automatically. To date I've only used them when the
>> processor is spinning a motor that might crash into something or
>> otherwise engage in damaging behavior if the processor goes nuts.
>>
>> In general, my rule on watchdogs, as with any other feature, is "use it
>> if using it is better", which means that I think about the consequences
>> of the thing popping off when I don't want it to (as during a code
>> update or during development when I hit a breakpoint) vs. the
>> consequences of not having the thing when the processor goes haywire.
>>
>> Furthermore, if I use a watchdog I don't just treat updating the thing
>> as a requirement check-box -- so you won't find a timer ISR in my code
>> that unconditionally kicks the dog. Instead, I'll usually have just
>> one task (the motor control one, on most of my stuff) kick the dog when
>> it feels it's operating correctly. If I've got more than one critical
>> task (i.e.,
>> if I'm running more than one motor out of one processor) I'll have a
>> low-
>> priority built-in-test task that kicks the dog, but only if it's
>> getting periodic assurances of health from the (multiple) critical
>> tasks.
>>
>> Generally, in my systems, the result of the watchdog timer popping off
>> is that the system will no longer work quite correctly, but it will
>> operate safely.
>>
>> So -- what do you do with watchdogs, and how, and why? Always use 'em?
>> Never use 'em? Use 'em because the boss says so, but twiddle them in a
>> "last part to break" bit of code?
>>
>> Would you use a watchdog in a fly-by-wire system? A pacemaker? Why?
>> Why not? Could you justify _not_ using a watchdog in the top-level
>> processor of a Mars rover or a satellite?
>
> Watchdog timers are not often used in FPGAs. I guess that's because
> processes in HDL seldom get stuck or lost in the weeds. ;)
I've spent lab time next to unhappily cursing FPGA guys (good ones)
trying to determine why their state machines have wedged.
So I'm not sure that's an entirely accurate statement.
> When I did design a software project we had multiple tasks each kicking
> another task which would track what was going on and "pet" the watch dog
> to keep it from barking. The various tasks had periods of "interest"
> different from the watch dog timeout, so this process dealt with the
> appropriate time period of each of the tasks being watched. Only this
> task needed to actually deal with the watch dog period.
That's more or less what I do if I need to keep watch on multiple tasks.
--
Tim Wescott
Wescott Design Services
http://www.wescottdesign.com
Reply by rickman●May 9, 20162016-05-09
On 5/9/2016 5:13 PM, Tim Wescott wrote:
> On Mon, 09 May 2016 15:07:08 -0400, rickman wrote:
>
>> On 5/9/2016 1:06 PM, Tim Wescott wrote:
>>> Randy Yates recently started a thread on programming flash that had an
>>> interesting tangent into watchdog timers. I thought it was interesting
>>> enough that I'm starting a thread here.
>>>
>>> I had stated in Randy's thread that I avoid watchdogs, because they
>>> mostly seem to be a source of erroneous behavior to me.
>>>
>>> However, on reflection I realized that I lied: I _do_ use watchdog
>>> timers, but not automatically. To date I've only used them when the
>>> processor is spinning a motor that might crash into something or
>>> otherwise engage in damaging behavior if the processor goes nuts.
>>>
>>> In general, my rule on watchdogs, as with any other feature, is "use it
>>> if using it is better", which means that I think about the consequences
>>> of the thing popping off when I don't want it to (as during a code
>>> update or during development when I hit a breakpoint) vs. the
>>> consequences of not having the thing when the processor goes haywire.
>>>
>>> Furthermore, if I use a watchdog I don't just treat updating the thing
>>> as a requirement check-box -- so you won't find a timer ISR in my code
>>> that unconditionally kicks the dog. Instead, I'll usually have just
>>> one task (the motor control one, on most of my stuff) kick the dog when
>>> it feels it's operating correctly. If I've got more than one critical
>>> task (i.e.,
>>> if I'm running more than one motor out of one processor) I'll have a
>>> low-
>>> priority built-in-test task that kicks the dog, but only if it's
>>> getting periodic assurances of health from the (multiple) critical
>>> tasks.
>>>
>>> Generally, in my systems, the result of the watchdog timer popping off
>>> is that the system will no longer work quite correctly, but it will
>>> operate safely.
>>>
>>> So -- what do you do with watchdogs, and how, and why? Always use 'em?
>>> Never use 'em? Use 'em because the boss says so, but twiddle them in a
>>> "last part to break" bit of code?
>>>
>>> Would you use a watchdog in a fly-by-wire system? A pacemaker? Why?
>>> Why not? Could you justify _not_ using a watchdog in the top-level
>>> processor of a Mars rover or a satellite?
>>
>> Watchdog timers are not often used in FPGAs. I guess that's because
>> processes in HDL seldom get stuck or lost in the weeds. ;)
>
> I've spent lab time next to unhappily cursing FPGA guys (good ones)
> trying to determine why their state machines have wedged.
>
> So I'm not sure that's an entirely accurate statement.
Ask them why their FSMs got stuck. In development they may make
mistakes, but you don't use watchdogs for debugging. In fact they get
in the way.
I've never had a FSM failure in the field, but I suppose there is a
first time. I did say "seldom", not never. A FSM in an FPGA is a
separate entity. No other process in the FSM can step on it's memory or
cause it to miss a deadline. CPUs are shared which hugely complicate
multi-process designs in all aspects. You just don't have that in an
FPGA. By comparison FPGAs are simple. But maybe I've just not worked
on an FPGA design that was complicated enough to compare to what the
software guys do...
>> When I did design a software project we had multiple tasks each kicking
>> another task which would track what was going on and "pet" the watch dog
>> to keep it from barking. The various tasks had periods of "interest"
>> different from the watch dog timeout, so this process dealt with the
>> appropriate time period of each of the tasks being watched. Only this
>> task needed to actually deal with the watch dog period.
>
> That's more or less what I do if I need to keep watch on multiple tasks.
>
--
Rick C
Reply by o pere o●May 10, 20162016-05-10
On 09/05/16 19:06, Tim Wescott wrote:
> Randy Yates recently started a thread on programming flash that had an
> interesting tangent into watchdog timers. I thought it was interesting
> enough that I'm starting a thread here.
>
> I had stated in Randy's thread that I avoid watchdogs, because they
> mostly seem to be a source of erroneous behavior to me.
>
> However, on reflection I realized that I lied: I _do_ use watchdog
> timers, but not automatically. To date I've only used them when the
> processor is spinning a motor that might crash into something or
> otherwise engage in damaging behavior if the processor goes nuts.
>
> In general, my rule on watchdogs, as with any other feature, is "use it
> if using it is better", which means that I think about the consequences
> of the thing popping off when I don't want it to (as during a code update
> or during development when I hit a breakpoint) vs. the consequences of
> not having the thing when the processor goes haywire.
>
> Furthermore, if I use a watchdog I don't just treat updating the thing as
> a requirement check-box -- so you won't find a timer ISR in my code that
> unconditionally kicks the dog. Instead, I'll usually have just one task
> (the motor control one, on most of my stuff) kick the dog when it feels
> it's operating correctly. If I've got more than one critical task (i.e.,
> if I'm running more than one motor out of one processor) I'll have a low-
> priority built-in-test task that kicks the dog, but only if it's getting
> periodic assurances of health from the (multiple) critical tasks.
>
> Generally, in my systems, the result of the watchdog timer popping off is
> that the system will no longer work quite correctly, but it will operate
> safely.
>
> So -- what do you do with watchdogs, and how, and why? Always use 'em?
> Never use 'em? Use 'em because the boss says so, but twiddle them in a
> "last part to break" bit of code?
>
> Would you use a watchdog in a fly-by-wire system? A pacemaker? Why?
> Why not? Could you justify _not_ using a watchdog in the top-level
> processor of a Mars rover or a satellite?
>
Quoting Tim Williams' book "The most cost-effective way to ensure the
reliability of a microprocessor-based product is to accept that the
program (or data or both, my addition) *will* occasionally be corrupted,
and to provide a means whereby the program flow can be automatically
recovered, preferably transparently to the user. This is the function of
the microprocessor watchdog."
So, the whole thing is what to do "when" (not "if") shit (the
unexpected) happens.
Pere
Reply by rickman●May 10, 20162016-05-10
On 5/10/2016 9:36 AM, o pere o wrote:
>
> Quoting Tim Williams' book "The most cost-effective way to ensure the
> reliability of a microprocessor-based product is to accept that the
> program (or data or both, my addition) *will* occasionally be corrupted,
> and to provide a means whereby the program flow can be automatically
> recovered, preferably transparently to the user. This is the function of
> the microprocessor watchdog."
>
> So, the whole thing is what to do "when" (not "if") shit (the
> unexpected) happens.
That's an interesting approach, just give up on making the system
reliable and instead make it recover from a failure. You do realize
that just because Tim Williams said this, it doesn't make it gospel. It
*is* possible to make programs that work and in some cases a program can
be *proven* to work. But those are rare.
Sure, it's great if you can make your system recover from a catastrophic
failure. But there are many systems where that is not remotely a
solution. Virtually any real-time control needs to work and the only
other solution is to shut it down, preferably safely. Even that is not
always possible.
For any system where there is potential for harm to people or even
equipment (depending on the cost) the best approach is an independent
monitor that disconnects the errant controller. In other words, when
safety is important, a processor watchdog timer may not be adequate.
--
Rick C
Reply by Rob Gaddi●May 10, 20162016-05-10
rickman wrote:
> On 5/9/2016 5:13 PM, Tim Wescott wrote:
>> On Mon, 09 May 2016 15:07:08 -0400, rickman wrote:
>>
>>
>> I've spent lab time next to unhappily cursing FPGA guys (good ones)
>> trying to determine why their state machines have wedged.
>>
>> So I'm not sure that's an entirely accurate statement.
>
> Ask them why their FSMs got stuck. In development they may make
> mistakes, but you don't use watchdogs for debugging. In fact they get
> in the way.
>
Oh, that's easy. Because of either:
An error in the synchronous logic, leaving it in a defined state with no
way out (20% chance).
An unsynchronized async input causing a race condition that static
timing couldn't catch (80% chance)
Or a single event upset (0.0001% chance)
--
Rob Gaddi, Highland Technology -- www.highlandtechnology.com
Email address domain is currently out of order. See above to fix.
Signal Processing Engineer Seeking a DSP Engineer to tackle complex technical challenges. Requires expertise in DSP algorithms, EW, anti-jam, and datalink vulnerability. Qualifications: Bachelor's degree, Secret Clearance, and proficiency in waveform modulation, LPD waveforms, signal detection, MATLAB, algorithm development, RF, data links, and EW systems. The position is on-site in Huntsville, AL and can support candidates at 3+ or 10+ years of experience.