Hi Simon,
On 5/29/2015 9:15 AM, Simon Clubley wrote:
> On 2015-05-29, Don Y <this@is.not.me.com> wrote:
>>
>> Technician had placed a Black Cat with nichrome wire across the power
>> supply for the "Bang!" -- and a flashbulb for the "<flash>".
>>
>> He took great pleasure in commenting about how shook up I was!
>>
>> Then, I drew his attention to the fact that the machine wasn't
>> powering up: "Ooops!"
>>
>> Suddenly, *he* was the one who was shook up! (How to explain to the
>> boss that his practical joke had cost us THE prototype! :> )
> If I had done that, I would have expected to have been fired. :-)
<shrug> For the most part, I've been fortunate to work with people
who "didn't take themselves too seriously". This, IME, makes a huge
difference in how "creative" people can get in their solutions...
less worried about failing or "doing something that, in hindsight,
was obviously pretty 'stupid'". OTOH, ripe for coming up with really
*clever* approaches to problems that "less inspired" designs would
stumble on. Not the sort of environment for folks with big egos.
>> When I worked on The Reading Machine, one of the basic tests we would do while
>> bringing the system up was to push phonemes at the speech synthesizer to
>> verify the data path was intact, synthesizer functional, amplifier, etc.
>> These were all incredibly short bits of code because you had to "bit switch"
>> them into core (minicomputer-based). So, you just had a crib sheet of
>> octal codes that you'd quickly toggle into the machine, hit RUN and watch
>> (listen) what happens.
>
> I'm young enough to have missed the machines which needed a full bootstrap
> routinely keyed into them, but old enough to have run across (as a student)
> machines with a full console front panel.
>
> So yes, I understand the _strong_ desire to have kept this stuff short. :-)
The "normal" application was obviously too long to bit-switch in like this.
A tiny bipolar ROM (I think 16x16 -- or maybe 32x16?) did the normal
bootstrap... which loaded the image from a "data cassette" (the
"Compact Cassette" format that was popular for music, at the time).
Once loaded (into *core*), it was persistent, of course. So, subsequent
power-ups just caused the code to start running immediately (cassette load
was pretty slow).
> BTW, I think it also makes you reflect that you have knowledge and
> experience of a way of doing things that today's newcomers will never
> experience - at least it does for me.
The biggest take-away is learning to *think* about a problem before
just flailing away at it: "Let me try this, recompile... nope, that
wasn't it!" I think a lot of bugs creep into code because people
only partially think through their proposed remedies -- it's too
easy to just make a change, recompile and see the code (*appear*!)
to work... then, move on as if that problem was solved. As if each
problem was nothing more than a "typo".
At one point, I was working for a firm that had subcontracted some
defense work from big blue. I was responsible for debugging the
"processor" in the device.
We got a new device and their engineer came to help get the first
machine up and running. A "Series 1" minicomputer was used to drive
the test harness. The comms path (hardware) between the S1 and UUT
was physically long (30 or 40 feet) and had to go through various
gyrations to get to the proper logic levels, etc.
*LOTS* of one-shots (though they don't like calling them that!)
to account for delays in various level translators, etc. This one
triggers that one which, in turn, triggers this OTHER one, etc.
At one point, we couldn't get the two devices to communicate. I
was convinced the problem was an insufficient delay in one stage of
the "one-shot chain". Their engineer sat down, did the math and
convinced himself that this was NOT the problem. So, dismissed the
idea and went chasing other possible problems "on paper".
Never one to blindly "defer to my elders", I just walked off, grabbed
a honking BIG capacitor that was lying on a nearby bench (without
concern for it's actual *rating*), held it across the timing capacitor
for the one-shot that I suspected and, voila! Everything started
working!
"What did you just do??"
I showed him the cap. His eyes went wide when he saw that it
was like 1000 times larger than the circuit required...
"Well, that's way too big!"
"Yes, I know. But, obviously, the one that's *in* there isn't
big enough! Now, we can sort out why that's the case! (wrong
component installed? tolerances? some other issue that the
design failed to take into consideration?)
The current approach to much debugging seems to be "slap the
big capacitor in the circuit and, if it works, LEAVE IT THERE!"
> This even shows up in silly little ways; for example, I sometimes miss
> the ability to physically write protect a drive in certain situations.
Possible with most of my SCSI drives (via a jumper). The issue
is then whether the OS will gag when it encounters this restriction!
We used to code with the KNOWLEDGE/ASSURANCE that the executable would
be installed in R/O memory. E.g., using 16rFF as a terminator because
it could easily be checked (with an "increment the byte that this register
is pointing at" opcode).
When we started building SRAM modules to *emulate* EPROMs, we had to
include a "write protect" switch to ensure the SRAM behaved *like*
an EPROM once the software image was installed. You quickly learned
that failing to flip the switch caused your code to get clobbered
really quickly! ("Hmmm... why are the data in all of these memory
locations exactly +1 from what they *should be?")
> I also suspect that my code is tighter as a result of growing up on more
> resource limited machines.
The "attitude" also extends to other aspects of design, beyond "software".
E.g., a medical device I designed many years ago had to maintain an
internal database that would be served up via a pair of serial ports
and a query language that I had designed. At the time, DRAM was
small (16Kx1, 64Kx1) and EXPENSIVE! Stuffing 64K devices would
add considerably to the cost. Yet, restricting the design to 16K
devices could later require a redesign of the PCB and/or software.
My solution was to stuff 16K parts -- but, allow any or all of them to
be replaced with 64K parts. And, the software treated the first 16K
of that address space as "complete words"; but, all addresses beyond
that were treated as "N-(possibly non-contiguous)-bit wide".
During POST, the system would clarify the types of memory devices
present in each "bit position" -- in effect, creating a mask that
indicated where the bits were valid in this "beyond 16K space".
All accesses to the "data store" would occur through:
result_t get_word(addr_t address, word_t &word)
result_t put_word(addr_t address, word_t &word)
accessors. Of course, much slower than doing a memory cycle on
a specific address! But, infinitely faster than the data rate
that the query interface encountered (serial ports).
OTOH, I can recall another early design where the hardware guy had
opted to save the cost of a shift register -- forcing the software
to do shift-store cycles in a tight loop. It was *embarassing*
to see how much that "savings" COST the design!
[hardware and software folks tend not to overlap, IME]
The problem with this sort of mindset is that it is REALLY hard
to shake! I recall designing an interface to a PROM programmer
and found myself INSTINCTIVELY writing (in C) things like:
put_nybble(...) {put value & 0x0F}
put_byte(...) {put_nybble; put_nybble}
put_word(...) {put_byte; put_byte}
put_long(...) {put_word; put_word}
without thinking about whether this was *necessary* or *clear*!
I'm now working in a resource rich (more like "resource gluttony!")
environment and it is REALLY hard to discipline myself not to micro-manage
aspects of the design: "Burn a few million cycles, who cares! Use
them to improve reliability and ease maintenance efforts!"
I think going from scarce to plentiful is considerably easier than
trying to do it the other way around. I suspect most folks who write code
for desktops haven't a clue as to maximum stack penetration, etc. They
just tweek things until they *appear* to work -- and hope that they have
encountered (purely by CHANCE!) the worst case scenario at some point
"on the bench" (instead of designing *for* it!). So, "flukes" just get
shrugged off -- instead of explored in detail: "That SHOULDN'T happen!
So, why *did* it? (you saw it, too, didn't you??)"