In reply to "Anders.Montonen@kapsi.spam.stop.fi.invalid" who wrote the
following:
> joolzg <joolzg@btinternet.com> wrote:
> > Anybody got a simple 8052 emulator in C source, im trying to reverse
> > engineer some code and would like to emulate/simulate the code to get
> > better understanding as it looks like it was written in C and compiled
> > by a very bad compiler
>
> There's the Daniel's s51 simulator[1] which is used in the SDCC[2]
> debugger.
>
> -a
>
> [1] <http://mazsola.iit.uni-miskolc.hu/~drdani/embedded/ucsim/
> [2] <http://sdcc.sourceforge.net/
thanks but its in delphi, pascal so would have to learn pascal again to do my
mods
joolz
--
--------------------------------- --- -- -
Posted with NewsLeecher v5.0 Beta 6
Web @ http://www.newsleecher.com/?usenet
------------------- ----- ---- -- -
Reply by ●May 25, 20112011-05-25
On Tue, 24 May 2011 13:46:31 -0600, hamilton <hamilton@nothere.com>
wrote:
>On 5/24/2011 1:30 PM, D Yuniskis wrote:
>> Hi Hamilton,
>>
>> On 5/24/2011 12:23 PM, hamilton wrote:
>>> On 5/24/2011 2:11 AM, joolzg wrote:
>>>> Anybody got a simple 8052 emulator in C source, im trying to reverse
>>>> engineer
>>>> some code and would like to emulate/simulate the code to get a better
>>>> understanding as it looks like it was written in C and compiled by a
>>>> very bad compiler
>>>
>>> You don't what a emulator, you want a de-compiler or reverse compiler.
>>>
>>> An emulator will just execute the binary code as the real hardware would.
>>>
>>> Using the binary to get the C back is impossible !!!!
>>
>> Actually, for some simple-minded compilers, you can often reverse
>> engineer the code to get much of the "C" source (neglecting
>> variable names, some expressions, etc.). This is especially
>> true of old/early compilers that didn't do much optimization.
>
>For years I have heard that story.
>
>I have always asked to show me any links with the compiler in question,
>So I will ask if you have any links to this "simple compiler" ?
>
>I took a compiler class 30 years ago, and my professor at the time
>stated that it was not possible.
>With the better compiler available today it would be even more impossible.
I guess "simple compiler" refers to some 1970's compilers for PDP-11,
Intel Intellecs and Motorola Exorcisers.
Writing compilers for these platforms was problematic due to the 64
KiB address space limit. Overlay loading helped a lot (each
compilation phase in a separately loaded overlay branch), but you
still had to reserve space for the symbol table, that had to be kept
constantly in memory. Overlay loading with floppies was also very
slow, thus, much optimization could not be done. For this reason,
getting assembly output from a compiler was not the standard
situation.
I once wrote an object code disassembler for PDP-11. Compared to
ordinary disassemblers, the object code disassembler can also display
the global symbols defined in this module as well as displaying any
external function names (including library function names) in plain
text.
I analyzed quite a few object codes generated by Fortran, Pascal and C
compilers and I was capable of detecting by "organic matching" how
each compiler will generate code. After this, it was quite easy to
reverse engineering some algorithms.
These days with good compilers, it is much harder to reverse
engineering things based on purely the executable code.
Reply by ●May 25, 20112011-05-25
joolzg <joolzg@btinternet.com> wrote:
> In reply to "Anders.Montonen@kapsi.spam.stop.fi.invalid" who wrote the
> following:
>> joolzg <joolzg@btinternet.com> wrote:
>> > Anybody got a simple 8052 emulator in C source, im trying to reverse
>> > engineer some code and would like to emulate/simulate the code to get
>> > better understanding as it looks like it was written in C and compiled
>> > by a very bad compiler
>> There's the Daniel's s51 simulator[1] which is used in the SDCC[2]
>> debugger.
>> [1] <http://mazsola.iit.uni-miskolc.hu/~drdani/embedded/ucsim/
>> [2] <http://sdcc.sourceforge.net/
> thanks but its in delphi, pascal so would have to learn pascal again to do my
> mods
The DOS version was written in Pascal, the Unix version is written in C++
as you would have noticed if you'd downloaded the source code.
-a
Reply by Chris H●May 25, 20112011-05-25
In message <4ddcbba5$0$1206$c3e8da3$4e334b76@news.astraweb.com>, joolzg
<joolzg@btinternet.com> writes
>In reply to "hamilton" who wrote the following:
>
>> On 5/24/2011 2:11 AM, joolzg wrote:
>> > Anybody got a simple 8052 emulator in C source, im trying to reverse
>> > engineer
>> > some code and would like to emulate/simulate the code to get a better
>> > understanding as it looks like it was written in C and compiled by a very
>> > bad
>> > compiler
>> >
>> > joolz
>> >
>> >
>> >
>> You don't what a emulator, you want a de-compiler or reverse compiler.
>>
>> An emulator will just execute the binary code as the real hardware would.
>>
>> Using the binary to get the C back is impossible !!!!
>>
>> Except for very simple programs.
>>
>> Even if you have the compiler sources and understood the compile
>> process, you still would not be able to get the binary -> C conversion
>> to work.
>>
>> But, have fun and good luck.
>>
>> hamilton
>
>Ive got that already, i want to SIMULATE THE CODE and give the code
>real inputs
>so i can validate my findings
>
>I will be rewriting it for another cpu as well so want to find out as much
>
>joolz
Use the Keil simulator
--
\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\
\/\/\/\/\ Chris Hills Staffs England /\/\/\/\/
\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/
Reply by Chris H●May 25, 20112011-05-25
In message <irhc0q$p5d$1@speranza.aioe.org>, Anders.Montonen@kapsi.spam.
stop.fi.invalid writes
>joolzg <joolzg@btinternet.com> wrote:
>> Anybody got a simple 8052 emulator in C source, im trying to reverse
>> engineer some code and would like to emulate/simulate the code to get a
>> better understanding as it looks like it was written in C and compiled
>> by a very bad compiler
>
>There's the Daniel's s51 simulator[1] which is used in the SDCC[2]
>debugger.
I doubt it will work.
--
\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\
\/\/\/\/\ Chris Hills Staffs England /\/\/\/\/
\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/
Reply by Chris H●May 25, 20112011-05-25
In message <4ddcbb4c$0$1469$c3e8da3$fb483528@news.astraweb.com>, joolzg
<joolzg@btinternet.com> writes
>In reply to "Chris H" who wrote the following:
>
>> In message <4ddb6833$0$1509$c3e8da3$efbdef2c@news.astraweb.com>, joolzg
>> <joolzg@btinternet.com> writes
>> > Anybody got a simple 8052 emulator in C source, im trying to reverse
>> > engineer
>> > some code and would like to emulate/simulate the code to get a better
>> > understanding as it looks like it was written in C and compiled by a very
>> > bad
>> > compiler
>>
>> What is the target MCU? The 51 family is huge (over 600 variants) and
>> whilst the cores are similar there are some big differences.
>>
>Analog Devices ADuC84x
This is NOT a true 8051/52 core. Read the documentation it is "based
on" an 8052. Not all they 8051 simulators will handle the non standard
8051 parts like this one.
>> Why do you want the source of the simulator?
>>
>So i can add in a serial driver, also the output display, you know make the
>simulator behave like the real thing with inputs and outputs
Then use the Keil Simulator that can do this already.
>> How do you know the binary was written in C?
>I can tell from the way the code is written!! cant you tell the differnece
>between human and machine created code
Yes... However you can not tell which HLL was used.
>> How big is the binary?
>64k but not all used
>> What is it supposed to do?
>cant say
Use the Keil Sumulator.
--
\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\
\/\/\/\/\ Chris Hills Staffs England /\/\/\/\/
\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/
Reply by Walter Banks●May 25, 20112011-05-25
joolzg wrote:
> In reply to "Chris H" who wrote the following:
>
> > In message <4ddb6833$0$1509$c3e8da3$efbdef2c@news.astraweb.com>, joolzg
> > <joolzg@btinternet.com> writes
> > > Anybody got a simple 8052 emulator in C source, im trying to reverse
> > > engineer
> > > some code and would like to emulate/simulate the code to get a better
> > > understanding as it looks like it was written in C and compiled by a very
> > > bad
> > > compiler
> >
> > What is the target MCU? The 51 family is huge (over 600 variants) and
> > whilst the cores are similar there are some big differences.
> >
> Analog Devices ADuC84x
>
> > Why do you want the source of the simulator?
> >
> So i can add in a serial driver, also the output display, you know make the
> simulator behave like the real thing with inputs and outputs
>
>
> > How do you know the binary was written in C?
> >
> I can tell from the way the code is written!! cant you tell the differnece
> between human and machine created code
>
> > How big is the binary?
> >
> 64k but not all used
>
> > What is it supposed to do?
> >
> cant say
You are going to a lot of work to reverse engineer an application.
Why is this needed?
w..
Reply by ●May 25, 20112011-05-25
Chris H <chris@phaedsys.org> wrote:
> In message <irhc0q$p5d$1@speranza.aioe.org>, Anders.Montonen@kapsi.spam.
> stop.fi.invalid writes
>>joolzg <joolzg@btinternet.com> wrote:
>>> Anybody got a simple 8052 emulator in C source, im trying to reverse
>>> engineer some code and would like to emulate/simulate the code to get a
>>> better understanding as it looks like it was written in C and compiled
>>> by a very bad compiler
>>There's the Daniel's s51 simulator[1] which is used in the SDCC[2]
>>debugger.
> I doubt it will work.
Of course you do.
-a
Reply by David Brown●May 25, 20112011-05-25
On 24/05/11 10:11, joolzg wrote:
> Anybody got a simple 8052 emulator in C source, im trying to reverse engineer
> some code and would like to emulate/simulate the code to get a better
> understanding as it looks like it was written in C and compiled by a very bad
> compiler
>
> joolz
>
It shouldn't be too hard to write a simulator yourself for a processor
like this. It's quite an effort if you want it to be fast, or to
accurately simulate interrupts and peripherals, but the core itself is
easy - you have an array to hold "ram", and array for "flash", a struct
holding the registers, and a huge switch statement interpreting each
instruction.
Reply by D Yuniskis●May 25, 20112011-05-25
Hi Hamilton,
On 5/24/2011 12:46 PM, hamilton wrote:
>>> Using the binary to get the C back is impossible !!!!
>>
>> Actually, for some simple-minded compilers, you can often reverse
>> engineer the code to get much of the "C" source (neglecting
>> variable names, some expressions, etc.). This is especially
>> true of old/early compilers that didn't do much optimization.
>
> For years I have heard that story.
>
> I have always asked to show me any links with the compiler in question,
> So I will ask if you have any links to this "simple compiler" ?
>
> I took a compiler class 30 years ago, and my professor at the time
> stated that it was not possible.
<grin> It's relatively easy to disprove a negative. :> I'll
drag out some examples and post them here. I think you;ll see that
most of these early compilers were pretty "straightforward" in the
way they emitted code. You could look at stanzas and deduce from
what they were created (of course, you couldn't tell "a == b"
from "b == a" -- though sometimes you could distinguish "a > b"
from "b < a"!).
I remember thinking about "peephole optimizers" and wondering how
they could be effective ("Shirley the compiler knows what code it
*just* emitted? Why would it ever do something as inane as
'STORE X; LOAD X'?"). But, if you saw how stanzas were "pasted"
together, you could see lots of opportunities for this kind
of micro-optimization!
Perhaps Walter can shed some light on what his products were doing
in the mid 80's and how they've progressed (along with *why*)?
> With the better compiler available today it would be even more impossible.
A lot depends on the code being compiled, the level of optimization
used, the optimizations *available* and the actual target itself.
E.g., older "single register" machines required lots of shuffling
to get arguments into an accumulator where they could be operated
on.
Also, older devices didn't have niceties like "MUL" or (gasp!) "DIV".
So, the repertoire of "helper functions" gave you lots of insight
into what the code was actually doing. And, those helpers didn't
have "short-circuits" where the compiler could do a "partial"
operation, etc.
>> I was able to recreate C source for a client's libraries from
>> binaries using this approach. Though it required a fair bit of
>> "organic computing" to recognize the "patterns" in the code
>> (a decompiler wasn't available). Of course, familiarity with
>> the product (application) goes a long way -- especially when
>> it comes to annotating the sources!
>
> Being familiar with the code is the only way to get back the C code.
I disagree. You can get back code that will recompile into the
same binary. You can further embelish that with some ideas as
to what the code is *likely* doing. As far as the ultimate
application... <shrug>
If you have the compiler (and binary libraries) available, you have
a huge headstart. You can feed it test cases to see what the code
looks like for various C constructs. You can see which helper
functions get dragged in and, thus, start giving those real "names".
If you have the hardware available (or at least the memory map),
you have known starting points for the code -- instead of picking
a spot "at random".
Chances are, it uses some part of the standard libraries. These
are relatively easy to recognize. So, you can put names on their
entry points and back-annotate all references to them as they
are encountered.
It's trivial to identify the strings in most applications (though
some might go to some lengths to protect or hide them -- but that
is rare and starts competing with the compiler since *it* has
a notion of what constitutes a "string"). So, library functions
that use strings (e.g., printf et al.) can be identified. Also,
strings often give you information about the data *referenced*
there -- "%d records processed.\n").
Finally, most older processors used in embedded systems were small.
Few systems could afford gobs of (EP)ROM for multimegabyte images.
Likewise, tens of KB of RAM was a lot. It's not like trying to
reverse engineer MSBloatware...
> But the OP seems to have no knowledge of the application.
>
> I have lost sources in disk crashes and have had to re-create the C
> sources by watching the operation of the application.
>
> reverse-engineering is always easier when you have a good idea of what
> is suppose to happen.
Sure. But it isn't a necessary prerequisite.
There are (big name) firms whose businesses are based on reverse
engineering other people's products -- e.g., to make something
"compatible" with a closed system.
In the process, one can often find obvious "mistakes" or
opportunities for improvement that the original designers
overlooked.
One of my first jobs was at a firm that designed marine navigation
equipment (among other things). I recall the "excitement" when a
Japanese firm expressed an interest in one of our RADAR sets. I
think they purchased 25 of them "for evaluation".
Some time later, *they* produced a similar product. It was very
obvious that it was "heavily inspired" (avoiding the term "copied")
by our set.
My boss grumbled at the lost business and having been "suckered".
In the next breath, he pointed out how the "competing design" had
lots of little changes that were incredibly obvious after-the-fact...
but, that had been omitted in our design!
E.g., the antenna (rotor) emitted rotational pulses to tell the
display which way it was pointed. This allowed the sweep in the
display to be synchronized (angularly) to the antenna's position.
Of course, this was done by mounting an optointerrupter and
encoder wheel (slotted disc) on the antenna's shaft. I think
the encoder had perhaps 1 degree azimuth resolution -- or something
like that. It was relatively costly to manufacture the disc since
it was done photographically, etc.
The competing product had a crude disc with perhaps 9 (!) slots
cut in it. It looked like something that a child would fashion
out of cardboard. *But*, the disc was mounted on the high side
of the reducing gearbox that drove the antenna shaft. So, it
rotated 40 times faster than the antenna! (i.e., same sort of
information coming from the antenna but much lower manufacturing
costs).
Without seeing "our design" with that modification made to it,
I doubt it ever would have occurred to anyone! <:-(
Signal Processing Engineer Seeking a DSP Engineer to tackle complex technical challenges. Requires expertise in DSP algorithms, EW, anti-jam, and datalink vulnerability. Qualifications: Bachelor's degree, Secret Clearance, and proficiency in waveform modulation, LPD waveforms, signal detection, MATLAB, algorithm development, RF, data links, and EW systems. The position is on-site in Huntsville, AL and can support candidates at 3+ or 10+ years of experience.