EmbeddedRelated.com
Blogs
Memfault Beyond the Launch

Assembly language is best - except when it isn’t

Colin WallsJuly 27, 20231 comment

I started out embedded programming on small, 8-bit devices, with limited memory. High-level languages were not an option, so I became proficient in using assembly language. For modest sized applications, it was a reasonable option and, apart from that, there were no tools to support any other languages. I was always happy with that, because I could readily access all the functionality of the processor and knew what it could do best, my code would always be the most efficient.

In more recent years, I have gradually learnt that my confidence then was somewhat misplaced. Although my code may well have been efficient, it did not follow that assembly language was always the best way to achieve this result. There was another factor that I was not taking into account ...

Nowadays, most code is written in high-level languages. The size of applications alone is justification for this choice. With very powerful processors, a degree of inefficiency is acceptable. But this is no excuse to be completely relaxed, as embedded systems always have some limitations in their available resources, so wasting real-time or memory remains undesirable. With this in mind, I always advocate that developers should have some understanding of assembly language; they do not need to be able to write assembly language, but an ability to read it is useful. You can keep an eye on what the compiler is doing on your behalf.

This article is available in PDF format for easy printing

I would like to look at a couple of examples of compiler behavior and see what we can learn from them …

First off, consider a simple loop that initializes an array:

#define SIZE 4
char arr[SIZE];

for (i=0; i<SIZE; i++)
    arr[i] = 0;

The obvious compiler output would be a loop iterating around four times. However, any decent compiler (for a 32-bit target processor) would clear the whole array in a single instruction, which is much faster and smaller code. If you gradually try to increase the value of SIZE, a compiler will create appropriate sequences of clear instructions until, eventually, it will generate a loop (of 32-bit clears) when that becomes most efficient.

Next, let’s look at the switch statement in C. This is my favorite construct, as it offers the opportunity to craft very clear and easy to understand code - an alternative to complex if…else sequences that can be very challenging to untangle. What does a compiler do with a switch? I will offer some examples:

switch (index)
{
case 1: ...
    break;
case 3: ... 
    break;
case 9: ... 
    break;
};

In this example, the matched values of index are random and there are just a small number. A compiler will most likely generate the necessary if…else sequence.

switch (index)
{
case 2: ... 
    break;
case 3: ... 
    break;
case 4: ... 
    break;
case 5: ... 
    break;
};

Here, the matched values of index are contiguous and the complier will create a simple address table using index to create an offset.

switch (index)
{
case 2: ... 
    break;
case 4: ... 
    break;
case 5: ... 
    break;
case 6: ... 
    break;
};

Now, the matched values of index are almost contiguous and the complier will create a simple address table, with a dummy entry, using index to create an offset.

switch (index)
{
case 2:  ... 
    break;
case 11: ... 
    break;
case 7:  ... 
    break;
case 9:  ... 
    break;
case 49: ... 
    break;
case 1:  ... 
    break;
};

In this last example, the matched values of index are random and there are a significant number. A compiler will generate a look-up table to match values with code addresses.

What we can learn from both the loop and the switch examples is that a small change to the high-level code can have a significant impact on the resulting assembly language output from the compiler, as it seeks to produce the most efficient outcome. Of course, an assembly language programmer could take the same approach. But the question is: would they?

The determination of which code structure is most efficient is quite complex, but deterministic. This would be hard for a human programmer, but a compiler can be developed to encapsulate this expertise. Also, there is the question of human nature. Most programmers will have experienced a situation, sometime in their career, when someone changed the rules/specification and they needed to go back and rework some code. As a result, experienced developers tend to be somewhat defensive and write flexible code that can accommodate future changes with ease. In the switch example, this would mean coding the look-up table, as this covers all eventualities, but would sometimes be inefficient. A compiler does not mind rewriting the code every time something changes …

(BTW, I do know that a switch should always have a default clause, but I omitted them from these examples for simplicity.)



Memfault Beyond the Launch
[ - ]
Comment by CustomSargeAugust 10, 2023

I'll parse the topic: for human interface and/or complex tasks, a higher level language is FAR easier in pretty much every way.

For speed critical and/or compact embedded with minimum interface nothing beats well crafted assembler. Well crafted is KEY, spaghetti is for eating. I've done things no compiler would Think of allowing. Some would qualify as "bad or dubious practice" but the worst "practice" I saw, all too often, was poor documentation in several "flavors".

I'm oversimplifying this, but I started on Z-80s in 1978 and only stopped taking new projects this year. The vast majority were pure assembler.

To post reply to a comment, click on the 'reply' button attached to each comment. To post a new comment (not a reply to a comment) check out the 'Write a Comment' tab at the top of the comments.

Please login (on the right) if you already have an account on this platform.

Otherwise, please use this form to register (free) an join one of the largest online community for Electrical/Embedded/DSP/FPGA/ML engineers: