EmbeddedRelated.com
Forums

Arduino APIs, performance, and C++ templates (long)

Started by Clifford Heath May 18, 2016
Folk,

I'm doing a little project, my first with Arduinos. I had
assumed that the choice to use C++ meant that the APIs would
be fancy object-oriented APIs that generate inline assembly
code for performance. I normally do more bare-metal stuff,
including building C++ APIs for the peripherals of the
MC68HC11 more than a decade ago, so I was keen to see what can
be achieve using more modern C++ compilers.

To say I've been disappointed is an understatement. The
standard of the code is simply awful. The g++ compiler is
fantastic, but the Arduino APIs just don't use that power.

As an example, "digitalWrite" takes over 50 cycles, compared
to the expected 2. I know that there are libraries that work
faster, but why are the default libraries so bad? Even calling
these methods takes at least *three* times the code space
that's required. I drilled in to see what's going on, but
that's not the topic here. I wanted to show how things could
be better, and to see if anyone here is interested in making
it happen (personally I actually want to do this for ST's ARM
range, but will assist if someone wants to do AVR versions).

Using template metaprogramming, we can get nice object-oriented
APIs that also map directly to the hardware instructions.
Unfortunately it's not easy to use the existing Arduino port
definitions as template parameters, which might mean having to
redefine some of the #defines of the low- level hardware (more
below). So here's a minimal example that works, and shows what
could be achieved by following this route:

template <int Port, uint8_t Mask>
class Pin
{
         public:
         Pin& operator=(bool b)
         {
                 if (b)
                         *(volatile uint8_t*)Port |= Mask;
                 else
                         *(volatile uint8_t*)Port &= ~Mask;
                 return *this;
         }
};

Pin<0x25, 0x01> portBp0;

Folk,

I'm doing a little project, my first with Arduinos. I had
assumed that the choice to use C++ meant that the APIs would
be fancy object-oriented APIs that generate inline assembly
code for performance. I normally do more bare-metal stuff,
including building C++ APIs for the peripherals of the
MC68HC11 more than a decade ago, so I was keen to see what can
be achieve using more modern C++ compilers.

To say I've been disappointed is an understatement. The
standard of the code is simply awful. The g++ compiler is
fantastic, but the Arduino APIs just don't use that power.

As an example, "digitalWrite" takes over 50 cycles, compared
to the expected 2. I know that there are libraries that work
faster, but why are the default libraries so bad? Even calling
these methods takes at least *three* times the code space
that's required. I drilled in to see what's going on, but
that's not the topic here. I wanted to show how things could
be better, and to see if anyone here is interested in making
it happen (personally I actually want to do this for ST's ARM
range, but will assist if someone wants to do AVR versions).

Using template metaprogramming, we can get nice object-oriented
APIs that also map directly to the hardware instructions.
Unfortunately it's not easy to use the existing Arduino port
definitions as template parameters, which might mean having to
redefine some of the #defines of the low- level hardware (more
below). So here's a minimal example that works, and shows what
could be achieved by following this route:

template <int Port, uint8_t Mask>
class Pin
{
         public:
         Pin& operator=(bool b)
         {
                 if (b)
                         *(volatile uint8_t*)Port |= Mask;
                 else
                         *(volatile uint8_t*)Port &= ~Mask;
                 return *this;
         }
};

Pin<0x25, 0x01> portBp0;

Note that the 0x25 is the memory-mapped address of PORTB (its
I/O address is 0x05, but memory-mapping adds an offset of
0x20, if I understand the AVR hardware correctly).

Now, when I write "portBp0 = 1;" I get exactly one instruction
emitted ("sbi") which takes the expected 2 cycles (1 in -Mega).
Same deal for "portBp0 = 0;", the instruction is "cbi". Both
are single-word instructions, whereas a call to digitalWrite
takes three or four words of code space.

Note that I would have preferred to define the template like
this:

template <volatile uint8_t* Port, uint8_t Mask>
class Pin
{...};

Which allows removing the casts on uses of Port, but to be
able to instantiate the template requires a cast:

Pin<PORTB, 0x01> portBp0;

which translates roughly to:

Pin<(volatile uint8_t*)0x25, 0x01> portBp0;

... and that's not valid for a template parameter. The only
method I know that does work is to define the port variable as
extern, in a particular section, and use the linker script or
the linker option --just-symbols to define the location. This
means we can also use a C++ reference instead of a pointer:

extern volatile uint8_t PortB; // address provided to the linker
template <volatile uint8_t& Port, uint8_t Mask>
class Pin
{...};

Pin<PortB, 0x01> portBp1;

It's quite a lot of fiddling to use a linker script, but
using --just-symbols is easy enough; either way you can't
use the standard AVR header files for the values :(.

One option might be to define a structure for all the
registers in a given AVR variant (and just locate the
structure using --just-symbols), e.g.

extern struct {
         ...
         volatile uint8_t PortB; // ... at address 0x25 in the structure.
         ...
} CPU;

void clear_B()
{
         CPU.PortB = 0;
}

The other advantage of using templates is that we can
specialise them to set up the port correctly, and to check for
collisions in port usage:

template <volatile uint8_t& Port, uint8_t Mask>
class OutputPin : public Pin<Port, Mask>
{
         OutputPin()
         {
                 // (Check with a pin registry that this pin
                 // isn't already assigned to something else?)
                 // Set up port direction...
         }
         ...
};

This also means that you can dynamically assign port pins just
by defining a local variable in a function, and the pin will
be set up for you when you hit that function.

With more work, you could set up templates for whole ports, or
for ranges of pins on the same port:

template <volatile uint8_t& Port, uint8_t Mask, int Shift>
class PinRange
{
         public:
         operator int()
         {
                 return (Port&Mask) >> Shift;
         };

         PinRange& operator=(int val)
         {
                 Port = (Port&~Mask) | ((val << Shift)&Mask);
                 return *this;
         };

         Pin& operator++()
extern struct {
         ...
         volatile uint8_t PortB; // ... at address 0x25 in the structure.
         ...
} CPU;

void clear_B()
{
         CPU.PortB = 0;
}

The other advantage of using templates is that we can
specialise them to set up the port correctly, and to check for
collisions in port usage:

template <volatile uint8_t& Port, uint8_t Mask>
class OutputPin : public Pin<Port, Mask>
{
         OutputPin()
         {
                 // (Check with a pin registry that this pin
                 // isn't already assigned to something else?)
                 // Set up port direction...
         }
         ...
};

This also means that you can dynamically assign port pins just
by defining a local variable in a function, and the pin will
be set up for you when you hit that function.

With more work, you could set up templates for whole ports, or
for ranges of pins on the same port:

template <volatile uint8_t& Port, uint8_t Mask, int Shift>
class PinRange
{
         public:
         operator int()
         {
                 return (Port&Mask) >> Shift;
         };

         PinRange& operator=(int val)
         {
                 Port = (Port&~Mask) | ((val << Shift)&Mask);
                 return *this;
         };

         Pin& operator++()
         {
                 *this = (int)*this + 1;
                 return *this;
         };
         // ..., etc
};

PinRange<CPU.PortB, 0x16, 2> portBpins23and4;

The G++ compiler is theoretically quite capable of turning all
these templates and meta-programming into the most efficient
possible inline assembly code, with none of the downsides of
the Arduino approach.

I spent a few hours playing with this approach, and when you
go to "extern" definitions with the address provided to the
linker, the compiler no longer recognises that it can
substitute "sbi" for "ldw", "or" and "stw", so you get
long-form code.

I tried to force the issue using inline "asm" calls to the SBI
instruction, but then gcc won't coerce the (unknown, but
possibly 16-bit) address into the 6-bit field, even when I try
various ways to force it. I think that Atmel have hacked gcc
just enough to work for the cases they care about.

The upshot of that is I can't make proper use of a "struct"
(because I can't locate it in memory). The address must be
a constant whose value is visible to the compiler, not just
constant at link time.

I.e. I can't see any way to use SBI/CLI instructions on
registers in this struct:

struct __attribute__((packed)) AvrIOPort {
        uint8_t         pin;
        uint8_t         ddr;
        uint8_t         data;
};
extern  volatile AvrIOPort      PortB;  // Address set by a linker option

or the low 0x20 bytes of my much larger "CPU" structure (which
maps the entire 0xFF block).

Here is the code which fails:
         {
                 *this = (int)*this + 1;
                 return *this;
         };
         // ..., etc
};

PinRange<CPU.PortB, 0x16, 2> portBpins23and4;

The G++ compiler is theoretically quite capable of turning all
these templates and meta-programming into the most efficient
possible inline assembly code, with none of the downsides of
the Arduino approach.

I spent a few hours playing with this approach, and when you
go to "extern" definitions with the address provided to the
linker, the compiler no longer recognises that it can
substitute "sbi" for "ldw", "or" and "stw", so you get
long-form code.

I tried to force the issue using inline "asm" calls to the SBI
instruction, but then gcc won't coerce the (unknown, but
possibly 16-bit) address into the 6-bit field, even when I try
various ways to force it. I think that Atmel have hacked gcc
just enough to work for the cases they care about.

The upshot of that is I can't make proper use of a "struct"
(because I can't locate it in memory). The address must be
a constant whose value is visible to the compiler, not just
constant at link time.

I.e. I can't see any way to use SBI/CLI instructions on
registers in this struct:

struct __attribute__((packed)) AvrIOPort {
        uint8_t         pin;
        uint8_t         ddr;
        uint8_t         data;
};
extern  volatile AvrIOPort      PortB;  // Address set by a linker option

or the low 0x20 bytes of my much larger "CPU" structure (which
maps the entire 0xFF block).

Here is the code which fails:

template <volatile AvrIOPort& Port, uint8_t Number>
class Pin
{
public:
        Pin& operator=(bool b)
        {
                if (b)
                        // Port.data |= (01<<Number);
                        asm volatile(
                                " sbi %[portdata],%[portbit]\n"
                                : // Output Operands
                                : // Input Operands
                                  [portdata] "I" (&Port.data),
                                  [portbit] "I" (01<<Number)
                                :
                        );
                else
                        // Port.data &= ~(01<< Number);
                        asm volatile(
                                " cbi %[portdata],%[portbit]\n"
                                : // Output Operands
                                : // Input Operands
                                  [portdata] "I" (&Port.data),
                                  [portbit] "I" (01<<Number)
                                : // Clobbers
                        );
                return *this;
        }
        void    output() { }
};

Pin<PortB, 0> portBp0;

The compiler can't see that the (external) address of
"Port.data" can be fit into a 6-bit field (specified by the
"I" parameter type), so it complains "impossible constraint".
It's still faster than calling digitalWrite, but not much
smaller.

I can still make this all work using #defines for all the
register addresses, but it's a lot uglier than using structs.

I hope I don't have the same problem with the ARM version of
gcc.

Anyhow, I hope I've piqued someone's interest. Your comments
would be welcome.

Clifford Heath.
Op 18-May-16 om 6:58 AM schreef Clifford Heath:
> I'm doing a little project, my first with Arduinos. I had > assumed that the choice to use C++ meant that the APIs would > be fancy object-oriented APIs that generate inline assembly > code for performance. I normally do more bare-metal stuff, > including building C++ APIs for the peripherals of the > MC68HC11 more than a decade ago, so I was keen to see what can > be achieve using more modern C++ compilers.
You might check my "Objects? No Thanks" approach (being as efficient as C but with compile-time polymorphism) and Odin Holmes' Kvasir library (more complex, but potentially faster and compacter than C). Keep me posted if you find more such libraries (I try to gather them on http://www.voti.nl/blog/?page_id=144), or if you want to cooperate. Let's make embedded more ++ ! Wouter
On 18/05/16 06:58, Clifford Heath wrote:

(I'm snipping your post here, because it was quite long and because you
appear to duplicated a few parts by copy and paste.)

First, I believe the Arduino uses C++ more as "a better C" than taking
advantage of C++'s features.  The point is to help make it easier to
write correct code (and to get compiler warnings on incorrect code),
rather than to be a good C++ framework or generate efficient code.  I am
sure it is possible to get a better compromise here, but it is not the
first nor the last framework to be ridiculously inefficient at something
as simple as setting an IO pin.  I have seen many others where, in the
interest of "abstraction", "layers", "drivers", etc., setting a pin
means passing up and down through have a dozen function calls from
different files.  As well as being painfully slow, it is also extremely
difficult to actually see what is happing in the code, and to trace
problems.

So I am all in favour of templates as the modern way to handle this sort
of thing.  Currently, I use a set of macros - I have used basically the
same macros, adjusted and tweaked for different targets, for C and
assembly on a wide range of systems.  But templates have many
advantages, especially in light of the improvements in C++11 and C++14
on type safety and compile-time features (explicit conversion operators,
constexpr, etc.).

I have a couple of general comments on your approach here, if I may.

First, you seem to want to get rid of the "*(volatile uint8_t*)" cast
inside the template function.  I would say this is no problem at all -
casts like this are part of the low-level machinery, and putting them
inside the template cast is exactly where they should go.  When defining
lots of member functions, you can reduce the clutter a little by
wrapping the cast parts inside inline private functions.

It is also possible to make the template with a "volatile uint_t*"
pointer (or reference) rather than an integer, but as you saw that
brings new complications - you cannot instantiate the template with
"PORTB" or any other casts from a constant integer.

The rules for where you can make casts and conversions like this are
fiddly, and you need to be quite experienced to understand them all.
But a bit of trial and error, guessing possible solutions and looking at
what compiles, is also workable!

While a direct case would not work, as far as I could tell, this
constexpr function seems fine:

int constexpr intAddressOf(volatile uint8_t* p) {
  return (int) (intptr_t) p;
}

Pin<intAddressOf(&PORTB), 4> portBp4;



Second, one of the most important points in using templates like this is
to be able to give your pins appropriate names.  So you would be doing
something like this:

Pin<0x25, 0x04> statusLedPin;

I'd also add an optional template parameter for active low or active
high, and consider named member functions rather than overloading the
assignment operator, thus letting you write:

statusLedPin.on();

If a new hardware revision changes the polarity, or the pin number, you
only need to change the declaration of statusLedPin, not its usage.



On 18/05/16 09:05, Clifford Heath wrote:
> On 18/05/16 17:51, David Brown wrote: >> On 18/05/16 06:58, Clifford Heath wrote: >> (I'm snipping your post here, because it was quite long and because you >> appear to duplicated a few parts by copy and paste.) > > You're right, I did - it was separately authored and I had to > re-wrap it to avoid a narrower wrap margin in Thunderbird :(. > >> First, I believe the Arduino uses C++ more as "a better C" than taking >> advantage of C++'s features. The point is to help make it easier to >> write correct code (and to get compiler warnings on incorrect code), >> rather than to be a good C++ framework or generate efficient code. > > Yes, I get that. But even as a "better C" there's no reason to > be 5x slower than possible. > >> I have a couple of general comments on your approach here, if I may. > > Much appreciated - it's exactly what I was hoping for by reposting here. > >> First, you seem to want to get rid of the "*(volatile uint8_t*)" cast >> inside the template function. I would say this is no problem at all - >> casts like this are part of the low-level machinery, and putting them >> inside the template cast is exactly where they should go. When defining >> lots of member functions, you can reduce the clutter a little by >> wrapping the cast parts inside inline private functions. > > Yes, I just wanted to minimize clutter. > >> It is also possible to make the template with a "volatile uint_t*" >> pointer (or reference) rather than an integer, but as you saw that >> brings new complications - you cannot instantiate the template with >> "PORTB" or any other casts from a constant integer. >> >> The rules for where you can make casts and conversions like this are >> fiddly, and you need to be quite experienced to understand them all. >> But a bit of trial and error, guessing possible solutions and looking at >> what compiles, is also workable! >> >> While a direct case would not work, as far as I could tell, this >> constexpr function seems fine: >> >> int constexpr intAddressOf(volatile uint8_t* p) { >> return (int) (intptr_t) p; >> } >> >> Pin<intAddressOf(&PORTB), 4> portBp4; > > Great, I'll give that a try. My last major foray into C++ > templates was in 2007, and things have advanced considerably > since then. >
There may be other ways to achieve this too - it's just the first successful method from my trail-and-error probing. You might consider wrapping things in a macro.
>> Second, one of the most important points in using templates like this is >> to be able to give your pins appropriate names. So you would be doing >> something like this: >> >> Pin<0x25, 0x04> statusLedPin; > > Of course - But I wanted to make my example more easily > comprehensible to someone who hadn't seen templates before.
That's fair enough. But if you are making some sort of tutorial or example, then I would put this at the top of the page. The last thing you want is people to have a new way of writing: PORTB &= ~0x04; // Turn status led on Changing that to: portBp4 = 0; // Turn status led on is no improvement. So I would structure your tutorial or documentation in terms of first stating that you want to be able to write: statusLedPin.on(); And then consider how to achieve this in a safe and efficient manner.
> >> I'd also add an optional template parameter for active low or active >> high, and consider named member functions rather than overloading the >> assignment operator, thus letting you write: >> >> statusLedPin.on(); > > Right; once you have the hardware mapped, you can write > peripherals and solution-specific code. >
Certainly there may be some solution-specific code - it can be made as template specialisations. But a fair number of member functions can be standard, such as for port direction, pull-ups, etc. You might also want to consider your operator choice. Do you really want to use boolean assignment? It is not a bad choice, IMHO, but there are other options. What about: statusLedPin << true; statusLedPin << pins.active; statusLedPin.set(true); activate(statusLedPin); Should you distinguish between input pins and output pins? Should bool(statusLedPin) return the value driven out, or the current input? What happens if you write: Pin<0x25, 0x04> statusLedPin; in two modules? Should you really be writing: extern Pin<0x25, 0x04> statusLedPin; in a header, and Pin<0x25, 0x04> statusLedPin; in one source file - so that the constructor can handle setting up the pin direction register?
>> If a new hardware revision changes the polarity, or the pin number, you >> only need to change the declaration of statusLedPin, not its usage. > > Finally, on a CPU like the STM32F3 which has an I/O crossbar, > you need to configure the crossbar using a template parameter > too. So PortB, Bit3 might be mapped to pin 22 using the crossbar. > It might be a big challenge to *statically* establish that > all crossbar configuration is valid and non-conflicting. I was > thinking about ways to do it by defining external symbols so > the linker would reject conflicts, but even an error that's not > thrown until program initialization would be better than just > stomping on the same hardware. >
There comes a point when the best method is a pre-processor (written for the host - in C++, Python, or whatever) that generates the C/C++ header and source file handling this sort of thing. It is /possible/ to do many weird and exciting things with the linker, but even when you can get it to spot conflicts, it can be hard to for the user to convert linker errors into what actually went wrong.
Op 18-May-16 om 8:29 AM schreef Clifford Heath:
> On 18/05/16 16:46, Wouter van Ooijen wrote: >> Op 18-May-16 om 6:58 AM schreef Clifford Heath: >>> I'm doing a little project, my first with Arduinos. I had >>> assumed that the choice to use C++ meant that the APIs would >>> be fancy object-oriented APIs that generate inline assembly >>> code for performance. I normally do more bare-metal stuff, >>> including building C++ APIs for the peripherals of the >>> MC68HC11 more than a decade ago, so I was keen to see what can >>> be achieve using more modern C++ compilers. >> >> You might check my "Objects? No Thanks" approach (being as efficient as >> C but with compile-time polymorphism) and Odin Holmes' Kvasir library >> (more complex, but potentially faster and compacter than C). > > I don't mind objects - they can be as efficient as C also, > but like all things, you need to understand how they work. > >> Keep me posted if you find more such libraries (I try to gather them on >> http://www.voti.nl/blog/?page_id=144), or if you want to cooperate. >> >> Let's make embedded more ++ ! > > Indeed :). > > Have you seen Andy Brown's excellent stm32plus: > <https://github.com/andysworkshop/stm32plus>
Thanks, added to the list. I'll study his techniques when I have some spare time. Might even be before I die of old age... Wouter
This is a multi-part message in MIME format.
--------------020007090601010002000202
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit

Much snippage, leaving (hopefully) just the important context:

On 18/05/16 19:22, David Brown wrote:
> On 18/05/16 09:05, Clifford Heath wrote: >> On 18/05/16 17:51, David Brown wrote: >>> It is also possible to make the template with a "volatile uint_t*" >>> pointer (or reference) rather than an integer, but as you saw that >>> brings new complications - you cannot instantiate the template with >>> "PORTB" or any other casts from a constant integer.
I tried this anyway, see the attachment. You're right, it didn't work, which leaves me without any comfortable solution to my original problem. It's not nice for *users* to have to include the intAddressOf() call - and that only passes a number, not a structure that describes a whole device.
>>> The rules for where you can make casts and conversions like this are >>> fiddly, and you need to be quite experienced to understand them all. >>> But a bit of trial and error, guessing possible solutions and looking at >>> what compiles, is also workable! >>> >>> While a direct case would not work, as far as I could tell, this >>> constexpr function seems fine: >>> int constexpr intAddressOf(volatile uint8_t* p) { >>> return (int) (intptr_t) p; >>> } >>> Pin<intAddressOf(&PORTB), 4> portBp4;
I tried the equivalent of that, but the other way, from an integer to a reference or to a pointer... no dice.
> There may be other ways to achieve this too - it's just the first > successful method from my trial-and-error probing.
If you can see any way to make the templates in the attachment work, I'd be most appreciative.
> Should you distinguish between input pins and output pins? Should > bool(statusLedPin) return the value driven out, or the current input?
Yes, of course. I would specialize the template for Input, Output, and Bi-Dir pins; perhaps even for source-only and sink-only outputs. And signal polarity, as you suggest. If a pin is declared as an auto, arguably it should revert the pin to its previous state on destruction, in case it was also used as a global. Or have an "in-use" bitmask to notify that.
> What happens if you write: > Pin<0x25, 0x04> statusLedPin; > in two modules? Should you really be writing: > extern Pin<0x25, 0x04> statusLedPin; > in a header, and > Pin<0x25, 0x04> statusLedPin; > in one source file - so that the constructor can handle setting up the > pin direction register?
The latter, as for any variable, because the former is a duplicate definition.
>> Finally, on a CPU like the STM32F3 which has an I/O crossbar, >> you need to configure the crossbar using a template parameter >> too. So PortB, Bit3 might be mapped to pin 22 using the crossbar.
Actually I think I made an error. The port bits should get allocated as we've discussed, and the crossbar should have separate configuration templates; it assigns pins to port-bits. So the above template should be PortBit (not Pin) and the Pin template should assign physical pins to PortBits. User programs would need PortBit instances, and also pin assignment ones. Double definition, but that reflects the hardware. The nice thing is to do that in a way which knows which ports and pins exist on a specific device so you can't use hardware which doesn't exist, and can't (inadvertently) double up on the same hardware.
>> It might be a big challenge to *statically* establish that >> all crossbar configuration is valid and non-conflicting. I was >> thinking about ways to do it by defining external symbols so >> the linker would reject conflicts, but even an error that's not >> thrown until program initialization would be better than just >> stomping on the same hardware. > There comes a point when the best method is a pre-processor (written for > the host - in C++, Python, or whatever) that generates the C/C++ header > and source file handling this sort of thing. It is /possible/ to do > many weird and exciting things with the linker, but even when you can > get it to spot conflicts, it can be hard to for the user to convert > linker errors into what actually went wrong.
Right, but even if collisions in the cross-bar are detected as a failure at program initialization, that's ok as long as the definitions are easy to understand and to get right. Can a template with an integer parameter export a symbol which includes that number in the symbol name? If so, you could have template <int pin, PortBit& bit> class Pin { static void pin_<pin>_is_in_use; ... }; to fail at link-time if you re-use a pin. The linker would tell you where the duplicate definition is. Clifford Heath. --------------020007090601010002000202 Content-Type: text/plain; charset=UTF-8; name="UnoPortTemplates.ino" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="UnoPortTemplates.ino" c3RydWN0IF9fYXR0cmlidXRlX18oKHBhY2tlZCkpIEF2cklPUG9ydCB7Cgl1aW50OF90CQlw aW47Cgl1aW50OF90CQlkZHI7Cgl1aW50OF90CQlkYXRhOwp9OwoKc3RydWN0IF9fYXR0cmli dXRlX18oKHBhY2tlZCkpIEF2ckFUTWVnYUNQVQp7Cgl1aW50OF90CQlfcmVzZXJ2ZWRfMDA7 Cgl1aW50OF90CQlfcmVzZXJ2ZWRfMDE7Cgl1aW50OF90CQlfcmVzZXJ2ZWRfMDI7CglBdnJJ T1BvcnQJcG9ydF9iOwoJQXZySU9Qb3J0CXBvcnRfYzsKCUF2cklPUG9ydAlwb3J0X2Q7Cgkv LyAuLi4KfTsKY29uc3RleHByIHZvbGF0aWxlIEF2ckFUTWVnYUNQVSYJQ1BVID0gKihBdnJB VE1lZ2FDUFUqKTA7CmNvbnN0ZXhwciB2b2xhdGlsZSBBdnJJT1BvcnQmIAkJUG9ydEIgPSBD UFUucG9ydF9iOwoKLy8gVGhpcyB2ZXJzaW9uIHdvcmtzLCB1c2luZyBhbiBpbnRlZ2VyIGFz IHRoZSB0ZW1wbGF0ZSBwYXJhbWV0ZXIsCi8vIGJ1dCBpdCByZXF1aXJlcyB0aGUgdXNlIG9m IHRoZSBpbnRBZGRyZXNzT2YgZnVuY3Rpb24gdG8gZ2V0IHRoZSBpbnQ6CnRlbXBsYXRlIDxp bnQgUG9ydCwgdWludDhfdCBNYXNrPgpjbGFzcyBQaW4KewpwdWJsaWM6CglQaW4mIG9wZXJh dG9yPShib29sIGIpCgl7CgkJaWYgKGIpCgkJCSoodm9sYXRpbGUgdWludDhfdCopUG9ydCB8 PSBNYXNrOwoJCWVsc2UKCQkJKih2b2xhdGlsZSB1aW50OF90KilQb3J0ICY9IH5NYXNrOwoJ CXJldHVybiAqdGhpczsKCX0KfTsKCmludCBjb25zdGV4cHIgaW50QWRkcmVzc09mKHZvbGF0 aWxlIHVpbnQ4X3QqIHApIHsKCXJldHVybiAoaW50KSAoaW50cHRyX3QpIHA7Cn0KUGluPGlu dEFkZHJlc3NPZigmUE9SVEIpLCAweDAxPiBwb3J0QnAwOwkJLy8gVXNlIHRoZSBBcmR1aW5v IGhlYWRlciBmaWxlIG1hY3JvClBpbjxpbnRBZGRyZXNzT2YoJlBvcnRCLmRhdGEpLCAweDAy PiBwb3J0QnAxOwkJLy8gVXNlIHRoZSBQb3J0IHJlZmVyZW5jaW5nIHRoZSBDUFUgc3RydWN0 dXJlClBpbjxpbnRBZGRyZXNzT2YoJkNQVS5wb3J0X2IuZGF0YSksIDB4MDQ+IHBvcnRCcDI7 CS8vIFVzZSB0aGUgUG9ydCBpbiB0aGUgQ1BVIHN0cnVjdHVyZQoKLy8gU2FtZSBhZ2Fpbiwg YnV0IHVzaW5nIGFuIEF2cklPUG9ydCByZWZlcmVuY2UgYXMgYSB0ZW1wbGF0ZSBwYXJhbWV0 ZXIKdGVtcGxhdGUgPHZvbGF0aWxlIEF2cklPUG9ydCYgUG9ydCwgdWludDhfdCBNYXNrPgpj bGFzcyBQaW5SCnsKcHVibGljOgoJUGluUiYgb3BlcmF0b3I9KGJvb2wgYikKCXsKCQlpZiAo YikKCQkJUG9ydC5kYXRhIHw9IE1hc2s7CgkJZWxzZQoJCQlQb3J0LmRhdGEgJj0gfk1hc2s7 CgkJcmV0dXJuICp0aGlzOwoJfQp9OwoKLy8gVGhlIGZpcnN0IHBhcmFtZXRlcnMgdG8gdGhl c2UgdGVtcGxhdGUgaW5zdGFuY2VzIGFyZSBub3QgYWxsb3dlZAovL1BpblI8UE9SVEIsIDB4 MDE+CQlyX3BvcnRCcDA7Ci8vUGluUjxQb3J0QiwgMHgwND4JCXJfcG9ydEJwMTsKLy9QaW5S PENQVS5wb3J0X2IsIDB4MDQ+CXJfcG9ydEJwMjsKCi8vIFNhbWUgYWdhaW4sIGJ1dCB1c2lu ZyBhbiBBdnJJT1BvcnQgcG9pbnRlciBhcyBhIHRlbXBsYXRlIHBhcmFtZXRlcgp0ZW1wbGF0 ZSA8dm9sYXRpbGUgQXZySU9Qb3J0KiBQb3J0LCB1aW50OF90IE1hc2s+CmNsYXNzIFBpblAK ewpwdWJsaWM6CglQaW5QJiBvcGVyYXRvcj0oYm9vbCBiKQoJewoJCWlmIChiKQoJCQlQb3J0 LT5kYXRhIHw9IE1hc2s7CgkJZWxzZQoJCQlQb3J0LT5kYXRhICY9IH5NYXNrOwoJCXJldHVy biAqdGhpczsKCX0KfTsKCi8vIFRoZSBmaXJzdCBwYXJhbWV0ZXJzIHRvIHRoZXNlIHRlbXBs YXRlIGluc3RhbmNlcyBhcmUgbm90IGFsbG93ZWQ6Ci8vUGluUDwmUE9SVEIsIDB4MDE+CQlw X3BvcnRCcDA7Ci8vUGluUDwmUG9ydEIsIDB4MDQ+CQlwX3BvcnRCcDE7Ci8vUGluUDwmQ1BV LnBvcnRfYiwgMHgwND4JcF9wb3J0QnAyOwoKdm9pZCBzZXR1cCgpCnsKCS8vIFRoaXMgdXNl cyBPVVQgaW5zdHJ1Y3Rpb25zOgoJQ1BVLnBvcnRfYi5kYXRhID0gMHgxMjsKCVBvcnRCLmRh dGEgPSAweDM0OwoJUG9ydEIuZGF0YSA9IDB4NTY7CgoJLy8gVGhpcyB1c2VzIElOIGFuZCBP VVQgaW5zdHJ1Y3Rpb25zOgoJQ1BVLnBvcnRfYi5kYXRhIHw9IDB4Nzg7CgoJLy8gVGhlc2Ug dXNlIFNCSSBpbnN0cnVjdGlvbnM6Cglwb3J0QnAwID0gMTsKCXBvcnRCcDEgPSAxOwoJcG9y dEJwMiA9IDE7CgoJLy8gcl9wb3J0QnAwID0gMTsKCS8vIHJfcG9ydEJwMSA9IDE7CgkvLyBy X3BvcnRCcDIgPSAxOwoKCS8vIHBfcG9ydEJwMCA9IDE7CgkvLyBwX3BvcnRCcDEgPSAxOwoJ Ly8gcF9wb3J0QnAyID0gMTsKfQoKdm9pZCBsb29wKCkKewp9Cg== --------------020007090601010002000202--
On 19/05/16 15:32, Clifford Heath wrote:
> Much snippage, leaving (hopefully) just the important context: > > On 18/05/16 19:22, David Brown wrote: >> On 18/05/16 09:05, Clifford Heath wrote: >>> On 18/05/16 17:51, David Brown wrote: >>>> It is also possible to make the template with a "volatile uint_t*" >>>> pointer (or reference) rather than an integer, but as you saw that >>>> brings new complications - you cannot instantiate the template with >>>> "PORTB" or any other casts from a constant integer. > > I tried this anyway, see the attachment. You're right, it didn't > work, which leaves me without any comfortable solution to my > original problem. It's not nice for *users* to have to include > the intAddressOf() call - and that only passes a number, not > a structure that describes a whole device. > >>>> The rules for where you can make casts and conversions like this are >>>> fiddly, and you need to be quite experienced to understand them all. >>>> But a bit of trial and error, guessing possible solutions and >>>> looking at >>>> what compiles, is also workable! >>>> >>>> While a direct case would not work, as far as I could tell, this >>>> constexpr function seems fine: >>>> int constexpr intAddressOf(volatile uint8_t* p) { >>>> return (int) (intptr_t) p; >>>> } >>>> Pin<intAddressOf(&PORTB), 4> portBp4; > > I tried the equivalent of that, but the other way, from an > integer to a reference or to a pointer... no dice. > >> There may be other ways to achieve this too - it's just the first >> successful method from my trial-and-error probing. > > If you can see any way to make the templates in the attachment > work, I'd be most appreciative. > >> Should you distinguish between input pins and output pins? Should >> bool(statusLedPin) return the value driven out, or the current input? > > Yes, of course. I would specialize the template for > Input, Output, and Bi-Dir pins; perhaps even for > source-only and sink-only outputs. And signal > polarity, as you suggest. > > If a pin is declared as an auto, arguably it should > revert the pin to its previous state on destruction, in > case it was also used as a global. Or have an "in-use" > bitmask to notify that. > >> What happens if you write: >> Pin<0x25, 0x04> statusLedPin; >> in two modules? Should you really be writing: >> extern Pin<0x25, 0x04> statusLedPin; >> in a header, and >> Pin<0x25, 0x04> statusLedPin; >> in one source file - so that the constructor can handle setting up the >> pin direction register? > > The latter, as for any variable, because the former is a duplicate > definition. > >>> Finally, on a CPU like the STM32F3 which has an I/O crossbar, >>> you need to configure the crossbar using a template parameter >>> too. So PortB, Bit3 might be mapped to pin 22 using the crossbar. > > Actually I think I made an error. The port bits should get > allocated as we've discussed, and the crossbar should have > separate configuration templates; it assigns pins to port-bits. > So the above template should be PortBit (not Pin) and the Pin > template should assign physical pins to PortBits. > > User programs would need PortBit instances, and also pin > assignment ones. Double definition, but that reflects the > hardware. > > The nice thing is to do that in a way which knows which ports > and pins exist on a specific device so you can't use hardware > which doesn't exist, and can't (inadvertently) double up on > the same hardware. > >>> It might be a big challenge to *statically* establish that >>> all crossbar configuration is valid and non-conflicting. I was >>> thinking about ways to do it by defining external symbols so >>> the linker would reject conflicts, but even an error that's not >>> thrown until program initialization would be better than just >>> stomping on the same hardware. >> There comes a point when the best method is a pre-processor (written for >> the host - in C++, Python, or whatever) that generates the C/C++ header >> and source file handling this sort of thing. It is /possible/ to do >> many weird and exciting things with the linker, but even when you can >> get it to spot conflicts, it can be hard to for the user to convert >> linker errors into what actually went wrong. > > Right, but even if collisions in the cross-bar are detected as > a failure at program initialization, that's ok as long as the > definitions are easy to understand and to get right. > > Can a template with an integer parameter export a symbol which > includes that number in the symbol name? If so, you could have > > template <int pin, PortBit& bit> > class Pin > { > static void pin_<pin>_is_in_use; > ... > }; > > to fail at link-time if you re-use a pin. The linker would > tell you where the duplicate definition is. > > Clifford Heath.
No takers? It seems that modern C++ is still the same exploding ball of rotating knives, except now, some of the knives have been deliberately blunted "to make it safer". Humbug.
Op 20-May-16 om 9:08 PM schreef Clifford Heath:
> On 19/05/16 15:32, Clifford Heath wrote: >> Much snippage, leaving (hopefully) just the important context: >> >> On 18/05/16 19:22, David Brown wrote: >>> On 18/05/16 09:05, Clifford Heath wrote: >>>> On 18/05/16 17:51, David Brown wrote: >>>>> It is also possible to make the template with a "volatile uint_t*" >>>>> pointer (or reference) rather than an integer, but as you saw that >>>>> brings new complications - you cannot instantiate the template with >>>>> "PORTB" or any other casts from a constant integer. >> >> I tried this anyway, see the attachment. You're right, it didn't >> work, which leaves me without any comfortable solution to my >> original problem. It's not nice for *users* to have to include >> the intAddressOf() call - and that only passes a number, not >> a structure that describes a whole device. >> >>>>> The rules for where you can make casts and conversions like this are >>>>> fiddly, and you need to be quite experienced to understand them all. >>>>> But a bit of trial and error, guessing possible solutions and >>>>> looking at >>>>> what compiles, is also workable! >>>>> >>>>> While a direct case would not work, as far as I could tell, this >>>>> constexpr function seems fine: >>>>> int constexpr intAddressOf(volatile uint8_t* p) { >>>>> return (int) (intptr_t) p; >>>>> } >>>>> Pin<intAddressOf(&PORTB), 4> portBp4; >> >> I tried the equivalent of that, but the other way, from an >> integer to a reference or to a pointer... no dice. >> >>> There may be other ways to achieve this too - it's just the first >>> successful method from my trial-and-error probing. >> >> If you can see any way to make the templates in the attachment >> work, I'd be most appreciative. >> >>> Should you distinguish between input pins and output pins? Should >>> bool(statusLedPin) return the value driven out, or the current input? >> >> Yes, of course. I would specialize the template for >> Input, Output, and Bi-Dir pins; perhaps even for >> source-only and sink-only outputs. And signal >> polarity, as you suggest. >> >> If a pin is declared as an auto, arguably it should >> revert the pin to its previous state on destruction, in >> case it was also used as a global. Or have an "in-use" >> bitmask to notify that. >> >>> What happens if you write: >>> Pin<0x25, 0x04> statusLedPin; >>> in two modules? Should you really be writing: >>> extern Pin<0x25, 0x04> statusLedPin; >>> in a header, and >>> Pin<0x25, 0x04> statusLedPin; >>> in one source file - so that the constructor can handle setting up the >>> pin direction register? >> >> The latter, as for any variable, because the former is a duplicate >> definition. >> >>>> Finally, on a CPU like the STM32F3 which has an I/O crossbar, >>>> you need to configure the crossbar using a template parameter >>>> too. So PortB, Bit3 might be mapped to pin 22 using the crossbar. >> >> Actually I think I made an error. The port bits should get >> allocated as we've discussed, and the crossbar should have >> separate configuration templates; it assigns pins to port-bits. >> So the above template should be PortBit (not Pin) and the Pin >> template should assign physical pins to PortBits. >> >> User programs would need PortBit instances, and also pin >> assignment ones. Double definition, but that reflects the >> hardware. >> >> The nice thing is to do that in a way which knows which ports >> and pins exist on a specific device so you can't use hardware >> which doesn't exist, and can't (inadvertently) double up on >> the same hardware. >> >>>> It might be a big challenge to *statically* establish that >>>> all crossbar configuration is valid and non-conflicting. I was >>>> thinking about ways to do it by defining external symbols so >>>> the linker would reject conflicts, but even an error that's not >>>> thrown until program initialization would be better than just >>>> stomping on the same hardware. >>> There comes a point when the best method is a pre-processor (written for >>> the host - in C++, Python, or whatever) that generates the C/C++ header >>> and source file handling this sort of thing. It is /possible/ to do >>> many weird and exciting things with the linker, but even when you can >>> get it to spot conflicts, it can be hard to for the user to convert >>> linker errors into what actually went wrong. >> >> Right, but even if collisions in the cross-bar are detected as >> a failure at program initialization, that's ok as long as the >> definitions are easy to understand and to get right. >> >> Can a template with an integer parameter export a symbol which >> includes that number in the symbol name? If so, you could have >> >> template <int pin, PortBit& bit> >> class Pin >> { >> static void pin_<pin>_is_in_use; >> ... >> }; >> >> to fail at link-time if you re-use a pin. The linker would >> tell you where the duplicate definition is. >> >> Clifford Heath. > > No takers?
Within the C++ language there is AFAIK no bridge between values (even compile time constants) and names. But the preprocessor has no such barrier. But I think that still would not achieve your aim, because it is legal to instantiate a class template twice with the same parameters. That is simply two times the same thing, and the linker would ignore all but one. Wouter "Objects? No Thanks!" van Ooijen
Clifford Heath <no.spam@please.net> wrote:
> (snip)
Here's another way for template-based register access: <https://github.com/andersm/register_templates> The code is based on this lightning talk from CppCon 2014 by Ken Smith: <https://www.youtube.com/watch?v=lrrQaa_-hzU> -a
Am Mittwoch, 18. Mai 2016 11:22:27 UTC+2 schrieb David Brown:
> > There comes a point when the best method is a pre-processor (written for > the host - in C++, Python, or whatever) that generates the C/C++ header > and source file handling this sort of thing. It is /possible/ to do > many weird and exciting things with the linker, but even when you can > get it to spot conflicts, it can be hard to for the user to convert > linker errors into what actually went wrong.
Over time there have been several attempts in that direction, there was a tool "Dave" from Siemens (still existing in Infineon) for their microcontrollers, and the PSoC Creator goes in that direction. For high-end devices NXP/Freescale's "DPAA Expert" is such a tool. In my opinion a community effort would be required that allows such preprocessing, allocation, configuration in a (host and target)-platform independent way, say XML+Python/Java. I think there are some free tools from academia but you would need to get enough acceptance to get the vendors into the boat. Andreas