EmbeddedRelated.com
Forums
The 2024 Embedded Online Conference

64-bit embedded computing is here and now

Started by James Brakefield June 7, 2021
Theo <theom+news@chiark.greenend.org.uk> writes:
>> Buy yourself a Raspberry Pi 4 and set it up to run your fish tank via a >> remote web browser. There's your 64 bit embedded system. > I suppose there's a question of what embedded tasks intrinsically require >>4GiB RAM, and those that do so because it makes programmers' lives easier?
You can buy a Raspberry Pi 4 with up to 8gb of ram, but the most common configuration is 2gb. The cpu is 64 bit anyway because why not?
> There are obviously plenty of computer systems doing that, but the > question I don't know is what applications can be said to be > 'embedded' but need that kind of RAM.
Lots of stuff is using 32 bit cpus with a few KB of ram these days. 32 bits is displacing 8 bits in the MCU world. Is 64 bit displacing 32 bit in application processors like the Raspberry Pi, even when less than 4GB of ram is involved? I think yes, at least to some extent, and it will continue. My fairly low end mobile phone has 2GB of ram and a 64 bit 4-core processor, I think. Will 64 bit MCU's displace 32 bit MCUs? I don't know, maybe not. Are application processors displacing MCU's in embedded systems? Not much in portable and wearable stuff (other than phones) at least for now, but in larger devices I think yes, at least somewhat for now, and probably more going forward. Even if you're not using networking, it makes software and UI development a heck of a lot easier.
Paul Rubin wrote:
> David Brown <david.brown@hesbynett.no> writes: >> I can't really tell what kinds of designs you are discussing here. When >> I talk about embedded systems in general, I mean microcontrollers >> running specific programs - not general-purpose computers in embedded >> formats (such as phones). > > Philip Munts made a comment a while back that stayed with me: that these > days, in anything mains powered, there is usually little reason to use > an MCU instead of a Linux board. >
Except that if it has a network connection, you have to patch it unendingly or suffer the common-as-dirt IoT security nightmares. Cheers Phil Hobbs -- Dr Philip C D Hobbs Principal Consultant ElectroOptical Innovations LLC / Hobbs ElectroOptics Optics, Electro-optics, Photonics, Analog Electronics Briarcliff Manor NY 10510 http://electrooptical.net http://hobbs-eo.com
On 6/9/2021 4:29, Don Y wrote:
> On 6/8/2021 3:01 PM, Dimiter_Popoff wrote: > >>> Am trying to puzzle out what a 64-bit embedded processor should look >>> like. >>> At the low end, yeah, a simple RISC processor.&nbsp; And support for >>> complex arithmetic >>> using 32-bit floats?&nbsp; And support for pixel alpha blending using quad >>> 16-bit numbers? >>> 32-bit pointers into the software? >> >> The real value in 64 bit integer registers and 64 bit address space is >> just that, having an orthogonal "endless" space (well I remember some >> 30 years ago 32 bits seemed sort of "endless" to me...). >> >> Not needing to assign overlapping logical addresses to anything >> can make a big difference to how the OS is done. > > That depends on what you expect from the OS.&nbsp; If you are > comfortable with the possibility of bugs propagating between > different subsystems, then you can live with a logical address > space that exactly coincides with a physical address space.
So how does the linear 64 bt address space get in the way of any protection you want to implement? Pages are still 4 k and each has its own protection attributes governed by the OS, it is like that with 32 bit processors as well (I talk power, I am not interested in half baked stuff like ARM, risc-v etc., I don't know if there could be a problem like that with one of these). There is *nothing* to gain on a 64 bit machine from segmentation, assigning overlapping address spaces to tasks etc. Notice I am talking *logical* addresses, I was explicit about that. Dimiter ====================================================== Dimiter Popoff, TGI http://www.tgi-sci.com ====================================================== http://www.flickr.com/photos/didi_tgi/
On 6/9/2021 11:59, David Brown wrote:
> On 08/06/2021 22:39, Dimiter_Popoff wrote: >> On 6/8/2021 23:18, David Brown wrote: >>> On 08/06/2021 16:46, Theo wrote: >>>> ...... >>> >>>> Memory bus/cache width >>> >>> No, that is not a common way to measure cpu "width", for many reasons. >>> A chip is likely to have many buses outside the cpu core itself (and the >>> cache(s) may or may not be considered part of the core).&nbsp; It's common to >>> have 64-bit wide buses on 32-bit processors, it's also common to have >>> 16-bit external databuses on a microcontroller.&nbsp; And the cache might be >>> 128 bits wide. >> >> I agree with your points and those of Theo, but the cache is basically >> as wide as the registers? Logically, that is; a cacheline is several >> times that, probably you refer to that. >> Not that it makes much of a difference to the fact that 64 bit data >> buses/registers in an MCU (apart from FPU registers, 32 bit FPUs are >> useless to me) are unlikely to attract much interest, nothing of >> significance to be gained as you said. >> To me 64 bit CPUs are of interest of course and thankfully there are >> some available, but this goes somewhat past what we call&nbsp; "embedded". >> Not long ago in a chat with a guy who knew some of ARM 64 bit I gathered >> there is some real mess with their out of order execution, one needs to >> do... hmmmm.. "sync", whatever they call it, all the time and there is >> a huge performance cost because of that. Anybody heard anything about >> it? (I only know what I was told). >> > > sync instructions of various types can be needed to handle > thread/process synchronisation, atomic accesses, and coordination > between software and hardware registers. Software normally runs with > the idea that it is the only thing running, and the cpu can re-order and > re-arrange the instructions and execution as long as it maintains the > illusion that the assembly instructions in the current thread are > executed one after the other. These re-arrangements and parallel > execution can give very large performance benefits. > > But it also means that when you need to coordinate with other things, > you need syncs, perhaps cache flushes, etc. Full syncs can take > hundreds of cycles to execute on large processors. So you need to > distinguish between reads and writes, acquires and releases, syncs on > single addresses or general memory syncs. Big processors are optimised > for throughput, not latency or quick reaction to hardware events. > > There are good reasons why big cpus are often paired with a Cortex-M > core in SOCs. > >
Of course I know all that David, I have been using power processors which do things out of order for over 20 years now. What I was told was something about a real mess, like system memory accesses getting wrong because of out of order execution hence plenty of syncs needed to keep the thing working. I have not even tried to verify that, only someone with experience with 64 bit ARM can do that - so far none here seems to have that. Dimiter ====================================================== Dimiter Popoff, TGI http://www.tgi-sci.com ====================================================== http://www.flickr.com/photos/didi_tgi/
On 09/06/2021 20:00, Dimiter_Popoff wrote:
> On 6/9/2021 11:59, David Brown wrote: >> On 08/06/2021 22:39, Dimiter_Popoff wrote: >>> On 6/8/2021 23:18, David Brown wrote: >>>> On 08/06/2021 16:46, Theo wrote: >>>>> ...... >>>> >>>>> Memory bus/cache width >>>> >>>> No, that is not a common way to measure cpu "width", for many reasons. >>>> A chip is likely to have many buses outside the cpu core itself (and >>>> the >>>> cache(s) may or may not be considered part of the core).&nbsp; It's >>>> common to >>>> have 64-bit wide buses on 32-bit processors, it's also common to have >>>> 16-bit external databuses on a microcontroller.&nbsp; And the cache might be >>>> 128 bits wide. >>> >>> I agree with your points and those of Theo, but the cache is basically >>> as wide as the registers? Logically, that is; a cacheline is several >>> times that, probably you refer to that. >>> Not that it makes much of a difference to the fact that 64 bit data >>> buses/registers in an MCU (apart from FPU registers, 32 bit FPUs are >>> useless to me) are unlikely to attract much interest, nothing of >>> significance to be gained as you said. >>> To me 64 bit CPUs are of interest of course and thankfully there are >>> some available, but this goes somewhat past what we call&nbsp; "embedded". >>> Not long ago in a chat with a guy who knew some of ARM 64 bit I gathered >>> there is some real mess with their out of order execution, one needs to >>> do... hmmmm.. "sync", whatever they call it, all the time and there is >>> a huge performance cost because of that. Anybody heard anything about >>> it? (I only know what I was told). >>> >> >> sync instructions of various types can be needed to handle >> thread/process synchronisation, atomic accesses, and coordination >> between software and hardware registers.&nbsp; Software normally runs with >> the idea that it is the only thing running, and the cpu can re-order and >> re-arrange the instructions and execution as long as it maintains the >> illusion that the assembly instructions in the current thread are >> executed one after the other.&nbsp; These re-arrangements and parallel >> execution can give very large performance benefits. >> >> But it also means that when you need to coordinate with other things, >> you need syncs, perhaps cache flushes, etc.&nbsp; Full syncs can take >> hundreds of cycles to execute on large processors.&nbsp; So you need to >> distinguish between reads and writes, acquires and releases, syncs on >> single addresses or general memory syncs.&nbsp; Big processors are optimised >> for throughput, not latency or quick reaction to hardware events. >> >> There are good reasons why big cpus are often paired with a Cortex-M >> core in SOCs. >> >> > > Of course I know all that David, I have been using power processors > which do things out of order for over 20 years now.
It depends on the actual PPC's in question - with single core devices targeted for embedded systems, you don't need much of that at all. Perhaps an occasional sync of some sort in connection with using DMA, but that's about it. Key to this is, of course, having your MPU set up right to make sure hardware register accesses are in-order and not cached.
> What I was told was something about a real mess, like system memory > accesses getting wrong because of out of order execution hence > plenty of syncs needed to keep the thing working. I have not > even tried to verify that, only someone with experience with 64 bit > ARM can do that - so far none here seems to have that. >
If the person programming the device has made incorrect assumptions, or incorrect setup, then yes, things can go wrong if something other than the current core is affected by the reads or writes.
On 6/9/2021 20:44, Phil Hobbs wrote:
> Paul Rubin wrote: >> David Brown <david.brown@hesbynett.no> writes: >>> I can't really tell what kinds of designs you are discussing here.&nbsp; When >>> I talk about embedded systems in general, I mean microcontrollers >>> running specific programs - not general-purpose computers in embedded >>> formats (such as phones). >> >> Philip Munts made a comment a while back that stayed with me: that these >> days, in anything mains powered, there is usually little reason to use >> an MCU instead of a Linux board. >> > > Except that if it has a network connection, you have to patch it > unendingly or suffer the common-as-dirt IoT security nightmares. > > Cheers > > Phil Hobbs >
Those nightmares do not apply if you are in complete control of your firmware - which few people are nowadays indeed. I have had netMCA devices on the net for over 10 years now in many countries, the worst problem I have seen was some Chinese IP hanging on port 80 to no consequences. Dimiter ====================================================== Dimiter Popoff, TGI http://www.tgi-sci.com ====================================================== http://www.flickr.com/photos/didi_tgi/
On 6/9/2021 21:55, David Brown wrote:
> On 09/06/2021 20:00, Dimiter_Popoff wrote: >> On 6/9/2021 11:59, David Brown wrote: >>> On 08/06/2021 22:39, Dimiter_Popoff wrote: >>>> On 6/8/2021 23:18, David Brown wrote: >>>>> On 08/06/2021 16:46, Theo wrote: >>>>>> ...... >>>>> >>>>>> Memory bus/cache width >>>>> >>>>> No, that is not a common way to measure cpu "width", for many reasons. >>>>> A chip is likely to have many buses outside the cpu core itself (and >>>>> the >>>>> cache(s) may or may not be considered part of the core).&nbsp; It's >>>>> common to >>>>> have 64-bit wide buses on 32-bit processors, it's also common to have >>>>> 16-bit external databuses on a microcontroller.&nbsp; And the cache might be >>>>> 128 bits wide. >>>> >>>> I agree with your points and those of Theo, but the cache is basically >>>> as wide as the registers? Logically, that is; a cacheline is several >>>> times that, probably you refer to that. >>>> Not that it makes much of a difference to the fact that 64 bit data >>>> buses/registers in an MCU (apart from FPU registers, 32 bit FPUs are >>>> useless to me) are unlikely to attract much interest, nothing of >>>> significance to be gained as you said. >>>> To me 64 bit CPUs are of interest of course and thankfully there are >>>> some available, but this goes somewhat past what we call&nbsp; "embedded". >>>> Not long ago in a chat with a guy who knew some of ARM 64 bit I gathered >>>> there is some real mess with their out of order execution, one needs to >>>> do... hmmmm.. "sync", whatever they call it, all the time and there is >>>> a huge performance cost because of that. Anybody heard anything about >>>> it? (I only know what I was told). >>>> >>> >>> sync instructions of various types can be needed to handle >>> thread/process synchronisation, atomic accesses, and coordination >>> between software and hardware registers.&nbsp; Software normally runs with >>> the idea that it is the only thing running, and the cpu can re-order and >>> re-arrange the instructions and execution as long as it maintains the >>> illusion that the assembly instructions in the current thread are >>> executed one after the other.&nbsp; These re-arrangements and parallel >>> execution can give very large performance benefits. >>> >>> But it also means that when you need to coordinate with other things, >>> you need syncs, perhaps cache flushes, etc.&nbsp; Full syncs can take >>> hundreds of cycles to execute on large processors.&nbsp; So you need to >>> distinguish between reads and writes, acquires and releases, syncs on >>> single addresses or general memory syncs.&nbsp; Big processors are optimised >>> for throughput, not latency or quick reaction to hardware events. >>> >>> There are good reasons why big cpus are often paired with a Cortex-M >>> core in SOCs. >>> >>> >> >> Of course I know all that David, I have been using power processors >> which do things out of order for over 20 years now. > > It depends on the actual PPC's in question - with single core devices > targeted for embedded systems, you don't need much of that at all.
You *do* need it enough to know what is there to know about it, I have been through it all. How big a latency there is is irrelevant to the point.
>> What I was told was something about a real mess, like system memory >> accesses getting wrong because of out of order execution hence >> plenty of syncs needed to keep the thing working. I have not >> even tried to verify that, only someone with experience with 64 bit >> ARM can do that - so far none here seems to have that. >> > > If the person programming the device has made incorrect assumptions, or > incorrect setup, then yes, things can go wrong if something other than > the current core is affected by the reads or writes. >
May be the assumptions of the person were wrong. Or may be your assumption that their assumptions were wrong is wrong. Neither of us knows which it is. Dimiter ====================================================== Dimiter Popoff, TGI http://www.tgi-sci.com ====================================================== http://www.flickr.com/photos/didi_tgi/
Dimiter_Popoff wrote:
> On 6/9/2021 20:44, Phil Hobbs wrote: >> Paul Rubin wrote: >>> David Brown <david.brown@hesbynett.no> writes: >>>> I can't really tell what kinds of designs you are discussing here. >>>> When >>>> I talk about embedded systems in general, I mean microcontrollers >>>> running specific programs - not general-purpose computers in embedded >>>> formats (such as phones). >>> >>> Philip Munts made a comment a while back that stayed with me: that these >>> days, in anything mains powered, there is usually little reason to use >>> an MCU instead of a Linux board. >>> >> >> Except that if it has a network connection, you have to patch it >> unendingly or suffer the common-as-dirt IoT security nightmares. >> > > Those nightmares do not apply if you are in complete control of your > firmware - which few people are nowadays indeed. > > I have had netMCA devices on the net for over 10 years now in many > countries, the worst problem I have seen was some Chinese IP hanging > on port 80 to no consequences.
But if you're using a RasPi or Beaglebone or something like that, you need a reasonably well-upholstered Linux distro, which has to be patched regularly. At very least it'll need a kernel, and kernel patches affecting security are not exactly rare. Cheers Phil Hobbs -- Dr Philip C D Hobbs Principal Consultant ElectroOptical Innovations LLC / Hobbs ElectroOptics Optics, Electro-optics, Photonics, Analog Electronics Briarcliff Manor NY 10510 http://electrooptical.net http://hobbs-eo.com
Phil Hobbs <pcdhSpamMeSenseless@electrooptical.net> writes:
> But if you're using a RasPi or Beaglebone or something like that, you > need a reasonably well-upholstered Linux distro, which has to be > patched regularly. At very least it'll need a kernel, and kernel > patches affecting security are not exactly rare.
You're in the same situation with almost anything else connected to the internet. Think of the notorious "smart light bulbs". On the other hand, you are in reasonable shape if the raspberry pi running your fish tank is only reachable through a LAN or VPN. Non-networked low end linux boards are also a thing.
On 6/9/2021 22:22, Phil Hobbs wrote:
> Dimiter_Popoff wrote: >> On 6/9/2021 20:44, Phil Hobbs wrote: >>> Paul Rubin wrote: >>>> David Brown <david.brown@hesbynett.no> writes: >>>>> I can't really tell what kinds of designs you are discussing here. >>>>> When >>>>> I talk about embedded systems in general, I mean microcontrollers >>>>> running specific programs - not general-purpose computers in embedded >>>>> formats (such as phones). >>>> >>>> Philip Munts made a comment a while back that stayed with me: that >>>> these >>>> days, in anything mains powered, there is usually little reason to use >>>> an MCU instead of a Linux board. >>>> >>> >>> Except that if it has a network connection, you have to patch it >>> unendingly or suffer the common-as-dirt IoT security nightmares. >>> >> >> Those nightmares do not apply if you are in complete control of your >> firmware - which few people are nowadays indeed. >> >> I have had netMCA devices on the net for over 10 years now in many >> countries, the worst problem I have seen was some Chinese IP hanging >> on port 80 to no consequences. > > But if you're using a RasPi or Beaglebone or something like that, you > need a reasonably well-upholstered Linux distro, which has to be patched > regularly.&nbsp; At very least it'll need a kernel, and kernel patches > affecting security are not exactly rare. > > Cheers > > Phil Hobbs > > >
Oh if you use one of these all you can rely on is prayer, I don't think there is *one* person knowing everything which goes on within such a system. Basically it is impossible to know, even if you have all the manpower to dissect all the code you can still be taken by surprise by something a compiler has inserted somewhere etc., your initial point is well taken here. If you ask *me* if I am 100% sure what my devices might do - and I have written every single bit of code running on them, which has been compiled by a compiler I have written every single bit of - I might still be scratching my head. We buy our silicon, you know... Dimiter ====================================================== Dimiter Popoff, TGI http://www.tgi-sci.com ====================================================== http://www.flickr.com/photos/didi_tgi/

The 2024 Embedded Online Conference