EmbeddedRelated.com
Forums

timestamp in ms and 64-bit counter

Started by pozz February 6, 2020
On Sat, 8 Feb 2020 19:57:48 +0100, David Brown
<david.brown@hesbynett.no> wrote:

>> Never used NT, but I used W2k and it was great! W2k was widely >> pirated so MS started a phone home type of licensing with XP which >> was initially not well received, but over time became accepted. Now >> people reminisce about the halcyon days of XP. > >Did you not use NT 4.0 ? It was quite solid. W2K was also good, but XP >took a few service packs before it became reliable enough for serious use.
NT 4.0 solid ?? NT4 moved graphical functions to kernel mode to speed up window updates. When doing some operations in kernel mode on behalf of a user mode function, the first thing that the kernel mode routine should do is to check that the parameters passed to it were accessible from _user_ mode. Unfortunately this was not done initially, so passing by accident a NULL pointer to these functions crashed the whole computer, not just the application. SP1 added these checks. In general, each NT4 service pack introduced new bugs and soon the next SP was released to correct the bugs introduced by the previous SP. Thus every other SPs were actually usable. Even NT5 beta was more stable than NT4 with most recent SP. NT5 beta was renamed Windows 2000 before final release.
On Sat, 08 Feb 2020 14:30:53 -0600, Robert Wessel
<robertwessel2@yahoo.com> wrote:

>On Sat, 08 Feb 2020 18:37:35 +0200, upsidedown@downunder.com wrote: > >>Some earlier Windows versions used 55 Hz (or was it 55 ms) clock >>interrupt rate, so I really don't understand from where the 1 ms clock >>tick or 49 days is from. > > >The "tick count" in the (Win32) OS was always 1000Hz (as reported by >GetTickCount(), for example). The physical ticks were massaged to >correctly update that count.
Yes, but the hardware tick was at 18Hz (~55ms) up until XP and the introduction of "multimedia" timers. At first those "multimedia" timers were implemented by a realtime priority thread using the CPU's cycle counter. In a quiet system you could get down to ~50us. However, 10+MHz HPET hardware timers were introduced in 2005 and quickly became standard on retail systems. Support for HPET based multimedia timers came in XPsp3 (2008). Since Vista, if HPET is available, one channel of the timer is used to support the system clock at 1KHz. George
On 09/02/2020 07:35, upsidedown@downunder.com wrote:
> On Sat, 8 Feb 2020 19:57:48 +0100, David Brown > <david.brown@hesbynett.no> wrote: > >>> Never used NT, but I used W2k and it was great! W2k was widely >>> pirated so MS started a phone home type of licensing with XP which >>> was initially not well received, but over time became accepted. Now >>> people reminisce about the halcyon days of XP. >> >> Did you not use NT 4.0 ? It was quite solid. W2K was also good, but XP >> took a few service packs before it became reliable enough for serious use. > > NT 4.0 solid ?? > > NT4 moved graphical functions to kernel mode to speed up window > updates.
Yes. And that meant bugs in the graphics drivers could kill the whole system, unlike in NT 3.x. And bugs in the graphics drivers were certainly not unknown. However, with a little care it could run reliably for long times. I don't remember ever having a software or OS related crash or halt on our little NT 4 server. (My NT 4 workstation eventually decided to wipe my start menu and replace it with a single entry "eject computer", complete with icon. And it kept asking me to insert a disk in drive C: and close the door. But that was after many years of use and abuse.)
> When doing some operations in kernel mode on behalf of a user > mode function, the first thing that the kernel mode routine should do > is to check that the parameters passed to it were accessible from > _user_ mode. Unfortunately this was not done initially, so passing by > accident a NULL pointer to these functions crashed the whole computer, > not just the application. SP1 added these checks. > > In general, each NT4 service pack introduced new bugs and soon the > next SP was released to correct the bugs introduced by the previous > SP. Thus every other SPs were actually usable. > > Even NT5 beta was more stable than NT4 with most recent SP. NT5 beta > was renamed Windows 2000 before final release. >
I certainly liked W2K, and found it quite reliable. But I still remember NT 4.0 as good too.
On 2/8/20 2:33 PM, Kent Dickey wrote:
> In article <5jC%F.78169$8Y7.67931@fx05.iad>, > Richard Damon <Richard@Damon-Family.org> wrote: >> On 2/8/20 12:03 PM, Kent Dickey wrote: >>> Shown more explicitly, the following are all valid states (let's assume >>> ticks_high is 0, read_low32() just ticked to 0xffff_fffe): >>> >>> Time read_low32() ticks_high >>> ------------------------------------------------- >>> 0 0xffff_fffe 0 >>> 1ms 0xffff_ffff 0 >>> 1.99999ms 0xffff_ffff 0 >>> 2ms 0x0000_0000 0 >>> Interrupt is sent and is now pending >>> 2ms+delta 0x0000_0000 1 >>> >>> The issue is: what is "delta", and can other code (including your GetTick() >>> function) run between "2ms" and "2ms+delta"? And the answer is almost >>> assuredly "yes". This is a problem. >> >> But, as long as the timing is such that we can not do BOTH the >> read_low32() and the read of ticks_high in that delta, we can't get the >> wrong number. >> >> This is somewhat a function of the processor, and how much the >> instruction pipeline 'skids' when an interrupt occurs. The processor >> that he mentioned, A STM32L4R9, which uses an M4 processor, doesn't have >> this much of a skid, so that can't be a problem unless you do something >> foolish like disable the interrupts while doing the sequence. > > The interrupt skid matters for how large the window is, but the problem > happens even if the "skid" was 0. > > Look at it this way: the hardware counter logic is something like: > > always @(posedge clk) begin > if(do_inc) begin > cntr += 1; > if(cntr == 0) begin > interrupt = 1; > end > end > end > > Then at cycle 0 cntr=ffff_ffff and do_inc=0. At cycle 1, do_inc=1 and cntr=0 > and interrupt=1. > > In that cycle, software could read cntr=0. The interrupt CANNOT have taken > place yet since interrupts aren't instaneous--the signal hasn't even made it > to the interrupt controller yet, it's just this clock module has decided to > request an interrupt. (The ARM GIC support asynchronous interrupts, so it > takes several clocks just for it to register the interrupt). > > This is always somewhat a function of the processor, but the problem is > inherent to all CPUs. A simple 6502 or 8086 or whatever has the same problem > and cannot fix it easily either. > > The hardware cannot get this case right without some extreme craziness. That > would be a pre-interrupt detection circuit, prepared to drive the interrupt > early so the CPU reacts in time. > > The right way to look at it--hardware interrupts are delayed tens or > hundreds of cycles always from when you think they happen to when you receive > it. Then you'll get your algorithms right. > > Kent >
I did forget the delay in the interrupt controller. With that delay, you do have a fundamental issue between reading hardware registers and the software counter. A couple of solutions, some that have been mentioned: Have a 1ms interrupt and in software keep the 64 bit counter. I believe you can also program the counters to generate multiple interrupts in the count cycle, if you generate on at 0 and one at a half way point, knowing which interrupt was last seen you can tell if one is 'pending' based on the lower counter read. Another option on that processor is to chain a couple of timers together, so when the lower counter rolls over the upper counter counts automatically, and I believe it handles it so there isn't a skew between the counters. Then the read upper going direct to the hardware won't have the issue.
Il 06/02/2020 19:02, Rick C ha scritto:
 > [...]
> Is the "call this code at least once in 3 years" requirement reasonable? In the systems I design that would not be a problem.
Of course this limitation isn't usually a real problem. However there could be some situation where GetTick() is called after 49 days. For example, you can have an IoT device that starts sending data (with timestamps) after the user make a request. And timestamps/GetTick() is used only in the routine that sends data. Maybe the user, after purchasing, is excited of this gadget and make the request multiple times every day. After some weeks, he could forget to have this gadget and maybe remember of it only after many days...
Il 07/02/2020 10:43, David Brown ha scritto:
> On 07/02/2020 01:29, Rick C wrote: >> On Thursday, February 6, 2020 at 4:35:56 PM UTC-5, David Brown >> wrote: >>> On 06/02/2020 19:02, Rick C wrote: > >> >>> I think it's quite likely that the code already has a 1 KHz >>> interrupt routine doing other things, so incrementing a "high" >>> counter there would not be an issue. >> >> Actually, there is no need for a 1 kHz interrupt. I believe the OP >> has a hardware counter ticking at 1 kHz so the overflow event would >> be 49 days. There may be a faster interrupt for some other purpose, >> but not needed for the hardware timer which may well be run on a low >> power oscillator and backup battery. > > We don't know that - the OP hasn't told us all the details. An > interrupt that hits every millisecond (or some other regular time), used > as part of the global timebase and for executing regular functions, is > very common. Maybe he has something like this, maybe not.
I think FreeRTOS is already configured to have a fast interrupt, something similar to 1ms. I suspect it is used to check if some tasks, blocked waiting the expiration of a timer, must be activated. My first idea is to implement the 64-bits ms-resolution timestamp counter as a completely different than OS ticks, but I think I could add some code to OS ticks interrupt.
>>> But if there is no interrupt for other purposes, then it is a nice >>> idea to do the update during the GetTick call like this. However, >>> you need locking (or a more advanced lockfree solution, but that >>> would likely be a fair bit less efficient on a microcontroller. It >>> might be worth considering on a multi-processor system). >> >> My intent was to do something just plain simple, but in multitasking >> system it is not so simple. I don't typically use complications like >> interrupts. I do FPGA work where I roll the hardware for the >> peripherals as well as designing the instruction set for the >> processor. I seldom use multitasking other than potentially >> interrupts which usually don't need to use locking and such. If >> resources are not shared locking is not required. > > Fair enough. And we don't know if there is multi-tasking involved here > or not. If GetTick is only called from one thread, there is no problem. > Also if he has cooperative multitasking rather than pre-emptive, there > will be no problem. Your suggested solution is good (IMHO), but it is > important to understand its restrictions too.
Yes, I have a preemptive multi-tasking system (FreeRTOS).
Il 08/02/2020 20:43, Ed Prochak ha scritto:
> On Thursday, February 6, 2020 at 1:02:49 PM UTC-5, Rick C wrote: >> On Thursday, February 6, 2020 at 7:43:35 AM UTC-5, pozz wrote: >>> I need a timestamp in millisecond in linux epoch. It is a number that >>> doesn't fit in a 32-bits number. >>> >>> I'm using a 32-bit MCU (STM32L4R9...) so I don't have a 64-bits hw >>> counter. I need to create a mixed sw/hw 64-bits counter. It's very >>> simple, I configure a 32-bits hw timer to run at 1kHz and increment an >>> uint32_t variable in timer overflow ISR. >>> >>> Now I need to implement a GetTick() function that returns a uint64_t. I >>> know it could be difficult, because of race conditions. One solutions is >>> to disable interrupts, but I remember another solution. >>> >>> extern volatile uint32_t ticks_high; >>> >>> uint64_t >>> GetTick(void) >>> { >>> uint32_t h1 = ticks_high; >>> uint32_t l1 = hwcnt_get(); >>> uint32_t h2 = ticks_high; >>> >>> if (h1 == h2) return ((uint64_t)h1 << 32) | l1; >>> else return ((uint64_t)h1 << 32); >>> } >>> >>> Is it correct in single-tasking? In the else branch, I decided to set >>> the low part to zero. I think it's acceptable, because if h1!=h2, hw >>> counter has just wrapped-around, so it is 0... maybe 1. >>> >>> What about preemptive multi-tasking? What happens if GetTick() is >>> preempted by another higher-priority task that calls GetTick()? >>> >>> I think it's better to fix the else branch, because the higher-priority >>> task could take more than a few milliseconds, so the previous assumption >>> that hw counter is 0 (maybe 1) can be incorrect. The fix could be: >>> >>> uint64_t >>> GetTick(void) >>> { >>> uint32_t h1 = ticks_high; >>> uint32_t l1 = hwcnt_get(); >>> uint32_t h2 = ticks_high; >>> >>> if (h1 == h2) return ((uint64_t)h1 << 32) | l1; >>> else return ((uint64_t)h1 << 32) | hwcnt_get(); >>> } >> >> All this seems to be more complex than it needs to be. You guys are >> focused on limitations when things happen fast. Do you know how >> slow things can happen? >> >> A 32 bit counter incremented at 1 kHz will roll over every 3 years >> or so. If you can assure that the GetTick is called once every >> 3 years simpler code can be used. >> >> Have a 64 bit counter value which is the 32 bit counter incremented >> by the 1kHz interrupt and another 32 bit counter (the high part of >> the 64 bits) which is incremented when needed in the GetTick code. >> >> uint64_t >> GetTick(void) >> { >> static uint32_t ticks_high; >> uint32_t ticks_hw= hwcnt_get(); >> static uint32_t ticks_last; >> >> if (ticks_last > ticks_hw) ticks_high++; >> ticks_last = ticks_hw; >> return ((uint64_t)ticks_high << 32) | ticks_hw; >> } >> >> I'm not so conversant in C and I'm not familiar with the >> conventions of using time variables. Clearly the time will >> need to be initialized by some means and ticks_high would >> need to be initialized to correspond to the current time/date, >> unless this is a run time variable only tracking time since boot up. >> >> Is the "call this code at least once in 3 years" requirement reasonable? >> In the systems I design that would not be a problem. >> >> -- >> >> Rick C. >> >> - Get 1,000 miles of free Supercharging >> - Tesla referral code - https://ts.la/richard11209 > > Good points, Rick, but this conversation has me wonder: > > Why use a design that is handling the high 32bits in the > application layer and the low 32bits separately in at the ISR?
I think Rick suggested his solution, where high 32-bits are increased in application layer, because it is simpler. As you read, increasing the high 32-bits in ISR, force us to implement a trickier GetTick() with multiple reads of high counter. Unfortunately his solution doesn't work as is with preemptive scheduler when multiple tasks call GetTick().
> Apparently you are using this only for interval timing? > If you are looking to maintain calendar time, then you will > need to store the high 32bits as Rick mentioned. > > restoring the high 32bits from nonvolatile storage is a boot up > issue, and storing the value may require work outside the ISR. > But is required only once in 3 years as Rick pointed out. > But you would have some drift anyway since you have no way to > measure the time the system is down.
At every bootup and at a regular interval I use NTP to synchronize the internal calendar time.
Il 08/02/2020 18:03, Kent Dickey ha scritto:
> [...] > Unfortunately, with this design, I believe it is not possible to implement > a GetTick() function which does not sometimes fail to return a correct time. > There is a fundamental race between the interrupt and the timer value rolling > to 0 which software cannot account for.
Good point, Kent. Thank you for your post that helps to fix some critical bugs. You're right, ISRs aren't executed immediately after the relative event occurred. We should think ISR code runs after many cycles the interrupt event.
> 1) Have a single GetTick() routine, which is single-tasking (by > disabling interrupts, or a mutex if there are multiple processors). > This requires something to call GetTick() at least once every 49 days > (worst case). This is basically the Rich C./David Brown solution, but > they don't mention that you need to remove the interrupt on 32-bit overflow.
I think you mentioned to disable interrupts to avoid any preemption from RTOS scheduler, effectively blocking scheduler at all. However I know it's a bad idea to enable/disable interrupts "manually" with an RTOS. Maybe the mutex for GetTick() is a better idea, something similar to this: uint64_t GetTick(void) { mutex_take(); static uint32_t ticks_high; uint32_t ticks_hw = hwcnt_get(); static uint32_t ticks_last; if (ticks_last > ticks_hw) ticks_high++; ticks_last = ticks_hw; mutex_give(); return ((uint64_t)ticks_high << 32) | ticks_hw; }
> 2) Use a higher interrupt rate. For instance, if we can take the interrupt > when read_low32() has carry from bit 28 to bit 29, then we can piece together > code which can work as long as GetTick() isn't delayed by more than 3-4 days. > This require GetTick() to change using code given under #4 below. > > 3) Forget the hardware counter: just take an interrupt every 1ms, and > increment a global variable uint64_t ticks64 on each interrupt, and then > GetTick just returns ticks64. This only works if the CPU hardware supports > atomic 64-bit accesses. It's not generally possible to write C code for a > 32-bit processor which can guarantee 64-bit atomic ops, so it's best to have > the interrupt handler deal with two 32-bit variables ticks_low and > ticks_high, and then you still need the GetTicks() to have a while loop to > read the two variables.
What about? static volatile uint64_t ticks64; void timer_isr(void) { ticks64++; } uint64_t GetTick(void) { uint64_t t1 = ticks64; uint64_t t2; while((t2 = ticks64) - t1 > 100) { t1 = t2; } return t2; } If dangerous things happen (ISR executes during GetTick), t2-t1 is a very big number. 100ms represent the worst case max duration of ISRs/tasks that could preempt/interrupt GetTick. We could increase 100 even more.
> 4) Use a regular existing interrupt which occurs at any rate, as long as it's > well over 1ms, and well under 49 days. Let's assume you have a 1-second > interrupt. This can be asynchronous to the 1ms timer. In that interrupt > handler, you sample the 32-bit hardware counter, and if you notice it > wrapping (previous read value > new value), increment ticks_high. > You need to update the global volatile variable ticks_low as well as the > current hw count. And this hinterrupt andler needs to be the only code > changing ticks_low and ticks_high. Then, GetTick() does the following: > > uint32_t local_ticks_low, local_ticks_high; > [ while loop to read valid ticks_low and ticks_high value into the > local_* variables ] > uint64_t ticks64 = ((uint64_t)local_ticks_high << 32) | local_ticks_low; > ticks64 += (int32_t)(read_low32() - local_ticks_low); > return ticks64;
Do you mean...? volatile uint32_t ticks_low; volatile uint32_t ticks_high; void interrupt_at_every_second(void) { uint32_t tl = get_low_ticks(); // from free-running 1ms counter if (ticks_low > tl) { ticks_high++; } ticks_low = tl; } uint64_t GetTick(void) { uint32_t h2; uint32_t local_ticks_high = ticks_high; uint32_t local_ticks_low = ticks_low; while((h2 = ticks_high) != local_ticks_high) { local_ticks_high = h2; local_ticks_low = ticks_low; } uint64_t ticks64 = ((uint64_t)local_ticks_high << 32) | local_ticks_low; ticks64 += (int32_t)(read_low32() - local_ticks_low); return ticks64; }
> Basically, we return the ticks64 from the last regular interrupt, which could > be 1 second ago, and we add in the small delta from reading the hw counter. > Again, this requires the 1-second interrupt to be guaranteed to happen before > we get close to 49 days since the last 1-second interrupt (if it's really > a 1-second interrupt, it easily meets that criteria. If you try to pick > something irregular, like a keypress interrupt, then that won't work). It > does not depend on the exact rate of the interrupt at all. > > I wrote it above with extra safety--It subtracts two 32-bit unsigned variables, > gets a 32-bit unsigned result, treats that as a 32-bit signed result, and adds > that to the 64-bit unsigned ticks count. It's not strictly necessary to do > the 32-bit signed result cast: it just makes the code more robust in case > the HW timer moves backwards slightly. Imagine some code tries to adjust the > current timer value by setting it backwards slightly (say, some code trying > to calibrate the timer with the RTC or something). Without the cast to > 32-bit signed int, this slight backwards move would result in ticks64 > jumping ahead 49 days, which would be bad. In C, this is pretty easy, but it > should be carefully commented so no one removes any important casts.
Il 09/02/2020 21:55, Richard Damon ha scritto:
 > [...]
> Another option on that processor is to chain a couple of timers > together, so when the lower counter rolls over the upper counter counts > automatically, and I believe it handles it so there isn't a skew between > the counters. Then the read upper going direct to the hardware won't > have the issue.
I *believe* too, but are we completely sure?
On 2/9/20 6:58 PM, pozz wrote:
> Il 09/02/2020 21:55, Richard Damon ha scritto: > > [...] >> Another option on that processor is to chain a couple of timers >> together, so when the lower counter rolls over the upper counter >> counts automatically, and I believe it handles it so there isn't a >> skew between the counters. Then the read upper going direct to the >> hardware won't have the issue. > > I *believe* too, but are we completely sure?
It has been a bit since I have been through that processors documentation, but I remember a choice of synchronization modes, one which made the update simultaneous, at the cost of a bit more delay from the trigger pulse (you would typically run the counter on the system clock with a once per millisecond trigger pulse). If I am right about what the documentation says, it is a clear guarantee, the person designing the system should look that up to make sure they do it right.