timestamp in ms and 64-bit counter| page 5

Reply by ●February 9, 20202020-02-09

On Sat, 8 Feb 2020 19:57:48 +0100, David Brown
<david.brown@hesbynett.no> wrote:

>> Never used NT, but I used W2k and it was great!  W2k was widely
>> pirated so MS started a phone home type of licensing with XP which
>> was initially not well received, but over time became accepted.  Now
>> people reminisce about the halcyon days of XP.
>
>Did you not use NT 4.0 ?  It was quite solid.  W2K was also good, but XP 
>took a few service packs before it became reliable enough for serious use.

NT 4.0 solid ??

NT4 moved graphical functions to kernel mode to speed up window
updates. When doing some operations in kernel mode on behalf of a user
mode function, the first thing that the kernel mode routine should do
is to check that the parameters passed to it were accessible from
_user_ mode. Unfortunately this was not done initially, so passing by
accident a NULL pointer to these functions crashed the whole computer,
not just the application. SP1 added these checks.

In general, each NT4 service pack introduced new bugs and soon the
next SP was released to correct the bugs introduced by the previous
SP. Thus every other SPs were actually usable.

Even NT5 beta was more stable than NT4 with most recent SP. NT5 beta
was renamed Windows 2000 before final release.

Reply by George Neuner ●February 9, 20202020-02-09

On Sat, 08 Feb 2020 14:30:53 -0600, Robert Wessel
<robertwessel2@yahoo.com> wrote:

>On Sat, 08 Feb 2020 18:37:35 +0200, upsidedown@downunder.com wrote:
>
>>Some earlier Windows versions used 55 Hz (or was it 55 ms) clock
>>interrupt rate, so I really don't understand from where the 1 ms clock
>>tick or 49 days is from.
>
>
>The "tick count" in the (Win32) OS was always 1000Hz (as reported by
>GetTickCount(), for example).  The physical ticks were massaged to
>correctly update that count.

Yes, but the hardware tick was at 18Hz (~55ms) up until XP and the
introduction of "multimedia" timers.  

At first those "multimedia" timers were implemented by a realtime
priority thread using the CPU's cycle counter.  In a quiet system you
could get down to ~50us.

However, 10+MHz HPET hardware timers were introduced in 2005 and
quickly became standard on retail systems.  Support for HPET based
multimedia timers came in XPsp3 (2008).

Since Vista, if HPET is available, one channel of the timer is used to
support the system clock at 1KHz.  

George

Reply by David Brown ●February 9, 20202020-02-09

On 09/02/2020 07:35, upsidedown@downunder.com wrote:
> On Sat, 8 Feb 2020 19:57:48 +0100, David Brown
> <david.brown@hesbynett.no> wrote:
> 
>>> Never used NT, but I used W2k and it was great!  W2k was widely
>>> pirated so MS started a phone home type of licensing with XP which
>>> was initially not well received, but over time became accepted.  Now
>>> people reminisce about the halcyon days of XP.
>>
>> Did you not use NT 4.0 ?  It was quite solid.  W2K was also good, but XP
>> took a few service packs before it became reliable enough for serious use.
> 
> NT 4.0 solid ??
> 
> NT4 moved graphical functions to kernel mode to speed up window
> updates.

Yes.  And that meant bugs in the graphics drivers could kill the whole 
system, unlike in NT 3.x.  And bugs in the graphics drivers were 
certainly not unknown.  However, with a little care it could run 
reliably for long times.  I don't remember ever having a software or OS 
related crash or halt on our little NT 4 server.

(My NT 4 workstation eventually decided to wipe my start menu and 
replace it with a single entry "eject computer", complete with icon. 
And it kept asking me to insert a disk in drive C: and close the door. 
But that was after many years of use and abuse.)

> When doing some operations in kernel mode on behalf of a user
> mode function, the first thing that the kernel mode routine should do
> is to check that the parameters passed to it were accessible from
> _user_ mode. Unfortunately this was not done initially, so passing by
> accident a NULL pointer to these functions crashed the whole computer,
> not just the application. SP1 added these checks.
> 
> In general, each NT4 service pack introduced new bugs and soon the
> next SP was released to correct the bugs introduced by the previous
> SP. Thus every other SPs were actually usable.
> 
> Even NT5 beta was more stable than NT4 with most recent SP. NT5 beta
> was renamed Windows 2000 before final release.
> 

I certainly liked W2K, and found it quite reliable.  But I still 
remember NT 4.0 as good too.

Reply by Richard Damon ●February 9, 20202020-02-09

On 2/8/20 2:33 PM, Kent Dickey wrote:
> In article <5jC%F.78169$8Y7.67931@fx05.iad>,
> Richard Damon  <Richard@Damon-Family.org> wrote:
>> On 2/8/20 12:03 PM, Kent Dickey wrote:
>>> Shown more explicitly, the following are all valid states (let's assume
>>> ticks_high is 0, read_low32() just ticked to 0xffff_fffe):
>>>
>>> Time            read_low32()            ticks_high
>>> -------------------------------------------------
>>> 0               0xffff_fffe             0
>>> 1ms             0xffff_ffff             0
>>> 1.99999ms       0xffff_ffff             0
>>> 2ms             0x0000_0000             0
>>> Interrupt is sent and is now pending
>>> 2ms+delta       0x0000_0000             1
>>>
>>> The issue is: what is "delta", and can other code (including your GetTick()
>>> function) run between "2ms" and "2ms+delta"?  And the answer is almost
>>> assuredly "yes".  This is a problem.
>>
>> But, as long as the timing is such that we can not do BOTH the
>> read_low32() and the read of ticks_high in that delta, we can't get the
>> wrong number.
>>
>> This is somewhat a function of the processor, and how much the
>> instruction pipeline 'skids' when an interrupt occurs. The processor
>> that he mentioned, A STM32L4R9, which uses an M4 processor, doesn't have
>> this much of a skid, so that can't be a problem unless you do something
>> foolish like disable the interrupts while doing the sequence.
> 
> The interrupt skid matters for how large the window is, but the problem
> happens even if the "skid" was 0.
> 
> Look at it this way: the hardware counter logic is something like:
> 
> 	always @(posedge clk) begin
> 		if(do_inc) begin
> 			cntr += 1;
> 			if(cntr == 0) begin
> 				interrupt = 1;
> 			end
> 		end
> 	end
> 
> Then at cycle 0 cntr=ffff_ffff and do_inc=0.  At cycle 1, do_inc=1 and cntr=0
> and interrupt=1.
> 
> In that cycle, software could read cntr=0.  The interrupt CANNOT have taken
> place yet since interrupts aren't instaneous--the signal hasn't even made it
> to the interrupt controller yet, it's just this clock module has decided to
> request an interrupt.  (The ARM GIC support asynchronous interrupts, so it
> takes several clocks just for it to register the interrupt).
> 
> This is always somewhat a function of the processor, but the problem is
> inherent to all CPUs.  A simple 6502 or 8086 or whatever has the same problem
> and cannot fix it easily either.
> 
> The hardware cannot get this case right without some extreme craziness.  That
> would be a pre-interrupt detection circuit, prepared to drive the interrupt
> early so the CPU reacts in time.
> 
> The right way to look at it--hardware interrupts are delayed tens or
> hundreds of cycles always from when you think they happen to when you receive
> it.  Then you'll get your algorithms right.
> 
> Kent
> 

I did forget the delay in the interrupt controller. With that delay, you 
do have a fundamental issue between reading hardware registers and the 
software counter.

A couple of solutions, some that have been mentioned:

Have a 1ms interrupt and in software keep the 64 bit counter.

I believe you can also program the counters to generate multiple 
interrupts in the count cycle, if you generate on at 0 and one at a half 
way point, knowing which interrupt was last seen you can tell if one is 
'pending' based on the lower counter read.

Another option on that processor is to chain a couple of timers 
together, so when the lower counter rolls over the upper counter counts 
automatically, and I believe it handles it so there isn't a skew between 
the counters. Then the read upper going direct to the hardware won't 
have the issue.

Reply by pozz ●February 9, 20202020-02-09

Il 06/02/2020 19:02, Rick C ha scritto:
 > [...]
> Is the "call this code at least once in 3 years" requirement reasonable?  In the systems I design that would not be a problem.

Of course this limitation isn't usually a real problem.

However there could be some situation where GetTick() is called after 49 
days. For example, you can have an IoT device that starts sending data 
(with timestamps) after the user make a request. And 
timestamps/GetTick() is used only in the routine that sends data.

Maybe the user, after purchasing, is excited of this gadget and make the 
request multiple times every day. After some weeks, he could forget to 
have this gadget and maybe remember of it only after many days...

Reply by pozz ●February 9, 20202020-02-09

Il 07/02/2020 10:43, David Brown ha scritto:
> On 07/02/2020 01:29, Rick C wrote:
>> On Thursday, February 6, 2020 at 4:35:56 PM UTC-5, David Brown
>> wrote:
>>> On 06/02/2020 19:02, Rick C wrote:
> 
>>
>>> I think it's quite likely that the code already has a 1 KHz
>>> interrupt routine doing other things, so incrementing a "high"
>>> counter there would not be an issue.
>>
>> Actually, there is no need for a 1 kHz interrupt.  I believe the OP
>> has a hardware counter ticking at 1 kHz so the overflow event would
>> be 49 days.  There may be a faster interrupt for some other purpose,
>> but not needed for the hardware timer which may well be run on a low
>> power oscillator and backup battery.
> 
> We don't know that - the OP hasn't told us all the details.  An
> interrupt that hits every millisecond (or some other regular time), used
> as part of the global timebase and for executing regular functions, is
> very common.  Maybe he has something like this, maybe not.

I think FreeRTOS is already configured to have a fast interrupt, 
something similar to 1ms. I suspect it is used to check if some tasks, 
blocked waiting the expiration of a timer, must be activated.

My first idea is to implement the 64-bits ms-resolution timestamp 
counter as a completely different than OS ticks, but I think I could add 
some code to OS ticks interrupt.


>>> But if there is no interrupt for other purposes, then it is a nice
>>> idea to do the update during the GetTick call like this. However,
>>> you need locking (or a more advanced lockfree solution, but that
>>> would likely be a fair bit less efficient on a microcontroller.  It
>>>   might be worth considering on a multi-processor system).
>>
>> My intent was to do something just plain simple, but in multitasking
>> system it is not so simple.  I don't typically use complications like
>> interrupts.  I do FPGA work where I roll the hardware for the
>> peripherals as well as designing the instruction set for the
>> processor.  I seldom use multitasking other than potentially
>> interrupts which usually don't need to use locking and such.  If
>> resources are not shared locking is not required.
> 
> Fair enough.  And we don't know if there is multi-tasking involved here
> or not.  If GetTick is only called from one thread, there is no problem.
>   Also if he has cooperative multitasking rather than pre-emptive, there
> will be no problem.  Your suggested solution is good (IMHO), but it is
> important to understand its restrictions too.

Yes, I have a preemptive multi-tasking system (FreeRTOS).

Reply by pozz ●February 9, 20202020-02-09

Il 08/02/2020 20:43, Ed Prochak ha scritto:
> On Thursday, February 6, 2020 at 1:02:49 PM UTC-5, Rick C wrote:
>> On Thursday, February 6, 2020 at 7:43:35 AM UTC-5, pozz wrote:
>>> I need a timestamp in millisecond in linux epoch. It is a number that
>>> doesn't fit in a 32-bits number.
>>>
>>> I'm using a 32-bit MCU (STM32L4R9...) so I don't have a 64-bits hw
>>> counter. I need to create a mixed sw/hw 64-bits counter. It's very
>>> simple, I configure a 32-bits hw timer to run at 1kHz and increment an
>>> uint32_t variable in timer overflow ISR.
>>>
>>> Now I need to implement a GetTick() function that returns a uint64_t. I
>>> know it could be difficult, because of race conditions. One solutions is
>>> to disable interrupts, but I remember another solution.
>>>
>>> extern volatile uint32_t ticks_high;
>>>
>>> uint64_t
>>> GetTick(void)
>>> {
>>>     uint32_t h1 = ticks_high;
>>>     uint32_t l1 = hwcnt_get();
>>>     uint32_t h2 = ticks_high;
>>>
>>>     if (h1 == h2) return ((uint64_t)h1 << 32) | l1;
>>>     else          return ((uint64_t)h1 << 32);
>>> }
>>>
>>> Is it correct in single-tasking? In the else branch, I decided to set
>>> the low part to zero. I think it's acceptable, because if h1!=h2, hw
>>> counter has just wrapped-around, so it is 0... maybe 1.
>>>
>>> What about preemptive multi-tasking? What happens if GetTick() is
>>> preempted by another higher-priority task that calls GetTick()?
>>>
>>> I think it's better to fix the else branch, because the higher-priority
>>> task could take more than a few milliseconds, so the previous assumption
>>> that hw counter is 0 (maybe 1) can be incorrect. The fix could be:
>>>
>>> uint64_t
>>> GetTick(void)
>>> {
>>>     uint32_t h1 = ticks_high;
>>>     uint32_t l1 = hwcnt_get();
>>>     uint32_t h2 = ticks_high;
>>>
>>>     if (h1 == h2) return ((uint64_t)h1 << 32) | l1;
>>>     else          return ((uint64_t)h1 << 32) | hwcnt_get();
>>> }
>>
>> All this seems to be more complex than it needs to be.  You guys are
>> focused on limitations when things happen fast.  Do you know how
>> slow things can happen?
>>
>> A 32 bit counter incremented at 1 kHz will roll over every 3 years
>> or so.  If you can assure that the GetTick is called once every
>> 3 years simpler code can be used.
>>
>> Have a 64 bit counter value which is the 32 bit counter incremented
>> by the 1kHz interrupt and another 32 bit counter (the high part of
>> the 64 bits) which is incremented when needed in the GetTick code.
>>
>> uint64_t
>> GetTick(void)
>> {
>>     static uint32_t ticks_high;
>>     uint32_t ticks_hw= hwcnt_get();
>>     static uint32_t ticks_last;
>>
>>     if (ticks_last > ticks_hw)  ticks_high++;
>>     ticks_last = ticks_hw;
>>     return ((uint64_t)ticks_high << 32) | ticks_hw;
>> }
>>
>> I'm not so conversant in C and I'm not familiar with the
>> conventions of using time variables.  Clearly the time will
>> need to be initialized by some means and ticks_high would
>> need to be initialized to correspond to the current time/date,
>> unless this is a run time variable only tracking time since boot up.
>>
>> Is the "call this code at least once in 3 years" requirement reasonable?
>>   In the systems I design that would not be a problem.
>>
>> -- 
>>
>>    Rick C.
>>
>>    - Get 1,000 miles of free Supercharging
>>    - Tesla referral code - https://ts.la/richard11209
> 
> Good points, Rick, but this conversation has me wonder:
> 
>   Why use a design that is handling the high 32bits in the
>   application layer and the low 32bits separately in at the ISR?

I think Rick suggested his solution, where high 32-bits are increased in 
application layer, because it is simpler. As you read, increasing the 
high 32-bits in ISR, force us to implement a trickier GetTick() with 
multiple reads of high counter.

Unfortunately his solution doesn't work as is with preemptive scheduler 
when multiple tasks call GetTick().


> Apparently you are using this only for interval timing?
> If you are looking to maintain calendar time, then you will
> need to store the high 32bits as Rick mentioned.
> 
> restoring the high 32bits from nonvolatile storage is a boot up
> issue, and storing the value may require work outside the ISR.
> But is required only once in 3 years as Rick pointed out.
> But you would have some drift anyway since you have no way to
> measure the time the system is down.

At every bootup and at a regular interval I use NTP to synchronize the 
internal calendar time.

Reply by pozz ●February 9, 20202020-02-09

Il 08/02/2020 18:03, Kent Dickey ha scritto:
> [...]
> Unfortunately, with this design, I believe it is not possible to implement
> a GetTick() function which does not sometimes fail to return a correct time.
> There is a fundamental race between the interrupt and the timer value rolling
> to 0 which software cannot account for.

Good point, Kent. Thank you for your post that helps to fix some 
critical bugs.

You're right, ISRs aren't executed immediately after the relative event 
occurred. We should think ISR code runs after many cycles the interrupt 
event.


> 1) Have a single GetTick() routine, which is single-tasking (by
> disabling interrupts, or a mutex if there are multiple processors).
> This requires something to call GetTick() at least once every 49 days
> (worst case).  This is basically the Rich C./David Brown solution, but
> they don't mention that you need to remove the interrupt on 32-bit overflow.

I think you mentioned to disable interrupts to avoid any preemption from 
RTOS scheduler, effectively blocking scheduler at all.
However I know it's a bad idea to enable/disable interrupts "manually" 
with an RTOS.
Maybe the mutex for GetTick() is a better idea, something similar to this:

uint64_t
GetTick(void)
{
    mutex_take();

    static uint32_t ticks_high;
    uint32_t ticks_hw = hwcnt_get();
    static uint32_t ticks_last;

    if (ticks_last > ticks_hw)  ticks_high++;
    ticks_last = ticks_hw;
    mutex_give();

    return ((uint64_t)ticks_high << 32) | ticks_hw;
}

> 2) Use a higher interrupt rate.  For instance, if we can take the interrupt
> when read_low32() has carry from bit 28 to bit 29, then we can piece together
> code which can work as long as GetTick() isn't delayed by more than 3-4 days.
> This require GetTick() to change using code given under #4 below.
> 
> 3) Forget the hardware counter: just take an interrupt every 1ms, and
> increment a global variable uint64_t ticks64 on each interrupt, and then
> GetTick just returns ticks64.  This only works if the CPU hardware supports
> atomic 64-bit accesses.  It's not generally possible to write C code for a
> 32-bit processor which can guarantee 64-bit atomic ops, so it's best to have
> the interrupt handler deal with two 32-bit variables ticks_low and
> ticks_high, and then you still need the GetTicks() to have a while loop to
> read the two variables.

What about?

static volatile uint64_t ticks64;
void timer_isr(void) {
   ticks64++;
}
uint64_t GetTick(void) {
   uint64_t t1 = ticks64;
   uint64_t t2;
   while((t2 = ticks64) - t1 > 100) {
     t1 = t2;
   }
   return t2;
}

If dangerous things happen (ISR executes during GetTick), t2-t1 is a 
very big number. 100ms represent the worst case max duration of 
ISRs/tasks that could preempt/interrupt GetTick. We could increase 100 
even more.


> 4) Use a regular existing interrupt which occurs at any rate, as long as it's
> well over 1ms, and well under 49 days.  Let's assume you have a 1-second
> interrupt.  This can be asynchronous to the 1ms timer.  In that interrupt
> handler, you sample the 32-bit hardware counter, and if you notice it
> wrapping (previous read value > new value), increment ticks_high.
> You need to update the global volatile variable ticks_low as well as the
> current hw count.  And this hinterrupt andler needs to be the only code
> changing ticks_low and ticks_high.  Then, GetTick() does the following:
> 
>          uint32_t local_ticks_low, local_ticks_high;
>          [ while loop to read valid ticks_low and ticks_high value into the
>                  local_* variables ]
>          uint64_t ticks64 = ((uint64_t)local_ticks_high << 32) | local_ticks_low;
>          ticks64 += (int32_t)(read_low32() - local_ticks_low);
>          return ticks64;

Do you mean...?

volatile uint32_t ticks_low;
volatile uint32_t ticks_high;

void interrupt_at_every_second(void)
{
     uint32_t tl = get_low_ticks();   // from free-running 1ms counter
     if (ticks_low > tl) {
       ticks_high++;
     }
     ticks_low = tl;
}

uint64_t GetTick(void)
{
     uint32_t h2;

     uint32_t local_ticks_high = ticks_high;
     uint32_t local_ticks_low = ticks_low;
     while((h2 = ticks_high) != local_ticks_high) {
         local_ticks_high = h2;
         local_ticks_low = ticks_low;
     }

     uint64_t ticks64 = ((uint64_t)local_ticks_high << 32) | 
local_ticks_low;
     ticks64 += (int32_t)(read_low32() - local_ticks_low);
     return ticks64;
}


> Basically, we return the ticks64 from the last regular interrupt, which could
> be 1 second ago, and we add in the small delta from reading the hw counter.
> Again, this requires the 1-second interrupt to be guaranteed to happen before
> we get close to 49 days since the last 1-second interrupt (if it's really
> a 1-second interrupt, it easily meets that criteria.  If you try to pick
> something irregular, like a keypress interrupt, then that won't work).  It
> does not depend on the exact rate of the interrupt at all.
> 
> I wrote it above with extra safety--It subtracts two 32-bit unsigned variables,
> gets a 32-bit unsigned result, treats that as a 32-bit signed result, and adds
> that to the 64-bit unsigned ticks count.  It's not strictly necessary to do
> the 32-bit signed result cast: it just makes the code more robust in case
> the HW timer moves backwards slightly.  Imagine some code tries to adjust the
> current timer value by setting it backwards slightly (say, some code trying
> to calibrate the timer with the RTC or something).  Without the cast to
> 32-bit signed int, this slight backwards move would result in ticks64
> jumping ahead 49 days, which would be bad.  In C, this is pretty easy, but it
> should be carefully commented so no one removes any important casts.

Reply by pozz ●February 9, 20202020-02-09

Il 09/02/2020 21:55, Richard Damon ha scritto:
 > [...]
> Another option on that processor is to chain a couple of timers 
> together, so when the lower counter rolls over the upper counter counts 
> automatically, and I believe it handles it so there isn't a skew between 
> the counters. Then the read upper going direct to the hardware won't 
> have the issue.

I *believe* too, but are we completely sure?

Reply by Richard Damon ●February 9, 20202020-02-09

On 2/9/20 6:58 PM, pozz wrote:
> Il 09/02/2020 21:55, Richard Damon ha scritto:
>  > [...]
>> Another option on that processor is to chain a couple of timers 
>> together, so when the lower counter rolls over the upper counter 
>> counts automatically, and I believe it handles it so there isn't a 
>> skew between the counters. Then the read upper going direct to the 
>> hardware won't have the issue.
> 
> I *believe* too, but are we completely sure?

It has been a bit since I have been through that processors 
documentation, but I remember a choice of synchronization modes, one 
which made the update simultaneous, at the cost of a bit more delay from 
the trigger pulse (you would typically run the counter on the system 
clock with a once per millisecond trigger pulse). If I am right about 
what the documentation says, it is a clear guarantee, the person 
designing the system should look that up to make sure they do it right.