Re: Proposal: add the timestamp64 type (Noam Yorav-Raphael)

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: add the timestamp64 type (Noam Yorav-Raphael)

Stefano Miccoli


On 11 Nov 2020, at 18:00, [hidden email] wrote:

I propose to add a new type called "timestamp64". It will be a pure timestamp, meaning that it represents a moment in time (as seconds/ms/us/ns since the epoch), without any timezone information. 

Sorry, but I really don see the usefulness for another time stamping format based on POSIX time. Indeed POSIX time is based on a naive approximation of UTC and is ambiguous across leap seconds. Quoting from Wikipedia <https://en.wikipedia.org/wiki/Unix_time#Leap_seconds>

The Unix time number 1483142400 is thus ambiguous: it can refer either to start of the leap second (2016-12-31 23:59:60) or the end of it, one second later (2017-01-01 00:00:00). In the theoretical case when a negative leap second occurs, no ambiguity is caused, but instead there is a range of Unix time numbers that do not refer to any point in UTC time at all.

Precision time stamping is quite a complex task: you can use UTC, TAI, GPS, just to mention the most used timescales. And how do you deal with timestamps in the past, when timekeeping was based on earth rotation, and not atomic clocks ticking at (approximately) 1 SI-second frequency?

In my opinion time-stamping should be application dependent, and I doubt that the new “timestamp64” could be beneficial to the numpy community.

Best regards,

Stefano

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: add the timestamp64 type (Noam Yorav-Raphael)

mattip
Administrator

On 11/12/20 6:04 PM, Stefano Miccoli wrote:

>
>
>> On 11 Nov 2020, at 18:00, [hidden email]
>> <mailto:[hidden email]> wrote:
>>
>> I propose to add a new type called "timestamp64". It will be a pure
>> timestamp, meaning that it represents a moment in time (as
>> seconds/ms/us/ns since the epoch), without any timezone information.
>
> Sorry, but I really don see the usefulness for another time stamping
> format based on POSIX time. Indeed POSIX time is based on a naive
> approximation of UTC and is ambiguous across leap seconds. Quoting
> from Wikipedia <https://en.wikipedia.org/wiki/Unix_time#Leap_seconds>
>
> ...


In a one-on-one discussion with Noam in a pre-community call (that, how
ironically, we had time for since we both messed up the meeting
time-zone change) we reached the conclusion that the request is to
clarify whether NumPy's datetime64 represents TAI time [0] or POSIX
time, with a preferecne for TAI time. The documentation mentions POSIX
time[1]. As Stefano points out, there is a couple of seconds difference
between POSIX (or Unix) time and TAI time. In practice numpy simply
stores a int64 value to represent the datetime64, and relies on others
to convert it. The leap-second might be getting lost in the conversions.
So it might make sense to clarify exactly how those conversions deal
with the leap-seconds and choose which one we mean when we use
datetime64. Noam please correct me if I am mistaken.


Matti


[0] https://en.wikipedia.org/wiki/International_Atomic_Time

[1]
https://numpy.org/doc/stable/reference/arrays.datetime.html#datetime-units

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: add the timestamp64 type (Noam Yorav-Raphael)

Noam Yorav-Raphael
Hi Matti and Stefano,

My understanding is that datetime64 was decided to be neither TAI nor posix time, but rather represent an abstract calendar point, like datetime.datetime without a specified timezone. This can usually be converted into posix time given a timezone (although in the "repeated" hour between DST and winter time there will be ambiguity!) If it is agreed by all users that a datetime64 represents the time in UTC, it is the same as posix time.

I would like to have a type that is defined to be equivalent to posix time. I don't agree with Stefano, I think that posix time is very useful (as I think its ubiquity shows that), and I think that a type that is defined to be posix time would also be very useful. I think that posix time is well suited for the vast majority of use cases. Indeed, there are use cases where you should take into account leap seconds, but those are rare. In practice, a leap second would be presented by the OS as a second that actually takes more than a second. This actually happens all the time without leap seconds - when your computer automatically syncs with ntp, it adjusts the time continuously, so applications will not experience "time bumps". If you want to make sure that the intervals you measure are correct, you should use something like time.monotonic().

So, most users are not interested in very precise time measurements, but rather in knowing what happened before what, and roughly when. For this, posix time is great - it's very simple, and does the job. In some cases you need to take into account leap seconds, but in those cases, just using the computer clock will not give you the precision you need no matter what - so you'll need specialized software anyway.

I think that posix time is great, and since it's very easy to make wrong decisions that seem to work until you discover they don't (such as discovering too late that local time won't work when you are not sure of the time zone, or when you switch from DST to winter time), a sane and simple default is important.

Cheers,
Noam





On Thu, Nov 12, 2020 at 6:41 PM Matti Picus <[hidden email]> wrote:

On 11/12/20 6:04 PM, Stefano Miccoli wrote:
>
>
>> On 11 Nov 2020, at 18:00, [hidden email]
>> <mailto:[hidden email]> wrote:
>>
>> I propose to add a new type called "timestamp64". It will be a pure
>> timestamp, meaning that it represents a moment in time (as
>> seconds/ms/us/ns since the epoch), without any timezone information.
>
> Sorry, but I really don see the usefulness for another time stamping
> format based on POSIX time. Indeed POSIX time is based on a naive
> approximation of UTC and is ambiguous across leap seconds. Quoting
> from Wikipedia <https://en.wikipedia.org/wiki/Unix_time#Leap_seconds>
>
> ...


In a one-on-one discussion with Noam in a pre-community call (that, how
ironically, we had time for since we both messed up the meeting
time-zone change) we reached the conclusion that the request is to
clarify whether NumPy's datetime64 represents TAI time [0] or POSIX
time, with a preferecne for TAI time. The documentation mentions POSIX
time[1]. As Stefano points out, there is a couple of seconds difference
between POSIX (or Unix) time and TAI time. In practice numpy simply
stores a int64 value to represent the datetime64, and relies on others
to convert it. The leap-second might be getting lost in the conversions.
So it might make sense to clarify exactly how those conversions deal
with the leap-seconds and choose which one we mean when we use
datetime64. Noam please correct me if I am mistaken.


Matti


[0] https://en.wikipedia.org/wiki/International_Atomic_Time

[1]
https://numpy.org/doc/stable/reference/arrays.datetime.html#datetime-units

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Proposal: add the timestamp64 type (Noam Yorav-Raphael)

Daniele Nicolodi
In reply to this post by mattip
On 12/11/2020 17:40, Matti Picus wrote:

> In a one-on-one discussion with Noam in a pre-community call (that, how
> ironically, we had time for since we both messed up the meeting
> time-zone change) we reached the conclusion that the request is to
> clarify whether NumPy's datetime64 represents TAI time [0] or POSIX
> time, with a preferecne for TAI time. The documentation mentions POSIX
> time[1]. As Stefano points out, there is a couple of seconds difference
> between POSIX (or Unix) time and TAI time. In practice numpy simply
> stores a int64 value to represent the datetime64, and relies on others
> to convert it. The leap-second might be getting lost in the conversions.
> So it might make sense to clarify exactly how those conversions deal
> with the leap-seconds and choose which one we mean when we use
> datetime64. Noam please correct me if I am mistaken.

Unix time is a representation of the UTC timescale that counts 1 seconds
intervals starting from a defined epoch. It deals with leap seconds
either skipping one interval (never happened so far) or repeating an
interval so that two moments in time that on the UTC timescale are
separated by one second (for example 2016-12-31 23:59:59 and 2016-12-31
23:59:60) are represented in the same way and thus the conversion from
Unix time to UTC is ambiguous during this one second. This happened 37
times since 1972.

This comes with the nice properties that minutes, hours and days have
always the same duration (in Unix time), thus converting from the Unix
time representation to an date and hour and vice versa is fairly easy.

The drawback are, as seen above, an ambiguity on leap seconds and the
fact that the trivial computation of time intervals does not take into
account leap seconds and thus may be shorted of a few seconds (any time
interval across 2016-12-31 23:59:59 is off by at least one second if
computed simply subtracting Unix times).

I don't think these two drawbacks are important for Numpy (or any other
general purpose library). As things stand, it is not even possible, in
Python, with or without Numpy, to create a datetime or datetime64 object
from the time "2016-12-31 23:59:60" (neither accept the existence of a
minute with 61 seconds) thus the ambiguity issue is not an issue in
practice. The time interval issue may matter for some applications, but
the ones affected are aware of the issue and have means to deal with it
(the most common one being taking a day off on the days leap seconds are
introduced).

I think documenting that datetime64 is a representation of fixed time
intervals since a conventional epoch, neglecting leap seconds, is easy
to explain and implement and allows for easy interoperability with the
rest of the world.

What advantage would making datetime64 explicitly a representation of
TAI bring?

One disadvantage would be that `np.datetime64(datetime.now())` would be
harder to support as we are trying to match a point in time on the UTC
time scale to a point in time in on the TAI time scale. This is trivial
for past times (just need to adjust for the right offset) but it is
impossible to do correctly for dates in the future because we cannot
predict future leap second insertions. This would, for example, make
timestamp conversions not be reproducible across announcement of leap
second insertions.

Cheers,
Dan
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion