number datetime64 dtypes

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

number datetime64 dtypes

Mark Mikofski-2
Hi,

Thank you for your time.

A colleague asked me about creating a range of numpy datetime64 at 15-day increments.

This works:
np.arange(np.datetime64('2008-04-01'), np.datetime64('2020-09-01'), np.timedelta64(15, 'D'))
but then they also showed me this, which leads to some very strange responses:

np.arange(np.datetime64('2008-04-01'), np.datetime64('2020-09-01'), dtype="datetime64[15D]")
Out[50]:
array(['2008-03-27', '2008-04-11', '2008-04-26', '2008-05-11',
       '2008-05-26', '2008-06-10', '2008-06-25', '2008-07-10',
...
       '2020-05-23', '2020-06-07', '2020-06-22', '2020-07-07',
       '2020-07-22', '2020-08-06'], dtype='datetime64[15D]')


See how the 1st day is March 27th?

I couldn't find a reference to this dtype ( "datetime64[15D]" ) in the numpy docs, but I think it's a common pattern in Pandas, that is using a number to get an increment of the frequency, for example "5T" is 5-minutes, etc.

There is a reference to using arange with dtype on the datetimes & timedelta doc page () but the datetime is 1-day or  "datetime64[D]"

Is this the intended outcome? Or is it a side effect?

I wonder if others have tried to adapt Pandas patterns to Numpy datetimes, and if it's an issue for anyone else.

I've advised my colleague not to use Numpy datetimes like this, assuming based on the docs that Pandas-style offsets do not translate into Numpy style datetimes.

thanks!

--
Mark Mikofski, PhD (2005)
Fiat Lux

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: number datetime64 dtypes

Eric Wieser
It's interesting to confirm that people are aware of this syntax!

This is intended but perhaps not useful behavior.

`datetime64[15D]` is a type that stores dates by the nearest date that is a multiple of 15 days from the unix epoch.
Arguably there isn't a situation where using `15D` makes a whole lot of sense, but the generalization is useful - `datetime64[15m]` stores dates rounded to the nearest quarter hour, which is somewhat sensible.

Perhaps we should have added support for a custom epoch, which would make your problem go away...

On Thu, 10 Sep 2020 at 18:43, Dr. Mark Alexander Mikofski PhD <[hidden email]> wrote:
Hi,

Thank you for your time.

A colleague asked me about creating a range of numpy datetime64 at 15-day increments.

This works:
np.arange(np.datetime64('2008-04-01'), np.datetime64('2020-09-01'), np.timedelta64(15, 'D'))
but then they also showed me this, which leads to some very strange responses:

np.arange(np.datetime64('2008-04-01'), np.datetime64('2020-09-01'), dtype="datetime64[15D]")
Out[50]:
array(['2008-03-27', '2008-04-11', '2008-04-26', '2008-05-11',
       '2008-05-26', '2008-06-10', '2008-06-25', '2008-07-10',
...
       '2020-05-23', '2020-06-07', '2020-06-22', '2020-07-07',
       '2020-07-22', '2020-08-06'], dtype='datetime64[15D]')


See how the 1st day is March 27th?

I couldn't find a reference to this dtype ( "datetime64[15D]" ) in the numpy docs, but I think it's a common pattern in Pandas, that is using a number to get an increment of the frequency, for example "5T" is 5-minutes, etc.

There is a reference to using arange with dtype on the datetimes & timedelta doc page () but the datetime is 1-day or  "datetime64[D]"

Is this the intended outcome? Or is it a side effect?

I wonder if others have tried to adapt Pandas patterns to Numpy datetimes, and if it's an issue for anyone else.

I've advised my colleague not to use Numpy datetimes like this, assuming based on the docs that Pandas-style offsets do not translate into Numpy style datetimes.

thanks!

--
Mark Mikofski, PhD (2005)
Fiat Lux
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: number datetime64 dtypes

Mark Mikofski-2
Hi Eric,

Thank you so much for your answer!

That explains this interesting behavior:
>>> [np.datetime64('2008-04-01', f'{x}D') for x in range(1, 16)]
[numpy.datetime64('2008-04-01'),
 numpy.datetime64('2008-04-01','2D'),
 numpy.datetime64('2008-03-30','3D'),
 numpy.datetime64('2008-03-30','4D'),
 numpy.datetime64('2008-04-01','5D'),
 numpy.datetime64('2008-03-30','6D'),
 numpy.datetime64('2008-03-27','7D'),
 numpy.datetime64('2008-03-30','8D'),
 numpy.datetime64('2008-03-30','9D'),
 numpy.datetime64('2008-04-01','10D'),
 numpy.datetime64('2008-04-01','11D'),
 numpy.datetime64('2008-03-30','12D'),
 numpy.datetime64('2008-03-24','13D'),
 numpy.datetime64('2008-03-20','14D'),
 numpy.datetime64('2008-03-27','15D')]
Is there something to be done? Perhaps a change to the documentation? I'm willing to open a PR with your notes from this thread.
Thanks!

On Thu, Sep 10, 2020 at 11:04 AM Eric Wieser <[hidden email]> wrote:
It's interesting to confirm that people are aware of this syntax!

This is intended but perhaps not useful behavior.

`datetime64[15D]` is a type that stores dates by the nearest date that is a multiple of 15 days from the unix epoch.
Arguably there isn't a situation where using `15D` makes a whole lot of sense, but the generalization is useful - `datetime64[15m]` stores dates rounded to the nearest quarter hour, which is somewhat sensible.

Perhaps we should have added support for a custom epoch, which would make your problem go away...

On Thu, 10 Sep 2020 at 18:43, Dr. Mark Alexander Mikofski PhD <[hidden email]> wrote:
Hi,

Thank you for your time.

A colleague asked me about creating a range of numpy datetime64 at 15-day increments.

This works:
np.arange(np.datetime64('2008-04-01'), np.datetime64('2020-09-01'), np.timedelta64(15, 'D'))
but then they also showed me this, which leads to some very strange responses:

np.arange(np.datetime64('2008-04-01'), np.datetime64('2020-09-01'), dtype="datetime64[15D]")
Out[50]:
array(['2008-03-27', '2008-04-11', '2008-04-26', '2008-05-11',
       '2008-05-26', '2008-06-10', '2008-06-25', '2008-07-10',
...
       '2020-05-23', '2020-06-07', '2020-06-22', '2020-07-07',
       '2020-07-22', '2020-08-06'], dtype='datetime64[15D]')


See how the 1st day is March 27th?

I couldn't find a reference to this dtype ( "datetime64[15D]" ) in the numpy docs, but I think it's a common pattern in Pandas, that is using a number to get an increment of the frequency, for example "5T" is 5-minutes, etc.

There is a reference to using arange with dtype on the datetimes & timedelta doc page () but the datetime is 1-day or  "datetime64[D]"

Is this the intended outcome? Or is it a side effect?

I wonder if others have tried to adapt Pandas patterns to Numpy datetimes, and if it's an issue for anyone else.

I've advised my colleague not to use Numpy datetimes like this, assuming based on the docs that Pandas-style offsets do not translate into Numpy style datetimes.

thanks!

--
Mark Mikofski, PhD (2005)
Fiat Lux
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion


--
Mark Mikofski, PhD (2005)
Fiat Lux

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion