

Hi All,
I've been using numpy array objects to store collections of 2D (and soon ND) variables. When iterating through these collections, I often found it useful to use `ndindex`, which for `for loops` behaves much like `range` with only a `stop` parameter.
That said, it lacks a few features that are now present in `range` are missing from `ndindex`, most notably the ability to iterate over a subset of the ndindex.
I found myself often writing `itertools.product(range(1, data.shapep[0]), range(3, data.shape[2]))` for custom iterations. While it does flatten out the for loop, it is arguable less readable than having 1 or 2 levels of nested for loops.
It is quite possible that `nditer` would solve my problems, but unfortunately I am still not able to make sense of then numerous options it has.
I propose an `ndrange` class that can be used to iterate over ndcollections mimicking the API of `range` as much as possible and adapting it to the ND case (i.e. returning tuples instead of singletons).
Since this is an enhancement proposal, I am bringing the idea to the mailing list for reactions.
The implementation in this PR https://github.com/numpy/numpy/pull/12094 is based on keeping track of a tuple of python `range` range objects. The `__iter__` method returns the result of `itertools.product(*self._ranges)`
By leveraging python's `range` implementation, operations like `containement` `index`, `reversed`, `equality` and most importantly slicing of the ndrange object are possible to offer to the general numpy audiance.
For example, iterating through a 2D collection but avoiding indexing the first and last column used to look like this:
``` c = np.empty((4, 4), dtype=object) # ... compute on c for j in range(c.shape[0]): for i in range(1, c.shape[1]1): c[j, i] # = compute on c[j, i] that depends on the index i, j
```
With `np.ndrange` it can look something like this:
```
c = np.empty((4, 4), dtype=object) # ... compute on c for i in np.ndrange(c.shape)[:, 1:1]: c[i] # = some operation on c[i] that depends on the index i
```
very pythonic, very familiar to numpy users
Thank you for the feedback,
Mark
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion


On 10/07/2018 10:32 AM, Mark Harfouche wrote:
> With `np.ndrange` it can look something like this:
>
> ```
> c = np.empty((4, 4), dtype=object)
> # ... compute on c
> for i in np.ndrange(c.shape)[:, 1:1]:
> c[i] # = some operation on c[i] that depends on the index i
> ```
>
> very pythonic, very familiar to numpy users
So if I understand, this does the same as `np.ndindex` but allows
numpylike slicing of the returned generator object, as requested in #6393.
I don't like the duplication in functionality between ndindex and
ndrange here. Better rather to add the slicing functionality to ndindex,
than create a whole new nearlyidentical function. np.ndindex is
already a somewhat obscure and discouraged method since it is usually
better to find a vectorized numpy operation instead of a for loop, and I
don't like adding more obscure functions.
But as an improvement to np.ndindex, I think adding this functionality
seems good if it can be nicely implemented. Maybe there is a way to use
the same optimization tricks as in the current implementation of ndindex
but allow different stop/step? A simple wrapper of ndindex?
Cheers,
Allan
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion


Allan,
Sorry for the delay. I had my mailing list preferences set to digest. I changed them for now. (I hope this message continues that thread).
Thank
you for your feedback. You are correct in identifying that the real
feature is expanding the `ndindex` API to support slicing. See comments
about the separate points you raised below
## Expanding the API of ndindex
> Better rather to add the slicing functionality to ndindex, than create a whole new nearlyidentical function.
I ran into 2 issues: 1.
Getting around the catchall positional argument is annoying, and logic
to do that will likely be error prone. Peculiarities about how we
implement it might cause some very strange for `tuplelike` inputs that we don't expect.
2.
`ndindex` is an iterator itself. As proposed, `ndrange`, like `range`,
is not an iterator. Changing this behaviour would likely lead to
breaking code that uses that assumption. For example anybody using
introspection or code like:
``` indx = np.ndindex(5, 5) next(indx) # Don't look at the (0, 0) coordinate
for i in indx:
print(i) ``` would break if `ndindex` becomes "not an iterator"
For
these two reasons, I thought it was easier to simply have a new class,
that seems like a close sibling to `ndindex`.
I
personally don't care about point 1 so much. In my mind, start, stop
and step is confusing in ND. but maybe some might find it useful? Point 1
also makes it harder to make `ndrange` more familiar to `range` users.
> I don't like adding more obscure functions
Hopefully the name `ndrange` makes it easier to find?
## Writing vectorized code
>
np.ndindex is
already a somewhat obscure and discouraged method since it is usually
better to find a vectorized numpy operation instead of a for loop
I
understand that this kind of function is not focused on `numerical`
operations on the elements of the matrix itself. It really is there to
help fill the void of any useful multidimensional python container.
I
think `ndrange`/`ndindex` is there to be used like `np.vectorized`. I've
tried to use `np.vectorize` in my own code, but quickly found that
making logic fit into vectorize's requirements was often more
complicated than writing my own loop multinested loops. In my opinion, nested `range`
loops or `ndrange`/`ndindex` is a much more natural way to loop over collections compared to `np.vectorized`.
I'm glad to add warnings to the docs.
## Implementation detail: itertools.product + range vs nditer
> Maybe there is a way to use
the same optimization tricks as in the current implementation of ndindex
but allow different stop/step? My primary goal here
is to make `ndrange` behave much like
`range`. By implementing it on top of `range`, it makes it obvious to me
how to enforce that behaviour as the API of range gets expanded (though
it seems to have settled since Python 3.3). Whatever we decide to call
`ndrange`/`ndindex`, the tests I wrote can help ensure we have good
rangeAPI coverage (for now).
itertools.product + range seems to be much faster than the current implementation of ndindex
(python 3.6)
``` %%timeit
for i in np.ndindex(100, 100): pass 3.94 ms ± 19.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%%timeit import itertools for i in itertools.product(range(100), range(100)): pass 231 µs ± 1.09 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) ```
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion


On 10/8/18 12:21 PM, Mark Harfouche wrote:
> 2. `ndindex` is an iterator itself. As proposed, `ndrange`, like
> `range`, is not an iterator. Changing this behaviour would likely lead
> to breaking code that uses that assumption. For example anybody using
> introspection or code like:
>
> ```
> indx = np.ndindex(5, 5)
> next(indx) # Don't look at the (0, 0) coordinate
> for i in indx:
> print(i)
> ```
> would break if `ndindex` becomes "not an iterator"
OK, I see now. Just like python3 has separate range and range_iterator
types, where range is sliceable, we would have separate ndrange and
ndindex types, where ndrange is sliceable. You're just copying the
python3 api. That justifies it pretty well for me.
I still think we shouldn't have two functions which do nearly the same
thing. We should only have one, and get rid of the other. I see two ways
forward:
* replace ndindex by your ndrange code, so it is no longer an iter.
This would require some deprecation cycles for the cases that break.
* deprecate ndindex in favor of a new function ndrange. We would keep
ndindex around for backcompatibility, with a dep warning to use
ndrange instead.
Doing a code search on github, I can see that a lot of people's code
would break if ndindex no longer was an iter. I also like the name
ndrange for its allusion to python3's range behavior. That makes me lean
towards the second option of a separate ndrange, with possible
deprecation of ndindex.
> itertools.product + range seems to be much faster than the current
> implementation of ndindex
>
> (python 3.6)
> ```
> %%timeit
>
> for i in np.ndindex(100, 100):
> pass
> 3.94 ms ± 19.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
>
> %%timeit
> import itertools
> for i in itertools.product(range(100), range(100)):
> pass
> 231 µs ± 1.09 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
> ```
If the new code ends up faster than the old code, that's great, and
further justification for using ndrange instead of ndindex. I had
thought using nditer in the old code was fastest.
So as far as I am concerned, I say go ahead with the PR the way you are
doing it.
Allan
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion


I'm open to adding ndrange, and "softdeprecating" ndindex (i.e., discouraging its use in our docs, but not actually deprecating it). Certainly ndrange seems like a small but meaningful improvement in the interface.
That said, I'm not convinced this is really worth the trouble. I think the nested loop is still pretty readable/clear, and there are few times when I've actually found ndindex() be useful. On 10/8/18 12:21 PM, Mark Harfouche wrote:
> 2. `ndindex` is an iterator itself. As proposed, `ndrange`, like
> `range`, is not an iterator. Changing this behaviour would likely lead
> to breaking code that uses that assumption. For example anybody using
> introspection or code like:
>
> ```
> indx = np.ndindex(5, 5)
> next(indx) # Don't look at the (0, 0) coordinate
> for i in indx:
> print(i)
> ```
> would break if `ndindex` becomes "not an iterator"
OK, I see now. Just like python3 has separate range and range_iterator
types, where range is sliceable, we would have separate ndrange and
ndindex types, where ndrange is sliceable. You're just copying the
python3 api. That justifies it pretty well for me.
I still think we shouldn't have two functions which do nearly the same
thing. We should only have one, and get rid of the other. I see two ways
forward:
* replace ndindex by your ndrange code, so it is no longer an iter.
This would require some deprecation cycles for the cases that break.
* deprecate ndindex in favor of a new function ndrange. We would keep
ndindex around for backcompatibility, with a dep warning to use
ndrange instead.
Doing a code search on github, I can see that a lot of people's code
would break if ndindex no longer was an iter. I also like the name
ndrange for its allusion to python3's range behavior. That makes me lean
towards the second option of a separate ndrange, with possible
deprecation of ndindex.
> itertools.product + range seems to be much faster than the current
> implementation of ndindex
>
> (python 3.6)
> ```
> %%timeit
>
> for i in np.ndindex(100, 100):
> pass
> 3.94 ms ± 19.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
>
> %%timeit
> import itertools
> for i in itertools.product(range(100), range(100)):
> pass
> 231 µs ± 1.09 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
> ```
If the new code ends up faster than the old code, that's great, and
further justification for using ndrange instead of ndindex. I had
thought using nditer in the old code was fastest.
So as far as I am concerned, I say go ahead with the PR the way you are
doing it.
Allan
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion


since ndrange is a superset of the features of ndindex, we can implement ndindex with ndrange or keep it as is. ndindex is now a glorified `nditer` object anyway. So it isn't so much of a maintenance burden.
As for how ndindex is implemented, I'm a little worried about python 2 performance seeing as range is a list.
I would wait on changing the way ndindex is implemented for now.
I agree with Stephan that ndindex should be kept in. Many want backward compatible code. It would be hard for me to justify why a dependency should be bumped up to bleeding edge numpy just for a convenience iterator.
Honestly, I was really surprised to see such a speed difference, I thought it would have been closer.
Allan, I decided to run a few more benchmarks, the nditer just seems slow for single array access some reason. Maybe a bug?
``` import numpy as np import itertools a = np.ones((1000, 1000))
b = {} for i in np.ndindex(a.shape): b[i] = i
%%timeit
# op_flag=('readonly',) doesn't change performance
for a_value in np.nditer(a): pass 109 ms ± 921 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%%timeit for i in itertools.product(range(1000), range(1000)): a_value = a[i] 113 ms ± 1.72 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
%%timeit for i in itertools.product(range(1000), range(1000)): c = b[i] 193 ms ± 3.89 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) %%timeit for a_value in a.flat: pass 25.3 ms ± 278 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%%timeit for k, v in b.items(): pass 19.9 ms ± 675 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%%timeit for i in itertools.product(range(1000), range(1000)): pass 28 ms ± 715 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
```
I'm open to adding ndrange, and "softdeprecating" ndindex (i.e., discouraging its use in our docs, but not actually deprecating it). Certainly ndrange seems like a small but meaningful improvement in the interface.
That said, I'm not convinced this is really worth the trouble. I think the nested loop is still pretty readable/clear, and there are few times when I've actually found ndindex() be useful.
On 10/8/18 12:21 PM, Mark Harfouche wrote:
> 2. `ndindex` is an iterator itself. As proposed, `ndrange`, like
> `range`, is not an iterator. Changing this behaviour would likely lead
> to breaking code that uses that assumption. For example anybody using
> introspection or code like:
>
> ```
> indx = np.ndindex(5, 5)
> next(indx) # Don't look at the (0, 0) coordinate
> for i in indx:
> print(i)
> ```
> would break if `ndindex` becomes "not an iterator"
OK, I see now. Just like python3 has separate range and range_iterator
types, where range is sliceable, we would have separate ndrange and
ndindex types, where ndrange is sliceable. You're just copying the
python3 api. That justifies it pretty well for me.
I still think we shouldn't have two functions which do nearly the same
thing. We should only have one, and get rid of the other. I see two ways
forward:
* replace ndindex by your ndrange code, so it is no longer an iter.
This would require some deprecation cycles for the cases that break.
* deprecate ndindex in favor of a new function ndrange. We would keep
ndindex around for backcompatibility, with a dep warning to use
ndrange instead.
Doing a code search on github, I can see that a lot of people's code
would break if ndindex no longer was an iter. I also like the name
ndrange for its allusion to python3's range behavior. That makes me lean
towards the second option of a separate ndrange, with possible
deprecation of ndindex.
> itertools.product + range seems to be much faster than the current
> implementation of ndindex
>
> (python 3.6)
> ```
> %%timeit
>
> for i in np.ndindex(100, 100):
> pass
> 3.94 ms ± 19.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
>
> %%timeit
> import itertools
> for i in itertools.product(range(100), range(100)):
> pass
> 231 µs ± 1.09 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
> ```
If the new code ends up faster than the old code, that's great, and
further justification for using ndrange instead of ndindex. I had
thought using nditer in the old code was fastest.
So as far as I am concerned, I say go ahead with the PR the way you are
doing it.
Allan
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion


The speed difference is interesting but really a different question than the public API.
I'm coming around to ndrange(). I can see how it could be useful for symbolic manipulation of arrays and indexing operations, similar to what we do in dask and xarray. since ndrange is a superset of the features of ndindex, we can implement ndindex with ndrange or keep it as is. ndindex is now a glorified `nditer` object anyway. So it isn't so much of a maintenance burden.
As for how ndindex is implemented, I'm a little worried about python 2 performance seeing as range is a list.
I would wait on changing the way ndindex is implemented for now.
I agree with Stephan that ndindex should be kept in. Many want backward compatible code. It would be hard for me to justify why a dependency should be bumped up to bleeding edge numpy just for a convenience iterator.
Honestly, I was really surprised to see such a speed difference, I thought it would have been closer.
Allan, I decided to run a few more benchmarks, the nditer just seems slow for single array access some reason. Maybe a bug?
``` import numpy as np import itertools a = np.ones((1000, 1000))
b = {} for i in np.ndindex(a.shape): b[i] = i
%%timeit
# op_flag=('readonly',) doesn't change performance
for a_value in np.nditer(a): pass 109 ms ± 921 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%%timeit for i in itertools.product(range(1000), range(1000)): a_value = a[i] 113 ms ± 1.72 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
%%timeit for i in itertools.product(range(1000), range(1000)): c = b[i] 193 ms ± 3.89 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) %%timeit for a_value in a.flat: pass 25.3 ms ± 278 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%%timeit for k, v in b.items(): pass 19.9 ms ± 675 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%%timeit for i in itertools.product(range(1000), range(1000)): pass 28 ms ± 715 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
```
I'm open to adding ndrange, and "softdeprecating" ndindex (i.e., discouraging its use in our docs, but not actually deprecating it). Certainly ndrange seems like a small but meaningful improvement in the interface.
That said, I'm not convinced this is really worth the trouble. I think the nested loop is still pretty readable/clear, and there are few times when I've actually found ndindex() be useful.
On 10/8/18 12:21 PM, Mark Harfouche wrote:
> 2. `ndindex` is an iterator itself. As proposed, `ndrange`, like
> `range`, is not an iterator. Changing this behaviour would likely lead
> to breaking code that uses that assumption. For example anybody using
> introspection or code like:
>
> ```
> indx = np.ndindex(5, 5)
> next(indx) # Don't look at the (0, 0) coordinate
> for i in indx:
> print(i)
> ```
> would break if `ndindex` becomes "not an iterator"
OK, I see now. Just like python3 has separate range and range_iterator
types, where range is sliceable, we would have separate ndrange and
ndindex types, where ndrange is sliceable. You're just copying the
python3 api. That justifies it pretty well for me.
I still think we shouldn't have two functions which do nearly the same
thing. We should only have one, and get rid of the other. I see two ways
forward:
* replace ndindex by your ndrange code, so it is no longer an iter.
This would require some deprecation cycles for the cases that break.
* deprecate ndindex in favor of a new function ndrange. We would keep
ndindex around for backcompatibility, with a dep warning to use
ndrange instead.
Doing a code search on github, I can see that a lot of people's code
would break if ndindex no longer was an iter. I also like the name
ndrange for its allusion to python3's range behavior. That makes me lean
towards the second option of a separate ndrange, with possible
deprecation of ndindex.
> itertools.product + range seems to be much faster than the current
> implementation of ndindex
>
> (python 3.6)
> ```
> %%timeit
>
> for i in np.ndindex(100, 100):
> pass
> 3.94 ms ± 19.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
>
> %%timeit
> import itertools
> for i in itertools.product(range(100), range(100)):
> pass
> 231 µs ± 1.09 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
> ```
If the new code ends up faster than the old code, that's great, and
further justification for using ndrange instead of ndindex. I had
thought using nditer in the old code was fastest.
So as far as I am concerned, I say go ahead with the PR the way you are
doing it.
Allan
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion


One thing that worries me here  in python, range(...) in essence generates a lazy list  so I’d expect ndrange to generate a lazy ndarray . In practice, that means it would be a ducktype defining an __array__ method to evaluate it, and only implement methods already present in numpy.
It’s not clear to me what the datatype of such an arraylike would be. Candidates I can think of are:
[('i0', intp), ('i1', intp), ...] , but this makes tuple coercion a little awkward
(intp, (N,))  which collapses into a shape + (3,) array
object_ .
 Some new
np.tuple_ dtype, a heterogenous tuple, which is like the structured np.void but without field names. I’m not sure how vectorized element indexing would be spelt though.
Eric
The speed difference is interesting but really a different question than the public API.
I'm coming around to ndrange(). I can see how it could be useful for symbolic manipulation of arrays and indexing operations, similar to what we do in dask and xarray.
since ndrange is a superset of the features of ndindex, we can implement ndindex with ndrange or keep it as is. ndindex is now a glorified `nditer` object anyway. So it isn't so much of a maintenance burden.
As for how ndindex is implemented, I'm a little worried about python 2 performance seeing as range is a list.
I would wait on changing the way ndindex is implemented for now.
I agree with Stephan that ndindex should be kept in. Many want backward compatible code. It would be hard for me to justify why a dependency should be bumped up to bleeding edge numpy just for a convenience iterator.
Honestly, I was really surprised to see such a speed difference, I thought it would have been closer.
Allan, I decided to run a few more benchmarks, the nditer just seems slow for single array access some reason. Maybe a bug?
``` import numpy as np import itertools a = np.ones((1000, 1000))
b = {} for i in np.ndindex(a.shape): b[i] = i
%%timeit
# op_flag=('readonly',) doesn't change performance
for a_value in np.nditer(a): pass 109 ms ± 921 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%%timeit for i in itertools.product(range(1000), range(1000)): a_value = a[i] 113 ms ± 1.72 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
%%timeit for i in itertools.product(range(1000), range(1000)): c = b[i] 193 ms ± 3.89 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) %%timeit for a_value in a.flat: pass 25.3 ms ± 278 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%%timeit for k, v in b.items(): pass 19.9 ms ± 675 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%%timeit for i in itertools.product(range(1000), range(1000)): pass 28 ms ± 715 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
```
I'm open to adding ndrange, and "softdeprecating" ndindex (i.e., discouraging its use in our docs, but not actually deprecating it). Certainly ndrange seems like a small but meaningful improvement in the interface.
That said, I'm not convinced this is really worth the trouble. I think the nested loop is still pretty readable/clear, and there are few times when I've actually found ndindex() be useful.
On 10/8/18 12:21 PM, Mark Harfouche wrote:
> 2. `ndindex` is an iterator itself. As proposed, `ndrange`, like
> `range`, is not an iterator. Changing this behaviour would likely lead
> to breaking code that uses that assumption. For example anybody using
> introspection or code like:
>
> ```
> indx = np.ndindex(5, 5)
> next(indx) # Don't look at the (0, 0) coordinate
> for i in indx:
> print(i)
> ```
> would break if `ndindex` becomes "not an iterator"
OK, I see now. Just like python3 has separate range and range_iterator
types, where range is sliceable, we would have separate ndrange and
ndindex types, where ndrange is sliceable. You're just copying the
python3 api. That justifies it pretty well for me.
I still think we shouldn't have two functions which do nearly the same
thing. We should only have one, and get rid of the other. I see two ways
forward:
* replace ndindex by your ndrange code, so it is no longer an iter.
This would require some deprecation cycles for the cases that break.
* deprecate ndindex in favor of a new function ndrange. We would keep
ndindex around for backcompatibility, with a dep warning to use
ndrange instead.
Doing a code search on github, I can see that a lot of people's code
would break if ndindex no longer was an iter. I also like the name
ndrange for its allusion to python3's range behavior. That makes me lean
towards the second option of a separate ndrange, with possible
deprecation of ndindex.
> itertools.product + range seems to be much faster than the current
> implementation of ndindex
>
> (python 3.6)
> ```
> %%timeit
>
> for i in np.ndindex(100, 100):
> pass
> 3.94 ms ± 19.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
>
> %%timeit
> import itertools
> for i in itertools.product(range(100), range(100)):
> pass
> 231 µs ± 1.09 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
> ```
If the new code ends up faster than the old code, that's great, and
further justification for using ndrange instead of ndindex. I had
thought using nditer in the old code was fastest.
So as far as I am concerned, I say go ahead with the PR the way you are
doing it.
Allan
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion


Eric,
Great point. The multidimensional slicing and sequence return type is definitely strange. I was thinking about that last night. I’m a little new to the __array__ methods. Are you saying that the sequence behaviour would stay the same, (ie. __iter__ , __revesed__ , __contains__ ), but
np.asarray(np.ndrange((3, 3))) would return something like an array of tuples? I’m not sure this is something that anybody can’t already with do meshgrid + stack
and only implement methods already present in numpy.
I’m not sure what this means.
I’ll note that in Python 3, range is it’s own thing. It is still a sequence type but it doesn’t support addition. I’m kinda ok with ndrange /ndindex being a sequence type, supporting ND slicing, but not being an array ;)
I’m kinda warming up to the idea of expanding ndindex .
 The additional
start and step can be omitted from ndindex for a while (indefinitely?). Slicing is way more convenient anyway.
 Warnings can help people move from
nd.index(1, 2, 3) to nd.index((1, 2, 3))
 ndindex can return a seperate iterator, but the ndindex object would hold a reference to it. Calls to
ndindex.__next__ would simply return next(of_that_object) Note. This would break introspection since the iterator is no longer ndindex type. I’m kinda OK with this though, but breaking code is never nice :(
 Benchmarking can help motivate the choice of iterator used for
step=(1,) * N start=(0,) * N
 Wait until 2019 because I don’t want to deal with performance regressions of potentially using
range in Python2 and I don’t want this to motivate any implementation details.
Mark
One thing that worries me here  in python, range(...) in essence generates a lazy list  so I’d expect ndrange to generate a lazy ndarray . In practice, that means it would be a ducktype defining an __array__ method to evaluate it, and only implement methods already present in numpy.
It’s not clear to me what the datatype of such an arraylike would be. Candidates I can think of are:
[('i0', intp), ('i1', intp), ...] , but this makes tuple coercion a little awkward
(intp, (N,))  which collapses into a shape + (3,) array
object_ .
 Some new
np.tuple_ dtype, a heterogenous tuple, which is like the structured np.void but without field names. I’m not sure how vectorized element indexing would be spelt though.
Eric
The speed difference is interesting but really a different question than the public API.
I'm coming around to ndrange(). I can see how it could be useful for symbolic manipulation of arrays and indexing operations, similar to what we do in dask and xarray.
since ndrange is a superset of the features of ndindex, we can implement ndindex with ndrange or keep it as is. ndindex is now a glorified `nditer` object anyway. So it isn't so much of a maintenance burden.
As for how ndindex is implemented, I'm a little worried about python 2 performance seeing as range is a list.
I would wait on changing the way ndindex is implemented for now.
I agree with Stephan that ndindex should be kept in. Many want backward compatible code. It would be hard for me to justify why a dependency should be bumped up to bleeding edge numpy just for a convenience iterator.
Honestly, I was really surprised to see such a speed difference, I thought it would have been closer.
Allan, I decided to run a few more benchmarks, the nditer just seems slow for single array access some reason. Maybe a bug?
``` import numpy as np import itertools a = np.ones((1000, 1000))
b = {} for i in np.ndindex(a.shape): b[i] = i
%%timeit
# op_flag=('readonly',) doesn't change performance
for a_value in np.nditer(a): pass 109 ms ± 921 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%%timeit for i in itertools.product(range(1000), range(1000)): a_value = a[i] 113 ms ± 1.72 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
%%timeit for i in itertools.product(range(1000), range(1000)): c = b[i] 193 ms ± 3.89 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) %%timeit for a_value in a.flat: pass 25.3 ms ± 278 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%%timeit for k, v in b.items(): pass 19.9 ms ± 675 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%%timeit for i in itertools.product(range(1000), range(1000)): pass 28 ms ± 715 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
```
I'm open to adding ndrange, and "softdeprecating" ndindex (i.e., discouraging its use in our docs, but not actually deprecating it). Certainly ndrange seems like a small but meaningful improvement in the interface.
That said, I'm not convinced this is really worth the trouble. I think the nested loop is still pretty readable/clear, and there are few times when I've actually found ndindex() be useful.
On 10/8/18 12:21 PM, Mark Harfouche wrote:
> 2. `ndindex` is an iterator itself. As proposed, `ndrange`, like
> `range`, is not an iterator. Changing this behaviour would likely lead
> to breaking code that uses that assumption. For example anybody using
> introspection or code like:
>
> ```
> indx = np.ndindex(5, 5)
> next(indx) # Don't look at the (0, 0) coordinate
> for i in indx:
> print(i)
> ```
> would break if `ndindex` becomes "not an iterator"
OK, I see now. Just like python3 has separate range and range_iterator
types, where range is sliceable, we would have separate ndrange and
ndindex types, where ndrange is sliceable. You're just copying the
python3 api. That justifies it pretty well for me.
I still think we shouldn't have two functions which do nearly the same
thing. We should only have one, and get rid of the other. I see two ways
forward:
* replace ndindex by your ndrange code, so it is no longer an iter.
This would require some deprecation cycles for the cases that break.
* deprecate ndindex in favor of a new function ndrange. We would keep
ndindex around for backcompatibility, with a dep warning to use
ndrange instead.
Doing a code search on github, I can see that a lot of people's code
would break if ndindex no longer was an iter. I also like the name
ndrange for its allusion to python3's range behavior. That makes me lean
towards the second option of a separate ndrange, with possible
deprecation of ndindex.
> itertools.product + range seems to be much faster than the current
> implementation of ndindex
>
> (python 3.6)
> ```
> %%timeit
>
> for i in np.ndindex(100, 100):
> pass
> 3.94 ms ± 19.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
>
> %%timeit
> import itertools
> for i in itertools.product(range(100), range(100)):
> pass
> 231 µs ± 1.09 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
> ```
If the new code ends up faster than the old code, that's great, and
further justification for using ndrange instead of ndindex. I had
thought using nditer in the old code was fastest.
So as far as I am concerned, I say go ahead with the PR the way you are
doing it.
Allan
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion


One thing that worries me here  in python, range(...) in essence generates a lazy list  so I’d expect ndrange to generate a lazy ndarray . In practice, that means it would be a ducktype defining an __array__ method to evaluate it, and only implement methods already present in numpy.
It’s not clear to me what the datatype of such an arraylike would be. Candidates I can think of are:
[('i0', intp), ('i1', intp), ...] , but this makes tuple coercion a little awkward
I think this would be the appropriate choice. What about it makes tuple coercion awkward? If you use this as the dtype, you both set and get element as tuples.
In particular, I would say that ndrange() should be a lazy equivalent to the following explicit constructor:
def ndrange(shape): dtype = [('i' + str(i), np.intp) for i in range(len(shape))] array = np.empty(shape, dtype) for indices in np.ndindex(*shape): array[indices] = indices return array
>>> ndrange((2,) array([(0,), (1,)], dtype=[('i0', '<i8')])
>>> ndrange((2, 3)) array([[(0, 0), (0, 1), (0, 2)],
[(1, 0), (1, 1), (1, 2)]], dtype=[('i0', '<i8'), ('i1', '<i8')])
The one deviation in behavior would be that ndrange() iterates over flattened elements rather than the first axes.
It is indeed a little awkward to have field names, but given that NumPy creates those automatically when you supply a dtype like 'i8,i8' this is probably a reasonable choice.
(intp, (N,))  which collapses into a shape + (3,) array
object_ .
 Some new
np.tuple_ dtype, a heterogenous tuple, which is like the structured np.void but without field names. I’m not sure how vectorized element indexing would be spelt though.
Eric
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion


On 10/10/18 12:34 AM, Eric Wieser wrote:
> One thing that worries me here  in python, range(...) in essence
> generates a lazy list  so I’d expect ndrange to generate a lazy
> ndarray. In practice, that means it would be a ducktype defining an
> __array__ method to evaluate it, and only implement methods already
> present in numpy.
Isn't that what arange is for?
It seems like there are two uses of python3's range: 1. creating a 1d
iterable of indices for use in forloops, and 2. with list(range) can be
used to create a sequence of integers.
Numpy can extend this in two directions:
* ndrange returns an iterable of nd indices (for forloops).
* arange returns an 1d ndarray of integers instead of a list
The application of forloops, which is more niche, doesn't need
ndarray's vectorized properties, so I'm not convinced it should return
an ndarray. It certainly seems simpler not to return an ndarray, due to
the dtype question.
arange on its own seems to cover the need for a vectorized version of range.
Allan
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion


I’m really open to these kinds of array extensions but, I (personally) just don’t know how to do this efficiently. I feel like ogrid and mgrid are probably enough for people that want think kind of feature.
My implementation would just be based on python primitives which would yield performance similar to
In [2]: %timeit np.arange(1000)
1.25 µs ± 4.01 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
In [4]: %timeit np.asarray(range(1000))
99.6 µs ± 1.38 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
Here is how mgrid can be used to return something similar to the indices from ndrange
In [10]: np.mgrid[1:10:3, 2:10:3][:, 1, 1]
Out[10]: array([4, 5])
In [13]: np.ndrange((10, 10))[1::3, 2::3][1, 1]
Out[13]: (4, 5)
On 10/10/18 12:34 AM, Eric Wieser wrote:
> One thing that worries me here  in python, range(...) in essence
> generates a lazy list  so I’d expect ndrange to generate a lazy
> ndarray. In practice, that means it would be a ducktype defining an
> __array__ method to evaluate it, and only implement methods already
> present in numpy.
Isn't that what arange is for?
It seems like there are two uses of python3's range: 1. creating a 1d
iterable of indices for use in forloops, and 2. with list(range) can be
used to create a sequence of integers.
Numpy can extend this in two directions:
* ndrange returns an iterable of nd indices (for forloops).
* arange returns an 1d ndarray of integers instead of a list
The application of forloops, which is more niche, doesn't need
ndarray's vectorized properties, so I'm not convinced it should return
an ndarray. It certainly seems simpler not to return an ndarray, due to
the dtype question.
arange on its own seems to cover the need for a vectorized version of range.
Allan
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion


If you use this as the dtype, you both set and get element as tuples.
Elements are not got as tuples, but they can be explicitly cast
What about it makes tuple coercion awkward?
This explicit cast
>>> dt_ind2d = np.dtype([('i0', np.intp), ('i1', np.intp)])
>>> ind = np.zeros((), dt_ind2d)[0]
>>> ind, type(ind)
((0, 0), <class 'numpy.void'>)
>>> m[ind]
Traceback (most recent call last):
File "<pyshell#17>", line 1, in <module>
m[inds[0]]
IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices
>>> m[tuple(ind)]
1.0
On Wed, 10 Oct 2018 at 09:08 Stephan Hoyer shoyer@... wrote:
One thing that worries me here  in python, range(...) in essence generates a lazy list  so I’d expect ndrange to generate a lazy ndarray . In practice, that means it would be a ducktype defining an __array__ method to evaluate it, and only implement methods already present in numpy.
It’s not clear to me what the datatype of such an arraylike would be. Candidates I can think of are:
[('i0', intp), ('i1', intp), ...] , but this makes tuple coercion a little awkward
I think this would be the appropriate choice. What about it makes tuple coercion awkward? If you use this as the dtype, you both set and get element as tuples.
In particular, I would say that ndrange() should be a lazy equivalent to the following explicit constructor:
def ndrange(shape): dtype = [('i' + str(i), np.intp) for i in range(len(shape))] array = np.empty(shape, dtype) for indices in np.ndindex(*shape): array[indices] = indices return array
>>> ndrange((2,) array([(0,), (1,)], dtype=[('i0', '<i8')])
>>> ndrange((2, 3)) array([[(0, 0), (0, 1), (0, 2)],
[(1, 0), (1, 1), (1, 2)]], dtype=[('i0', '<i8'), ('i1', '<i8')])
The one deviation in behavior would be that ndrange() iterates over flattened elements rather than the first axes.
It is indeed a little awkward to have field names, but given that NumPy creates those automatically when you supply a dtype like 'i8,i8' this is probably a reasonable choice.
(intp, (N,))  which collapses into a shape + (3,) array
object_ .
 Some new
np.tuple_ dtype, a heterogenous tuple, which is like the structured np.void but without field names. I’m not sure how vectorized element indexing would be spelt though.
Eric
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion


Isn’t that what arange is for?
Imagining ourselves in python2 land for now  I’m proposing arange is to range , as ndrange is to xrange
I’m not convinced it should return an ndarray
I agree  I think it should return a rangelike object that:
 Is convertible via
__array__ if needed
 Looks like an ndarray, with:
 a
.dtype attribute
 a
__getitem__(Tuple[int]) which returns numpy scalars
.ravel() and .flat for choosing iteration order.
On Wed, 10 Oct 2018 at 11:21 Allan Haldane allanhaldane@... wrote:
On 10/10/18 12:34 AM, Eric Wieser wrote:
> One thing that worries me here  in python, range(...) in essence
> generates a lazy list  so I’d expect ndrange to generate a lazy
> ndarray. In practice, that means it would be a ducktype defining an
> __array__ method to evaluate it, and only implement methods already
> present in numpy.
Isn't that what arange is for?
It seems like there are two uses of python3's range: 1. creating a 1d
iterable of indices for use in forloops, and 2. with list(range) can be
used to create a sequence of integers.
Numpy can extend this in two directions:
* ndrange returns an iterable of nd indices (for forloops).
* arange returns an 1d ndarray of integers instead of a list
The application of forloops, which is more niche, doesn't need
ndarray's vectorized properties, so I'm not convinced it should return
an ndarray. It certainly seems simpler not to return an ndarray, due to
the dtype question.
arange on its own seems to cover the need for a vectorized version of range.
Allan
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion


Eric, interesting ideas.
> __getitem__(Tuple[int]) which returns numpy scalars
I'm not sure what you mean. Even if you supply a numpy uint8 to range, it still returns a python int class.
Would you like ndrange to return a tuple of `uint8` in this case?
``` In [3]: a = iter(range(np.uint8(10)))
In [4]: next(a).__class__ Out[4]: int
In [5]: np.uint8(10).__class__ Out[5]: numpy.uint8
```
Ravel seems like a cool way to choose iteration order. In the PR, I mentionned that one reason that I removed `'F'` order from the PR was:
1. My implementation was not competitive with the `C` order implementation in terms of speed (can be fixed)
2. I don't know if it something that people really need to iterate over collections (annoying to maintain if unused)
Instead, I just showed an example how people could iterate in `F` order should they need to.
I'm not sure if we ever want the `ndrange` object to return a full matrix. It seems like we would be creating a custom tuple class just for this which seems pretty niche.
Isn’t that what arange is for?
Imagining ourselves in python2 land for now  I’m proposing arange is to range , as ndrange is to xrange
I’m not convinced it should return an ndarray
I agree  I think it should return a rangelike object that:
 Is convertible via
__array__ if needed
 Looks like an ndarray, with:
 a
.dtype attribute
 a
__getitem__(Tuple[int]) which returns numpy scalars
.ravel() and .flat for choosing iteration order.
On Wed, 10 Oct 2018 at 11:21 Allan Haldane allanhaldane@... wrote:
On 10/10/18 12:34 AM, Eric Wieser wrote:
> One thing that worries me here  in python, range(...) in essence
> generates a lazy list  so I’d expect ndrange to generate a lazy
> ndarray. In practice, that means it would be a ducktype defining an
> __array__ method to evaluate it, and only implement methods already
> present in numpy.
Isn't that what arange is for?
It seems like there are two uses of python3's range: 1. creating a 1d
iterable of indices for use in forloops, and 2. with list(range) can be
used to create a sequence of integers.
Numpy can extend this in two directions:
* ndrange returns an iterable of nd indices (for forloops).
* arange returns an 1d ndarray of integers instead of a list
The application of forloops, which is more niche, doesn't need
ndarray's vectorized properties, so I'm not convinced it should return
an ndarray. It certainly seems simpler not to return an ndarray, due to
the dtype question.
arange on its own seems to cover the need for a vectorized version of range.
Allan
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion


I’m not sure if we ever want the ndrange object to return a full matrix.
np.array(ndrange(...)) should definitely return a full array, because that’s what the user asked for.
Even if you supply a numpy uint8 to range, it still returns a python int class.
If we want to design ndrange with the intent of indexing only, then it should probably always use np.intp, whatever the type of the provided arguments
Would you like ndrange to return a tuple of uint8 in this case?
Tuples are just one of the four options I listed in a previous message. The downside of tuples is there’s no easy way to say “take just the first axis of this range”. Whatever we pick, the return value should be such that np.array(ndrange(...))[ind] == ndrange(...)[idx]
> __getitem__(Tuple[int]) which returns numpy scalars
I'm not sure what you mean. Even if you supply a numpy uint8 to range, it still returns a python int class.
Would you like ndrange to return a tuple of `uint8` in this case?
``` In [3]: a = iter(range(np.uint8(10)))
In [4]: next(a).__class__ Out[4]: int
In [5]: np.uint8(10).__class__ Out[5]: numpy.uint8
```
Ravel seems like a cool way to choose iteration order. In the PR, I mentionned that one reason that I removed `'F'` order from the PR was:
1. My implementation was not competitive with the `C` order implementation in terms of speed (can be fixed)
2. I don't know if it something that people really need to iterate over collections (annoying to maintain if unused)
Instead, I just showed an example how people could iterate in `F` order should they need to.
I'm not sure if we ever want the `ndrange` object to return a full matrix. It seems like we would be creating a custom tuple class just for this which seems pretty niche.
Isn’t that what arange is for?
Imagining ourselves in python2 land for now  I’m proposing arange is to range , as ndrange is to xrange
I’m not convinced it should return an ndarray
I agree  I think it should return a rangelike object that:
 Is convertible via
__array__ if needed
 Looks like an ndarray, with:
 a
.dtype attribute
 a
__getitem__(Tuple[int]) which returns numpy scalars
.ravel() and .flat for choosing iteration order.
On Wed, 10 Oct 2018 at 11:21 Allan Haldane allanhaldane@... wrote:
On 10/10/18 12:34 AM, Eric Wieser wrote:
> One thing that worries me here  in python, range(...) in essence
> generates a lazy list  so I’d expect ndrange to generate a lazy
> ndarray. In practice, that means it would be a ducktype defining an
> __array__ method to evaluate it, and only implement methods already
> present in numpy.
Isn't that what arange is for?
It seems like there are two uses of python3's range: 1. creating a 1d
iterable of indices for use in forloops, and 2. with list(range) can be
used to create a sequence of integers.
Numpy can extend this in two directions:
* ndrange returns an iterable of nd indices (for forloops).
* arange returns an 1d ndarray of integers instead of a list
The application of forloops, which is more niche, doesn't need
ndarray's vectorized properties, so I'm not convinced it should return
an ndarray. It certainly seems simpler not to return an ndarray, due to
the dtype question.
arange on its own seems to cover the need for a vectorized version of range.
Allan
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion


> If we want to design ndrange with the intent of indexing only
This is the only use I had in mind. But I feel like you are able to envision different use cases.
> Whatever we pick, the return value should be such that np.array(ndrange(...))[ind] == ndrange(...)[idx] I can see the appeal to this.
I’m not sure if we ever want the ndrange object to return a full matrix.
np.array(ndrange(...)) should definitely return a full array, because that’s what the user asked for.
Even if you supply a numpy uint8 to range, it still returns a python int class.
If we want to design ndrange with the intent of indexing only, then it should probably always use np.intp, whatever the type of the provided arguments
Would you like ndrange to return a tuple of uint8 in this case?
Tuples are just one of the four options I listed in a previous message. The downside of tuples is there’s no easy way to say “take just the first axis of this range”. Whatever we pick, the return value should be such that np.array(ndrange(...))[ind] == ndrange(...)[idx]
> __getitem__(Tuple[int]) which returns numpy scalars
I'm not sure what you mean. Even if you supply a numpy uint8 to range, it still returns a python int class.
Would you like ndrange to return a tuple of `uint8` in this case?
``` In [3]: a = iter(range(np.uint8(10)))
In [4]: next(a).__class__ Out[4]: int
In [5]: np.uint8(10).__class__ Out[5]: numpy.uint8
```
Ravel seems like a cool way to choose iteration order. In the PR, I mentionned that one reason that I removed `'F'` order from the PR was:
1. My implementation was not competitive with the `C` order implementation in terms of speed (can be fixed)
2. I don't know if it something that people really need to iterate over collections (annoying to maintain if unused)
Instead, I just showed an example how people could iterate in `F` order should they need to.
I'm not sure if we ever want the `ndrange` object to return a full matrix. It seems like we would be creating a custom tuple class just for this which seems pretty niche.
Isn’t that what arange is for?
Imagining ourselves in python2 land for now  I’m proposing arange is to range , as ndrange is to xrange
I’m not convinced it should return an ndarray
I agree  I think it should return a rangelike object that:
 Is convertible via
__array__ if needed
 Looks like an ndarray, with:
 a
.dtype attribute
 a
__getitem__(Tuple[int]) which returns numpy scalars
.ravel() and .flat for choosing iteration order.
On Wed, 10 Oct 2018 at 11:21 Allan Haldane allanhaldane@... wrote:
On 10/10/18 12:34 AM, Eric Wieser wrote:
> One thing that worries me here  in python, range(...) in essence
> generates a lazy list  so I’d expect ndrange to generate a lazy
> ndarray. In practice, that means it would be a ducktype defining an
> __array__ method to evaluate it, and only implement methods already
> present in numpy.
Isn't that what arange is for?
It seems like there are two uses of python3's range: 1. creating a 1d
iterable of indices for use in forloops, and 2. with list(range) can be
used to create a sequence of integers.
Numpy can extend this in two directions:
* ndrange returns an iterable of nd indices (for forloops).
* arange returns an 1d ndarray of integers instead of a list
The application of forloops, which is more niche, doesn't need
ndarray's vectorized properties, so I'm not convinced it should return
an ndarray. It certainly seems simpler not to return an ndarray, due to
the dtype question.
arange on its own seems to cover the need for a vectorized version of range.
Allan
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion
_______________________________________________
NumPyDiscussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpydiscussion

