Incremental rewrite of numpy.pad (gh-11358)

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Incremental rewrite of numpy.pad (gh-11358)

Lars Grüter
Dear community,

I started the rewrite of "numpy.pad" [1] some time ago and feel like it
has now left the WIP status. All tests are green and the benchmarks
promise significant improvements for large arrays [2].

However progress on this PR seems to have stalled somewhat. As was
brought up multiple times, a reasonable concern is that introducing
these changes incrementally is preferable to a single PR. This was
partly possible (see [3]) but as explained further down is not trivial
for the remaining changes.

Concerning the suggestion to split the remaining changes I proposed 4
ways forward (copied from comment [4]):

Option 1: Split this PR in smaller PRs with the requirement to pass the
test suite. I expect this will make this a lot more complicated. At some
points I will just have to introduce an overhead of logic to control old
and new code paths at the same time. I just don't see another way
because the two approaches are so different. I'm really not a fan of
creating functions that work with both approaches if this leads to a
weird architecture in the long run. Also while this will split the
review burden into smaller pieces I think it will increase the sum of
changes to review in total.

Option 2: Split this PR in smaller PRs without the requirement to pass
the test suite (mark appropriate tests as xfail). Create a new branch in
numpy's repo just for the rewrite. I could then make my PRs against this
branch without master suffering. I'm thinking of first wiping the slate,
introducing shared functions (for all modes) and then incrementally
introducing each "new" mode with a new PR. This would keep the review
burden for each PR and the rewrite overall small and still allow
benchmarking of all combined changes against master.

Option 3: Squash changes in this PR and continue with a new PR to
maintain clarity as this PR has gotten quite big while I was still
making changes and addressing reviews.

Option 4: Continue this PR.

I myself think option 2 is the most elegant and best compromise between
all concerns. However I can't move forward on my own as this option
requires a maintainer to set up a temporary branch for me to make PRs
against.

So I'd like to kindly ask for your thoughts and guidance.

Best regards,
Lars

[1] https://github.com/numpy/numpy/pull/11358
[2] https://github.com/numpy/numpy/pull/11358#issuecomment-441246090
[3] https://github.com/numpy/numpy/pull/11966
[4] https://github.com/numpy/numpy/pull/11358#issuecomment-441362401
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion