

Hi folks,
I'm working on a livevideo display for some microscope control tools
I'm building. For this, I need a fast histogram function to work on
largeish images (1000x2000 or so) at video rate, with cycles left
over for more interesting calculations (like autofocus).
Now, numpy.histogram is a bit slower than I'd like, probably because
it's pretty general (and of course cf. the recent discussion about its
speed). I just need even bins within a set range. This is easy enough
to do with a Cextension, or perhaps even cython, but before I go
there, I was wondering if there's a numpy function that can help.
Here's what I have in mind:
def histogram(arr, bins, range):
min, max = range
indices = numpy.clip(((arr.astype(float)  min) * bins / (max 
min)).astype(int), 0, bins1)
histogram = numpy.zeros(bins, numpy.uint32)
for i in indices:
histogram[i] += 1
Now, clearly, the last loop is what needs speeding up. Are there any
numpy functions that can do this kind of operation? Also perhaps
unnecessarily slow is the conversion of 'arr' to a float  I do this
to avoid overflow issues with integer arrays.
Any advice? Should I go ahead and write this up in C (easy enough), or
can I do this in numpy? Probably the indicescomputation line I'll
speed up with numexpr, if I use a purenumpy approach.
Thanks,
Zach
_______________________________________________
Numpydiscussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpydiscussion


On Thu, Apr 17, 2008 at 10:02 AM, Zachary Pincus < [hidden email]> wrote:
Hi folks,
I'm working on a livevideo display for some microscope control tools
I'm building. For this, I need a fast histogram function to work on
largeish images (1000x2000 or so) at video rate, with cycles left
over for more interesting calculations (like autofocus).
Now, numpy.histogram is a bit slower than I'd like, probably because
it's pretty general (and of course cf. the recent discussion about its
speed). I just need even bins within a set range. This is easy enough
to do with a Cextension, or perhaps even cython, but before I go
there, I was wondering if there's a numpy function that can help.
Here's what I have in mind:
def histogram(arr, bins, range):
min, max = range
indices = numpy.clip(((arr.astype(float)  min) * bins / (max 
min)).astype(int), 0, bins1)
histogram = numpy.zeros(bins, numpy.uint32)
for i in indices:
histogram[i] += 1
Now, clearly, the last loop is what needs speeding up. Are there any
numpy functions that can do this kind of operation? Also perhaps
unnecessarily slow is the conversion of 'arr' to a float  I do this
to avoid overflow issues with integer arrays.
How about a combination of sort, followed by searchsorted right/left using the bin boundaries as keys? The difference of the two resulting vectors is the bin value. Something like:
In [1]: data = arange(100)
In [2]: bins = [0,10,50,70,100]
In [3]: lind = data.searchsorted(bins)
In [4]: print lind[1:]  lind[:1]
[10 40 20 30]
This won't be as fast as a c implementation, but at least avoids the loop.
Chuck
_______________________________________________
Numpydiscussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpydiscussion


On Thu, Apr 17, 2008 at 6:18 PM, Charles R Harris
< [hidden email]> wrote:
>
>
> On Thu, Apr 17, 2008 at 10:02 AM, Zachary Pincus < [hidden email]>
> wrote:
> > Hi folks,
> >
> > I'm working on a livevideo display for some microscope control tools
> > I'm building. For this, I need a fast histogram function to work on
> > largeish images (1000x2000 or so) at video rate, with cycles left
> > over for more interesting calculations (like autofocus).
> >
> > Now, numpy.histogram is a bit slower than I'd like, probably because
> > it's pretty general (and of course cf. the recent discussion about its
> > speed). I just need even bins within a set range. This is easy enough
> > to do with a Cextension, or perhaps even cython, but before I go
> > there, I was wondering if there's a numpy function that can help.
> >
> > Here's what I have in mind:
> >
> > def histogram(arr, bins, range):
> > min, max = range
> > indices = numpy.clip(((arr.astype(float)  min) * bins / (max 
> > min)).astype(int), 0, bins1)
> > histogram = numpy.zeros(bins, numpy.uint32)
> > for i in indices:
> > histogram[i] += 1
> >
> > Now, clearly, the last loop is what needs speeding up. Are there any
> > numpy functions that can do this kind of operation? Also perhaps
> > unnecessarily slow is the conversion of 'arr' to a float  I do this
> > to avoid overflow issues with integer arrays.
> >
>
> How about a combination of sort, followed by searchsorted right/left using
> the bin boundaries as keys? The difference of the two resulting vectors is
> the bin value. Something like:
>
> In [1]: data = arange(100)
>
> In [2]: bins = [0,10,50,70,100]
>
> In [3]: lind = data.searchsorted(bins)
>
> In [4]: print lind[1:]  lind[:1]
> [10 40 20 30]
>
> This won't be as fast as a c implementation, but at least avoids the loop.
>
> Chuck
How many bits per pixel does your camera actually generate !?
If its for example a 12 bit camera, you could just fill in directly
into 4096 preallocated bins.
You would not need any sorting !!
That's what I did for a 16 bit camera  but I wrote it in C and I had
4 cameras at 30 Hz.
Sebastian Haase
_______________________________________________
Numpydiscussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpydiscussion


Hi,
> How about a combination of sort, followed by searchsorted right/left
> using the bin boundaries as keys? The difference of the two
> resulting vectors is the bin value. Something like:
>
> In [1]: data = arange(100)
>
> In [2]: bins = [0,10,50,70,100]
>
> In [3]: lind = data.searchsorted(bins)
>
> In [4]: print lind[1:]  lind[:1]
> [10 40 20 30]
>
> This won't be as fast as a c implementation, but at least avoids the
> loop.
This is, more or less, what the current numpy.histogram does, no? I
was hoping to avoid the O(n log n) sorting, because the image arrays
are pretty big, and numpy.histogram doesn't get close to video rate
for me...
Perhaps, though, some of the slowdown from numpy.histogram is from
other overhead, and not the sorting. I'll try this, but I think I'll
probably just have to write the c loop...
Zach
_______________________________________________
Numpydiscussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpydiscussion


Hi, and thanks for the suggestion!
> How many bits per pixel does your camera actually generate !?
> If its for example a 12 bit camera, you could just fill in directly
> into 4096 preallocated bins.
> You would not need any sorting !!
> That's what I did for a 16 bit camera  but I wrote it in C and I had
> 4 cameras at 30 Hz.
That approach avoids the binindex calculation line:
indices = numpy.clip(((array.astype(float)  min) * bins / (max 
min)).astype(int), 0, bins1)
But even if indices = array, one still needs to do something like:
for index in indices: histogram[index] += 1
Which is slow in python and fast in C.
I'm guessing that there's no utility function in numpy that does a
loop like this? If so, that would be handy, but if now, I think I need
to dig out the numpy book and write a little extension...
Zach
_______________________________________________
Numpydiscussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpydiscussion


On Thu, Apr 17, 2008 at 6:54 PM, Zachary Pincus < [hidden email]> wrote:
> Hi, and thanks for the suggestion!
>
>
> > How many bits per pixel does your camera actually generate !?
> > If its for example a 12 bit camera, you could just fill in directly
> > into 4096 preallocated bins.
> > You would not need any sorting !!
> > That's what I did for a 16 bit camera  but I wrote it in C and I had
> > 4 cameras at 30 Hz.
>
> That approach avoids the binindex calculation line:
> indices = numpy.clip(((array.astype(float)  min) * bins / (max 
>
> min)).astype(int), 0, bins1)
>
> But even if indices = array, one still needs to do something like:
> for index in indices: histogram[index] += 1
>
> Which is slow in python and fast in C.
>
> I'm guessing that there's no utility function in numpy that does a
> loop like this? If so, that would be handy, but if now, I think I need
> to dig out the numpy book and write a little extension...
>
>
I thought of a broadcasting approach... what are the chances that a simple
bins[:] = 0
bins[ img.flat ] += 1
works !? Or something along those lines ....
Sebastian
_______________________________________________
Numpydiscussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpydiscussion


>> But even if indices = array, one still needs to do something like:
>> for index in indices: histogram[index] += 1
>>
>> Which is slow in python and fast in C.
>>
>>
> I thought of a broadcasting approach... what are the chances that a
> simple
>
> bins[:] = 0
> bins[ img.flat ] += 1
That doesn't work cumulatively, it seems. Some bins get a value of
'1', but it never gets incremented past that.
It was worth a try, though... in some cases inplace updating seems to
work this way (which is usually a source of confusion on this list,
not the desired result!)
Zach
_______________________________________________
Numpydiscussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpydiscussion


> But even if indices = array, one still needs to do something like:
> for index in indices: histogram[index] += 1
>
> Which is slow in python and fast in C.
>
> I'm guessing that there's no utility function in numpy that does a
> loop like this? If so, that would be handy, but if now, I think I need
> to dig out the numpy book and write a little extension...
>
numpy.bincount?
Docstring:
bincount(x,weights=None)
Return the number of occurrences of each value in x.
x must be a list of nonnegative integers. The output, b[i],
represents the number of times that i is found in x. If weights
is specified, every occurrence of i at a position p contributes
weights[p] instead of 1.
See also: histogram, digitize, unique.
JJ
_______________________________________________
Numpydiscussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpydiscussion


On 17/04/2008, Zachary Pincus wrote:
> But even if indices = array, one still needs to do something like:
> for index in indices: histogram[index] += 1
>
> Which is slow in python and fast in C.
>
I haven't tried this, but if you want the sum in C you could do
for x in unique(indices):
histogram[x] = (indices==x).sum()
Of course, this just replaces an O(N log N) algorithm by an O(N * M)
(M is the number of bins), which is only going to be worth for very
small M.
Peter
_______________________________________________
Numpydiscussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpydiscussion


On 17/04/2008, Zachary Pincus wrote:
> But even if indices = array, one still needs to do something like:
> for index in indices: histogram[index] += 1
>
> Which is slow in python and fast in C.
>
I haven't tried this, but if you want the sum in C you could do
for x in unique(indices):
histogram[x] = (indices==x).sum()
Of course, this just replaces an O(N log N) algorithm by an O(N * M)
(M is the number of bins), which is only going to be worth it for very
small M.
Peter
_______________________________________________
Numpydiscussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpydiscussion


Hi Zach,
I have a similar loop which I wrote using scipy.weave. This was my first
foray into weave, and I had to dig through the intermediate C sources to
find the macros that did the indexing in the way I make use of here, but
this snipped may get you started. There are 2 functions, which each do
the same thing, but one is in python, the other is in C. Note that this
is for a 3D histogram  presumably you could remove B and C from this
example.
I'm sure there are better (more documented) ways to do this using weave
 but I had this code written, it works, and it appears it may be
useful to you... (Sorry it's not documented, however.)
Andrew
Zachary Pincus wrote:
> Hi,
>
>
>> How about a combination of sort, followed by searchsorted right/left
>> using the bin boundaries as keys? The difference of the two
>> resulting vectors is the bin value. Something like:
>>
>> In [1]: data = arange(100)
>>
>> In [2]: bins = [0,10,50,70,100]
>>
>> In [3]: lind = data.searchsorted(bins)
>>
>> In [4]: print lind[1:]  lind[:1]
>> [10 40 20 30]
>>
>> This won't be as fast as a c implementation, but at least avoids the
>> loop.
>>
>
> This is, more or less, what the current numpy.histogram does, no? I
> was hoping to avoid the O(n log n) sorting, because the image arrays
> are pretty big, and numpy.histogram doesn't get close to video rate
> for me...
>
> Perhaps, though, some of the slowdown from numpy.histogram is from
> other overhead, and not the sorting. I'll try this, but I think I'll
> probably just have to write the c loop...
>
> Zach
> _______________________________________________
> Numpydiscussion mailing list
> [hidden email]
> http://projects.scipy.org/mailman/listinfo/numpydiscussion>
from scipy import weave
import numpy
def increment_slow(final_array_shape, A,B,C):
counts = numpy.zeros(final_array_shape, dtype=numpy.uint64)
for i in range(len(A)):
a=A[i]; b=B[i]; c=C[i]
counts[a,b,c] += 1
return counts
def increment_fast(final_array_shape, idxa,idxb,idxc):
counts = numpy.zeros(final_array_shape, dtype=numpy.uint64)
assert len(idxa.shape)==1
assert len(idxa)==len(idxb)
assert len(idxa)==len(idxc)
code = r"""
for (int i=0; i<Nidxa[0]; i++) {
COUNTS3( IDXA1(i), IDXB1(i), IDXC1(i) )++;
}
"""
weave.inline( code, ['counts', 'idxa','idxb','idxc'])
return counts
#increment = increment_slow
increment = increment_fast
_______________________________________________
Numpydiscussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpydiscussion


Hello,
>> But even if indices = array, one still needs to do something like:
>> for index in indices: histogram[index] += 1
> numpy.bincount?
That is indeed what I was looking for! I knew I'd seen such a function.
However, the speed is a bit disappointing. I guess the sorting isn't
too much of a penalty:
def histogram(array, bins, range):
min, max = range
indices = numpy.clip(((array.astype(float)  min) * bins / (max 
min)).astype(int), 0, bins1).flat
return numpy.bincount(indices)
import numexpr
def histogram_numexpr(array, bins, range):
min, max = range
min = float(min)
max = float(max)
indices = numexpr.evaluate('(array  min) * bins / (max  min)')
indices = numpy.clip(indices.astype(int), 0, bins1).flat
return numpy.bincount(indices)
>>> arr.shape
(1300, 1030)
>>> timeit numpy.histogram(arr, 12, [0, 5000])
10 loops, best of 3: 99.9 ms per loop
>>> timeit histogram(arr, 12, [0, 5000])
10 loops, best of 3: 127 ms per loop
>>> timeit histogram_numexpr(arr, 12, [0, 5000])
10 loops, best of 3: 109 ms per loop
>>> timeit numpy.histogram(arr, 5000, [0, 5000])
10 loops, best of 3: 111 ms per loop
>>> timeit histogram(arr, 5000, [0, 5000])
10 loops, best of 3: 127 ms per loop
>>> timeit histogram_numexpr(arr, 5000, [0, 5000])
10 loops, best of 3: 108 ms per loop
So, they're all quite close, and it seems that numpy.histogram is the
definite winner. Huh. I guess I will have to go to C or maybe weave to
get up to videorate, unless folks can suggest some further
optimizations...
Zach
_______________________________________________
Numpydiscussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpydiscussion


Combining Sebastian and JaeJoon's suggestions, I have something that
might work:
>>> timeit numpy.bincount(array.flat)
10 loops, best of 3: 28.2 ms per loop
This is close enough to videorate... And I can then combine bins as
needed to get a particular bin count/range after the fact.
Thanks, everyone,
Zach
_______________________________________________
Numpydiscussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpydiscussion

