ANN: NumPy/SciPy Documentation Marathon 2008

classic Classic list List threaded Threaded
37 messages Options
12
Reply | Threaded
Open this post in threaded view
|

ANN: NumPy/SciPy Documentation Marathon 2008

Joe Harrington
           NUMPY/SCIPY DOCUMENTATION MARATHON 2008

As we all know, the state of the numpy and scipy reference
documentation (aka the docstrings) is best described as "incomplete".
Most functions have docstrings shorter than 5 lines, whereas our
competitors IDL and Matlab usually have a concise and well-written
page or two per function.  The (wonderful) categorized list of
functions is very new and isn't included in the package yet.  There
isn't even a "Getting Started"-type of document you can hand a new
user so they can dive right in.  Documentation tools are limited to
plain-text paginators, while our competition enjoys HTML-based
documents with formulae, images, search capability, and cross linking.

Tales of woe abound.  A university class switched to Numpy and got
hopelessly bogged down because students couldn't find out how to call
the functions.  A developer looked something up while giving a
presentation and the words "Blah, Blah, Blah" stared down at the
audience in response.

To head off another pedagogical meltdown, the University of Central
Florida has hired Stefan van der Walt full time to coordinate a
community documentation effort to write reference documentation and
tools.  The project starts now and continues through the summer.  The
goals:

1. Produce complete docstrings for all numpy functions and as much of
   scipy as possible,

2. Produce an 8-15 page Getting Started tutorial that is not
   discipline-specific,

3. Write reference sections on topics in numpy, such as slicing and
   the use principles of the modules,

4. Complete a first edition, in both PDF and HTML, of a NumPy
   Reference Manual, and

5. Check everything into the sources by 1 August 2008 so that the
   Packaging Team can cut a release and have it available in time for
   Fall 2008 classes.

Even Stefan could not document the hundreds of functions that need it
by himself, and in any case such a large contribution requires
community review.  To make it easy for everyone to contribute, Pauli
Virtanen and Emmanuelle Guillart have provided a wiki system for
editing reference documentation.  The idea was developed by Fernando
Perez, Stefan, and Gael Varoquaux.  We encourage community members to
write, review, and proofread reference pages on this wiki.  Stefan
will check updates into the sources roughly weekly.  Near the end of
the project, we will put these wiki pages through a vetting process
and then check them into the sources a final time for a release
hopefully to occur in early August.

Meanwhile, Perry Greenfield has taken the lead on on task 3, writing
reference docs for things that currently don't have docstrings, such
as basic concepts like slicing.

We have proposed two small extensions to the current docstring format,
for images (to be used sparingly) and indexing.  These appear in
updated versions of the doc standard, which are linked from the wiki
frontpage.  Please take a look and comment on these if you like.  All
docstrings will remain readable in plain text, but we are now
generating a full reference guide in PDF and HTML (you guessed it,
linked from the wiki).  These are searchable formats.

There are several ways you can help:

1. Write some docstrings on the wiki!  Many people can do this, many
more than can write code for the package itself.  However, you must
know numpy, the function group, and the function you are writing well.
You should be familiar with the concept of a reference page and write
in that concise style.  We'll do tutorial docs in another project at a
later date.  See the instructions on the wiki for guidelines and
format.

2. Review others' docstrings and leave comments on their wiki pages.

3. Proofread docstrings.  Make sure they are correct, complete, and
concise.  Fix grammar.

4. Write examples ("doctests").  Even if you are not a top-notch
English writer, you can help by producing a code snippet of a few
lines that demonstrates a function.  It is fine for them to go into
the docstring templates before the actual text.

5. Write a new help function that optionally produces ASCII or points
the user's PDF or HTML reader to the right page (either local or
global).

6. If you are in a position to hire someone, such as a knowledgeable
student or short-term consultant, hire them to work on the tasks above
for the summer.  We can provide supervision to them or guidance to you
if you like.

The home for this project is here:

http://scipy.org/Developer_Zone/DocMarathon2008

This is not a sprint.  It is a marathon, and this time we are going to
finish.  We hope you will join us!

--jh-- and Stefan and Perry and Pauli and Emmanuelle...and you!
Joe Harrington
Stefan van der Walt
Perry Greenfield
Pauli Virtanen
Emmanuelle Guillart
...and you!
_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: [SciPy-user] ANN: NumPy/SciPy Documentation Marathon 2008

Joe Harrington
Ryan writes:
> This is very good news.  I will find some way to get involved.

Great!  Please dive right in, and sign up on the Developer_Zone page
so we can keep track of who's involved.

One thing I forgot to mention in my too-wordy announcement was that
discussion of documentation is on the scipy-dev mailing list.  We had
to pick one spot and decided that since we are going after scipy as
soon as numpy is done, we'd like to use that list rather than
numpy-discussion.  We also wanted to keep it on a development list
rather than polluting the new users' discussion space.

--jh--
_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: ANN: NumPy/SciPy Documentation Marathon 2008

Andreas Klöckner-3
In reply to this post by Joe Harrington
On Samstag 17 Mai 2008, Joe Harrington wrote:
> To head off another pedagogical meltdown, the University of Central
> Florida has hired Stefan van der Walt full time to coordinate a
> community documentation effort to write reference documentation and
> tools.

This is truly excellent news. One question though: I didn't see Travis's Numpy
book mentioned at all in your writeup, so I am wondering where its role in
the doc effort is. Its home page states that it will be opened on Sep 1,
2008, apparently in time for classes, and it already provides parts of what
you propose.

Mainly: while we need to respect Travis's copyright, a duplication of the
massive effort that went into the book hardly seems sensible. One initial
question is therefore: Is it OK to copy material out of the book and into
other parts of the documentation?

Andreas

_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion

signature.asc (196 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: ANN: NumPy/SciPy Documentation Marathon 2008

Stéfan van der Walt
Hi Andreas

2008/5/17 Andreas Klöckner <[hidden email]>:

> On Samstag 17 Mai 2008, Joe Harrington wrote:
>> To head off another pedagogical meltdown, the University of Central
>> Florida has hired Stefan van der Walt full time to coordinate a
>> community documentation effort to write reference documentation and
>> tools.
>
> This is truly excellent news. One question though: I didn't see Travis's Numpy
> book mentioned at all in your writeup, so I am wondering where its role in
> the doc effort is. Its home page states that it will be opened on Sep 1,
> 2008, apparently in time for classes, and it already provides parts of what
> you propose.

Travis has generously pemitted us to use any part of his book in the
documentation.  We shall be making use of his kind offer!  As far as I
am aware, the code to his book will be released at SciPy 2008
(http://conference.scipy.org).

Regards
Stéfan
_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: ANN: NumPy/SciPy Documentation Marathon 2008

Joe Harrington
In reply to this post by Joe Harrington
> I didn't see Travis's Numpy book mentioned at all in your writeup, so
> I am wondering where its role in the doc effort is.

> Is it OK to copy material out of the book and into
> other parts of the documentation?

No worries, Travis is on board here.  We included him and others on
the Steering Committee in planning this effort.

Travis's book overlaps the current effort to some extent.  The
function descriptions in the book are the numpy docstrings, such as
they currently exist.  The docstrings are open source software and the
book is a work derived from them.  The current effort is essentially
to fill in the docstrings to the full expectation of professional
reference documentation.  If you compare the docstring example on the
wiki (for multivariate_normal) with the current page for that
function, you'll see the difference.  The multivariate_normal
docstring is actually pretty good among current docstrings, but even
for this function we're aiming for a big change.  Collected, the new
docstrings will make a reference manual very much like those you'll
find for other scientific languages, with similar format for the
pages.  The choice of a ReST-based docstring format some time ago was
to support producing such a manual.

The rest of Travis's book is still critical information and we're not
contemplating replacing it at this point.  Much of it is on the
technical end, and our goal is to address the general user,
particularly students learning to do data analysis, so I think even
the eventual User Guide, whatever form it takes, will not encroach on
its technical focus.  Of course, he's welcome to include the improved
docstrings in his book if he wants to (as is anyone), or to exclude
them and make a tighter book aimed at extension programmers, or
whatever.

Let's continue discussion on scipy-dev, just to keep it all in one
place.

--jh--
_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: ANN: NumPy/SciPy Documentation Marathon 2008

Steven H. Rogers
In reply to this post by Joe Harrington
Joe Harrington wrote:
>   NUMPY/SCIPY DOCUMENTATION MARATHON 2008
> ...
> 5. Write a new help function that optionally produces ASCII or points
> the user's PDF or HTML reader to the right page (either local or
> global).
>  
I can work on this.  Fernando suggested this at the IPython sprint in
Boulder last year, so I've given it some thought and started a wiki page:
http://ipython.scipy.org/moin/Developer_Zone/SearchDocs

Regards,
Steve

_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: ANN: NumPy/SciPy Documentation Marathon 2008

Pauli Virtanen-3
Hi,

su, 2008-05-18 kello 07:16 -0600, Steven H. Rogers kirjoitti:

> Joe Harrington wrote:
> >   NUMPY/SCIPY DOCUMENTATION MARATHON 2008
> > ...
> > 5. Write a new help function that optionally produces ASCII or points
> > the user's PDF or HTML reader to the right page (either local or
> > global).
> >  
> I can work on this.  Fernando suggested this at the IPython sprint in
> Boulder last year, so I've given it some thought and started a wiki page:
> http://ipython.scipy.org/moin/Developer_Zone/SearchDocs

In Numpy SVN/1.1 there is a function "lookfor" that searches the
docstrings for a substring (no stemming etc. is done). Similar
"%lookfor" magic command got accepted into IPython0 as an extension
ipy_lookfor.py. Improvements to these would be surely appreciated.

I think that also Sphinx supports searching, so that the generated HTML
docs [1] are searchable, as is the generated PDF output.

        Pauli


.. [1] http://mentat.za.net/numpy/refguide/
   So far, this preview contains only docs for ndarray, though.


_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: ANN: NumPy/SciPy Documentation Marathon 2008

Steven H. Rogers
Pauli Virtanen wrote:

> Hi,
>
> su, 2008-05-18 kello 07:16 -0600, Steven H. Rogers kirjoitti:
>  
>> Joe Harrington wrote:
>>    
>>>   NUMPY/SCIPY DOCUMENTATION MARATHON 2008
>>> ...
>>> 5. Write a new help function that optionally produces ASCII or points
>>> the user's PDF or HTML reader to the right page (either local or
>>> global).
>>>  
>>>      
>> I can work on this.  Fernando suggested this at the IPython sprint in
>> Boulder last year, so I've given it some thought and started a wiki page:
>> http://ipython.scipy.org/moin/Developer_Zone/SearchDocs
>>    
>
> In Numpy SVN/1.1 there is a function "lookfor" that searches the
> docstrings for a substring (no stemming etc. is done). Similar
> "%lookfor" magic command got accepted into IPython0 as an extension
> ipy_lookfor.py. Improvements to these would be surely appreciated.
>
> I think that also Sphinx supports searching, so that the generated HTML
> docs [1] are searchable, as is the generated PDF output.
>
> Pauli
>
>
> .. [1] http://mentat.za.net/numpy/refguide/
>    So far, this preview contains only docs for ndarray, though.
>
>  

Thanks Pauli.  Looking at these.

# Steve
_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: ANN: NumPy/SciPy Documentation Marathon 2008

Stéfan van der Walt
2008/5/20 Steven H. Rogers <[hidden email]>:
>> .. [1] http://mentat.za.net/numpy/refguide/
>>    So far, this preview contains only docs for ndarray, though.

The reference guide has been updated to contain the entire numpy.
Once we've applied indexing tags to functions, those will be sorted in
a more coherent manner.  Also, the math role and directive, i.e.
:math:`\int_0^\infty` and

.. math:: \int_0^\infty

now render correctly.  This is achieved using mathml in the xhtml
files (so you need to install a mathml plugin if you use Internet
Explorer).  For en example, see "bartlett" (use the index to find it,
quicksearch is currently broken).

Regards
Stéfan
_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: ANN: NumPy/SciPy Documentation Marathon 2008

Alan G Isaac
On Tue, 20 May 2008, Stéfan van der Walt apparently wrote:
> Also, the math role and directive, i.e.
> :math:`\int_0^\infty` and
> .. math:: \int_0^\infty
> now render correctly.

Is this being done with Jens's writers?
If not, I'd like to know how.

Thank you,
Alan Isaac

PS There is currently active discussion by the docutils
developers about implementing moving the math role and
directive into docutils.  Discovered issues could be
usefully shared right now!



_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: ANN: NumPy/SciPy Documentation Marathon 2008

Stéfan van der Walt
Hi Alan

Yes, the one discussed in this thread:

http://groups.google.com/group/sphinx-dev/browse_thread/thread/ef74352b9f196002/0e257bc8c116f73f

I've only had to make one change so far, to parse '*' as in 'A^*'
(patch attached).  Unfortunately, the author chose incomprehensible
variable names like 'mo', 'mi', and 'mn', so I'm not sure I fixed it
the right way.

I'd be very glad if these directives became part of docutils -- I'd be
glad if you could monitor that situation for us.

Regards
Stéfan

2008/5/20 Alan G Isaac <[hidden email]>:

> On Tue, 20 May 2008, Stéfan van der Walt apparently wrote:
>> Also, the math role and directive, i.e.
>> :math:`\int_0^\infty` and
>> .. math:: \int_0^\infty
>> now render correctly.
>
> Is this being done with Jens's writers?
> If not, I'd like to know how.
>
> Thank you,
> Alan Isaac
>
> PS There is currently active discussion by the docutils
> developers about implementing moving the math role and
> directive into docutils.  Discovered issues could be
> usefully shared right now!

_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion

mathml.patch (562 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: ANN: NumPy/SciPy Documentation Marathon 2008

Rob Hetland

I would like to help, but it's not clear to me exactly how to do that  
from the wiki.  What are the steps?

-Rob

_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: ANN: NumPy/SciPy Documentation Marathon 2008

Stéfan van der Walt
Hi Rob

Which of the instructions are not clear?  We'd like to make this as
accessible as possible.

In order to start editing, you need to complete step 5, which is to
register on the wiki and send us your UserName.

Regards
Stéfan

2008/5/20 Rob Hetland <[hidden email]>:
>
> I would like to help, but it's not clear to me exactly how to do that
> from the wiki.  What are the steps?
>
> -Rob
_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: ANN: NumPy/SciPy Documentation Marathon 2008

Jon Wright
In reply to this post by Joe Harrington
Joe Harrington wrote:
>   NUMPY/SCIPY DOCUMENTATION MARATHON 2008
>  
On the wiki it says: "Writers should be fluent in English"

In case someone is working on the dynamic docstring magic, is this a
good moment to mention "internationalisation" and "world domination" in
the same sentence?

-Jon
_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: ANN: NumPy/SciPy Documentation Marathon 2008

Rob Hetland
In reply to this post by Stéfan van der Walt

On May 20, 2008, at 7:30 PM, Stéfan van der Walt wrote:

>  ...and send us your UserName.

This is the part I skipped over...  I registered, and wondered why  
everything was not editable.

-Rob

----
Rob Hetland, Associate Professor
Dept. of Oceanography, Texas A&M University
http://pong.tamu.edu/~rob
phone: 979-458-0096, fax: 979-845-6331



_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: ANN: NumPy/SciPy Documentation Marathon 2008

Rob Hetland
In reply to this post by Stéfan van der Walt

On May 20, 2008, at 7:30 PM, Stéfan van der Walt wrote:

>  and send us your UserName.


Oh, and my username is RobHetland

-Rob

----
Rob Hetland, Associate Professor
Dept. of Oceanography, Texas A&M University
http://pong.tamu.edu/~rob
phone: 979-458-0096, fax: 979-845-6331



_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: ANN: NumPy/SciPy Documentation Marathon 2008

Pauli Virtanen-3
ti, 2008-05-20 kello 20:06 +0200, Rob Hetland kirjoitti:
> On May 20, 2008, at 7:30 PM, Stéfan van der Walt wrote:
>
> >  and send us your UserName.
>
>
> Oh, and my username is RobHetland

You're in now.

Regards,
Pauli


_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: ANN: NumPy/SciPy Documentation Marathon 2008

Stéfan van der Walt
In reply to this post by Jon Wright
2008/5/20 Jonathan Wright <[hidden email]>:
> Joe Harrington wrote:
>>          NUMPY/SCIPY DOCUMENTATION MARATHON 2008
>>
> On the wiki it says: "Writers should be fluent in English"
>
> In case someone is working on the dynamic docstring magic, is this a
> good moment to mention "internationalisation" and "world domination" in
> the same sentence?

I think we'll stick to English for now (I don't think I have the
motivation to do an Afrikaans translation!).

As for internationali(s/z)ation, we'll see who writes the most
docstrings.  In a fortuitous twist of events, I find myself able to
read American as well :)

Cheers
Stéfan
_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: ANN: NumPy/SciPy Documentation Marathon 2008

Jon Wright
Stéfan van der Walt wrote:
 > As for internationali(s/z)ation, we'll see who writes the most
 > docstrings.

Indeed. There are some notes on the OLPC wiki at

http://wiki.laptop.org/go/Python_i18n

It seems to be just a question of adding at the top of add_newdocs.py

from gettext import gettext as _

... and putting the docstrings in a _() function call, although perhaps
I miss something important, like a performance hit? This would catch
everything in add_newdocs at least. It seems like a relatively minor
change if you are overhauling anyway?

Jon






_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: ANN: NumPy/SciPy Documentation Marathon 2008

Robert Kern-2
On Tue, May 20, 2008 at 5:55 PM, Jonathan Wright <[hidden email]> wrote:

> Stéfan van der Walt wrote:
>  > As for internationali(s/z)ation, we'll see who writes the most
>  > docstrings.
>
> Indeed. There are some notes on the OLPC wiki at
>
> http://wiki.laptop.org/go/Python_i18n
>
> It seems to be just a question of adding at the top of add_newdocs.py
>
> from gettext import gettext as _
>
> ... and putting the docstrings in a _() function call, although perhaps
> I miss something important, like a performance hit?

Possibly a significant one. This could affect startup times, which I
am hesitant to make worse.

> This would catch
> everything in add_newdocs at least. It seems like a relatively minor
> change if you are overhauling anyway?

add_newdocs() could do that, but the usual function docstrings can't.
The rule is that if the first statement in a function is a literal
string, then the compiler will assign it func.__doc__. Expressions are
just treated as expressions in the function body and have no affect on
func.__doc__.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
 -- Umberto Eco
_______________________________________________
Numpy-discussion mailing list
[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion
12