Style guide for numpy code?

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Style guide for numpy code?

Chris Barker - NOAA Federal
Hey all,

Do any of you know of a style guide for computational / numpy code?

I don't mean code that will go into numpy itself, but rather, users code that uses numpy (and scipy, and...)

I know about (am a proponent 

--

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

[hidden email]

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Style guide for numpy code?

Chris Barker - NOAA Federal
Oops,

Somehow that got sent before I was done. (Like my use of the passive voice there?)

Here is a complete message:

Do any of you know of a style guide for computational / numpy code?

I don't mean code that will go into numpy itself, but rather, users code that uses numpy (and scipy, and...)

I know about (am a proponent of) PEP8, but it doesn’t address the unique needs of scientific programming.

This is mostly about variable names. In scientific code, we often want:

- variable names that match the math notation- so single character names, maybe upper or lower case to mean different things ( in ocean wave mechanics, often “h” is the water depth, and “H” is the wave height)

-to distinguish between scalar, vector, and matrix values — often UpperCase means an array or matrix, for instance.

But despite (or because of) these unique needs, a style guide would be really helpful.

Anyone have one? Or even any notes on what you do yourself?

Thanks,
-CHB




--

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

[hidden email]

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Style guide for numpy code?

Joe Harrington
In reply to this post by Chris Barker - NOAA Federal

I have a handout for my PHZ 3150 Introduction to Numerical Computing course that includes some rules:

(a) All integer-valued floating-point numbers should have decimal points after them. For
example, if you have a time of 10 sec, do not use

y = np.e**10 # sec

use

y = np.e**10. # sec

instead.  For example, an item count is always an integer, but a distance is always a float.  A decimal in the range (-1,1) must always have a zero before the decimal point, for readability:

x = 0.23 # Right!
x = .23 # WRONG

The purpose of this one is simply to build the decimal-point habit.  In Python it's less of an issue now, but sometimes code is translated, and integer division is still out there.  For that reason, in other languages, it may be desirable to use a decimal point even for counts, unless integer division is wanted.  Make a comment whenever you intend integer division and the language uses the same symbol (/) for both kinds of division.

(b) Use spaces around binary operations and relations (=<>+-*/). Put a space after “,”.
Do not put space around “=” in keyword arguments, or around “ ** ”.

(c) Do not put plt.show() in your homework file! You may put it in a comment if you
like, but it is not necessary. Just save the plot. If you say

plt.ion()

plots will automatically show while you are working.

(d) Use:

import matplotlib.pyplot as plt

NOT:

import matplotlib.pylab as plt

(e) Keep lines to 80 characters, max, except in rare cases that are well justified, such as
very long strings. If you make comments on the same line as code, keep them short or
break them over more than a line:

code = code2   # set code equal to code2

# Longer comment requiring much more space because
# I'm explaining something complicated.
code = code2

code = code2   # Another way to do a very long comment,
               # like this one, which runs over more than
               # one line.

(f) Keep blocks of similar lines internally lined up on decimals, comments, and = signs.  This makes them easier to read and verify.  There will be some cases when this is impractical.  Use your judgment (you're not a computer, you control the computer!):

x    =   1.      # this is a comment
y    = 378.2345  # here's another
fred = chuck     # note how the decimals, = signs, and
                 # comments line up nicely...
alacazamshmazooboloid = 2721 # but not always!

(g) Put the units and sources of all values in comments:

t_planet = 523.     # K, Smith and Jones (2016, ApJ 234, 22)

(h) I don't mean to start a religious war, but I emphasize the alignment of similar adjacent code lines to make differences pop out and reduce the likelihood of bugs.  For example, it is much easier to verify the correctness of:

a     = 3 * x + 3 * 8. *     short    - 5. * np.exp(np.pi * omega * t)
a_alt = 3 * x + 3 * 8. * anotshortvar - 5. * np.exp(np.pi * omega * t)

than:

a = 3 * x + 3 * 8. * short - 5. * np.exp(np.pi * omega * t)
a_altvarname = 3 * x + 3*9*anotshortvar - 5. * np.exp(np.pi * omega * i)

(i) Assign values to meaningful variables, and use them in formulae and functions:

ny = 512
nx = 512
image = np.zeros((ny, nx))
expr1 = ny * 3
expr2 = nx * 4

Otherwise, later on when you upgrade to 2560x1440 arrays, you won't know which of the 512s are in the x direction and which are in the y direction.  Or, the student you (now a senior researcher) assign to code the upgrade won't!  Also, it reduces bugs arising from the order of arguments to functions if the args have meaningful names.  This is not to say that you should assign all numbers to functions.  This is fine:

circ = 2 * np.pi * r

(j) All functions assigned for grading must have full docstrings in numpy's format, as well as internal comments.  Utility functions not requested in the assignment and that the user will never see can have reduced docstrings if the functions are simple and obvious, but at least give the one-line summary.

(k) If you modify an existing function, you must either make a Git entry or, if it is not under revision control, include a Revision History section in your docstring and record your name, the date, the version number, your email, and the nature of the change you made.

(l) Choose variable names that are meaningful and consistent in style.  Document your style either at the head of a module or in a separate text file for the project.  For example, if you use CamelCaps with initial capital, say that.  If you reserve initial capitals for classes, say that.  If you use underscores for variable subscripts and camelCaps for the base variables, say that.  If you accept some other style and build on that, say that.  There are too many good reasons to have such styles for only one to be the community standard.  If certain kinds of values should get the same variable or base variable, such as fundamental constants or things like amplitudes, say that.

(j) It's best if variables that will appear in formulae are short, so more terms can fit in one 80 character line.

Overall, having and following a style makes code easier to read.  And, as an added bonus, if you take care to be consistent, you will write slower, view your code more times, and catch more bugs as you write them.  Thus, for codes of any significant size, writing pedantically commented and aligned code is almost always faster than blast coding, if you include debugging time.

Did you catch both bugs in item h?

--jh--


    
On 5/9/19 11:25 AM, Chris Barker - NOAA Federal [hidden email] wrote:
Do any of you know of a style guide for computational / numpy code?

I don't mean code that will go into numpy itself, but rather, users code that uses numpy (and scipy, and...)

I know about (am a proponent of) PEP8, but it doesn’t address the unique needs of scientific programming.

This is mostly about variable names. In scientific code, we often want:

- variable names that match the math notation- so single character names, maybe upper or lower case to mean different things ( in ocean wave mechanics, often “h” is the water depth, and “H” is the wave height)

-to distinguish between scalar, vector, and matrix values — often UpperCase means an array or matrix, for instance.

But despite (or because of) these unique needs, a style guide would be really helpful.

Anyone have one? Or even any notes on what you do yourself?

Thanks,
-CHB




--

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

[hidden email]

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Style guide for numpy code?

Eric Wieser
Joe,

While most of your style suggestions are reasonable, I would actually
recommend the opposite of the first point you make in (a)., especially
if you're trying to write generic reusable code.

> For example, an item count is always an integer, but a distance is always a float.

This is close, but `int` and `float` are implementation details. I
think a more precise way to state this is _"an item count is a
`numbers.Integral`, a distance is a `numbers.Real`.

Where this distinction matters is if you start using `decimal.Decimal`
or `fractions.Fraction` for your distances. Those are subclasses of
`numbers.Real`, but if you mix them with floats, you either lose
precision or crash due to refusing to:
```python
In [11]: Fraction(1, 3) + 1.0
Out[11]: 1.3333333333333333

In [12]: Fraction(1, 3) + 1
Out[12]: Fraction(4, 3)

In [15]: Decimal('0.1') + 0
Out[15]: Decimal('0.1')

In [16]: Decimal('0.1') + 0.
TypeError: unsupported operand type(s) for +: 'decimal.Decimal' and 'float'
```

For an example of this coming up in real-world functions, look at
https://github.com/numpy/numpy/pull/13390

Eric

On Thu, 9 May 2019 at 11:19, Joe Harrington <[hidden email]> wrote:

>
> I have a handout for my PHZ 3150 Introduction to Numerical Computing course that includes some rules:
>
> (a) All integer-valued floating-point numbers should have decimal points after them. For
> example, if you have a time of 10 sec, do not use
>
> y = np.e**10 # sec
>
> use
>
> y = np.e**10. # sec
>
> instead.  For example, an item count is always an integer, but a distance is always a float.  A decimal in the range (-1,1) must always have a zero before the decimal point, for readability:
>
> x = 0.23 # Right!
>
> x = .23 # WRONG
>
> The purpose of this one is simply to build the decimal-point habit.  In Python it's less of an issue now, but sometimes code is translated, and integer division is still out there.  For that reason, in other languages, it may be desirable to use a decimal point even for counts, unless integer division is wanted.  Make a comment whenever you intend integer division and the language uses the same symbol (/) for both kinds of division.
>
> (b) Use spaces around binary operations and relations (=<>+-*/). Put a space after “,”.
> Do not put space around “=” in keyword arguments, or around “ ** ”.
>
> (c) Do not put plt.show() in your homework file! You may put it in a comment if you
> like, but it is not necessary. Just save the plot. If you say
>
> plt.ion()
>
> plots will automatically show while you are working.
>
> (d) Use:
>
> import matplotlib.pyplot as plt
>
> NOT:
>
> import matplotlib.pylab as plt
>
> (e) Keep lines to 80 characters, max, except in rare cases that are well justified, such as
> very long strings. If you make comments on the same line as code, keep them short or
> break them over more than a line:
>
> code = code2   # set code equal to code2
>
> # Longer comment requiring much more space because
> # I'm explaining something complicated.
> code = code2
>
> code = code2   # Another way to do a very long comment,
>                # like this one, which runs over more than
>                # one line.
>
> (f) Keep blocks of similar lines internally lined up on decimals, comments, and = signs.  This makes them easier to read and verify.  There will be some cases when this is impractical.  Use your judgment (you're not a computer, you control the computer!):
>
> x    =   1.      # this is a comment
> y    = 378.2345  # here's another
> fred = chuck     # note how the decimals, = signs, and
>                  # comments line up nicely...
> alacazamshmazooboloid = 2721 # but not always!
>
> (g) Put the units and sources of all values in comments:
>
> t_planet = 523.     # K, Smith and Jones (2016, ApJ 234, 22)
>
> (h) I don't mean to start a religious war, but I emphasize the alignment of similar adjacent code lines to make differences pop out and reduce the likelihood of bugs.  For example, it is much easier to verify the correctness of:
>
> a     = 3 * x + 3 * 8. *     short    - 5. * np.exp(np.pi * omega * t)
> a_alt = 3 * x + 3 * 8. * anotshortvar - 5. * np.exp(np.pi * omega * t)
>
> than:
>
> a = 3 * x + 3 * 8. * short - 5. * np.exp(np.pi * omega * t)
> a_altvarname = 3 * x + 3*9*anotshortvar - 5. * np.exp(np.pi * omega * i)
>
> (i) Assign values to meaningful variables, and use them in formulae and functions:
>
> ny = 512
> nx = 512
> image = np.zeros((ny, nx))
> expr1 = ny * 3
> expr2 = nx * 4
>
> Otherwise, later on when you upgrade to 2560x1440 arrays, you won't know which of the 512s are in the x direction and which are in the y direction.  Or, the student you (now a senior researcher) assign to code the upgrade won't!  Also, it reduces bugs arising from the order of arguments to functions if the args have meaningful names.  This is not to say that you should assign all numbers to functions.  This is fine:
>
> circ = 2 * np.pi * r
>
> (j) All functions assigned for grading must have full docstrings in numpy's format, as well as internal comments.  Utility functions not requested in the assignment and that the user will never see can have reduced docstrings if the functions are simple and obvious, but at least give the one-line summary.
>
> (k) If you modify an existing function, you must either make a Git entry or, if it is not under revision control, include a Revision History section in your docstring and record your name, the date, the version number, your email, and the nature of the change you made.
>
> (l) Choose variable names that are meaningful and consistent in style.  Document your style either at the head of a module or in a separate text file for the project.  For example, if you use CamelCaps with initial capital, say that.  If you reserve initial capitals for classes, say that.  If you use underscores for variable subscripts and camelCaps for the base variables, say that.  If you accept some other style and build on that, say that.  There are too many good reasons to have such styles for only one to be the community standard.  If certain kinds of values should get the same variable or base variable, such as fundamental constants or things like amplitudes, say that.
>
> (j) It's best if variables that will appear in formulae are short, so more terms can fit in one 80 character line.
>
> Overall, having and following a style makes code easier to read.  And, as an added bonus, if you take care to be consistent, you will write slower, view your code more times, and catch more bugs as you write them.  Thus, for codes of any significant size, writing pedantically commented and aligned code is almost always faster than blast coding, if you include debugging time.
>
> Did you catch both bugs in item h?
>
> --jh--
>
> On 5/9/19 11:25 AM, Chris Barker - NOAA Federal <[hidden email]> wrote:
>
> Do any of you know of a style guide for computational / numpy code?
>
> I don't mean code that will go into numpy itself, but rather, users code that uses numpy (and scipy, and...)
>
> I know about (am a proponent of) PEP8, but it doesn’t address the unique needs of scientific programming.
>
> This is mostly about variable names. In scientific code, we often want:
>
> - variable names that match the math notation- so single character names, maybe upper or lower case to mean different things ( in ocean wave mechanics, often “h” is the water depth, and “H” is the wave height)
>
> -to distinguish between scalar, vector, and matrix values — often UpperCase means an array or matrix, for instance.
>
> But despite (or because of) these unique needs, a style guide would be really helpful.
>
> Anyone have one? Or even any notes on what you do yourself?
>
> Thanks,
> -CHB
>
>
>
>
> --
>
> Christopher Barker, Ph.D.
> Oceanographer
>
> Emergency Response Division
> NOAA/NOS/OR&R            (206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115       (206) 526-6317   main reception
>
> [hidden email]
>
> _______________________________________________
> NumPy-Discussion mailing list
> [hidden email]
> https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Style guide for numpy code?

Evgeni Burovski
In reply to this post by Joe Harrington
Hi Joe,

Thanks for sharing!

I'm going to use your handout as a base for my numerical computing classes, (with an appropriate citation, of course :-)).




чт, 9 мая 2019 г., 21:19 Joe Harrington <[hidden email]>:

I have a handout for my PHZ 3150 Introduction to Numerical Computing course that includes some rules:

(a) All integer-valued floating-point numbers should have decimal points after them. For
example, if you have a time of 10 sec, do not use

y = np.e**10 # sec

use

y = np.e**10. # sec

instead.  For example, an item count is always an integer, but a distance is always a float.  A decimal in the range (-1,1) must always have a zero before the decimal point, for readability:

x = 0.23 # Right!
x = .23 # WRONG

The purpose of this one is simply to build the decimal-point habit.  In Python it's less of an issue now, but sometimes code is translated, and integer division is still out there.  For that reason, in other languages, it may be desirable to use a decimal point even for counts, unless integer division is wanted.  Make a comment whenever you intend integer division and the language uses the same symbol (/) for both kinds of division.

(b) Use spaces around binary operations and relations (=<>+-*/). Put a space after “,”.
Do not put space around “=” in keyword arguments, or around “ ** ”.

(c) Do not put plt.show() in your homework file! You may put it in a comment if you
like, but it is not necessary. Just save the plot. If you say

plt.ion()

plots will automatically show while you are working.

(d) Use:

import matplotlib.pyplot as plt

NOT:

import matplotlib.pylab as plt

(e) Keep lines to 80 characters, max, except in rare cases that are well justified, such as
very long strings. If you make comments on the same line as code, keep them short or
break them over more than a line:

code = code2   # set code equal to code2

# Longer comment requiring much more space because
# I'm explaining something complicated.
code = code2

code = code2   # Another way to do a very long comment,
               # like this one, which runs over more than
               # one line.

(f) Keep blocks of similar lines internally lined up on decimals, comments, and = signs.  This makes them easier to read and verify.  There will be some cases when this is impractical.  Use your judgment (you're not a computer, you control the computer!):

x    =   1.      # this is a comment
y    = 378.2345  # here's another
fred = chuck     # note how the decimals, = signs, and
                 # comments line up nicely...
alacazamshmazooboloid = 2721 # but not always!

(g) Put the units and sources of all values in comments:

t_planet = 523.     # K, Smith and Jones (2016, ApJ 234, 22)

(h) I don't mean to start a religious war, but I emphasize the alignment of similar adjacent code lines to make differences pop out and reduce the likelihood of bugs.  For example, it is much easier to verify the correctness of:

a     = 3 * x + 3 * 8. *     short    - 5. * np.exp(np.pi * omega * t)
a_alt = 3 * x + 3 * 8. * anotshortvar - 5. * np.exp(np.pi * omega * t)

than:

a = 3 * x + 3 * 8. * short - 5. * np.exp(np.pi * omega * t)
a_altvarname = 3 * x + 3*9*anotshortvar - 5. * np.exp(np.pi * omega * i)

(i) Assign values to meaningful variables, and use them in formulae and functions:

ny = 512
nx = 512
image = np.zeros((ny, nx))
expr1 = ny * 3
expr2 = nx * 4

Otherwise, later on when you upgrade to 2560x1440 arrays, you won't know which of the 512s are in the x direction and which are in the y direction.  Or, the student you (now a senior researcher) assign to code the upgrade won't!  Also, it reduces bugs arising from the order of arguments to functions if the args have meaningful names.  This is not to say that you should assign all numbers to functions.  This is fine:

circ = 2 * np.pi * r

(j) All functions assigned for grading must have full docstrings in numpy's format, as well as internal comments.  Utility functions not requested in the assignment and that the user will never see can have reduced docstrings if the functions are simple and obvious, but at least give the one-line summary.

(k) If you modify an existing function, you must either make a Git entry or, if it is not under revision control, include a Revision History section in your docstring and record your name, the date, the version number, your email, and the nature of the change you made.

(l) Choose variable names that are meaningful and consistent in style.  Document your style either at the head of a module or in a separate text file for the project.  For example, if you use CamelCaps with initial capital, say that.  If you reserve initial capitals for classes, say that.  If you use underscores for variable subscripts and camelCaps for the base variables, say that.  If you accept some other style and build on that, say that.  There are too many good reasons to have such styles for only one to be the community standard.  If certain kinds of values should get the same variable or base variable, such as fundamental constants or things like amplitudes, say that.

(j) It's best if variables that will appear in formulae are short, so more terms can fit in one 80 character line.

Overall, having and following a style makes code easier to read.  And, as an added bonus, if you take care to be consistent, you will write slower, view your code more times, and catch more bugs as you write them.  Thus, for codes of any significant size, writing pedantically commented and aligned code is almost always faster than blast coding, if you include debugging time.

Did you catch both bugs in item h?

--jh--


    
On 5/9/19 11:25 AM, Chris Barker - NOAA Federal [hidden email] wrote:
Do any of you know of a style guide for computational / numpy code?

I don't mean code that will go into numpy itself, but rather, users code that uses numpy (and scipy, and...)

I know about (am a proponent of) PEP8, but it doesn’t address the unique needs of scientific programming.

This is mostly about variable names. In scientific code, we often want:

- variable names that match the math notation- so single character names, maybe upper or lower case to mean different things ( in ocean wave mechanics, often “h” is the water depth, and “H” is the wave height)

-to distinguish between scalar, vector, and matrix values — often UpperCase means an array or matrix, for instance.

But despite (or because of) these unique needs, a style guide would be really helpful.

Anyone have one? Or even any notes on what you do yourself?

Thanks,
-CHB




--

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

[hidden email]
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Style guide for numpy code?

Chris Barker - NOAA Federal
Thanks Joe,

Looks like a good list, though I personally would not recommend that students pick their own style.

I tell my students (general purpose Python, not Numerical work per se)

If your organization has a style guide, use that. If it doesn’t use PEP8.

In your case, you ARE the organization— You might consider defining a style. 

But I’ll read over this — you have some add-one and deviations from PEP8 that make sense for computational computing.

-Chris


On May 10, 2019, at 12:30 AM, Evgeni Burovski <[hidden email]> wrote:

Hi Joe,

Thanks for sharing!

I'm going to use your handout as a base for my numerical computing classes, (with an appropriate citation, of course :-)).




чт, 9 мая 2019 г., 21:19 Joe Harrington <[hidden email]>:

I have a handout for my PHZ 3150 Introduction to Numerical Computing course that includes some rules:

(a) All integer-valued floating-point numbers should have decimal points after them. For
example, if you have a time of 10 sec, do not use

y = np.e**10 # sec

use

y = np.e**10. # sec

instead.  For example, an item count is always an integer, but a distance is always a float.  A decimal in the range (-1,1) must always have a zero before the decimal point, for readability:

x = 0.23 # Right!
x = .23 # WRONG

The purpose of this one is simply to build the decimal-point habit.  In Python it's less of an issue now, but sometimes code is translated, and integer division is still out there.  For that reason, in other languages, it may be desirable to use a decimal point even for counts, unless integer division is wanted.  Make a comment whenever you intend integer division and the language uses the same symbol (/) for both kinds of division.

(b) Use spaces around binary operations and relations (=<>+-*/). Put a space after “,”.
Do not put space around “=” in keyword arguments, or around “ ** ”.

(c) Do not put plt.show() in your homework file! You may put it in a comment if you
like, but it is not necessary. Just save the plot. If you say

plt.ion()

plots will automatically show while you are working.

(d) Use:

import matplotlib.pyplot as plt

NOT:

import matplotlib.pylab as plt

(e) Keep lines to 80 characters, max, except in rare cases that are well justified, such as
very long strings. If you make comments on the same line as code, keep them short or
break them over more than a line:

code = code2   # set code equal to code2

# Longer comment requiring much more space because
# I'm explaining something complicated.
code = code2

code = code2   # Another way to do a very long comment,
               # like this one, which runs over more than
               # one line.

(f) Keep blocks of similar lines internally lined up on decimals, comments, and = signs.  This makes them easier to read and verify.  There will be some cases when this is impractical.  Use your judgment (you're not a computer, you control the computer!):

x    =   1.      # this is a comment
y    = 378.2345  # here's another
fred = chuck     # note how the decimals, = signs, and
                 # comments line up nicely...
alacazamshmazooboloid = 2721 # but not always!

(g) Put the units and sources of all values in comments:

t_planet = 523.     # K, Smith and Jones (2016, ApJ 234, 22)

(h) I don't mean to start a religious war, but I emphasize the alignment of similar adjacent code lines to make differences pop out and reduce the likelihood of bugs.  For example, it is much easier to verify the correctness of:

a     = 3 * x + 3 * 8. *     short    - 5. * np.exp(np.pi * omega * t)
a_alt = 3 * x + 3 * 8. * anotshortvar - 5. * np.exp(np.pi * omega * t)

than:

a = 3 * x + 3 * 8. * short - 5. * np.exp(np.pi * omega * t)
a_altvarname = 3 * x + 3*9*anotshortvar - 5. * np.exp(np.pi * omega * i)

(i) Assign values to meaningful variables, and use them in formulae and functions:

ny = 512
nx = 512
image = np.zeros((ny, nx))
expr1 = ny * 3
expr2 = nx * 4

Otherwise, later on when you upgrade to 2560x1440 arrays, you won't know which of the 512s are in the x direction and which are in the y direction.  Or, the student you (now a senior researcher) assign to code the upgrade won't!  Also, it reduces bugs arising from the order of arguments to functions if the args have meaningful names.  This is not to say that you should assign all numbers to functions.  This is fine:

circ = 2 * np.pi * r

(j) All functions assigned for grading must have full docstrings in numpy's format, as well as internal comments.  Utility functions not requested in the assignment and that the user will never see can have reduced docstrings if the functions are simple and obvious, but at least give the one-line summary.

(k) If you modify an existing function, you must either make a Git entry or, if it is not under revision control, include a Revision History section in your docstring and record your name, the date, the version number, your email, and the nature of the change you made.

(l) Choose variable names that are meaningful and consistent in style.  Document your style either at the head of a module or in a separate text file for the project.  For example, if you use CamelCaps with initial capital, say that.  If you reserve initial capitals for classes, say that.  If you use underscores for variable subscripts and camelCaps for the base variables, say that.  If you accept some other style and build on that, say that.  There are too many good reasons to have such styles for only one to be the community standard.  If certain kinds of values should get the same variable or base variable, such as fundamental constants or things like amplitudes, say that.

(j) It's best if variables that will appear in formulae are short, so more terms can fit in one 80 character line.

Overall, having and following a style makes code easier to read.  And, as an added bonus, if you take care to be consistent, you will write slower, view your code more times, and catch more bugs as you write them.  Thus, for codes of any significant size, writing pedantically commented and aligned code is almost always faster than blast coding, if you include debugging time.

Did you catch both bugs in item h?

--jh--


    
On 5/9/19 11:25 AM, Chris Barker - NOAA Federal [hidden email] wrote:
Do any of you know of a style guide for computational / numpy code?

I don't mean code that will go into numpy itself, but rather, users code that uses numpy (and scipy, and...)

I know about (am a proponent of) PEP8, but it doesn’t address the unique needs of scientific programming.

This is mostly about variable names. In scientific code, we often want:

- variable names that match the math notation- so single character names, maybe upper or lower case to mean different things ( in ocean wave mechanics, often “h” is the water depth, and “H” is the wave height)

-to distinguish between scalar, vector, and matrix values — often UpperCase means an array or matrix, for instance.

But despite (or because of) these unique needs, a style guide would be really helpful.

Anyone have one? Or even any notes on what you do yourself?

Thanks,
-CHB




--

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

[hidden email]
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion