Problem with np.savetxt

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Problem with np.savetxt

Stephen P. Molnar
I am embarrassed to be asking this question, but I have exhausted Google
at this point .

I have a number of identically formatted text files from which I want to
extract data, as an example (hopefully, putting these in as quotes will
persevere the format):

> =======================================================================
> PSOVina version 2.0
> Giotto H. K. Tai & Shirley W. I. Siu
>
> Computational Biology and Bioinformatics Lab
> University of Macau
>
> Visit http://cbbio.cis.umac.mo for more information.
>
> PSOVina was developed based on the framework of AutoDock Vina.
>
> For more information about Vina, please visit http://vina.scripps.edu.
>
> =======================================================================
>
> Output will be 13-7_out.pdbqt
> Reading input ... done.
> Setting up the scoring function ... done.
> Analyzing the binding site ... done.
> Using random seed: 1828390527
> Performing search ... done.
>
> Refining results ... done.
>
> mode |   affinity | dist from best mode
>      | (kcal/mol) | rmsd l.b.| rmsd u.b.
> -----+------------+----------+----------
>    1    -8.862004149      0.000      0.000
>    2    -8.403522829      2.992      6.553
>    3    -8.401384636      2.707      5.220
>    4    -7.886402037      4.907      6.862
>    5    -7.845519031      3.233      5.915
>    6    -7.837434227      3.954      5.641
>    7    -7.834584887      3.188      7.294
>    8    -7.694395765      3.746      7.553
>    9    -7.691211177      3.536      5.745
>   10    -7.670759445      3.698      7.587
>   11    -7.661882758      4.882      7.044
>   12    -7.636280303      2.347      3.284
>   13    -7.635788052      3.511      6.250
>   14    -7.611175249      2.427      3.449
>   15    -7.586368357      2.142      2.864
>   16    -7.531307666      2.976      4.980
>   17    -7.520501084      3.085      5.775
>   18    -7.512906514      4.220      7.672
>   19    -7.307403528      3.240      4.354
>   20    -7.256063348      3.694      7.252
> Writing output ... done.
  At this point, my python script consists of only the following:

> #!/usr/bin/env python3
> # -*- coding: utf-8 -*-
> """
>
> Created on Tue Sep 24 07:51:11 2019
>
> """
> import numpy as np
>
> data = []
>
> data = np.genfromtxt("13-7.log", usecols=(1), dtype=None,
> skip_header=27, skip_footer=1, encoding=None)
>
> print(data)
>
> np.savetxt('13-7', [data], fmt='%15.9f', header='13-7')

The problem lies in tfe np.savetxt line, on execution I get:

> runfile('/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet/DeltaGTable_V_s.py',
> wdir='/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet',
> current_namespace=True)
> ['-8.839713733' '-8.743377250' '-8.151051167' '-8.090452911'
>  '-7.967494477' '-7.854890056' '-7.757417879' '-7.741557490'
>  '-7.643885488' '-7.611595767' '-7.507605524' '-7.413920814'
>  '-7.389408331' '-7.384446364' '-7.374206276' '-7.368808179'
>  '-7.346641418' '-7.325037898' '-7.309614787' '-7.113209147']
> Traceback (most recent call last):
>
>   File
> "/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet/DeltaGTable_V_s.py",
> line 16, in <module>
>     np.savetxt('13-7', [data], fmt='%16.9f', header='13-7')
>
>   File "<__array_function__ internals>", line 6, in savetxt
>
>   File
> "/home/comp/Apps/Miniconda3/lib/python3.7/site-packages/numpy/lib/npyio.py",
> line 1438, in savetxt
>     % (str(X.dtype), format))
>
> TypeError: Mismatch between array dtype ('<U12') and format specifier
> ('%16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f
> %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f
> %16.9f')

The data is in the data file, but the only entry in '13-7', the saved
file, is the label. Obviously, the error is in the format argument.

Help will be much appreciated.

Thanks in advance.

--
Stephen P. Molnar, Ph.D.
www.molecular-modeling.net
614.312.7528 (c)
Skype:  smolnar1

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Problem with np.savetxt

Andras Deak
On Tue, Oct 8, 2019 at 3:17 PM Stephen P. Molnar <[hidden email]> wrote:

>
> I am embarrassed to be asking this question, but I have exhausted Google
> at this point .
>
> I have a number of identically formatted text files from which I want to
> extract data, as an example (hopefully, putting these in as quotes will
> persevere the format):
>
> > =======================================================================
> > PSOVina version 2.0
> > Giotto H. K. Tai & Shirley W. I. Siu
> >
> > Computational Biology and Bioinformatics Lab
> > University of Macau
> >
> > Visit http://cbbio.cis.umac.mo for more information.
> >
> > PSOVina was developed based on the framework of AutoDock Vina.
> >
> > For more information about Vina, please visit http://vina.scripps.edu.
> >
> > =======================================================================
> >
> > Output will be 13-7_out.pdbqt
> > Reading input ... done.
> > Setting up the scoring function ... done.
> > Analyzing the binding site ... done.
> > Using random seed: 1828390527
> > Performing search ... done.
> >
> > Refining results ... done.
> >
> > mode |   affinity | dist from best mode
> >      | (kcal/mol) | rmsd l.b.| rmsd u.b.
> > -----+------------+----------+----------
> >    1    -8.862004149      0.000      0.000
> >    2    -8.403522829      2.992      6.553
> >    3    -8.401384636      2.707      5.220
> >    4    -7.886402037      4.907      6.862
> >    5    -7.845519031      3.233      5.915
> >    6    -7.837434227      3.954      5.641
> >    7    -7.834584887      3.188      7.294
> >    8    -7.694395765      3.746      7.553
> >    9    -7.691211177      3.536      5.745
> >   10    -7.670759445      3.698      7.587
> >   11    -7.661882758      4.882      7.044
> >   12    -7.636280303      2.347      3.284
> >   13    -7.635788052      3.511      6.250
> >   14    -7.611175249      2.427      3.449
> >   15    -7.586368357      2.142      2.864
> >   16    -7.531307666      2.976      4.980
> >   17    -7.520501084      3.085      5.775
> >   18    -7.512906514      4.220      7.672
> >   19    -7.307403528      3.240      4.354
> >   20    -7.256063348      3.694      7.252
> > Writing output ... done.
>   At this point, my python script consists of only the following:
>
> > #!/usr/bin/env python3
> > # -*- coding: utf-8 -*-
> > """
> >
> > Created on Tue Sep 24 07:51:11 2019
> >
> > """
> > import numpy as np
> >
> > data = []
> >
> > data = np.genfromtxt("13-7.log", usecols=(1), dtype=None,
> > skip_header=27, skip_footer=1, encoding=None)
> >
> > print(data)
> >
> > np.savetxt('13-7', [data], fmt='%15.9f', header='13-7')
>
> The problem lies in tfe np.savetxt line, on execution I get:
>
> > runfile('/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet/DeltaGTable_V_s.py',
> > wdir='/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet',
> > current_namespace=True)
> > ['-8.839713733' '-8.743377250' '-8.151051167' '-8.090452911'
> >  '-7.967494477' '-7.854890056' '-7.757417879' '-7.741557490'
> >  '-7.643885488' '-7.611595767' '-7.507605524' '-7.413920814'
> >  '-7.389408331' '-7.384446364' '-7.374206276' '-7.368808179'
> >  '-7.346641418' '-7.325037898' '-7.309614787' '-7.113209147']
> > Traceback (most recent call last):
> >
> >   File
> > "/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet/DeltaGTable_V_s.py",
> > line 16, in <module>
> >     np.savetxt('13-7', [data], fmt='%16.9f', header='13-7')
> >
> >   File "<__array_function__ internals>", line 6, in savetxt
> >
> >   File
> > "/home/comp/Apps/Miniconda3/lib/python3.7/site-packages/numpy/lib/npyio.py",
> > line 1438, in savetxt
> >     % (str(X.dtype), format))
> >
> > TypeError: Mismatch between array dtype ('<U12') and format specifier
> > ('%16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f
> > %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f
> > %16.9f')
>
> The data is in the data file, but the only entry in '13-7', the saved
> file, is the label. Obviously, the error is in the format argument.

Hi,

One problem is the format: the error is telling you that you have
strings in your array (compare the `'<U12'` dtype and the output of
your `print(data)` call with strings inside), whereas %16.9f can only
be used to format floats (f for float). You would first have to
convert your array of strings to an array numbers. I don't usually use
genfromtxt so I'm not sure how you can make it return floats for you
in the first place, but I suspect `dtype=None` in the call to
genfromtxt might be responsible. In any case making it return numbers
should be the easier case.
The second problem is that you should make sure you mean `[data]` in
the call to savetxt. As it is now this would give you a 2d array of
shape (1, 20), and the output would correspondingly contain a single
row of 20 values (hence the 20 instances of '%16.9f' in the error
message). In case you meant to print one value per row in a single
column, you should drop the brackets around `data`:
np.savetxt('13-7', data, fmt='%16.9f', header='13-7')

And just a personal note, but I'd find an output file named '13-7' to
be a bit surprising. Perhaps some extension or prefix would help
organize these files?
Regards,

András

>
> Help will be much appreciated.
>
> Thanks in advance.
>
> --
> Stephen P. Molnar, Ph.D.
> www.molecular-modeling.net
> 614.312.7528 (c)
> Skype:  smolnar1
>
> _______________________________________________
> NumPy-Discussion mailing list
> [hidden email]
> https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Problem with np.savetxt

Andras Deak
PS. if you just want to specify the width of the fields you wouldn't
have to convert anything, because you can specify the size and
justification of a %s format. But arguably having float data as floats
is more natural anyway.

On Tue, Oct 8, 2019 at 3:42 PM Andras Deak <[hidden email]> wrote:

>
> On Tue, Oct 8, 2019 at 3:17 PM Stephen P. Molnar <[hidden email]> wrote:
> >
> > I am embarrassed to be asking this question, but I have exhausted Google
> > at this point .
> >
> > I have a number of identically formatted text files from which I want to
> > extract data, as an example (hopefully, putting these in as quotes will
> > persevere the format):
> >
> > > =======================================================================
> > > PSOVina version 2.0
> > > Giotto H. K. Tai & Shirley W. I. Siu
> > >
> > > Computational Biology and Bioinformatics Lab
> > > University of Macau
> > >
> > > Visit http://cbbio.cis.umac.mo for more information.
> > >
> > > PSOVina was developed based on the framework of AutoDock Vina.
> > >
> > > For more information about Vina, please visit http://vina.scripps.edu.
> > >
> > > =======================================================================
> > >
> > > Output will be 13-7_out.pdbqt
> > > Reading input ... done.
> > > Setting up the scoring function ... done.
> > > Analyzing the binding site ... done.
> > > Using random seed: 1828390527
> > > Performing search ... done.
> > >
> > > Refining results ... done.
> > >
> > > mode |   affinity | dist from best mode
> > >      | (kcal/mol) | rmsd l.b.| rmsd u.b.
> > > -----+------------+----------+----------
> > >    1    -8.862004149      0.000      0.000
> > >    2    -8.403522829      2.992      6.553
> > >    3    -8.401384636      2.707      5.220
> > >    4    -7.886402037      4.907      6.862
> > >    5    -7.845519031      3.233      5.915
> > >    6    -7.837434227      3.954      5.641
> > >    7    -7.834584887      3.188      7.294
> > >    8    -7.694395765      3.746      7.553
> > >    9    -7.691211177      3.536      5.745
> > >   10    -7.670759445      3.698      7.587
> > >   11    -7.661882758      4.882      7.044
> > >   12    -7.636280303      2.347      3.284
> > >   13    -7.635788052      3.511      6.250
> > >   14    -7.611175249      2.427      3.449
> > >   15    -7.586368357      2.142      2.864
> > >   16    -7.531307666      2.976      4.980
> > >   17    -7.520501084      3.085      5.775
> > >   18    -7.512906514      4.220      7.672
> > >   19    -7.307403528      3.240      4.354
> > >   20    -7.256063348      3.694      7.252
> > > Writing output ... done.
> >   At this point, my python script consists of only the following:
> >
> > > #!/usr/bin/env python3
> > > # -*- coding: utf-8 -*-
> > > """
> > >
> > > Created on Tue Sep 24 07:51:11 2019
> > >
> > > """
> > > import numpy as np
> > >
> > > data = []
> > >
> > > data = np.genfromtxt("13-7.log", usecols=(1), dtype=None,
> > > skip_header=27, skip_footer=1, encoding=None)
> > >
> > > print(data)
> > >
> > > np.savetxt('13-7', [data], fmt='%15.9f', header='13-7')
> >
> > The problem lies in tfe np.savetxt line, on execution I get:
> >
> > > runfile('/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet/DeltaGTable_V_s.py',
> > > wdir='/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet',
> > > current_namespace=True)
> > > ['-8.839713733' '-8.743377250' '-8.151051167' '-8.090452911'
> > >  '-7.967494477' '-7.854890056' '-7.757417879' '-7.741557490'
> > >  '-7.643885488' '-7.611595767' '-7.507605524' '-7.413920814'
> > >  '-7.389408331' '-7.384446364' '-7.374206276' '-7.368808179'
> > >  '-7.346641418' '-7.325037898' '-7.309614787' '-7.113209147']
> > > Traceback (most recent call last):
> > >
> > >   File
> > > "/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet/DeltaGTable_V_s.py",
> > > line 16, in <module>
> > >     np.savetxt('13-7', [data], fmt='%16.9f', header='13-7')
> > >
> > >   File "<__array_function__ internals>", line 6, in savetxt
> > >
> > >   File
> > > "/home/comp/Apps/Miniconda3/lib/python3.7/site-packages/numpy/lib/npyio.py",
> > > line 1438, in savetxt
> > >     % (str(X.dtype), format))
> > >
> > > TypeError: Mismatch between array dtype ('<U12') and format specifier
> > > ('%16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f
> > > %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f
> > > %16.9f')
> >
> > The data is in the data file, but the only entry in '13-7', the saved
> > file, is the label. Obviously, the error is in the format argument.
>
> Hi,
>
> One problem is the format: the error is telling you that you have
> strings in your array (compare the `'<U12'` dtype and the output of
> your `print(data)` call with strings inside), whereas %16.9f can only
> be used to format floats (f for float). You would first have to
> convert your array of strings to an array numbers. I don't usually use
> genfromtxt so I'm not sure how you can make it return floats for you
> in the first place, but I suspect `dtype=None` in the call to
> genfromtxt might be responsible. In any case making it return numbers
> should be the easier case.
> The second problem is that you should make sure you mean `[data]` in
> the call to savetxt. As it is now this would give you a 2d array of
> shape (1, 20), and the output would correspondingly contain a single
> row of 20 values (hence the 20 instances of '%16.9f' in the error
> message). In case you meant to print one value per row in a single
> column, you should drop the brackets around `data`:
> np.savetxt('13-7', data, fmt='%16.9f', header='13-7')
>
> And just a personal note, but I'd find an output file named '13-7' to
> be a bit surprising. Perhaps some extension or prefix would help
> organize these files?
> Regards,
>
> András
>
> >
> > Help will be much appreciated.
> >
> > Thanks in advance.
> >
> > --
> > Stephen P. Molnar, Ph.D.
> > www.molecular-modeling.net
> > 614.312.7528 (c)
> > Skype:  smolnar1
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > [hidden email]
> > https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Problem with np.savetxt

Fabrice Silva-2
In reply to this post by Stephen P. Molnar
Le mardi 08 octobre 2019, Stephen P. Molnar a écrit :
data = np.genfromtxt("13-7.log", usecols=(1), dtype=None, 
skip_header=27, skip_footer=1, encoding=None)
print(data)
[...]

['-8.839713733' '-8.743377250' '-8.151051167' '-8.090452911'
  '-7.967494477' '-7.854890056' '-7.757417879' '-7.741557490'
  '-7.643885488' '-7.611595767' '-7.507605524' '-7.413920814'
  '-7.389408331' '-7.384446364' '-7.374206276' '-7.368808179'
  '-7.346641418' '-7.325037898' '-7.309614787' '-7.113209147']

Hi,
Note that your data array is made of strings and not floats.
The default value of the dtype argument is float, which you override by None.
Remove the 'dtype=None' part to correctly load data

You then have no problem to save your data with the format you want.

Fabrice

PS : be aware that [data] is a 2D row array, that will end up inlined with command
np.savetxt('13-7', [data], fmt='%15.9f', header='13-7')
Remove the bracket for a one-per-line formatted output
np.savetxt('13-7', data, fmt='%15.9f', header='13-7')

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Problem with np.savetxt

Stephen P. Molnar
In reply to this post by Andras Deak
Many thanks or your kid replies.

I really appreciate your suggestions.

On 10/08/2019 09:44 AM, Andras Deak wrote:

> PS. if you just want to specify the width of the fields you wouldn't
> have to convert anything, because you can specify the size and
> justification of a %s format. But arguably having float data as floats
> is more natural anyway.
>
> On Tue, Oct 8, 2019 at 3:42 PM Andras Deak <[hidden email]> wrote:
>> On Tue, Oct 8, 2019 at 3:17 PM Stephen P. Molnar <[hidden email]> wrote:
>>> I am embarrassed to be asking this question, but I have exhausted Google
>>> at this point .
>>>
>>> I have a number of identically formatted text files from which I want to
>>> extract data, as an example (hopefully, putting these in as quotes will
>>> persevere the format):
>>>
>>>> =======================================================================
>>>> PSOVina version 2.0
>>>> Giotto H. K. Tai & Shirley W. I. Siu
>>>>
>>>> Computational Biology and Bioinformatics Lab
>>>> University of Macau
>>>>
>>>> Visit http://cbbio.cis.umac.mo for more information.
>>>>
>>>> PSOVina was developed based on the framework of AutoDock Vina.
>>>>
>>>> For more information about Vina, please visit http://vina.scripps.edu.
>>>>
>>>> =======================================================================
>>>>
>>>> Output will be 13-7_out.pdbqt
>>>> Reading input ... done.
>>>> Setting up the scoring function ... done.
>>>> Analyzing the binding site ... done.
>>>> Using random seed: 1828390527
>>>> Performing search ... done.
>>>>
>>>> Refining results ... done.
>>>>
>>>> mode |   affinity | dist from best mode
>>>>       | (kcal/mol) | rmsd l.b.| rmsd u.b.
>>>> -----+------------+----------+----------
>>>>     1    -8.862004149      0.000      0.000
>>>>     2    -8.403522829      2.992      6.553
>>>>     3    -8.401384636      2.707      5.220
>>>>     4    -7.886402037      4.907      6.862
>>>>     5    -7.845519031      3.233      5.915
>>>>     6    -7.837434227      3.954      5.641
>>>>     7    -7.834584887      3.188      7.294
>>>>     8    -7.694395765      3.746      7.553
>>>>     9    -7.691211177      3.536      5.745
>>>>    10    -7.670759445      3.698      7.587
>>>>    11    -7.661882758      4.882      7.044
>>>>    12    -7.636280303      2.347      3.284
>>>>    13    -7.635788052      3.511      6.250
>>>>    14    -7.611175249      2.427      3.449
>>>>    15    -7.586368357      2.142      2.864
>>>>    16    -7.531307666      2.976      4.980
>>>>    17    -7.520501084      3.085      5.775
>>>>    18    -7.512906514      4.220      7.672
>>>>    19    -7.307403528      3.240      4.354
>>>>    20    -7.256063348      3.694      7.252
>>>> Writing output ... done.
>>>    At this point, my python script consists of only the following:
>>>
>>>> #!/usr/bin/env python3
>>>> # -*- coding: utf-8 -*-
>>>> """
>>>>
>>>> Created on Tue Sep 24 07:51:11 2019
>>>>
>>>> """
>>>> import numpy as np
>>>>
>>>> data = []
>>>>
>>>> data = np.genfromtxt("13-7.log", usecols=(1), dtype=None,
>>>> skip_header=27, skip_footer=1, encoding=None)
>>>>
>>>> print(data)
>>>>
>>>> np.savetxt('13-7', [data], fmt='%15.9f', header='13-7')
>>> The problem lies in tfe np.savetxt line, on execution I get:
>>>
>>>> runfile('/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet/DeltaGTable_V_s.py',
>>>> wdir='/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet',
>>>> current_namespace=True)
>>>> ['-8.839713733' '-8.743377250' '-8.151051167' '-8.090452911'
>>>>   '-7.967494477' '-7.854890056' '-7.757417879' '-7.741557490'
>>>>   '-7.643885488' '-7.611595767' '-7.507605524' '-7.413920814'
>>>>   '-7.389408331' '-7.384446364' '-7.374206276' '-7.368808179'
>>>>   '-7.346641418' '-7.325037898' '-7.309614787' '-7.113209147']
>>>> Traceback (most recent call last):
>>>>
>>>>    File
>>>> "/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet/DeltaGTable_V_s.py",
>>>> line 16, in <module>
>>>>      np.savetxt('13-7', [data], fmt='%16.9f', header='13-7')
>>>>
>>>>    File "<__array_function__ internals>", line 6, in savetxt
>>>>
>>>>    File
>>>> "/home/comp/Apps/Miniconda3/lib/python3.7/site-packages/numpy/lib/npyio.py",
>>>> line 1438, in savetxt
>>>>      % (str(X.dtype), format))
>>>>
>>>> TypeError: Mismatch between array dtype ('<U12') and format specifier
>>>> ('%16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f
>>>> %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f
>>>> %16.9f')
>>> The data is in the data file, but the only entry in '13-7', the saved
>>> file, is the label. Obviously, the error is in the format argument.
>> Hi,
>>
>> One problem is the format: the error is telling you that you have
>> strings in your array (compare the `'<U12'` dtype and the output of
>> your `print(data)` call with strings inside), whereas %16.9f can only
>> be used to format floats (f for float). You would first have to
>> convert your array of strings to an array numbers. I don't usually use
>> genfromtxt so I'm not sure how you can make it return floats for you
>> in the first place, but I suspect `dtype=None` in the call to
>> genfromtxt might be responsible. In any case making it return numbers
>> should be the easier case.
>> The second problem is that you should make sure you mean `[data]` in
>> the call to savetxt. As it is now this would give you a 2d array of
>> shape (1, 20), and the output would correspondingly contain a single
>> row of 20 values (hence the 20 instances of '%16.9f' in the error
>> message). In case you meant to print one value per row in a single
>> column, you should drop the brackets around `data`:
>> np.savetxt('13-7', data, fmt='%16.9f', header='13-7')
>>
>> And just a personal note, but I'd find an output file named '13-7' to
>> be a bit surprising. Perhaps some extension or prefix would help
>> organize these files?
>> Regards,
>>
>> Andr??s
>>
>>> Help will be much appreciated.
>>>
>>> Thanks in advance.
>>>
>>> --
>>> Stephen P. Molnar, Ph.D.
>>> www.molecular-modeling.net
>>> 614.312.7528 (c)
>>> Skype:  smolnar1
>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> [hidden email]
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
> _______________________________________________
> NumPy-Discussion mailing list
> [hidden email]
> https://mail.python.org/mailman/listinfo/numpy-discussion

--
Stephen P. Molnar, Ph.D.
www.molecular-modeling.net
614.312.7528 (c)
Skype:  smolnar1

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Problem with np.savetxt

Stephen P. Molnar
In reply to this post by Fabrice Silva-2
Thanks for the replies. All is now well!

I'm thankful that this list is so very patient with ROF's (retired old fools) struggling to learn a new programmng language.


On 10/08/2019 10:42 AM, Fabrice Silva wrote:
Le mardi 08 octobre 2019, Stephen P. Molnar a ?crit :
data = np.genfromtxt("13-7.log", usecols=(1), dtype=None, 
skip_header=27, skip_footer=1, encoding=None)
print(data)
[...]

        
['-8.839713733' '-8.743377250' '-8.151051167' '-8.090452911'
? '-7.967494477' '-7.854890056' '-7.757417879' '-7.741557490'
? '-7.643885488' '-7.611595767' '-7.507605524' '-7.413920814'
? '-7.389408331' '-7.384446364' '-7.374206276' '-7.368808179'
? '-7.346641418' '-7.325037898' '-7.309614787' '-7.113209147']

Hi,
Note that your data array is made of strings and not floats.
The default value of the dtype argument is float, which you override by None.
Remove the 'dtype=None' part to correctly load data

You then have no problem to save your data with the format you want.

Fabrice

PS : be aware that [data] is a 2D row array, that will end up inlined with command
np.savetxt('13-7', [data], fmt='%15.9f', header='13-7')
Remove the?bracket for a one-per-line formatted output
np.savetxt('13-7', data, fmt='%15.9f', header='13-7')


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

-- 
Stephen P. Molnar, Ph.D.
www.molecular-modeling.net
614.312.7528 (c)
Skype:  smolnar1


_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Problem with np.savetxt

Stephen P. Molnar
In reply to this post by Stephen P. Molnar
I am slowly and not quickly stumbling forward, but at this point my
degree of mental entropy (confusion) is monumental.

This works:

> import numpy as np
>
> print('${d}')
>
> data = np.genfromtxt("14-7.log", usecols=(1), skip_header=27,
> skip_footer=1, encoding=None)
>
> print(data)
>
> np.savetxt('14-7.dG', data, fmt='%12.9f', header='14-7')
> print(data)

which produces:

> runfile('/home/comp/Apps/Python/PsoVina/DeltaGTable_V_s.py',
> wdir='/home/comp/Apps/Python/PsoVina', current_namespace=True)
> ${d}
> [-9.96090267 -8.97950478 -8.94261136 -8.91552301 -8.73650883 -8.66338714
>  -8.41073971 -8.38914635 -8.29679891 -8.16845411 -8.12799082 -8.12710377
>  -7.97909074 -7.94187268 -7.90076621 -7.88148523 -7.83782648 -7.8159095
>  -7.72254029 -7.72034674]
> [-9.96090267 -8.97950478 -8.94261136 -8.91552301 -8.73650883 -8.66338714
>  -8.41073971 -8.38914635 -8.29679891 -8.16845411 -8.12799082 -8.12710377
>  -7.97909074 -7.94187268 -7.90076621 -7.88148523 -7.83782648 -7.8159095
>  -7.72254029 -7.72034674]
Note;  the print statements are for a quick check o the output, which is:

> # 14-7
> -9.960902669
> -8.979504781
> -8.942611364
> -8.915523010
> -8.736508831
> -8.663387139
> -8.410739711
> -8.389146347
> -8.296798909
> -8.168454106
> -8.127990818
> -8.127103774
> -7.979090739
> -7.941872682
> -7.900766215
> -7.881485228
> -7.837826485
> -7.815909505
> -7.722540286
> -7.720346742
  Also, this bash script works:

> #!/bin/bash
>
> # Run.dG.list_1
>
> while IFS= read -r d
> do
>     echo "${d}.log"
>
> done <ligand.list
which returns the three log file names:

> 14-7.log
> 15-7.log
> 18-7.log
> C-VX3.log


But, if I run this bash script:

> #!/bin/bash
>
> # Run.dG.list_1
>
> while IFS= read -r d
> do
>     echo "${d}.log"
>     python3 DeltaGTable_V_sl.py
>
>
> done <ligand.list
>
where DeltaGTable_V_sl.py is:

> import numpy as np
>
> print('${d}')
>
> data = np.genfromtxt('${d}.log', usecols=(1), skip_header=27,
> skip_footer=1, encoding=None)
> print(data)
>
> np.savetxt('${d}.dG', data, fmt='%12.9f', header='${d}')
> print(data.dG)

I get:

> (base) comp@AbNormal:~/Apps/Python/PsoVina$ sh ./Run.dG.list_1.sh
> 14-7.log
> python3: can't open file 'DeltaGTable_V_sl.py': [Errno 2] No such file
> or directory
> 15-7.log
> python3: can't open file 'DeltaGTable_V_sl.py': [Errno 2] No such file
> or directory
> 18-7.log
> python3: can't open file 'DeltaGTable_V_sl.py': [Errno 2] No such file
> or directory
> C-VX3.log
> python3: can't open file 'DeltaGTable_V_sl.py': [Errno 2] No such file
> or directory

So, it would appear that the log file labels are in the workspace, but
'${d}.log' is not being recognized as fname by genfromtxt. Although i
have googled every combination of terms I can think of I am obviously
missing something.

As I have potentially hundreds of files to process, I would appreciate
pointers towards a solution to the problem.

Thanks in advance.

On 10/08/2019 10:49 AM, Stephen P. Molnar wrote:

> Many thanks or your kind replies.
>
> I really appreciate your suggestions.
>
> On 10/08/2019 09:44 AM, Andras Deak wrote:
>> PS. if you just want to specify the width of the fields you wouldn't
>> have to convert anything, because you can specify the size and
>> justification of a %s format. But arguably having float data as floats
>> is more natural anyway.
>>
>> On Tue, Oct 8, 2019 at 3:42 PM Andras Deak <[hidden email]>
>> wrote:
>>> On Tue, Oct 8, 2019 at 3:17 PM Stephen P. Molnar
>>> <[hidden email]> wrote:
>>>> I am embarrassed to be asking this question, but I have exhausted
>>>> Google
>>>> at this point .
>>>>
>>>> I have a number of identically formatted text files from which I
>>>> want to
>>>> extract data, as an example (hopefully, putting these in as quotes
>>>> will
>>>> persevere the format):
>>>>
>>>>> =======================================================================
>>>>>
>>>>> PSOVina version 2.0
>>>>> Giotto H. K. Tai & Shirley W. I. Siu
>>>>>
>>>>> Computational Biology and Bioinformatics Lab
>>>>> University of Macau
>>>>>
>>>>> Visit http://cbbio.cis.umac.mo for more information.
>>>>>
>>>>> PSOVina was developed based on the framework of AutoDock Vina.
>>>>>
>>>>> For more information about Vina, please visit
>>>>> http://vina.scripps.edu.
>>>>>
>>>>> =======================================================================
>>>>>
>>>>>
>>>>> Output will be 13-7_out.pdbqt
>>>>> Reading input ... done.
>>>>> Setting up the scoring function ... done.
>>>>> Analyzing the binding site ... done.
>>>>> Using random seed: 1828390527
>>>>> Performing search ... done.
>>>>>
>>>>> Refining results ... done.
>>>>>
>>>>> mode |   affinity | dist from best mode
>>>>>       | (kcal/mol) | rmsd l.b.| rmsd u.b.
>>>>> -----+------------+----------+----------
>>>>>     1    -8.862004149      0.000      0.000
>>>>>     2    -8.403522829      2.992      6.553
>>>>>     3    -8.401384636      2.707      5.220
>>>>>     4    -7.886402037      4.907      6.862
>>>>>     5    -7.845519031      3.233      5.915
>>>>>     6    -7.837434227      3.954      5.641
>>>>>     7    -7.834584887      3.188      7.294
>>>>>     8    -7.694395765      3.746      7.553
>>>>>     9    -7.691211177      3.536      5.745
>>>>>    10    -7.670759445      3.698      7.587
>>>>>    11    -7.661882758      4.882      7.044
>>>>>    12    -7.636280303      2.347      3.284
>>>>>    13    -7.635788052      3.511      6.250
>>>>>    14    -7.611175249      2.427      3.449
>>>>>    15    -7.586368357      2.142      2.864
>>>>>    16    -7.531307666      2.976      4.980
>>>>>    17    -7.520501084      3.085      5.775
>>>>>    18    -7.512906514      4.220      7.672
>>>>>    19    -7.307403528      3.240      4.354
>>>>>    20    -7.256063348      3.694      7.252
>>>>> Writing output ... done.
>>>>    At this point, my python script consists of only the following:
>>>>
>>>>> #!/usr/bin/env python3
>>>>> # -*- coding: utf-8 -*-
>>>>> """
>>>>>
>>>>> Created on Tue Sep 24 07:51:11 2019
>>>>>
>>>>> """
>>>>> import numpy as np
>>>>>
>>>>> data = []
>>>>>
>>>>> data = np.genfromtxt("13-7.log", usecols=(1), dtype=None,
>>>>> skip_header=27, skip_footer=1, encoding=None)
>>>>>
>>>>> print(data)
>>>>>
>>>>> np.savetxt('13-7', [data], fmt='%15.9f', header='13-7')
>>>> The problem lies in tfe np.savetxt line, on execution I get:
>>>>
>>>>> runfile('/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet/DeltaGTable_V_s.py',
>>>>>
>>>>> wdir='/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet',
>>>>>
>>>>> current_namespace=True)
>>>>> ['-8.839713733' '-8.743377250' '-8.151051167' '-8.090452911'
>>>>>   '-7.967494477' '-7.854890056' '-7.757417879' '-7.741557490'
>>>>>   '-7.643885488' '-7.611595767' '-7.507605524' '-7.413920814'
>>>>>   '-7.389408331' '-7.384446364' '-7.374206276' '-7.368808179'
>>>>>   '-7.346641418' '-7.325037898' '-7.309614787' '-7.113209147']
>>>>> Traceback (most recent call last):
>>>>>
>>>>>    File
>>>>> "/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet/DeltaGTable_V_s.py",
>>>>>
>>>>> line 16, in <module>
>>>>>      np.savetxt('13-7', [data], fmt='%16.9f', header='13-7')
>>>>>
>>>>>    File "<__array_function__ internals>", line 6, in savetxt
>>>>>
>>>>>    File
>>>>> "/home/comp/Apps/Miniconda3/lib/python3.7/site-packages/numpy/lib/npyio.py",
>>>>>
>>>>> line 1438, in savetxt
>>>>>      % (str(X.dtype), format))
>>>>>
>>>>> TypeError: Mismatch between array dtype ('<U12') and format specifier
>>>>> ('%16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f
>>>>> %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f
>>>>> %16.9f')
>>>> The data is in the data file, but the only entry in '13-7', the saved
>>>> file, is the label. Obviously, the error is in the format argument.
>>> Hi,
>>>
>>> One problem is the format: the error is telling you that you have
>>> strings in your array (compare the `'<U12'` dtype and the output of
>>> your `print(data)` call with strings inside), whereas %16.9f can only
>>> be used to format floats (f for float). You would first have to
>>> convert your array of strings to an array numbers. I don't usually use
>>> genfromtxt so I'm not sure how you can make it return floats for you
>>> in the first place, but I suspect `dtype=None` in the call to
>>> genfromtxt might be responsible. In any case making it return numbers
>>> should be the easier case.
>>> The second problem is that you should make sure you mean `[data]` in
>>> the call to savetxt. As it is now this would give you a 2d array of
>>> shape (1, 20), and the output would correspondingly contain a single
>>> row of 20 values (hence the 20 instances of '%16.9f' in the error
>>> message). In case you meant to print one value per row in a single
>>> column, you should drop the brackets around `data`:
>>> np.savetxt('13-7', data, fmt='%16.9f', header='13-7')
>>>
>>> And just a personal note, but I'd find an output file named '13-7' to
>>> be a bit surprising. Perhaps some extension or prefix would help
>>> organize these files?
>>> Regards,
>>>
>>> Andr??s
>>>
>>>> Help will be much appreciated.
>>>>
>>>> Thanks in advance.
>>>>
>>>> --
>>>> Stephen P. Molnar, Ph.D.
>>>> www.molecular-modeling.net
>>>> 614.312.7528 (c)
>>>> Skype:  smolnar1
>>>>
>>>> _______________________________________________
>>>> NumPy-Discussion mailing list
>>>> [hidden email]
>>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>> _______________________________________________
>> NumPy-Discussion mailing list
>> [hidden email]
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>

--
Stephen P. Molnar, Ph.D.
www.molecular-modeling.net
614.312.7528 (c)
Skype:  smolnar1

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Problem with np.savetxt

Eric Wieser
You're trying to read a file with a name of literally `${d}.log`, which is unlikely to be the name of your file. `${}` is bash syntax, not python syntax.

This has drifted out of numpy territory and into "how to coordinate between bash and python" territory - I'd perhaps recommend you ask this to a wider python audience on StackOverflow, where you'll get a faster response.

Eric

On Thu, 10 Oct 2019 at 15:11, Stephen P. Molnar <[hidden email]> wrote:
I am slowly and not quickly stumbling forward, but at this point my
degree of mental entropy (confusion) is monumental.

This works:

> import numpy as np
>
> print('${d}')
>
> data = np.genfromtxt("14-7.log", usecols=(1), skip_header=27,
> skip_footer=1, encoding=None)
>
> print(data)
>
> np.savetxt('14-7.dG', data, fmt='%12.9f', header='14-7')
> print(data)

which produces:

> runfile('/home/comp/Apps/Python/PsoVina/DeltaGTable_V_s.py',
> wdir='/home/comp/Apps/Python/PsoVina', current_namespace=True)
> ${d}
> [-9.96090267 -8.97950478 -8.94261136 -8.91552301 -8.73650883 -8.66338714
>  -8.41073971 -8.38914635 -8.29679891 -8.16845411 -8.12799082 -8.12710377
>  -7.97909074 -7.94187268 -7.90076621 -7.88148523 -7.83782648 -7.8159095
>  -7.72254029 -7.72034674]
> [-9.96090267 -8.97950478 -8.94261136 -8.91552301 -8.73650883 -8.66338714
>  -8.41073971 -8.38914635 -8.29679891 -8.16845411 -8.12799082 -8.12710377
>  -7.97909074 -7.94187268 -7.90076621 -7.88148523 -7.83782648 -7.8159095
>  -7.72254029 -7.72034674]
Note;  the print statements are for a quick check o the output, which is:

> # 14-7
> -9.960902669
> -8.979504781
> -8.942611364
> -8.915523010
> -8.736508831
> -8.663387139
> -8.410739711
> -8.389146347
> -8.296798909
> -8.168454106
> -8.127990818
> -8.127103774
> -7.979090739
> -7.941872682
> -7.900766215
> -7.881485228
> -7.837826485
> -7.815909505
> -7.722540286
> -7.720346742
  Also, this bash script works:

> #!/bin/bash
>
> # Run.dG.list_1
>
> while IFS= read -r d
> do
>     echo "${d}.log"
>
> done <ligand.list
which returns the three log file names:

> 14-7.log
> 15-7.log
> 18-7.log
> C-VX3.log


But, if I run this bash script:

> #!/bin/bash
>
> # Run.dG.list_1
>
> while IFS= read -r d
> do
>     echo "${d}.log"
>     python3 DeltaGTable_V_sl.py
>
>
> done <ligand.list
>
where DeltaGTable_V_sl.py is:

> import numpy as np
>
> print('${d}')
>
> data = np.genfromtxt('${d}.log', usecols=(1), skip_header=27,
> skip_footer=1, encoding=None)
> print(data)
>
> np.savetxt('${d}.dG', data, fmt='%12.9f', header='${d}')
> print(data.dG)

I get:

> (base) comp@AbNormal:~/Apps/Python/PsoVina$ sh ./Run.dG.list_1.sh
> 14-7.log
> python3: can't open file 'DeltaGTable_V_sl.py': [Errno 2] No such file
> or directory
> 15-7.log
> python3: can't open file 'DeltaGTable_V_sl.py': [Errno 2] No such file
> or directory
> 18-7.log
> python3: can't open file 'DeltaGTable_V_sl.py': [Errno 2] No such file
> or directory
> C-VX3.log
> python3: can't open file 'DeltaGTable_V_sl.py': [Errno 2] No such file
> or directory

So, it would appear that the log file labels are in the workspace, but
'${d}.log' is not being recognized as fname by genfromtxt. Although i
have googled every combination of terms I can think of I am obviously
missing something.

As I have potentially hundreds of files to process, I would appreciate
pointers towards a solution to the problem.

Thanks in advance.

On 10/08/2019 10:49 AM, Stephen P. Molnar wrote:
> Many thanks or your kind replies.
>
> I really appreciate your suggestions.
>
> On 10/08/2019 09:44 AM, Andras Deak wrote:
>> PS. if you just want to specify the width of the fields you wouldn't
>> have to convert anything, because you can specify the size and
>> justification of a %s format. But arguably having float data as floats
>> is more natural anyway.
>>
>> On Tue, Oct 8, 2019 at 3:42 PM Andras Deak <[hidden email]>
>> wrote:
>>> On Tue, Oct 8, 2019 at 3:17 PM Stephen P. Molnar
>>> <[hidden email]> wrote:
>>>> I am embarrassed to be asking this question, but I have exhausted
>>>> Google
>>>> at this point .
>>>>
>>>> I have a number of identically formatted text files from which I
>>>> want to
>>>> extract data, as an example (hopefully, putting these in as quotes
>>>> will
>>>> persevere the format):
>>>>
>>>>> =======================================================================
>>>>>
>>>>> PSOVina version 2.0
>>>>> Giotto H. K. Tai & Shirley W. I. Siu
>>>>>
>>>>> Computational Biology and Bioinformatics Lab
>>>>> University of Macau
>>>>>
>>>>> Visit http://cbbio.cis.umac.mo for more information.
>>>>>
>>>>> PSOVina was developed based on the framework of AutoDock Vina.
>>>>>
>>>>> For more information about Vina, please visit
>>>>> http://vina.scripps.edu.
>>>>>
>>>>> =======================================================================
>>>>>
>>>>>
>>>>> Output will be 13-7_out.pdbqt
>>>>> Reading input ... done.
>>>>> Setting up the scoring function ... done.
>>>>> Analyzing the binding site ... done.
>>>>> Using random seed: 1828390527
>>>>> Performing search ... done.
>>>>>
>>>>> Refining results ... done.
>>>>>
>>>>> mode |   affinity | dist from best mode
>>>>>       | (kcal/mol) | rmsd l.b.| rmsd u.b.
>>>>> -----+------------+----------+----------
>>>>>     1    -8.862004149      0.000      0.000
>>>>>     2    -8.403522829      2.992      6.553
>>>>>     3    -8.401384636      2.707      5.220
>>>>>     4    -7.886402037      4.907      6.862
>>>>>     5    -7.845519031      3.233      5.915
>>>>>     6    -7.837434227      3.954      5.641
>>>>>     7    -7.834584887      3.188      7.294
>>>>>     8    -7.694395765      3.746      7.553
>>>>>     9    -7.691211177      3.536      5.745
>>>>>    10    -7.670759445      3.698      7.587
>>>>>    11    -7.661882758      4.882      7.044
>>>>>    12    -7.636280303      2.347      3.284
>>>>>    13    -7.635788052      3.511      6.250
>>>>>    14    -7.611175249      2.427      3.449
>>>>>    15    -7.586368357      2.142      2.864
>>>>>    16    -7.531307666      2.976      4.980
>>>>>    17    -7.520501084      3.085      5.775
>>>>>    18    -7.512906514      4.220      7.672
>>>>>    19    -7.307403528      3.240      4.354
>>>>>    20    -7.256063348      3.694      7.252
>>>>> Writing output ... done.
>>>>    At this point, my python script consists of only the following:
>>>>
>>>>> #!/usr/bin/env python3
>>>>> # -*- coding: utf-8 -*-
>>>>> """
>>>>>
>>>>> Created on Tue Sep 24 07:51:11 2019
>>>>>
>>>>> """
>>>>> import numpy as np
>>>>>
>>>>> data = []
>>>>>
>>>>> data = np.genfromtxt("13-7.log", usecols=(1), dtype=None,
>>>>> skip_header=27, skip_footer=1, encoding=None)
>>>>>
>>>>> print(data)
>>>>>
>>>>> np.savetxt('13-7', [data], fmt='%15.9f', header='13-7')
>>>> The problem lies in tfe np.savetxt line, on execution I get:
>>>>
>>>>> runfile('/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet/DeltaGTable_V_s.py',
>>>>>
>>>>> wdir='/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet',
>>>>>
>>>>> current_namespace=True)
>>>>> ['-8.839713733' '-8.743377250' '-8.151051167' '-8.090452911'
>>>>>   '-7.967494477' '-7.854890056' '-7.757417879' '-7.741557490'
>>>>>   '-7.643885488' '-7.611595767' '-7.507605524' '-7.413920814'
>>>>>   '-7.389408331' '-7.384446364' '-7.374206276' '-7.368808179'
>>>>>   '-7.346641418' '-7.325037898' '-7.309614787' '-7.113209147']
>>>>> Traceback (most recent call last):
>>>>>
>>>>>    File
>>>>> "/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet/DeltaGTable_V_s.py",
>>>>>
>>>>> line 16, in <module>
>>>>>      np.savetxt('13-7', [data], fmt='%16.9f', header='13-7')
>>>>>
>>>>>    File "<__array_function__ internals>", line 6, in savetxt
>>>>>
>>>>>    File
>>>>> "/home/comp/Apps/Miniconda3/lib/python3.7/site-packages/numpy/lib/npyio.py",
>>>>>
>>>>> line 1438, in savetxt
>>>>>      % (str(X.dtype), format))
>>>>>
>>>>> TypeError: Mismatch between array dtype ('<U12') and format specifier
>>>>> ('%16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f
>>>>> %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f
>>>>> %16.9f')
>>>> The data is in the data file, but the only entry in '13-7', the saved
>>>> file, is the label. Obviously, the error is in the format argument.
>>> Hi,
>>>
>>> One problem is the format: the error is telling you that you have
>>> strings in your array (compare the `'<U12'` dtype and the output of
>>> your `print(data)` call with strings inside), whereas %16.9f can only
>>> be used to format floats (f for float). You would first have to
>>> convert your array of strings to an array numbers. I don't usually use
>>> genfromtxt so I'm not sure how you can make it return floats for you
>>> in the first place, but I suspect `dtype=None` in the call to
>>> genfromtxt might be responsible. In any case making it return numbers
>>> should be the easier case.
>>> The second problem is that you should make sure you mean `[data]` in
>>> the call to savetxt. As it is now this would give you a 2d array of
>>> shape (1, 20), and the output would correspondingly contain a single
>>> row of 20 values (hence the 20 instances of '%16.9f' in the error
>>> message). In case you meant to print one value per row in a single
>>> column, you should drop the brackets around `data`:
>>> np.savetxt('13-7', data, fmt='%16.9f', header='13-7')
>>>
>>> And just a personal note, but I'd find an output file named '13-7' to
>>> be a bit surprising. Perhaps some extension or prefix would help
>>> organize these files?
>>> Regards,
>>>
>>> Andr??s
>>>
>>>> Help will be much appreciated.
>>>>
>>>> Thanks in advance.
>>>>
>>>> --
>>>> Stephen P. Molnar, Ph.D.
>>>> www.molecular-modeling.net
>>>> 614.312.7528 (c)
>>>> Skype:  smolnar1
>>>>
>>>> _______________________________________________
>>>> NumPy-Discussion mailing list
>>>> [hidden email]
>>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>> _______________________________________________
>> NumPy-Discussion mailing list
>> [hidden email]
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>

--
Stephen P. Molnar, Ph.D.
www.molecular-modeling.net
614.312.7528 (c)
Skype:  smolnar1

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Fwd: Re: Problem with np.savetxt

Stephen P. Molnar
In reply to this post by Stephen P. Molnar



-------- Forwarded Message --------
Subject: Re: [Numpy-discussion] Problem with np.savetxt
Date: Thu, 10 Oct 2019 10:10:58 -0400
From: Stephen P. Molnar [hidden email]
To: [hidden email]


I am slowly and not quickly stumbling forward, but at this point my 
degree of mental entropy (confusion) is monumental.

This works:

> import numpy as np
>
> print('${d}')
>
> data = np.genfromtxt("14-7.log", usecols=(1), skip_header=27, 
> skip_footer=1, encoding=None)
>
> print(data)
>
> np.savetxt('14-7.dG', data, fmt='%12.9f', header='14-7')
> print(data)

which produces:

> runfile('/home/comp/Apps/Python/PsoVina/DeltaGTable_V_s.py', 
> wdir='/home/comp/Apps/Python/PsoVina', current_namespace=True)
> ${d}
> [-9.96090267 -8.97950478 -8.94261136 -8.91552301 -8.73650883 -8.66338714
>  -8.41073971 -8.38914635 -8.29679891 -8.16845411 -8.12799082 -8.12710377
>  -7.97909074 -7.94187268 -7.90076621 -7.88148523 -7.83782648 -7.8159095
>  -7.72254029 -7.72034674]
> [-9.96090267 -8.97950478 -8.94261136 -8.91552301 -8.73650883 -8.66338714
>  -8.41073971 -8.38914635 -8.29679891 -8.16845411 -8.12799082 -8.12710377
>  -7.97909074 -7.94187268 -7.90076621 -7.88148523 -7.83782648 -7.8159095
>  -7.72254029 -7.72034674]
Note;  the print statements are for a quick check o the output, which is:

> # 14-7
> -9.960902669
> -8.979504781
> -8.942611364
> -8.915523010
> -8.736508831
> -8.663387139
> -8.410739711
> -8.389146347
> -8.296798909
> -8.168454106
> -8.127990818
> -8.127103774
> -7.979090739
> -7.941872682
> -7.900766215
> -7.881485228
> -7.837826485
> -7.815909505
> -7.722540286
> -7.720346742
 Also, this bash script works:

> #!/bin/bash
>
> # Run.dG.list_1
>
> while IFS= read -r d
> do
>     echo "${d}.log"
>
> done <ligand.list
which returns the three log file names:

> 14-7.log
> 15-7.log
> 18-7.log
> C-VX3.log


But, if I run this bash script:

> #!/bin/bash
>
> # Run.dG.list_1
>
> while IFS= read -r d
> do
>     echo "${d}.log"
>     python3 DeltaGTable_V_sl.py
>
>
> done <ligand.list
>
where DeltaGTable_V_sl.py is:

> import numpy as np
>
> print('${d}')
>
> data = np.genfromtxt('${d}.log', usecols=(1), skip_header=27, 
> skip_footer=1, encoding=None)
> print(data)
>
> np.savetxt('${d}.dG', data, fmt='%12.9f', header='${d}')
> print(data.dG)

I get:

> (base) comp@AbNormal:~/Apps/Python/PsoVina$ sh ./Run.dG.list_1.sh
> 14-7.log
> python3: can't open file 'DeltaGTable_V_sl.py': [Errno 2] No such file 
> or directory
> 15-7.log
> python3: can't open file 'DeltaGTable_V_sl.py': [Errno 2] No such file 
> or directory
> 18-7.log
> python3: can't open file 'DeltaGTable_V_sl.py': [Errno 2] No such file 
> or directory
> C-VX3.log
> python3: can't open file 'DeltaGTable_V_sl.py': [Errno 2] No such file 
> or directory

So, it would appear that the log file labels are in the workspace, but 
'${d}.log' is not being recognized as fname by genfromtxt. Although i 
have googled every combination of terms I can think of I am obviously 
missing something.

As I have potentially hundreds of files to process, I would appreciate 
pointers towards a solution to the problem.

Thanks in advance.

On 10/08/2019 10:49 AM, Stephen P. Molnar wrote:
> Many thanks or your kind replies.
>
> I really appreciate your suggestions.
>
> On 10/08/2019 09:44 AM, Andras Deak wrote:
>> PS. if you just want to specify the width of the fields you wouldn't
>> have to convert anything, because you can specify the size and
>> justification of a %s format. But arguably having float data as floats
>> is more natural anyway.
>>
>> On Tue, Oct 8, 2019 at 3:42 PM Andras Deak [hidden email] 
>> wrote:
>>> On Tue, Oct 8, 2019 at 3:17 PM Stephen P. Molnar 
>>> [hidden email] wrote:
>>>> I am embarrassed to be asking this question, but I have exhausted 
>>>> Google
>>>> at this point .
>>>>
>>>> I have a number of identically formatted text files from which I 
>>>> want to
>>>> extract data, as an example (hopefully, putting these in as quotes 
>>>> will
>>>> persevere the format):
>>>>
>>>>> ======================================================================= 
>>>>>
>>>>> PSOVina version 2.0
>>>>> Giotto H. K. Tai & Shirley W. I. Siu
>>>>>
>>>>> Computational Biology and Bioinformatics Lab
>>>>> University of Macau
>>>>>
>>>>> Visit http://cbbio.cis.umac.mo for more information.
>>>>>
>>>>> PSOVina was developed based on the framework of AutoDock Vina.
>>>>>
>>>>> For more information about Vina, please visit 
>>>>> http://vina.scripps.edu.
>>>>>
>>>>> ======================================================================= 
>>>>>
>>>>>
>>>>> Output will be 13-7_out.pdbqt
>>>>> Reading input ... done.
>>>>> Setting up the scoring function ... done.
>>>>> Analyzing the binding site ... done.
>>>>> Using random seed: 1828390527
>>>>> Performing search ... done.
>>>>>
>>>>> Refining results ... done.
>>>>>
>>>>> mode |   affinity | dist from best mode
>>>>>       | (kcal/mol) | rmsd l.b.| rmsd u.b.
>>>>> -----+------------+----------+----------
>>>>>     1    -8.862004149      0.000      0.000
>>>>>     2    -8.403522829      2.992      6.553
>>>>>     3    -8.401384636      2.707      5.220
>>>>>     4    -7.886402037      4.907      6.862
>>>>>     5    -7.845519031      3.233      5.915
>>>>>     6    -7.837434227      3.954      5.641
>>>>>     7    -7.834584887      3.188      7.294
>>>>>     8    -7.694395765      3.746      7.553
>>>>>     9    -7.691211177      3.536      5.745
>>>>>    10    -7.670759445      3.698      7.587
>>>>>    11    -7.661882758      4.882      7.044
>>>>>    12    -7.636280303      2.347      3.284
>>>>>    13    -7.635788052      3.511      6.250
>>>>>    14    -7.611175249      2.427      3.449
>>>>>    15    -7.586368357      2.142      2.864
>>>>>    16    -7.531307666      2.976      4.980
>>>>>    17    -7.520501084      3.085      5.775
>>>>>    18    -7.512906514      4.220      7.672
>>>>>    19    -7.307403528      3.240      4.354
>>>>>    20    -7.256063348      3.694      7.252
>>>>> Writing output ... done.
>>>>    At this point, my python script consists of only the following:
>>>>
>>>>> #!/usr/bin/env python3
>>>>> # -*- coding: utf-8 -*-
>>>>> """
>>>>>
>>>>> Created on Tue Sep 24 07:51:11 2019
>>>>>
>>>>> """
>>>>> import numpy as np
>>>>>
>>>>> data = []
>>>>>
>>>>> data = np.genfromtxt("13-7.log", usecols=(1), dtype=None,
>>>>> skip_header=27, skip_footer=1, encoding=None)
>>>>>
>>>>> print(data)
>>>>>
>>>>> np.savetxt('13-7', [data], fmt='%15.9f', header='13-7')
>>>> The problem lies in tfe np.savetxt line, on execution I get:
>>>>
>>>>> runfile('/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet/DeltaGTable_V_s.py', 
>>>>>
>>>>> wdir='/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet', 
>>>>>
>>>>> current_namespace=True)
>>>>> ['-8.839713733' '-8.743377250' '-8.151051167' '-8.090452911'
>>>>>   '-7.967494477' '-7.854890056' '-7.757417879' '-7.741557490'
>>>>>   '-7.643885488' '-7.611595767' '-7.507605524' '-7.413920814'
>>>>>   '-7.389408331' '-7.384446364' '-7.374206276' '-7.368808179'
>>>>>   '-7.346641418' '-7.325037898' '-7.309614787' '-7.113209147']
>>>>> Traceback (most recent call last):
>>>>>
>>>>>    File
>>>>> "/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet/DeltaGTable_V_s.py", 
>>>>>
>>>>> line 16, in <module>
>>>>>      np.savetxt('13-7', [data], fmt='%16.9f', header='13-7')
>>>>>
>>>>>    File "<__array_function__ internals>", line 6, in savetxt
>>>>>
>>>>>    File
>>>>> "/home/comp/Apps/Miniconda3/lib/python3.7/site-packages/numpy/lib/npyio.py", 
>>>>>
>>>>> line 1438, in savetxt
>>>>>      % (str(X.dtype), format))
>>>>>
>>>>> TypeError: Mismatch between array dtype ('<U12') and format specifier
>>>>> ('%16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f
>>>>> %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f
>>>>> %16.9f')
>>>> The data is in the data file, but the only entry in '13-7', the saved
>>>> file, is the label. Obviously, the error is in the format argument.
>>> Hi,
>>>
>>> One problem is the format: the error is telling you that you have
>>> strings in your array (compare the `'<U12'` dtype and the output of
>>> your `print(data)` call with strings inside), whereas %16.9f can only
>>> be used to format floats (f for float). You would first have to
>>> convert your array of strings to an array numbers. I don't usually use
>>> genfromtxt so I'm not sure how you can make it return floats for you
>>> in the first place, but I suspect `dtype=None` in the call to
>>> genfromtxt might be responsible. In any case making it return numbers
>>> should be the easier case.
>>> The second problem is that you should make sure you mean `[data]` in
>>> the call to savetxt. As it is now this would give you a 2d array of
>>> shape (1, 20), and the output would correspondingly contain a single
>>> row of 20 values (hence the 20 instances of '%16.9f' in the error
>>> message). In case you meant to print one value per row in a single
>>> column, you should drop the brackets around `data`:
>>> np.savetxt('13-7', data, fmt='%16.9f', header='13-7')
>>>
>>> And just a personal note, but I'd find an output file named '13-7' to
>>> be a bit surprising. Perhaps some extension or prefix would help
>>> organize these files?
>>> Regards,
>>>
>>> Andr??s
>>>
>>>> Help will be much appreciated.
>>>>
>>>> Thanks in advance.
>>>>
>>>> -- 
>>>> Stephen P. Molnar, Ph.D.
>>>> www.molecular-modeling.net
>>>> 614.312.7528 (c)
>>>> Skype:  smolnar1
>>>>
>>>> _______________________________________________
>>>> NumPy-Discussion mailing list
>>>> [hidden email]
>>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>> _______________________________________________
>> NumPy-Discussion mailing list
>> [hidden email]
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>

-- 
Stephen P. Molnar, Ph.D.
www.molecular-modeling.net
614.312.7528 (c)
Skype:  smolnar1




_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion