Np.genfromtxt Problem

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Np.genfromtxt Problem

Stephen P. Molnar

I have a snippet of code

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""

Created on Tue Sep 24 07:51:11 2019

"""
import numpy as np

files = []

data = np.genfromtxt(files, usecols=(3), dtype=None, skip_header=8,
skip_footer=1, encoding=None)

print(data)


If file is a single file the code generates the data that I want.
However I have a list of files that I want to process. According to
numpy.genfromtxt fname can be a "File, filename, list, or generator to
read."  If I use [13-7a_apo-1acl.RMSD    13-7_apo-1acl.RMSD
14-7_apo-1acl.RMSD    15-7_apo-1acl.RMSD    17-7_apo-1acl.RMSD ] get the
error:

runfile('/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/Results/RMSDTable/Test/DeltaGTable_s.py',
wdir='/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/Results/RMSDTable/Test',
current_namespace=True)
Traceback (most recent call last):

   File
"/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/Results/RMSDTable/Test/DeltaGTable_s.py",
line 12, in <module>
     data = np.genfromtxt(files, usecols=(3), dtype=None, skip_header=8,
skip_footer=1, encoding=None)

   File
"/home/comp/Apps/Miniconda3/lib/python3.7/site-packages/numpy/lib/npyio.py",
line 1762, in genfromtxt
     next(fhd)

StopIteration

I have tried very combination of search terms that I can think of in
order to find an example of how to make this work without success.

How can I make this work?

Thanks in advance.

--
Stephen P. Molnar, Ph.D.
www.molecular-modeling.net
614.312.7528 (c)
Skype:  smolnar1

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Np.genfromtxt Problem

Andras Deak
On Fri, Oct 4, 2019 at 7:31 PM Stephen P. Molnar <[hidden email]> wrote:

>
>
> I have a snippet of code
>
> #!/usr/bin/env python3
> # -*- coding: utf-8 -*-
> """
>
> Created on Tue Sep 24 07:51:11 2019
>
> """
> import numpy as np
>
> files = []
>
> data = np.genfromtxt(files, usecols=(3), dtype=None, skip_header=8,
> skip_footer=1, encoding=None)
>
> print(data)
>
>
> If file is a single file the code generates the data that I want.
> However I have a list of files that I want to process. According to
> numpy.genfromtxt fname can be a "File, filename, list, or generator to
> read."  If I use [13-7a_apo-1acl.RMSD    13-7_apo-1acl.RMSD
> 14-7_apo-1acl.RMSD    15-7_apo-1acl.RMSD    17-7_apo-1acl.RMSD ] get the
> error:

Hi Stephen,

As far as I know genfromtxt is designed to read the contents of a
single file. Consider this quote from the docs for the first
parameter:
"The strings in a list or produced by a generator are treated as lines."
And the general description of the function says
"Load data from a text file, with missing values handled as specified."
("a text file", singular)
So if I understand correctly the list case is there so that you can
pass `f.readlines()` or equivalent into genfromtxt. From a
higher-level standpoint, how would reading multiple files behave if
the files have different structure, and what type and shape should the
function return in that case?
If one file can be read just fine then I suggest looping over them to
read each, one after the other. You can then tell python what to do
with each returned array and so it doesn't have to guess.
Regards,

András




>
> runfile('/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/Results/RMSDTable/Test/DeltaGTable_s.py',
> wdir='/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/Results/RMSDTable/Test',
> current_namespace=True)
> Traceback (most recent call last):
>
>    File
> "/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/Results/RMSDTable/Test/DeltaGTable_s.py",
> line 12, in <module>
>      data = np.genfromtxt(files, usecols=(3), dtype=None, skip_header=8,
> skip_footer=1, encoding=None)
>
>    File
> "/home/comp/Apps/Miniconda3/lib/python3.7/site-packages/numpy/lib/npyio.py",
> line 1762, in genfromtxt
>      next(fhd)
>
> StopIteration
>
> I have tried very combination of search terms that I can think of in
> order to find an example of how to make this work without success.
>
> How can I make this work?
>
> Thanks in advance.
>
> --
> Stephen P. Molnar, Ph.D.
> www.molecular-modeling.net
> 614.312.7528 (c)
> Skype:  smolnar1
>
> _______________________________________________
> NumPy-Discussion mailing list
> [hidden email]
> https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Np.genfromtxt Problem

Derek Homeier
On 5 Oct 2019, at 12:15 am, Andras Deak <[hidden email]> wrote:

>
> On Fri, Oct 4, 2019 at 7:31 PM Stephen P. Molnar <[hidden email]> wrote:
>>
>>
>> I have a snippet of code
>>
>> #!/usr/bin/env python3
>> # -*- coding: utf-8 -*-
>> """
>>
>> Created on Tue Sep 24 07:51:11 2019
>>
>> """
>> import numpy as np
>>
>> files = []
>>
>> data = np.genfromtxt(files, usecols=(3), dtype=None, skip_header=8,
>> skip_footer=1, encoding=None)
>>
>> print(data)
>>
>>
>> If file is a single file the code generates the data that I want.
>> However I have a list of files that I want to process. According to
>> numpy.genfromtxt fname can be a "File, filename, list, or generator to
>> read."  If I use [13-7a_apo-1acl.RMSD    13-7_apo-1acl.RMSD
>> 14-7_apo-1acl.RMSD    15-7_apo-1acl.RMSD    17-7_apo-1acl.RMSD ] get the
>> error:
>
> Hi Stephen,
>
> As far as I know genfromtxt is designed to read the contents of a
> single file. Consider this quote from the docs for the first
> parameter:
> "The strings in a list or produced by a generator are treated as lines."
> And the general description of the function says
> "Load data from a text file, with missing values handled as specified."
> ("a text file", singular)
> So if I understand correctly the list case is there so that you can
> pass `f.readlines()` or equivalent into genfromtxt. From a
> higher-level standpoint, how would reading multiple files behave if
> the files have different structure, and what type and shape should the
> function return in that case?
> If one file can be read just fine then I suggest looping over them to
> read each, one after the other. You can then tell python what to do
> with each returned array and so it doesn't have to guess.

The above is correct in that genfromtxt expects a single file or file-like object.
That said, assuming all input files have compatible format (i.e. identical no. of
columns with matching dtypes), which really is the only case that would make
sense to pass to genfromtxt, you could try creating a pipe to concatenate all
input files into a single object. Something like this might work:

fobj = os.popen('cat 1[3457]-7a_apo-1acl.RMSD’)
data = np.genfromtxt(fobj, usecols=(3), dtype=None, …)

However the multiple headers and footers in your concatenated file may cause
trouble here - maybe you find a way to remove them in the popen call with some
'[e]grep -v’ artistry. Depending on this, the loop over input files might be the easier
solution.

HTH,
                                        Derek

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Np.genfromtxt Problem

Stefan van der Walt
In reply to this post by Stephen P. Molnar
On Fri, Oct 4, 2019, at 10:31, Stephen P. Molnar wrote:
> data = np.genfromtxt(files, usecols=(3), dtype=None, skip_header=8,
> skip_footer=1, encoding=None)

This seems like a good use case for `dask.dataframe.read_csv` [0].

Stéfan

[0] https://examples.dask.org/dataframes/01-data-access.html#Read-CSV-files
_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion
Reply | Threaded
Open this post in threaded view
|

Re: Np.genfromtxt Problem

Stephen P. Molnar


On 10/04/2019 08:19 PM, Stefan van der Walt wrote:

> On Fri, Oct 4, 2019, at 10:31, Stephen P. Molnar wrote:
>> data = np.genfromtxt(files, usecols=(3), dtype=None, skip_header=8,
>> skip_footer=1, encoding=None)
> This seems like a good use case for `dask.dataframe.read_csv` [0].
>
> St??fan
>
> [0] https://examples.dask.org/dataframes/01-data-access.html#Read-CSV-files
> _______________________________________________
> NumPy-Discussion mailing list
> [hidden email]
> https://mail.python.org/mailman/listinfo/numpy-discussion

I appreciate the responses that I've received.

I feel that I must apologize for the one important fact it would appear
I railed to mention - all of the files that I wish to process are identical.

--
Stephen P. Molnar, Ph.D.
www.molecular-modeling.net
614.312.7528 (c)
Skype:  smolnar1

_______________________________________________
NumPy-Discussion mailing list
[hidden email]
https://mail.python.org/mailman/listinfo/numpy-discussion