You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The program takes some .csv database, performs computational manipulations with them, and then saves the resulting database using the function df.to_csv(filename, index=False, mode='a', compression="gzip")
When reading back the uploaded file in Python, the column types that the dataframe had in the loop should be preserved.
How can I read a large file in one dataframe?
I do this using the following functions:
Traceback (most recent call last):
File "c:/Users/proeh/Downloads/tt/repo/main_programm.py", line 254, in <module>
df = dd.read_csv("./genfiles/13_Apr_2022_17_18_04.gz")
File "C:\Users\proeh\AppData\Local\Programs\Python\Python35\lib\site-packages\dask\dataframe\io\csv.py", line 578, in read
**kwargs
File "C:\Users\proeh\AppData\Local\Programs\Python\Python35\lib\site-packages\dask\dataframe\io\csv.py", line 444, in read_pandas
head = reader(BytesIO(b_sample), **kwargs)
File "C:\Users\proeh\AppData\Local\Programs\Python\Python35\lib\site-packages\pandas\io\parsers.py", line 702, in parser_f
return _read(filepath_or_buffer, kwds)
File "C:\Users\proeh\AppData\Local\Programs\Python\Python35\lib\site-packages\pandas\io\parsers.py", line 429, in _read
parser = TextFileReader(filepath_or_buffer, **kwds)
File "C:\Users\proeh\AppData\Local\Programs\Python\Python35\lib\site-packages\pandas\io\parsers.py", line 895, in __init__
self._make_engine(self.engine)
File "C:\Users\proeh\AppData\Local\Programs\Python\Python35\lib\site-packages\pandas\io\parsers.py", line 1122, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "C:\Users\proeh\AppData\Local\Programs\Python\Python35\lib\site-packages\pandas\io\parsers.py", line 1853, in __init__
self._reader = parsers.TextReader(src, **kwds)
File "pandas\_libs\parsers.pyx", line 542, in pandas._libs.parsers.TextReader.__cinit__
File "pandas\_libs\parsers.pyx", line 782, in pandas._libs.parsers.TextReader._get_header
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte
Perhaps the problem is due to saving data, but the main thing is that the data can then be read and enjoy all the benefits of dataframes with a large amount of data (about a million records)
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
The program takes some .csv database, performs computational manipulations with them, and then saves the resulting database using the function
df.to_csv(filename, index=False, mode='a', compression="gzip")
When reading back the uploaded file in Python, the column types that the dataframe had in the loop should be preserved.
How can I read a large file in one dataframe?
I do this using the following functions:
But the program gives an error
Perhaps the problem is due to saving data, but the main thing is that the data can then be read and enjoy all the benefits of dataframes with a large amount of data (about a million records)
Beta Was this translation helpful? Give feedback.
All reactions