1 year ago
#357118
lok6666
Error shows up when using df.to_parquet("filename")
I want to save the data set as a parquet file, called power.parquet, and I use df.to_parquet(<filename>). But it gives me this errer "ValueError: Error converting column "Global_reactive_power" to bytes using encoding UTF8. Original error: bad argument type for built-in operation" And I installed the fastparquet package.
from fastparquet import write, ParquetFile
dat.to_parquet("power.parquet")
df_parquet = ParquetFile("power.parquet").to_pandas()
df_parquet.head() # Test your final value
`*Traceback (most recent call last):
File "/opt/anaconda3/lib/python3.9/site-packages/fastparquet/writer.py", line 259, in convert
out = array_encode_utf8(data)
File "fastparquet/speedups.pyx", line 50, in fastparquet.speedups.array_encode_utf8
TypeError: bad argument type for built-in operation
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/var/folders/4f/bm2th1p56tz4rq_zffc8g3940000gn/T/ipykernel_85477/3080656655.py", line 1, in <module>
dat.to_parquet("power.parquet", compression="GZIP")
File "/opt/anaconda3/lib/python3.9/site-packages/dask/dataframe/core.py", line 4560, in to_parquet
return to_parquet(self, path, *args, **kwargs)
File "/opt/anaconda3/lib/python3.9/site-packages/dask/dataframe/io/parquet/core.py", line 732, in to_parquet
return compute_as_if_collection(
File "/opt/anaconda3/lib/python3.9/site-packages/dask/base.py", line 315, in compute_as_if_collection
return schedule(dsk2, keys, **kwargs)
File "/opt/anaconda3/lib/python3.9/site-packages/dask/threaded.py", line 79, in get
results = get_async(
File "/opt/anaconda3/lib/python3.9/site-packages/dask/local.py", line 507, in get_async
raise_exception(exc, tb)
File "/opt/anaconda3/lib/python3.9/site-packages/dask/local.py", line 315, in reraise
raise exc
File "/opt/anaconda3/lib/python3.9/site-packages/dask/local.py", line 220, in execute_task
result = _execute_task(task, data)
File "/opt/anaconda3/lib/python3.9/site-packages/dask/core.py", line 119, in _execute_task
return func(*(_execute_task(a, cache) for a in args))
File "/opt/anaconda3/lib/python3.9/site-packages/dask/utils.py", line 35, in apply
return func(*args, **kwargs)
File "/opt/anaconda3/lib/python3.9/site-packages/dask/dataframe/io/parquet/fastparquet.py", line 1167, in write_partition
rg = make_part_file(
File "/opt/anaconda3/lib/python3.9/site-packages/fastparquet/writer.py", line 716, in make_part_file
rg = make_row_group(f, data, schema, compression=compression,
File "/opt/anaconda3/lib/python3.9/site-packages/fastparquet/writer.py", line 701, in make_row_group
chunk = write_column(f, coldata, column,
File "/opt/anaconda3/lib/python3.9/site-packages/fastparquet/writer.py", line 554, in write_column
repetition_data, definition_data, encode[encoding](data, selement), 8 * b'\x00'
File "/opt/anaconda3/lib/python3.9/site-packages/fastparquet/writer.py", line 354, in encode_plain
out = convert(data, se)
File "/opt/anaconda3/lib/python3.9/site-packages/fastparquet/writer.py", line 284, in convert
raise ValueError('Error converting column "%s" to bytes using '
ValueError: Error converting column "Global_reactive_power" to bytes using encoding UTF8. Original error: bad argument type for built-in operation
*
I tried by adding object_coding = "bytes".I want to solve this problem.
python
dataframe
data-science
parquet
writefile
0 Answers
Your Answer