1 year ago

#357118

test-img

lok6666

Error shows up when using df.to_parquet("filename")

I want to save the data set as a parquet file, called power.parquet, and I use df.to_parquet(<filename>). But it gives me this errer "ValueError: Error converting column "Global_reactive_power" to bytes using encoding UTF8. Original error: bad argument type for built-in operation" And I installed the fastparquet package.

from fastparquet import write, ParquetFile

dat.to_parquet("power.parquet")

df_parquet = ParquetFile("power.parquet").to_pandas()

df_parquet.head() # Test your final value

`*Traceback (most recent call last):

  File "/opt/anaconda3/lib/python3.9/site-packages/fastparquet/writer.py", line 259, in convert
    out = array_encode_utf8(data)

  File "fastparquet/speedups.pyx", line 50, in fastparquet.speedups.array_encode_utf8

TypeError: bad argument type for built-in operation


During handling of the above exception, another exception occurred:

Traceback (most recent call last):

  File "/var/folders/4f/bm2th1p56tz4rq_zffc8g3940000gn/T/ipykernel_85477/3080656655.py", line 1, in <module>
    dat.to_parquet("power.parquet", compression="GZIP")

  File "/opt/anaconda3/lib/python3.9/site-packages/dask/dataframe/core.py", line 4560, in to_parquet
    return to_parquet(self, path, *args, **kwargs)

  File "/opt/anaconda3/lib/python3.9/site-packages/dask/dataframe/io/parquet/core.py", line 732, in to_parquet
    return compute_as_if_collection(

  File "/opt/anaconda3/lib/python3.9/site-packages/dask/base.py", line 315, in compute_as_if_collection
    return schedule(dsk2, keys, **kwargs)

  File "/opt/anaconda3/lib/python3.9/site-packages/dask/threaded.py", line 79, in get
    results = get_async(

  File "/opt/anaconda3/lib/python3.9/site-packages/dask/local.py", line 507, in get_async
    raise_exception(exc, tb)

  File "/opt/anaconda3/lib/python3.9/site-packages/dask/local.py", line 315, in reraise
    raise exc

  File "/opt/anaconda3/lib/python3.9/site-packages/dask/local.py", line 220, in execute_task
    result = _execute_task(task, data)

  File "/opt/anaconda3/lib/python3.9/site-packages/dask/core.py", line 119, in _execute_task
    return func(*(_execute_task(a, cache) for a in args))

  File "/opt/anaconda3/lib/python3.9/site-packages/dask/utils.py", line 35, in apply
    return func(*args, **kwargs)

  File "/opt/anaconda3/lib/python3.9/site-packages/dask/dataframe/io/parquet/fastparquet.py", line 1167, in write_partition
    rg = make_part_file(

  File "/opt/anaconda3/lib/python3.9/site-packages/fastparquet/writer.py", line 716, in make_part_file
    rg = make_row_group(f, data, schema, compression=compression,

  File "/opt/anaconda3/lib/python3.9/site-packages/fastparquet/writer.py", line 701, in make_row_group
    chunk = write_column(f, coldata, column,

  File "/opt/anaconda3/lib/python3.9/site-packages/fastparquet/writer.py", line 554, in write_column
    repetition_data, definition_data, encode[encoding](data, selement), 8 * b'\x00'

  File "/opt/anaconda3/lib/python3.9/site-packages/fastparquet/writer.py", line 354, in encode_plain
    out = convert(data, se)

  File "/opt/anaconda3/lib/python3.9/site-packages/fastparquet/writer.py", line 284, in convert
    raise ValueError('Error converting column "%s" to bytes using '

ValueError: Error converting column "Global_reactive_power" to bytes using encoding UTF8. Original error: bad argument type for built-in operation

*

I tried by adding object_coding = "bytes".I want to solve this problem.

python

dataframe

data-science

parquet

writefile

0 Answers

Your Answer

Accepted video resources