python (65.1k questions)
javascript (44.2k questions)
reactjs (22.7k questions)
java (20.8k questions)
c# (17.4k questions)
html (16.3k questions)
r (13.7k questions)
android (12.9k questions)
How can I read a large number of files with Pandas?
Number of file : 894
total file size : 22.2GB
I have to do machine learning by reading many csv files. There is not enough memory to read at once.
HOON
Votes: 0
Answers: 2
Converting Spark dataframe to Dask dataframe
First, I'm writing Spark df named calendar as a parquet file named cal.
calendar.write.parquet("/user/vusal.babashov/dataset/cal", mode="overwrite")
Then, I'm copying it from Hadoo...
Vusal
Votes: 0
Answers: 1
does dask compute store results?
Consider the following code
import dask
import dask.dataframe as dd
import pandas as pd
data_dict = {'data1':[1,2,3,4,5,6,7,8,9,10]}
df_pd = pd.DataFrame(data_dict)
df_dask = dd.from_pandas(df...
NNN
Votes: 0
Answers: 2
Using dask to return more than one dataframe
I am using read_csv() to read a long list of csv files and return two dataframes.
I have managed to speed up this action by using dask. Unfortunately, I have not been able to return multiple variables...
Tanjil
Votes: 0
Answers: 1