1 year ago
#332404
user19930511
pyspark.pandas.exceptions.PandasNotImplementedError: The method `pd.Series.__iter__()` is not implemented
I am trying to replace pandas library with pyspark.pandas library.
I tried this : NOTE : df is pyspark.pandas dataframe
import pyspark.pandas as pd
print(set(df["horizon"].unique()))
But got the below error :
print(set(df["horizon"].unique()))
File "C:\Users\abc\Anaconda3\envs\env1\lib\site-packages\pyspark\pandas\series.py", line 6328, in __iter__
return MissingPandasLikeSeries.__iter__(self)
File "C:\Users\abc\Anaconda3\envs\env1\lib\site-packages\pyspark\pandas\missing\__init__.py", line 24, in unsupported_function
class_name=class_name, method_name=method_name, reason=reason
pyspark.pandas.exceptions.PandasNotImplementedError: The method `pd.Series.__iter__()` is not implemented. If you want to collect your data as an NumPy array, use 'to_numpy()' instead.
python
pandas
dataframe
apache-spark
pyspark-pandas
0 Answers
Your Answer