1 year ago

#332404

test-img

user19930511

pyspark.pandas.exceptions.PandasNotImplementedError: The method `pd.Series.__iter__()` is not implemented

I am trying to replace pandas library with pyspark.pandas library.

I tried this : NOTE : df is pyspark.pandas dataframe

import pyspark.pandas as pd 
print(set(df["horizon"].unique()))

But got the below error :

   print(set(df["horizon"].unique()))
  File "C:\Users\abc\Anaconda3\envs\env1\lib\site-packages\pyspark\pandas\series.py", line 6328, in __iter__
    return MissingPandasLikeSeries.__iter__(self)
  File "C:\Users\abc\Anaconda3\envs\env1\lib\site-packages\pyspark\pandas\missing\__init__.py", line 24, in unsupported_function
    class_name=class_name, method_name=method_name, reason=reason
pyspark.pandas.exceptions.PandasNotImplementedError: The method `pd.Series.__iter__()` is not implemented. If you want to collect your data as an NumPy array, use 'to_numpy()' instead.

python

pandas

dataframe

apache-spark

pyspark-pandas

0 Answers

Your Answer

Accepted video resources