1 year ago
#385030
jharna
cant't figure out how to fix the error ''An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe.''
How to fix this error when I write the mapping function in pyspark. The error I get is (Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe.)
def user_rep(x):
acc = str(x[x.find('" Id="')+6:x.find('" LastAccessDate=')].encode('utf-8'))
if x.find('Reputation="')==-1:
return [acc,0]
else:
rep = int(x[x.find('Reputation="')+12:x.find('" UpVotes=')])
return [acc,rep]
lines_user = sc.textFile(localpath('/spark-stats-data/allUsers/'))
rows_user=lines_user.filter(lambda x: isrow(x))
reputation = rows_user.map(user_rep)
reputation.count()
reputation_user = reputation.map(lambda acc,rep:(rep,acc)).sortByKey(ascending=False)
dictionary
pyspark
syntax
rdd
0 Answers
Your Answer