1 year ago

#385030

test-img

jharna

cant't figure out how to fix the error ''An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe.''

How to fix this error when I write the mapping function in pyspark. The error I get is (Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe.)

def user_rep(x):
       acc = str(x[x.find('" Id="')+6:x.find('" LastAccessDate=')].encode('utf-8'))
    
    if x.find('Reputation="')==-1:
        return [acc,0]
    else:
        rep = int(x[x.find('Reputation="')+12:x.find('" UpVotes=')])
        return [acc,rep]
lines_user = sc.textFile(localpath('/spark-stats-data/allUsers/'))
rows_user=lines_user.filter(lambda x: isrow(x)) 
reputation = rows_user.map(user_rep)
reputation.count()

reputation_user = reputation.map(lambda acc,rep:(rep,acc)).sortByKey(ascending=False)

dictionary

pyspark

syntax

rdd

0 Answers

Your Answer

Accepted video resources