1 year ago

#338645

test-img

Rishabhg

KeyError: 'PYSPARK_GATEWAY_SECRET' when creating spark context inside aws lambda code

I have deployed a lambda function which uses sparknlp, as a docker container. For working with sparknlp I need spark context. So, In my sparknlp code, I start with

sc = pyspark.SparkContext().getOrCreate()

I tested my lambda on local and it worked fine.

On aws I got this error :

java gateway process exited before sending its port number

even though JAVA_HOME was properly set.

I found out in the source code : https://github.com/apache/spark/blob/master/python/pyspark/java_gateway.py in the launch_gateway method that it tries to create a temporary file and if that file is not created it raises the above error. (line 105)

Lambda won't allow write access to file system, so the file cannot be created.

So I am trying to pass gateway_port and gateway_secret as environment variables.

I have kept the PYSPARK_GATEWAY_PORT=25333 which is the default value. I am not able to figure out how to get the PYSPARK_GATEWAY_SECRET.

Which is why getting the error :

KeyError: 'PYSPARK_GATEWAY_SECRET'

apache-spark

pyspark

aws-lambda

py4j

johnsnowlabs-spark-nlp

0 Answers

Your Answer

Accepted video resources