1 year ago

#243734

test-img

Ben Konz

Python Beam SqlTransform unknown coder exception when getting data from PTransform

There's this PTransform that is mapping data to a beam.Row:

class MapToBeamRow(beam.PTransform):
    def expand(self, pcoll: PCollection[Any]) -> PCollection[beam.Row]:
        return (
            pcoll
            | beam.Map(lambda x: beam.Row(foo='foo'))
        )

and a PTransform that is using that beam.Row to apply a SqlTransform

class FilterValuesInSegment(beam.PTransform):
    def __init__(self, where_clause: str):
        self.where_clause = where_clause

    def expand(self, pcoll: PCollection[beam.Row]) -> PCollection[Any]:
        return (
                pcoll
                # | beam.Map(lambda x: beam.Row(foo='foo'))
                | SqlTransform("SELECT * FROM PCOLLECTION")
        )

it's being called in a unit test via:

with TestPipeline() as p:
  p
  | beam.Create(records)
  | MapToBeamRow()
  | FilterValuesInSegment()
  | beam.Map(print)

running this code produces a java.lang.IllegalArgumentException: Unknown Coder URN beam:coder:pickled_python:v1. Known URNs: [...] exception

the strange thing is that if I uncomment the line in FilterValuesInSegment, the code does work. Why is it running into a coder exception?

python

apache-beam

beam-sql

0 Answers

Your Answer

Accepted video resources