1 year ago
#122829

Murli Krishnan
Apache Beam Python - SQL Transform with named PCollection Issue
I am trying to execute the below code in which I am using Named Tuple for PCollection and SQL transform for doing a simple select.
As per the video link (4:06) : https://www.youtube.com/watch?v=zx4p-UNSmrA.
Instead of using PCOLLECTION in SQLTransform query, named PCollections can also be provided as below.
Code Block
class EmployeeType(typing.NamedTuple):
name:str
age:int
beam.coders.registry.register_coder(EmployeeType, beam.coders.RowCoder)
pcol = p | "Create" >> beam.Create([EmployeeType(name="ABC", age=10)]).with_output_types(EmployeeType)
(
{'a':pcol} | SqlTransform(
""" SELECT age FROM a """)
| "Map" >> beam.Map(lambda row: row.age)
| "Print" >> beam.Map(print)
)
p.run()
However the below code block errors out with error
Caused by: org.apache.beam.vendor.calcite.v1_28_0.org.apache.calcite.sql.validate.SqlValidatorException: Object 'a' not found
Apache Beam SDK used is 2.35.0, are there any known limitation in using named PCollection
google-cloud-dataflow
apache-beam
dataflow
apache-beam-internals
0 Answers
Your Answer