1 year ago
#388842
DarkLeafyGreen
How to specify input locations parameter in dataflow pipeline?
I have a Dataflow template, that I originally created via Dataprep. Now I want to move away from Dataprep and use just Dataflow and schedule jobs using Cloud Scheduler. In GC console in Dataflow -> Jobs -> Select Job is a feature "Import as pipeline", which I use to create a batch job pipeline.
In the multistep form I cannot get past specifying the input locations:
It wants me to specify locations matching this regex:
[ \t\n\x0B\f\r]*\{[ \t\n\x0B\f\r]*((.|\r|\n)*".*"[ \t\n\x0B\f\r]*:[ \t\n\x0B\f\r]*".*"(.|\r|\n)*){0}[ \t\n\x0B\f\r]*\}[ \t\n\x0B\f\r]*
I tried with:
{
"location1": "project:bq_dataset.bq_table1",
"location10": "project:bq_dataset.bq_table10",
"location17": "project:bq_dataset.bq_table17"
}
https://regex101.com/r/rTUfHH/1
In fact as others pointed out in the comments, it seems that I can only input {}
Any ideas why that is?
More details:
The flow is quite simple:
Data is loaded from GCS, transformed and put into BigQuery.
The loading of data is parameterized:
Here is the dataflow template: https://gist.github.com/arturozz/68fc482dba53ee2ab45f08b768e2d7cb
regex
google-cloud-dataflow
0 Answers
Your Answer