[EDP] Allow default parameter names for input/output to be overridden per job
Sahara is hardwired to generate "INPUT" and "OUTPUT" parameters for Pig and Hive jobs in an Oozie workflow. We may want to add the ability to override those default parameter names.
For example, a case was recently seen where a previously existing Pig script had been written to expect "$input" and "$output" symbol names at runtime. This script could not be run without modification from Sahara (unless additional "input" and "output" parameters were added to the job which specified the paths explicitly instead of indirectly via data sources)
Potentially, we could allow "edp.input_name" and "edp.output_name" config parameters to be added to a job, which would specify the parameter names to use for input and output. The path value would still be take from the data source.
Blueprint information
- Status:
- Not started
- Approver:
- Sergey Lukjanov
- Priority:
- Undefined
- Drafter:
- Trevor McKay
- Direction:
- Needs approval
- Assignee:
- None
- Definition:
- New
- Series goal:
- None
- Implementation:
-
Unknown
- Milestone target:
- None
- Started by
- Completed by