File name for pipeline

Is there a way to have the the name of the file that the pipeline is executing?

As an example I have a lot of files with a pattern like: file_name__.gz
depending on the ID and date value I need to enhance the data loaded trough the pipeline.

I was thinking to create multiple pipelines but the variation is quite large.

1 Like

Yes! In 6.8 and later, you can use pipeline_source_file() builtin in the set clause. For instance, if you want to set a column in your table called source_file to the source file, you can do

create table t(col1 bigint, col2 bigint, col3 bigint, source_file blob);

create pipeline p
as load data fs '/path/to/data'
into table t(col_1, col_2, col_3)
set source_file = pipeline_source_file();
1 Like