How to Split a File into Multiple Files In Azure Data Factory

Azure Data Factory is a data orchestration tool for ETL process. While working on moving data from source to target host, you may encounter issue with size limitations with your target location due to default size of file it can accommodate. This occur due to large size your Azure data factory sink to your target location.

In this post. I will walk you through the process of automatically splitting a file into multiple files of smaller size using Azure Data Factory max rows per file option.

How to split ADF pipeline output file by row count

In your copy data activity, click sink, in Max rows per file, enter number of rows you want each file split.

How to name Azure Data Factory pipeline Max rows per file output in Azure Data Lake or SFTP Sever (Target Host/Folder)

How to use file name prefix in Azure Data Factory when exporting data into Azure Data Lake.

In Copy data activity click Sink, click File name prefix, click Add dynamic content and Enter one of these two codes as needed, then hit Ok.

@concat(replace(item().TableName,’.parquet’,”), ‘_’, ‘.parquet’)

 @concat(‘prefix_’, replace(item().name,’.csv’,”), ‘_’, formatDateTime(utcnow(),’yyyyMMdd HHmmss’), ‘.csv’)

Remember, changing file format to desire output file you want created. (.csv, .parquet and so on).

Azure Data Factory output Sink files File name prefix will display as highlighted below with your specific file name.

For more Azure Data Factory practical knowledge subscribe to our blog to be notified on future posts.