How to Split a File into Multiple Files In Azure Data Factory

Splitting large files into smaller, manageable chunks is a pattern every data engineer eventually needs, and Azure Data Factory (ADF) gives you a clean, low‑code way to do it. This blog post focuses on a very practical approach: using the built‑in “Max rows per file” option in a Copy activity and combining it with dynamic file naming so the outputs are both controlled and easy to work with. As Azure Data Factory is a data orchestration tool for ETL process. While working on moving data from source to target host, you may encounter issue with size limitations with your target location due to default size of file it can accommodate. This occur due to large size your Azure data factory sink to your target location.

From an engineering perspective, the core problem is straightforward: your source system (or ADF copy sink) produces a file that’s too large for the target system’s limits—maybe a downstream SFTP server, a legacy app, or a partner’s ingestion process can only handle files up to a certain size or row count. Instead of trying to manually pre‑split data or write custom code, you can let ADF partition the file for you during the copy operation.

In the Copy Data activity, under the Sink tab, ADF exposes a Max rows per file setting. This is a simple but powerful control. You specify the maximum number of rows you want in each output file—say 100,000 rows—and ADF automatically breaks the incoming dataset into multiple files that each honor that limit. For example, if your source has 1.2 million rows and you set max rows per file to 100,000, you’ll end up with around 12 output files, each roughly 100k rows, instead of a single massive file.

In this post. I will walk you through the process of automatically splitting a file into multiple files of smaller size using Azure Data Factory max rows per file option.

How to split ADF pipeline output file by row count

In your copy data activity, click sink, in Max rows per file, enter number of rows you want each file split.

How to name Azure Data Factory pipeline Max rows per file output in Azure Data Lake or SFTP Sever (Target Host/Folder)

How to use file name prefix in Azure Data Factory when exporting data into Azure Data Lake.

In Copy data activity click Sink, click File name prefix, click Add dynamic content and Enter one of these two codes as needed, then hit Ok.

@concat(replace(item().TableName,’.parquet’,”), ‘_’, ‘.parquet’)

 @concat(‘prefix_’, replace(item().name,’.csv’,”), ‘_’, formatDateTime(utcnow(),’yyyyMMdd HHmmss’), ‘.csv’)

Remember, changing file format to desire output file you want created. (.csv, .parquet and so on).

Azure Data Factory output Sink files File name prefix will display as highlighted below with your specific file name.

I use Azure Data Factory’s built‑in row-based file splitting and dynamic file naming so I don’t have to maintain custom splitting code. That keeps the pipelines simpler, cuts down on operational issues, and lets me handle large files in a clean, repeatable way using what the platform already does well.

For more Azure Data Factory practical knowledge subscribe to our blog to be notified on future posts.