Understanding DFSORT Control Statements: Syntax and Usage

Introduction:

In the world of mainframe computing, sorting and manipulating large volumes of data efficiently is a critical task. One of the most powerful and widely used tools for sorting on IBM mainframes is DFSORT. DFSORT, which stands for Data Facility Sort, is a utility provided by IBM that offers a comprehensive set of capabilities for sorting, merging, and manipulating data sets.

In this blog post, we will explore the syntax and usage of DFSORT control statements. We will start with the basics and gradually progress to

DFSORT Control Statement
                          DFSORT Control Statement

intermediate and expert-level concepts. Along the way, we will provide practical examples to illustrate the usage of DFSORT and demonstrate its versatility.

Basic Syntax:

DFSORT control statements are used to define the sorting and manipulation operations to be performed on the input data set. These control statements are written in a special language known as the DFSORT Control Language (DFCL). The general syntax of a DFSORT control statement is as follows:

//SYSIN DD * DFCL statements /*

The control statements are enclosed between the //SYSIN DD * and /* delimiters. The actual DFCL statements are placed between these delimiters.

SORT Statement:

The SORT statement is the fundamental control statement used to initiate a sort operation. It specifies the input and output data sets, as well as any additional sorting options. Let’s consider a simple example to demonstrate the SORT statement:

//SYSIN DD * SORT FIELDS=(1,10,CH,A) /*

In the above example, we initiate a sort operation by using the SORT statement. The FIELDS parameter specifies the fields to be sorted, and the sorting order is specified by the A (ascending) option. In this case, the input data set will be sorted based on the first 10 characters in ascending order.

SORT Examples:

Now, let’s explore some more examples to illustrate the usage of DFSORT and its various features.

Example 1: Sorting a Dataset

Suppose we have a dataset called INPUT.DATA that contains records with a fixed-length format. We want to sort the records based on the values in a particular field. Here’s an example control statement to achieve that:

//SYSIN DD * SORT FIELDS=(20,5,PD,A) INREC BUILD=(1,10,21,5) OUTFIL FNAMES=OUTPUT.DATA /*

In the above example, we specify that the sorting should be done based on the 20th to 24th bytes of each record, assuming a packed decimal (PD) format. The INREC statement is used to reformat the input records before sorting. It extracts the first 10 bytes and the 21st to 25th bytes from each input record and combines them into the output record. The OUTFIL statement specifies the output data set name.

Example 2: Merging Multiple Datasets

DFSORT also provides the capability to merge multiple data sets into a single sorted output data set. Let’s consider an example where we want to merge three input data sets, INFILE1, INFILE2, and INFILE3, into a single sorted output data set called MERGED.OUT:

//SYSIN DD * SORT FIELDS=(1,10,CH,A) MERGE INFILE1,INFILE2,INFILE3 OUTFIL FNAMES=MERGED.OUT /*

In the above example, the MERGE statement is used to specify the input data sets to be merged. We specify the sorting based on the first 10 characters of each record in ascending order. The resulting merged and sorted output will be written to the data set named MERGED.OUT.

Intermediate-Level Concepts:

As we progress to the intermediate level, let’s explore some advanced concepts and techniques that DFSORT offers.

Conditional Sorting:

DFSORT allows us to perform conditional sorting based on specific criteria. We can specify multiple sorting fields and conditional expressions to control the sorting order. Here’s an example:

//SYSIN DD * SORT FIELDS=(1,5,CH,A,6,10,CH,D) INCLUDE COND=(1,5,CH,EQ,C'OPENA') /*

In the above example, we sort the records based on two fields: first 5 bytes in ascending order and the next 10 bytes in descending order. Additionally, we use the INCLUDE COND statement to specify that only records where the first 5 characters equal ‘OPENA’ should be included in the sort operation.

Advanced Usage:

At the expert level, DFSORT offers a wide range of features and options for sophisticated data manipulation tasks. Let’s explore a couple of examples to highlight these capabilities.

Example 1: Generating a Report with Breaks

Suppose we have a dataset containing sales data for different regions, and we want to generate a report with subtotals and grand totals for each region. DFSORT provides the BREAK parameter to achieve this. Here’s an example control statement:

//SYSIN DD * SORT FIELDS=(1,10,CH,A) SUM FIELDS=(20,5,ZD) OUTREC IFTHEN=(WHEN=GROUP, BEGIN=(1,10,CH,EQ,C' '), PUSH=(30:TOT=(20,5,ZD))) OUTFIL FNAMES=REPORT,OUTREC=(1,30,40:30,5,ZD,TO=ZD,LENGTH=5) /*

In the above example, we sort the records based on the first 10 characters and calculate the sum of the values in the 20th to 24th bytes. The OUTREC statement is used to insert a subtotal field (sum) after each region group. Finally, the OUTFIL statement specifies the output data set name and formats the report layout.

Example 2: Joining Data Sets

DFSORT provides the JOINKEYS statement to perform sophisticated data set joins. Let’s consider an example where we have two input data sets, INFILE1 and INFILE2, and we want to join them based on a common key field:

//SYSIN DD * JOINKEYS FILE=F1,FIELDS=(1,10,A) JOINKEYS FILE=F2,FIELDS=(1,10,A) SORT FIELDS=COPY OUTFIL FNAMES=JOINED.OUT /*

In the above example, we use the JOINKEYS statement to specify the key field for each input file. The SORT statement with FIELDS=COPY performs a simple copy operation. The resulting joined output will be written to the data set named JOINED.OUT.

Conclusion:

DFSORT is a powerful and versatile tool for sorting, merging, and manipulating large volumes of data on IBM mainframes. In this blog post, we covered the basics of DFSORT control statements and gradually progressed to intermediate and expert-level concepts.

We explored various examples to illustrate the syntax and usage of DFSORT, including sorting data sets, merging multiple data sets, conditional sorting, generating reports with breaks, and performing data set joins. These examples showcase the flexibility and advanced capabilities of DFSORT for efficiently manipulating mainframe data.

DFSORT continues to be widely used in the mainframe industry due to its performance, scalability, and rich feature set. By understanding DFSORT control statements and their usage, mainframe professionals can leverage this powerful tool to efficiently process and analyze large volumes of data.

References:

  1. IBM DFSORT Application Programming Guide
  2. https://zmainframes.com/ (Example source: https://www.zmainframes.com/viewforum.php?f=28)