Outputting data and creating new datasets

There are usually two options when producing data from a program, you can either create a new data set / Unix file to send the data to, or add it to an existing file.

Outputting to existing datasets

When the DISP parameter in your DD statementis set to OLD, you have write access to the data set you have specified, and the outputted data will overwrite or replace the data set specified. If the DISP parameter is set to MOD, the data will be appended at the end of the current data set, instead of overwriting it.

Overwrite

This dataset will be completely overwritten

Append

This dataset will be appended to

Not enough space

A common issue when outputting data is that the data set we are trying to add to or overwrite might not have enough space to hold our data. This would cause the job to fail with the error SB37

Creating a new dataset

If you want to create a new file, you can specify the name of it with the DSN parameter and use the (NEW,CATLG) value for the DISP parameter.

Naming the new dataset

Keep in mind that the name you supply must be a maximum of 44 characters (35 for generation data sets) and you must be authorized to use the name that you supplied.

Unqualified names

Unqualified names of the data sets are simple 1-8 character strings.

Qualified names

Qualified names are multiple unqualified names joined by dots. You will most likely use qualified names.
For example
PROD.PEN.TRANS.SORTED

When creating a new dataset you might not always know how much space will you need, unless you're placing data on tape, you need to specify the SPACE parameter

Generation Data sets

If you want to simply create a new generation of an existing Generation data set and send the output to it, you can specify the name as usual, and use a +1 as the generation.

Temporary Data sets

Creating temporary data sets is also pretty simple. Temporary data set names begin with && followed by 1-8 character names. This can be coded as the DSN to indicate it's meant to be a temporary data set.

Another way of doing it is to not code a DSN parameter at all. In this case the system will name the temporary data set itself. This is nice if you don't need to reference the data set in later parts of the job.

DISP

It is kind of important to pay attention to the DISP parameter when creating new data sets

DISP

This parameter Describes the disposition or the status of the data set.
It normally has three subparameters that describe the following

The status of the data (does it exist, is it used exclusively)
How is the data set to be handled after the step completes successfully
How is the data set to be handled on a failed step

In many cases (for example when a data set is used for input) you will not need to do anything special in either success or failure. In this case you can just provide the first subparameter and leave the other ones to go to default (which is KEEP, keeping the file)

The first subparameter

The first subparameter will indicate the status of the data set that is to be used:

SHR - Share, the data set should be handled in a shared way, this is essentially a read-only access to the data, as many programs can use it at the same time
OLD - Exclusive, this program is the only one that will use the dataset, it can then perform operations like writing (which would not be possible when multiple programs would be accessing the file)
MOD - This will also allow operations like writing, however it will also instruct the system to append anything written to the file at the end of it, instead of overwriting the data set like it would do with OLD
NEW - This means that the data set is to be created for the use in this program. This is the default if the first subparameter is not coded

The second subparameter

The second subparameter tells how to handle the data set after the step completes with success:

CATLG - the data set is retained and an entry is placed in the catalog so it can be easily located
KEEP - In SMS environments KEEP implies CATLG because SMS needs to know all data sets on the system. In a NON-SMS env the data will be retained but there won't be a catalog entry created for it, which can make it harder to find later on. If generation dat aset s are referncing the relative generation numbers this value won't result in data set being cataloued, only kept.
PASS - the data set will be passed to the next step of the job, this is commonly used for temporary data sets.
DELETE - the data set is not longer needed after the step and will be removed

If you omit this parameter for a new data set, the default will be DELETE.
If you omit it for an existing data set, KEEP will be the default.

The Third subparameter

This one will determine what happens to the data set when the step fails

For the third subparameter, all values except of PASS are allowed from the second subparameter.
If omitted, the default is what will be specified or implied for the second one, except if it is PASS, this would make the default same as the second.

New data sets

When creating a new file you can use the (NEW,CATLG) value for the parameter.

In scenarios where the DISP parameter is not coded in at all the (NEW,DELETE,DELETE) default will be used, pretty sure making a temporary file just for the processing of it, and it might create errors too.

See Outputting data and creating new datasets for more!

Where will the data set go?

When making a new data set it is important to know what type of device it will be on.

UNIT

When creating a data set the system will need to know what type of device it will be stored on.

You can specify the type of device that will be used to store the data set

CART can be used for a tape cartridge
SYSDA can be used for a system disk (DASD)
SYSALLDA is also a group referencing all DASD devices

You can be a bit more specific and specify the number for device types

UNIT=3390 would be the device-type for commonly used for DASD
UNIT=3400-5 would be the device type commonly used for tape drives

You can code the device number (4 digit hex number) and need to have a slash before it as well

UNIT=/04DE

It is likely that the system administrator has put defaults in place so that something appropriate is selected if the parameter is not specified, but try to not forget it!

This can be used to define where the data set will go in broad strokes. VOL can be used for DASD devices and tape cartridges to define the name of a specific volume to use. This can be used for example when you want to send a set a data that will be sent to another org later on.

VOL

This parameter performs a similar function to UNIT, it can be used for DASD but is more commonly used for tape cartridges and allows you to define the name of the specific volume to be used
VOL=SER=TX2018

It can also refer back to a previous statement to use the same volume that was defined there.
VOL=*.STEP1.SYSUT2

If the specified volume is not available, the operator will receive a prompt to take action (specify another one or cancel the job)

Space allocation for the new data set

SPACE

This is the parameter you need to specify in relation to how much space will your newly created data set use

The syntax for how this works is :
SPACE=(unit,(primary,secondary,dir))
Pasted image 20241210204202.png

Unit

Unit can have a few values

CYL for cylinders SPACE=(CYL,(10,2))
TRK for tracks SPACE=(TRK,(8,1))
block length in bytes SPACE=(200,(500,100))

Although tracks and cylinders are no longer used, they are still there for consistency.

Primary allocation

Primary is the amount of space unit to be allocated initially for the data set.
If you specify 10 tracks, the data set will instantly be allocated 10 tracks.

Secondary allocation

Secondary is the amount of data that the set is allowed to grow past the primary allocation, The system will allow up to 15 lots of the secondary extent to be defined if required making 16 overall. Those are only defined when needed.

In the example of SPACE=(CYL,(10,2)) the maximum amount of space that the data set can occupy is 10 + 15 * 2 cylinders, making the total 40.

Partitioned Data Sets

the DIR subparameter is only used for PDS and defined the number of directory blocks.
Those blocks contain the names of the members in the PDS and their location within the set.

Pasted image 20241210204235.png

If you use bytes as the measurement, you will need to specify the average record length as well, by adding the AVGREC parameter.

RLSE

RLSE can also be coded as the last subparameter.

This will instruct the system that space allocated for a new data set that is not used is to be released when the data set is closed. This is great for saving system space ,however it can create issues when trying to add to this data set later on, as it might not have enough space for that.
Pasted image 20241210204414.png

AVGREC

This parameter must be coded to to defined whether the number you supplied in the SPACE parameter is in Units U Kilobytes K or Megabytes M

Pasted image 20241210204414.png
In this example the average record size is 340 bytes, and the primary space is 5 * 1K = 5120 bytes.

This is because while tracks etc represent record quantities, bytes can come in multiple denominations.

You also need to provide information to the system regarding how data is read from and written to a data set

You can do this with the DCB parameter

DCB

This parameter should be used to give information about how data is read from and wrote to a data set

DCB=(DSORG=xx,RECFM=yy,LRECL=zzMBLKSIZE=nnn)
It takes a few sub parameters
Pasted image 20241210221956.png

You used to need to code all of those parameters as part of the DCB parameter, though you can now also code them individually
Pasted image 20241210222656.png

Copying from previous statement

As with a few others, you can copy them from a previous statement using the usual syntax
Pasted image 20241210222749.png

Importing from data set

If you need to create multiple data sets with the same DCB characteristics, you can save those in a data set and import it from it.
Pasted image 20241210223713.png

Subparameters

Data Set Organization

DSORG is the Data Set Organization, this defines the type of the data set

PS - Sequential data set
PO - Partitioned Data set
DA - Direct Access Data set

if you don't specify this, the system will attempt to guess.

Record Format

RECFM is Record Format, it is used to defined attributes associated with the records in the new data set

F - fixed length records that are not grouped to form a block
FB - fixed length records that are forming blocks
V - variable length records that do not form blocks
VB - variable length records that do form blocks
U - undefined length
You might also add A to the end of those values to indicate that the records contain ANSI characters used for printing purposes

Logical Record Length

LRECL is the logical record length, it defines the length of records in the data set,.

The value should be in bytes. For variable length records it should be the longest record.
Sometimes when dealing with libraries, it might have the record format as U and Logical record length set to 0 because there is no logical record to be associated with a block.

Block Size

BLKSIZE is the block size and it is used to sepcify the size of the block of data in bytes, kilobytes, megabytes or gigabytes that is read from or written to.

Fixed length records

For fixed length records this should be a multiple of the LRECL.

For example fixed length records of 80 bytes with a block size of 800 would allow 10 records per block.

Variable length records

For variable length records this value should a be multiple of LRECL + 4 bytes to include the record descriptor word.

Zero value

If the block size is coded as 0 the system will determine the optimal block size based on the LRECL and the physical characteristics of the disk device.

Omitting the sub parameter will have the same result.

Copying parameters from another data sets

LIKE

If you are in a SMS managed environment and you know of a data set that you want to copy the attributes of, you can use the LIKE parameter to copy those attributes.
Pasted image 20241210223956.png

Retention period and expiration date

RETPD

This parameter specifies the Retention period for the data in the data set you're creating.

This might be useful for governance or regulatory purposes.
It is coded as a 5 digit field that specifies the number of days that the data in the data set should be retained for.

The maximum is 93000 days. The system will use this number against the current system date to determine the end of the retention period.

Pasted image 20241210224835.png

In this example the data set will be retained for 90 days

EXPDT

Expiration date can specify when the file should be deleted.

It is coded as a combination of year and day of year, for example
2025/032 will delete the file on the 32nd day of 2025 (which means February 1st)

Pasted image 20241210225127.png

A special values of 99365 and 99366 can be used to define a permanent retention

This can be helpful to prevent accidental deletion of the data sets, for example during the job. If an operator attempts to manually delete a file with a permanent expiration date, it they will need to manually override it in a confirmation window.

If you need to use JCL to delete a file protected by expiration date, you can use IDCAMS utility with a PURGE control statement
Pasted image 20241210225523.png

Simplifying with classes

You Many organizations will have SMS implemented, which also allows you to simplify some of your JCL, using SMS class parameters

DATACLAS

A data class can contain many of the DCB attributes discussed above, as well as retention and expiration periods.
Pasted image 20241210230018.png

MGMTCLAS

This class can be used to define the details of how the data set should be migrated between different storage mediums
Pasted image 20241210230056.png

STORCLAS

Storage class can be used to replaec teh UNIT and VOL parameters
Pasted image 20241210230122.png

DSNTYPE

This parameter also can make things easier for us. It can differentiate between a regular PDS and a PDSE, a z/OS Unix named pipe, basic, large and extended format data sets.

Pasted image 20241210230319.png