There are usually two options when producing data from a program, you can either create a new data set / Unix file to send the data to, or add it to an existing file.
Outputting to existing datasets
When the DISP parameter in your DD statementis set to OLD, you have write access to the data set you have specified, and the outputted data will overwrite or replace the data set specified. If the DISP parameter is set to MOD, the data will be appended at the end of the current data set, instead of overwriting it.
Overwrite
This dataset will be completely overwritten
Append
This dataset will be appended to
Not enough space
A common issue when outputting data is that the data set we are trying to add to or overwrite might not have enough space to hold our data. This would cause the job to fail with the error SB37
Creating a new dataset
If you want to create a new file, you can specify the name of it with the DSN parameter and use the (NEW,CATLG) value for the DISP parameter.
Naming the new dataset
Keep in mind that the name you supply must be a maximum of 44 characters (35 for generation data sets) and you must be authorized to use the name that you supplied.
Unqualified names
Unqualified names of the data sets are simple 1-8 character strings.
Qualified names
Qualified names are multiple unqualified names joined by dots. You will most likely use qualified names.
For example PROD.PEN.TRANS.SORTED
When creating a new dataset you might not always know how much space will you need, unless you're placing data on tape, you need to specify the SPACE parameter
Generation Data sets
If you want to simply create a new generation of an existing Generation data set and send the output to it, you can specify the name as usual, and use a +1 as the generation.
Temporary Data sets
Creating temporary data sets is also pretty simple. Temporary data set names begin with && followed by 1-8 character names. This can be coded as the DSN to indicate it's meant to be a temporary data set.
Another way of doing it is to not code a DSN parameter at all. In this case the system will name the temporary data set itself. This is nice if you don't need to reference the data set in later parts of the job.
DISP
It is kind of important to pay attention to the DISP parameter when creating new data sets
DISP
This parameter Describes the disposition or the status of the data set.
It normally has three subparameters that describe the following
The status of the data (does it exist, is it used exclusively)
How is the data set to be handled after the step completes successfully
How is the data set to be handled on a failed step
In many cases (for example when a data set is used for input) you will not need to do anything special in either success or failure. In this case you can just provide the first subparameter and leave the other ones to go to default (which is KEEP, keeping the file)
The first subparameter
The first subparameter will indicate the status of the data set that is to be used:
SHR - Share, the data set should be handled in a shared way, this is essentially a read-only access to the data, as many programs can use it at the same time
OLD - Exclusive, this program is the only one that will use the dataset, it can then perform operations like writing (which would not be possible when multiple programs would be accessing the file)
MOD - This will also allow operations like writing, however it will also instruct the system to append anything written to the file at the end of it, instead of overwriting the data set like it would do with OLD
NEW - This means that the data set is to be created for the use in this program. This is the default if the first subparameter is not coded
The second subparameter
The second subparameter tells how to handle the data set after the step completes with success:
CATLG - the data set is retained and an entry is placed in the catalog so it can be easily located
KEEP - In SMS environments KEEP implies CATLG because SMS needs to know all data sets on the system. In a NON-SMS env the data will be retained but there won't be a catalog entry created for it, which can make it harder to find later on. If generation dat aset s are referncing the relative generation numbers this value won't result in data set being cataloued, only kept.
PASS - the data set will be passed to the next step of the job, this is commonly used for temporary data sets.
DELETE - the data set is not longer needed after the step and will be removed
If you omit this parameter for a new data set, the default will be DELETE.
If you omit it for an existing data set, KEEP will be the default.
The Third subparameter
This one will determine what happens to the data set when the step fails
For the third subparameter, all values except of PASS are allowed from the second subparameter.
If omitted, the default is what will be specified or implied for the second one, except if it is PASS, this would make the default same as the second.
New data sets
When creating a new file you can use the (NEW,CATLG) value for the parameter.
In scenarios where the DISP parameter is not coded in at all the (NEW,DELETE,DELETE) default will be used, pretty sure making a temporary file just for the processing of it, and it might create errors too.
When making a new data set it is important to know what type of device it will be on.
UNIT
When creating a data set the system will need to know what type of device it will be stored on.
You can specify the type of device that will be used to store the data set
CART can be used for a tape cartridge
SYSDA can be used for a system disk (DASD)
SYSALLDA is also a group referencing all DASD devices
You can be a bit more specific and specify the number for device types
UNIT=3390 would be the device-type for commonly used for DASD
UNIT=3400-5 would be the device type commonly used for tape drives
You can code the device number (4 digit hex number) and need to have a slash before it as well
UNIT=/04DE
It is likely that the system administrator has put defaults in place so that something appropriate is selected if the parameter is not specified, but try to not forget it!
This can be used to define where the data set will go in broad strokes. VOL can be used for DASD devices and tape cartridges to define the name of a specific volume to use. This can be used for example when you want to send a set a data that will be sent to another org later on.
VOL
This parameter performs a similar function to UNIT, it can be used for DASD but is more commonly used for tape cartridges and allows you to define the name of the specific volume to be used VOL=SER=TX2018
It can also refer back to a previous statement to use the same volume that was defined there. VOL=*.STEP1.SYSUT2
If the specified volume is not available, the operator will receive a prompt to take action (specify another one or cancel the job)
Space allocation for the new data set
SPACE
This is the parameter you need to specify in relation to how much space will your newly created data set use
The syntax for how this works is : SPACE=(unit,(primary,secondary,dir))
Unit
Unit can have a few values
CYL for cylinders SPACE=(CYL,(10,2))
TRK for tracks SPACE=(TRK,(8,1))
block length in bytes SPACE=(200,(500,100))
Although tracks and cylinders are no longer used, they are still there for consistency.
Primary allocation
Primary is the amount of space unit to be allocated initially for the data set.
If you specify 10 tracks, the data set will instantly be allocated 10 tracks.
Secondary allocation
Secondary is the amount of data that the set is allowed to grow past the primary allocation, The system will allow up to 15 lots of the secondary extent to be defined if required making 16 overall. Those are only defined when needed.
In the example of SPACE=(CYL,(10,2)) the maximum amount of space that the data set can occupy is 10 + 15 * 2 cylinders, making the total 40.
Partitioned Data Sets
the DIR subparameter is only used for PDS and defined the number of directory blocks.
Those blocks contain the names of the members in the PDS and their location within the set.
If you use bytes as the measurement, you will need to specify the average record length as well, by adding the AVGREC parameter.
RLSE
RLSE can also be coded as the last subparameter.
This will instruct the system that space allocated for a new data set that is not used is to be released when the data set is closed. This is great for saving system space ,however it can create issues when trying to add to this data set later on, as it might not have enough space for that.
AVGREC
This parameter must be coded to to defined whether the number you supplied in the SPACE parameter is in Units U Kilobytes K or Megabytes M
In this example the average record size is 340 bytes, and the primary space is 5 * 1K = 5120 bytes.
This is because while tracks etc represent record quantities, bytes can come in multiple denominations.
You also need to provide information to the system regarding how data is read from and written to a data set
You can do this with the DCB parameter
DCB
This parameter should be used to give information about how data is read from and wrote to a data set
DCB=(DSORG=xx,RECFM=yy,LRECL=zzMBLKSIZE=nnn)
It takes a few sub parameters
You used to need to code all of those parameters as part of the DCB parameter, though you can now also code them individually
Copying from previous statement
As with a few others, you can copy them from a previous statement using the usual syntax
Importing from data set
If you need to create multiple data sets with the same DCB characteristics, you can save those in a data set and import it from it.
Subparameters
Data Set Organization
DSORG is the Data Set Organization, this defines the type of the data set
PS - Sequential data set
PO - Partitioned Data set
DA - Direct Access Data set
if you don't specify this, the system will attempt to guess.
Record Format
RECFM is Record Format, it is used to defined attributes associated with the records in the new data set
F - fixed length records that are not grouped to form a block
FB - fixed length records that are forming blocks
V - variable length records that do not form blocks
VB - variable length records that do form blocks
U - undefined length
You might also add A to the end of those values to indicate that the records contain ANSI characters used for printing purposes
Logical Record Length
LRECL is the logical record length, it defines the length of records in the data set,.
The value should be in bytes. For variable length records it should be the longest record.
Sometimes when dealing with libraries, it might have the record format as U and Logical record length set to 0 because there is no logical record to be associated with a block.
Block Size
BLKSIZE is the block size and it is used to sepcify the size of the block of data in bytes, kilobytes, megabytes or gigabytes that is read from or written to.
Fixed length records
For fixed length records this should be a multiple of the LRECL.
For example fixed length records of 80 bytes with a block size of 800 would allow 10 records per block.
Variable length records
For variable length records this value should a be multiple of LRECL + 4 bytes to include the record descriptor word.
Zero value
If the block size is coded as 0 the system will determine the optimal block size based on the LRECL and the physical characteristics of the disk device.
Omitting the sub parameter will have the same result.
Copying parameters from another data sets
LIKE
If you are in a SMS managed environment and you know of a data set that you want to copy the attributes of, you can use the LIKE parameter to copy those attributes.
Retention period and expiration date
RETPD
This parameter specifies the Retention period for the data in the data set you're creating.
This might be useful for governance or regulatory purposes.
It is coded as a 5 digit field that specifies the number of days that the data in the data set should be retained for.
The maximum is 93000 days. The system will use this number against the current system date to determine the end of the retention period.
In this example the data set will be retained for 90 days
EXPDT
Expiration date can specify when the file should be deleted.
It is coded as a combination of year and day of year, for example 2025/032 will delete the file on the 32nd day of 2025 (which means February 1st)
A special values of 99365 and 99366 can be used to define a permanent retention
This can be helpful to prevent accidental deletion of the data sets, for example during the job. If an operator attempts to manually delete a file with a permanent expiration date, it they will need to manually override it in a confirmation window.
If you need to use JCL to delete a file protected by expiration date, you can use IDCAMS utility with a PURGE control statement
Simplifying with classes
You Many organizations will have SMS implemented, which also allows you to simplify some of your JCL, using SMS class parameters
DATACLAS
A data class can contain many of the DCB attributes discussed above, as well as retention and expiration periods.
MGMTCLAS
This class can be used to define the details of how the data set should be migrated between different storage mediums
STORCLAS
Storage class can be used to replaec teh UNIT and VOL parameters
DSNTYPE
This parameter also can make things easier for us. It can differentiate between a regular PDS and a PDSE, a z/OS Unix named pipe, basic, large and extended format data sets.