The config.yml file is not shipped with the S2PX distribution media. You must create this file yourself using your favourite text editor.
You can find the example file described on this page available here.
All configurable aspects of the S2PX conversion process are guided by settings in a YAML-formatted configuration file. This file can be named anything, but for the purposes of this documentation we will refer to it as config.yaml. This file has a number of distinct sections, responsible for different areas of the S2PX conversion process.
The transfer section
This section of the config.yml file is mandatory.
transfer: # General S2PX settings
strategy: RENAME_PARALLEL_JOB # The job naming strategy: Valid values are RENAME_PARALLEL_JOB or BACKUP_SERVER_JOB
suffix: Px # The job name suffix to be used when using the
The transfer section allows you to specify how the S2PX conversion process names handles the potential conflict between Server assets and their Parallel equivalents. Valid strategy values are:
-
RENAME_PARALLEL_JOB: The generated Parallel Jobs are assigned new names, and the existing Server Jobs are unaffected, or -
BACKUP_SERVER_JOB: The original Server jobs are renamed and the generated Parallel assets (Jobs or Sequences, as appropriate) assume the name of the Server Job used to generate them.
S2PX derives a unique name for the new Parallel assets by appending a suffix (specified in the transfer section’s suffix value) to the end of the original function name. These names are also used by newly-created Job references. Job Sequences generated using the RENAME_PARALLEL_JOB naming strategy, for example, will refer to generated Parallel jobs by their suffix-bearing names
The decomposition section
This section of the config.yml file is mandatory.
decomposition:
bufferDirectory: /data/project/sv2px/transient # A directory for the storage of temporary information during conversion
mode: OPTIMIZE_RUNTIME # (OPTIONAL) Optimise performance or minimize jobs. Valid values are OPTIMIZE_RUNTIME and MINIMIZE_JOBS
Job decomposition requires a path (accessible from DataStage jobs) to place any ‘buffer’ DataSets that are required during decomposition. For a vast majority of Server jobs this will never be used, but the path must be specified in the config regardless. The optional decomposition mode (specifying whether decomposition optimizes runtime performance or minimizes the number of generated jobs) defaults to OPTIMIZE_RUNTIME if a value for mode is not provided.
The hashedFiles section
The hashedFiles section allows you to set defaults for the Parallel DRS stage used to replace Server Hashed Files.
Example:
hashedFiles: # Special settings for handling the translation of Server Hashed Files
type: ORACLE # The type settings to be used for the generated DRS Connector stages. Valid values are DB2,ORACLE or ODBC
variant: 11 # the DRS variant setting
connection: MCIDEMO # the DRS connection name
username: MCITESTDEV # the DRS database username (this can also be a job parameter name if required)
password: '{iisenc}CeGmkL8fluDU7OVw==' # the DRS database password (this can also be a job parameter name if required)
schema: myuser # (OPTIONAL) Prefix all Hashed File tablenames with the a schema value (e.g. myuser.tablename)
For more information about where these settings are applied within the resulting stage’s configuration, refer to Defining a DRS Connector stage connection to the database - IBM Documentation.
The customRoutines section
The customRoutines section supports two distinct aspects of S2PX conversion:
-
mapping: Mapping Server Routine calls to Parallel equivalent expressions. You'll see that this mechanism is used by S2PX to map Server Functions without an identically-named Parallel equivalent to a Parallel expression which achieves the same effect. -
libraries: Providing the specification for new Parallel Routines to act as user-created replacements for Server Custom Functions
The customRoutines mapping section
The mapping clause within the customRoutines section of config.yaml allows you to map Server Functions without an identically-named Parallel equivalent to a Parallel expression which achieves the same effect. For example:
customRoutines: # A list of mappings from legacy Server functions to Parallel equivalent expressions
mapping: #
AddOneToNum: '{0} + 1' # 'AddOneToNumber(myParam)' --becomes--> 'MyParam + 1'
GetBalanceInUSD: '{0} * 0.75' # 'GetBalanceInUSD(MyParam)'' --becomes--> 'MyParam * 0.75'
This mapping mechanism is also used by S2PX to perform its out-of-the-box functions mappings which you can edit if desired.
Mappings are NOT iterative, so mapping entries may not include a reference to another mappings. Here’s an an example of an invalid mapping:
customRoutines: # A list of mappings from legacy Server functions to Parallel equivalent expressions
mapping: #
AddOneToNum: '{0} + 1' # 'AddOneToNumber(myParam)' is translated to 'MyParam + 1'
AddTwoToNum: 'AddOneToNum(AddOneToNum({0}))' # 'AddTwoToNumber(MyParam)' is invalid, as it references another mapping
The customRoutines libraries section
Where native DataStage Parallel engine functions do not provide a functional equivalent to a Server routine you may need to specify a mapping to a new Parallel function which you will create in C++. The libraries clause within the customRoutines section of config.yaml allows you to describe the type signature of each of your newly created Parallel functions, along with details of the library in which they can be found.
Here’s an annotated example illustrating the file’s structure and available settings:
libraries: # A list of libraries within which custom routines are defined
-
name: mylib # The name of a library: 'mylib'
path: /path/to/lib # The location of the mylib library
type: SHARED_LIBRARY # The library type. VAlid values are SHARED_LIBRARY (for .so files) and OBJECT_FILE (for .o files)
routines: # A list of the routines available within the mylib library
ACustomRoutine: # A routines within the mylib library:
type: STRING #
externalName: a_custom_func # Name of the C/C++ function in the specified library
arguments: # The arguments (parameters) to the function, descrbied as pairs of...
-
name: arg1 # parameter names, and
type: FLOAT # parameter datatype. Valid values are CHAR, STRING, DOUBLE, FLOAT, INT, LONG, SHORT, UCHAR, UINT, ULONG or USHORT.
-
name: arg2 # etc.
type: STRING # etc.
See Custom Routine conversion for more details.