XM4DB2 is an exception master for DB2. It continuously inspects each mission critical DB2 system and proactively looks for indicators for current or potential future problems, so called ’exceptions‘. An exception is an unacceptable situation that DB2 or another z/OS component cannot automatically solve. XM4DB2 alerts DBA staff and optionally can take action to rectify the situation. Working in the background it continually checks for availability of DB2 objects and the operational readiness of utilities, plans and packages.
Multiple monitoring programs for DB2 and z/OS deliver great raw statistics, but they lack the ability to intelligently review the observations to determine when a true exception or error situation is about to happen. The primary purpose of these monitoring tools is to make data available to the experts analyzing DB2 system failures. The UBS Hainer approach is to keep avoidable problems from impacting the company in the first place by automatically tracking suspicious and potentially harmful events in real time.
The check of the DB2 systems is an iterative and fatiguing task, so why not automate it? XM4DB2 maintains and displays a table of current exceptions across the DB2 subsystems and enterprise wide. Developed for Operations and Production DBA staff, it delivers a timely pro-active status update of the DB2 systems to improve and guarantee availability of all objects to their dependent applications.
XM4DB2 offers predefined best practices solutions and also supports customized exception definitions. An exception that repeatedly leads to a particular error, and seems in the short term to be unavoidable, should receive a standard treatment. XM4DB2 offers prepared jobs for repair and other functions. These jobs can be checked, edited and released by the staff or submitted to a scheduler for execution.
Some of the exceptions that are continuously checked include:
• SQL statements created without parameter markers
• Bad access path changes of SQL statements
• Most CPU consuming SQL statements
• Objects in an unexpected restricted state and all its affected plans/packages
• Foreseeable lack of space of objects whether attributable to z/OS, ICF or DB2
• Unrecoverable tablespaces due to missing image-copies
• Possible infringement of SLAs with regard to recovery time
• Missing Archived Logs; inconsistency with BSDS
• Stopped Procedures e.g. REJECT or QUEUE with queue length reached
• Stopped Utilities and affected objects
• In-doubt Threads and related system implications
• DDF status with regard to connections and threads
• Plans/packages that are invalid or inoperative or impacted by other failures
• Buffer pool related problems, e.g. avoidable re-reads, thresholds hit, etc.
• Unexpected load peaks in respect of getpage and CPU consumption