This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Process Scheduler

Process Scheduler alerts: long-running processes, process errors, backlogged processes, locked operators, and critical process monitoring.

Process Scheduler alerts monitor your PeopleSoft batch processing environment for errors, stalls, and missing critical runs.

AlertDescription
Long-Running ProcessesProcesses that have been running (Initiated or Processing) longer than the configured threshold
Process ErrorsProcesses that ended in Error, Not Successful, or Unable to Post status within the lookback window
Backlogged ProcessesQueued or blocked processes whose scheduled run time has passed by more than the configured threshold
Locked OPRID Scheduled ProcessesQueued or scheduled processes where the submitting operator account is locked
Process Run CheckConfigured critical processes that have not run successfully within their expected time window
No Process CompletedNo process has successfully completed within the lookback window. May indicate the scheduler is down.
Process Scheduler DownSchedulers that have not updated their heartbeat status in PSSERVERSTAT recently

1 - Long-Running Processes

Tailored Operational Context
  • Target Database:
  • Context Type:
  • Alert Severity:
  • Triggered Time:
  • Firing Context:

Long-Running Processes Alert

Alert ID: long_running_processes Category: Process Scheduler Default threshold: 20 minutes

What This Alert Detects

This alert finds Process Scheduler requests that are currently in Initiated or Processing status and have been running longer than the configured time threshold. A process that has been running for a long time may be stuck, consuming excessive server resources, or waiting on a lock or resource that will never become available.

Severity Logic

ConditionSeverity
Running longer than thresholdMinutesWarning

For example, with the default threshold of 20 minutes:

  • A process running for 25 minutes or more → Warning

What Gets Checked

The alert queries the Process Scheduler request table for processes in run status 6 (Initiated) or 7 (Processing). For each result, it calculates how long the process has been running based on its BeginDttm (begin datetime) and the current server time.

Processes with no BeginDttm value are skipped (the process hasn’t truly started yet).

Alert Details

Each alert item includes:

  • Process name (PRCSNAME)
  • Process instance number
  • How long the process has been running (in minutes)
  • The operator who submitted the request
  • A link to the Process Monitor detail page for that instance

Configuration

alerts:
  checks:
    long_running_processes:
      enabled: true
      thresholdMinutes: 20       # Minutes before flagging as Warning
      excludeProcesses:          # Process names to skip
        - SOME_LONG_BATCH_JOB
SettingDefaultDescription
thresholdMinutes20Minutes a process must be running to trigger a Warning alert.
excludeProcesses[]List of process names to exclude from this check. Use for known long-running processes that are expected to take a long time.

How to Respond

  1. Click the alert link to go directly to the Process Monitor entry for the flagged process
  2. Review the process details: what it is, who submitted it, when it started
  3. Check whether the process appears to be making progress or is stuck
  4. If the process is genuinely stuck, you may need to cancel it from PeopleSoft’s Process Monitor
  5. Investigate why it got stuck: look for locks, resource contention, or data issues

Tuning the Threshold

The right threshold depends on your environment. If you have batch jobs that are expected to run for 30-60 minutes, set thresholdMinutes to something higher than your longest expected normal runtime. You can also use excludeProcesses to exclude specific jobs from the check rather than raising the threshold for everything.

2 - Process Errors

Tailored Operational Context
  • Target Database:
  • Context Type:
  • Alert Severity:
  • Triggered Time:
  • Firing Context:

Process Errors Alert

Alert ID: process_errors Category: Process Scheduler Default lookback: 24 hours

What This Alert Detects

This alert finds Process Scheduler requests that have failed within a configurable lookback window. It catches processes that ended in one of three error statuses:

Run StatusPeopleSoft CodeMeaning
Error3The process ended with an error condition
Not Successful10The process ran but reported a non-success result
Unable to Post12The process output could not be delivered

Severity Logic

Process TypeStatusSeverity
Recurring (on a recurrence schedule)Error (3), Not Successful (10), Unable to Post (12)Critical
Non-Recurring (ad-hoc execution)Error (3), Not Successful (10), Unable to Post (12)Warning
  • Recurring Processes: Any failure fires Critical immediately.
  • Non-Recurring Processes: Fire Warning after the thresholdMinutes grace period.

Alert Details

Each alert item includes:

  • Process name and instance number
  • Run status label (Error, Not Successful, Unable to Post)
  • The operator who submitted the request
  • When the process ran
  • A link to the Process Monitor detail page for that instance

Configuration

alerts:
  checks:
    process_errors:
      enabled: true
      lookbackHours: 24        # How far back to look for failures
      thresholdMinutes: 15     # Grace period buffer in minutes for non-recurring errors
      excludeProcesses:        # Process names to skip
        - KNOWN_FLAKY_PROCESS
SettingDefaultDescription
lookbackHours24Number of hours back to search for failed processes
thresholdMinutes0Grace period buffer (in minutes) for non-recurring process errors before they raise a Warning alert.
excludeProcesses[]List of process names to exclude from this check

How to Respond

  1. Click the alert link to go directly to the Process Monitor entry for the failed process
  2. Review the process details: run status, begin and end times, server
  3. Look for output files or log information that might explain the failure
  4. Check whether this is a one-time failure or a repeating issue
  5. If the process needs to be rerun, submit a new request from PeopleSoft

Common Causes of Process Failures

  • Data errors: The process encountered unexpected data (null values, bad formats, constraint violations)
  • Resource issues: The server ran out of memory or disk space
  • Timeout: The process exceeded its allowed run time
  • Configuration problems: A required configuration parameter is missing or incorrect
  • Dependency failures: A process that runs after another failed because the first one didn’t complete correctly

Reducing Alert Noise

If certain processes fail regularly and you’re already tracking them separately, add them to excludeProcesses to keep the alert list focused on unexpected failures.

3 - Backlogged Processes

Tailored Operational Context
  • Target Database:
  • Context Type:
  • Alert Severity:
  • Triggered Time:
  • Firing Context:

Backlogged Processes Alert

Alert ID: backlogged_processes Category: Process Scheduler Default threshold: 30 minutes

What This Alert Detects

This alert finds Process Scheduler requests that are in Queued or Blocked status and whose scheduled run time (RUNDTTM) has already passed by more than the configured threshold.

Severity Logic

ConditionSeverity
Overdue by more than thresholdMinutesWarning
Overdue by more than thresholdMinutes × 2Critical

For example, with the default threshold of 30 minutes:

  • A process scheduled 40 minutes ago that is still queued → Warning
  • A process scheduled 65 minutes ago that is still queued → Critical

What Gets Checked

The alert queries the Process Scheduler request table for processes in run status 5 (Queued) or 18 (Blocked) whose RUNDTTM (scheduled run datetime) is in the past. For each result, it calculates how far past the scheduled time the process is based on RUNDTTM and the current server time.

Processes with no RUNDTTM value are skipped.

Alert Details

Each alert item includes:

  • Process name (PRCSNAME)
  • Process instance number
  • How long the process is overdue (in minutes)
  • Current run status (Queued or Blocked)
  • The operator who submitted the request
  • A link to the Process Monitor detail page for that instance

Configuration

alerts:
  checks:
    backlogged_processes:
      enabled: true
      thresholdMinutes: 30         # Minutes overdue before flagging as Warning
      excludeProcesses:            # Process names to skip
        - SOME_LOW_PRIORITY_JOB
SettingDefaultDescription
thresholdMinutes30Minutes past the scheduled run time before a queued/blocked process triggers a Warning alert. Critical fires at 2× this value.
excludeProcesses[]List of process names to exclude from this check. Use for processes that are known to queue for a long time and are not a concern.

How to Respond

  1. Click the alert link to go directly to the Process Monitor entry for the flagged process
  2. Check whether the Process Scheduler server is running and accepting work
  3. Look at how many processes are currently running on the server. It may have hit its concurrency limit
  4. Check if the process type or class has reached its maximum allowed concurrent instances
  5. For blocked processes, investigate what is blocking them (dependencies, server restrictions, etc.)
  6. If the Process Scheduler server is down, restart it from PeopleSoft’s Process Scheduler administration

Tuning the Threshold

The right threshold depends on how busy your Process Scheduler is. In environments where many jobs are submitted at once, some queuing is normal. Set thresholdMinutes high enough to avoid false positives during peak batch windows but low enough to catch genuine problems. You can also use excludeProcesses to exclude specific low-priority processes that are known to queue for long periods.

4 - Locked OPRID Scheduled Processes

Tailored Operational Context
  • Target Database:
  • Context Type:
  • Alert Severity:
  • Triggered Time:
  • Firing Context:

Locked OPRID Scheduled Processes Alert

Alert ID: locked_oprid_processes Category: Process Scheduler

What This Alert Detects

This alert finds queued or scheduled Process Scheduler requests where the submitting operator’s account (OPRID) is currently locked in PSOPRDEFN (ACCTLOCK = 1).

When an operator account is locked after a process has been queued, PeopleSoft will refuse to run the process, or run it under the locked account and immediately fail. PeopleSoft does not surface this condition anywhere obvious: Process Monitor shows the job queued, the operator’s user page shows them locked, but nothing connects the two. This alert does.

Common scenarios:

  • A service or batch account had its password expire and was locked
  • An employee left and their account was locked, but scheduled jobs were not transferred
  • A security lockout from failed login attempts affected a batch account

Severity Logic

All findings are reported at Warning severity. Every queued or scheduled process with a locked submitting account is flagged.

What Gets Checked

The alert queries PSPRCSRQST joined to PSOPRDEFN for process requests in Queued or Scheduled run status where the submitting OPRID has ACCTLOCK = 1.

Alert Details

Each alert item includes:

  • Process name and instance number
  • Submitting OPRID (with link to User detail page)
  • Current run status (Queued, Scheduled, etc.)
  • Scheduled run date/time
  • Recurrence name (if applicable)

Configuration

alerts:
  checks:
    locked_oprid_processes:
      enabled: true
      excludeProcesses: []   # Process names to ignore
SettingDefaultDescription
excludeProcesses[]List of process names to exclude from this check

How to Respond

  1. Click the alert link to open the Process Monitor detail page for the affected instance
  2. Identify the locked OPRID shown in the alert
  3. Navigate to the User detail page to review the account lock status
  4. Either unlock the account (if appropriate) or re-queue the process under an active operator account
  5. For recurring processes, update the recurrence definition to use a non-locked operator
  6. Investigate why the account was locked. If it was a failed login lockout, check the Failed Logins alert for additional context

Tables Queried

TableDescription
PSPRCSRQSTProcess Scheduler request queue
PSOPRDEFNOperator definitions (user accounts)

5 - Process Run Check

Tailored Operational Context
  • Target Database:
  • Context Type:
  • Alert Severity:
  • Triggered Time:
  • Firing Context:

Process Run Check Alert

Alert ID: process_run_check Category: Process Scheduler

What This Alert Detects

This alert monitors configured critical processes and fires when one has not completed successfully within its expected time window. It is the alert equivalent of the Process Run Check report. The difference is that this runs automatically on every check cycle and surfaces failures on the dashboard without any manual action.

Use this alert for processes that must run on a regular cadence, such as:

  • Nightly batch jobs that must complete before business hours
  • Data synchronization processes that run every few hours
  • Critical integrations that should run multiple times per day
  • Post-maintenance verification of essential processes

Severity Logic

ConditionSeverity
Process has run recently but not successfully in the configured windowWarning
Process has no run history at allCritical

Configuration

Process checks are configured per process name in config.yaml. Each entry specifies the process name and the number of hours within which a successful run is expected.

alerts:
  checks:
    process_run_check:
      enabled: true
      processChecks:
        SOMEJOBNAME: 24      # Must run successfully within 24 hours
        ANOTHERJOB: 8        # Must run successfully within 8 hours
        NIGHTLY_ETL: 12      # Must run successfully within 12 hours
SettingDefaultDescription
processChecks{}Map of process name to expected run window in hours

If a process name is listed with 0 or a negative value, the check defaults to a 24-hour window.

What Gets Checked

For each configured process, psLens queries PSPRCSRQST for successful runs (RunStatus = 9 / Success) within the configured time window. If none are found, it then checks for any run history to determine severity:

  • No successful run in window + recent run history found: Warning
  • No run history at all: Critical

Alert Details

Each alert item includes:

  • Process name
  • Configured threshold (hours)
  • Last known run status (if any history exists)
  • Last known run time (if any history exists)
  • Link to the Process Definition detail page

How to Respond

  1. Click the alert link to open the Process Definition detail page for the affected process
  2. Review recent run history to understand what happened. Did the process run but fail, or did it not run at all?
  3. Check the Process Scheduler server configuration if the process never ran
  4. Investigate error logs if the process ran but ended in a failed state
  5. If the process ran and succeeded but outside the expected window, consider adjusting the threshold in config.yaml

Tables Queried

TableDescription
PSPRCSRQSTProcess Scheduler request queue and run history

6 - Process Scheduler Down

Tailored Operational Context
  • Target Database:
  • Context Type:
  • Alert Severity:
  • Triggered Time:
  • Firing Context:

Process Scheduler Down Alert

Alert ID: process_scheduler_down Category: Process Scheduler Default threshold: 10 minutes

What This Alert Detects

This alert triggers when any active Process Scheduler server registered in PSSERVERSTAT has not reported a status update (heartbeat) within the configured amount of time.

Severity Logic

ConditionSeverity
Heartbeat stale by more than thresholdMinutesWarning
Heartbeat stale by more than thresholdMinutes × 2Critical

For example, with the default threshold of 10 minutes:

  • A scheduler that hasn’t heartbeat’ed for 12 minutes → Warning
  • A scheduler that hasn’t heartbeat’ed for 22 minutes → Critical

What Gets Checked

The alert queries the PSSERVERSTAT table to retrieve all server status definitions. For each active scheduler (status not Down/Offline), it calculates the elapsed time since its LASTUPDDTTM timestamp. If that time exceeds the configured threshold, the alert fires.

Alert Details

Each alert item includes:

  • Server name (SERVERNAME)
  • Current status code and friendly string status (e.g., Running, Error, Suspended)
  • Last heartbeat timestamp (LASTUPDDTTM)
  • Host name (SRVRHOSTNAME)
  • A detailed explanation of how long the heartbeat has been stale
  • A link to the Server Definition detail page for that server

Configuration

alerts:
  checks:
    process_scheduler_down:
      enabled: true
      thresholdMinutes: 10          # Minutes stale before flagging as Warning
      excludeProcesses:             # Server names (e.g., PSUNX, PSNT) to skip
        - PSUNX_OLD
SettingDefaultDescription
thresholdMinutes10Minutes of stale heartbeat status updates before a scheduler triggers a Warning alert. Critical fires at 2× this value.
excludeProcesses[]List of server names to exclude from this check. Use for retired scheduler definitions that linger in PSSERVERSTAT but aren’t cleaned up.

How to Respond

  1. Click the alert link to go directly to the Server Definition detail page for the affected scheduler.
  2. Check the Host Name where the Process Scheduler daemon runs.
  3. Access the server host and verify whether the Process Scheduler processes (e.g., psadmin, PSAESRV, etc.) are running.
  4. Review the Process Scheduler logs (e.g., TUXLOG, SCHED_*.LOG) on the host machine to diagnose why the process has hung or crashed.
  5. If the scheduler has hung, stop the process scheduler daemon and restart it using psadmin.
  6. If the server definition is obsolete or decommissioned, consider deleting it in PeopleSoft Server Definitions configuration to clean up the PSSERVERSTAT row.

7 - No Process Completed

Tailored Operational Context
  • Target Database:
  • Context Type:
  • Alert Severity:
  • Triggered Time:
  • Firing Context:

No Process Completed Alert

Alert ID: no_process_completed Category: Process Scheduler Default lookback: 1 hour

What This Alert Detects

This alert fires when no process has successfully completed within the configured lookback window. It is a broad scheduler health check. If nothing has finished successfully in the past hour, the Process Scheduler may be down, stalled, or not dispatching jobs.

This is distinct from the Process Run Check, which monitors specific named processes. This alert monitors overall scheduler activity.

Severity Logic

ConditionSeverity
Zero successful completions in the lookback windowWarning

What Gets Checked

The alert queries PSPRCSRQST for any process with RunStatus = 9 (Success) and an end datetime within the lookback window. If no rows are returned, the alert fires.

Only one result is needed to resolve the alert. The check uses a limit of 1 for efficiency.

Alert Details

When firing, the alert produces a single item:

  • Summary: No process completed successfully in the last N hour(s)
  • Lookback hours used for the check

Configuration

alerts:
  checks:
    no_process_completed:
      enabled: true
      lookbackHours: 1    # How far back to look for completed processes
SettingDefaultDescription
lookbackHours1How many hours back to look for a successfully completed process.

How to Respond

  1. Check PeopleSoft’s Process Monitor to see if any processes are running, queued, or have recently completed
  2. Verify the Process Scheduler server is running (PeopleSoft > PeopleTools > Process Scheduler > Servers)
  3. If processes are queued but not running, the scheduler daemon may need to be restarted
  4. If this fires regularly during off-hours when no jobs run, increase lookbackHours or disable the alert for those periods

Tuning

If your environment has periods where no batch jobs are expected to run (e.g., overnight maintenance windows), consider increasing lookbackHours to cover those gaps, or disable the alert entirely during those windows.