Process Scheduler Down

This alert triggers when any active Process Scheduler server registered in PSSERVERSTAT has not reported a status update (heartbeat) within the con…

Categories:

New to psLens? This page documents one specific alert. To see how it appears on the dashboard, what operators investigate, and how teams tune it, start with a live walkthrough.

Book a Demo Browse Alerts

Process Scheduler Down Alert

Alert ID: process_scheduler_down Category: Process Scheduler Default threshold: 10 minutes

What This Alert Detects

This alert triggers when any active Process Scheduler server registered in PSSERVERSTAT has not reported a status update (heartbeat) within the configured amount of time.

Note

The alert automatically ignores servers whose status is explicitly set to "1" (Down) or "7" (Suspended - Offline), as these represent intentionally stopped or offline schedulers. It will only flag active server configurations (e.g., Running, Suspended, Error, Overloaded) that have stalled or stopped updating.

Severity Logic

Condition	Severity
Heartbeat stale by more than `thresholdMinutes`	Warning
Heartbeat stale by more than `thresholdMinutes × 2`	Critical

For example, with the default threshold of 10 minutes:

A scheduler that hasn’t heartbeat’ed for 12 minutes → Warning
A scheduler that hasn’t heartbeat’ed for 22 minutes → Critical

What Gets Checked

The alert queries the PSSERVERSTAT table to retrieve all server status definitions. For each active scheduler (status not Down/Offline), it calculates the elapsed time since its LASTUPDDTTM timestamp. If that time exceeds the configured threshold, the alert fires.

Alert Details

Each alert item includes:

Server name (SERVERNAME)
Current status code and friendly string status (e.g., Running, Error, Suspended)
Last heartbeat timestamp (LASTUPDDTTM)
Host name (SRVRHOSTNAME)
A detailed explanation of how long the heartbeat has been stale
A link to the Server Definition detail page for that server

Configuration

alerts:
  checks:
    process_scheduler_down:
      enabled: true
      thresholdMinutes: 10          # Minutes stale before flagging as Warning
      excludeProcesses:             # Server names (e.g., PSUNX, PSNT) to skip
        - PSUNX_OLD

Setting	Default	Description
`thresholdMinutes`	`10`	Minutes of stale heartbeat status updates before a scheduler triggers a Warning alert. Critical fires at 2× this value.
`excludeProcesses`	`[]`	List of server names to exclude from this check. Use for retired scheduler definitions that linger in `PSSERVERSTAT` but aren’t cleaned up.

How to Respond

Click the alert link to go directly to the Server Definition detail page for the affected scheduler.
Check the Host Name where the Process Scheduler daemon runs.
Access the server host and verify whether the Process Scheduler processes (e.g., psadmin, PSAESRV, etc.) are running.
Review the Process Scheduler logs (e.g., TUXLOG, SCHED_*.LOG) on the host machine to diagnose why the process has hung or crashed.
If the scheduler has hung, stop the process scheduler daemon and restart it using psadmin.
If the server definition is obsolete or decommissioned, consider deleting it in PeopleSoft Server Definitions configuration to clean up the PSSERVERSTAT row.