This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Alerts

psLens real-time alerts for PeopleSoft: monitor long-running processes, process errors, stalled Integration Broker messages, failed logins, and more.

Alerts

Alert checks run on a 5-minute timer against every connected database. Findings appear on the dashboard with severity, and clear automatically when the underlying condition resolves. There is no acknowledge or dismiss.

How the Alert System Works

  1. psLens runs a set of alert checks on a timer (default: every 5 minutes)
  2. Each check queries one or more connected databases
  3. If a check finds something worth noting, it creates alert items with a severity level
  4. Alert results appear on the dashboard immediately
  5. When the underlying issue resolves, the alert clears automatically on the next check cycle

Alerts always reflect the current state of the system.

Alert Severity Levels

SeverityColorMeaning
CriticalRedSomething is actively wrong and needs immediate attention
WarningYellowSomething should be investigated. It may become a problem.
InfoBlueLow-priority finding; worth noting but not urgent

Alert Data Retention

Alert results are stored for 15 minutes. This means the dashboard shows findings from the most recent check cycle. Once an issue is resolved and the next check runs cleanly, the alert data expires.

Configuring Alerts

Alerts are configured in config.yaml. You can:

  • Enable or disable the entire alert system (alerts.enabled)
  • Set how often checks run (alerts.intervalMinutes)
  • Enable or disable individual checks
  • Set thresholds for stalled/long-running checks
  • Set lookback windows for error checks
  • Exclude specific process names or IB operation names from checks

See Configuration for the full configuration reference.


Alert Categories

Browse alerts by category:

  • Process Scheduler: Long-running processes, errors, backlogged jobs, locked operators, critical process monitoring
  • Integration Broker: Operation errors, contract errors, stalled messages, volume anomalies, sync exceptions
  • Web Server / WebLib: Alerts when the PeopleSoft Web Server or WebLib endpoints fail to respond
  • Security: Failed login detection and authentication monitoring
  • Generic SWS Alerts: Define custom, queryable alert rules using PsoftQL against any whitelisted tables

1 - Process Scheduler

Process Scheduler alerts: long-running processes, process errors, backlogged processes, locked operators, and critical process monitoring.

Process Scheduler alerts monitor your PeopleSoft batch processing environment for errors, stalls, and missing critical runs.

AlertDescription
Long-Running ProcessesProcesses that have been running (Initiated or Processing) longer than the configured threshold
Process ErrorsProcesses that ended in Error, Not Successful, or Unable to Post status within the lookback window
Backlogged ProcessesQueued or blocked processes whose scheduled run time has passed by more than the configured threshold
Locked OPRID Scheduled ProcessesQueued or scheduled processes where the submitting operator account is locked
Process Run CheckConfigured critical processes that have not run successfully within their expected time window
No Process CompletedNo process has successfully completed within the lookback window. May indicate the scheduler is down.
Process Scheduler DownSchedulers that have not updated their heartbeat status in PSSERVERSTAT recently

1.1 - Long-Running Processes

Tailored Operational Context
  • Target Database:
  • Context Type:
  • Alert Severity:
  • Triggered Time:
  • Firing Context:

Long-Running Processes Alert

Alert ID: long_running_processes Category: Process Scheduler Default threshold: 20 minutes

What This Alert Detects

This alert finds Process Scheduler requests that are currently in Initiated or Processing status and have been running longer than the configured time threshold. A process that has been running for a long time may be stuck, consuming excessive server resources, or waiting on a lock or resource that will never become available.

Severity Logic

ConditionSeverity
Running longer than thresholdMinutesWarning

For example, with the default threshold of 20 minutes:

  • A process running for 25 minutes or more → Warning

What Gets Checked

The alert queries the Process Scheduler request table for processes in run status 6 (Initiated) or 7 (Processing). For each result, it calculates how long the process has been running based on its BeginDttm (begin datetime) and the current server time.

Processes with no BeginDttm value are skipped (the process hasn’t truly started yet).

Alert Details

Each alert item includes:

  • Process name (PRCSNAME)
  • Process instance number
  • How long the process has been running (in minutes)
  • The operator who submitted the request
  • A link to the Process Monitor detail page for that instance

Configuration

alerts:
  checks:
    long_running_processes:
      enabled: true
      thresholdMinutes: 20       # Minutes before flagging as Warning
      excludeProcesses:          # Process names to skip
        - SOME_LONG_BATCH_JOB
SettingDefaultDescription
thresholdMinutes20Minutes a process must be running to trigger a Warning alert.
excludeProcesses[]List of process names to exclude from this check. Use for known long-running processes that are expected to take a long time.

How to Respond

  1. Click the alert link to go directly to the Process Monitor entry for the flagged process
  2. Review the process details: what it is, who submitted it, when it started
  3. Check whether the process appears to be making progress or is stuck
  4. If the process is genuinely stuck, you may need to cancel it from PeopleSoft’s Process Monitor
  5. Investigate why it got stuck: look for locks, resource contention, or data issues

Tuning the Threshold

The right threshold depends on your environment. If you have batch jobs that are expected to run for 30-60 minutes, set thresholdMinutes to something higher than your longest expected normal runtime. You can also use excludeProcesses to exclude specific jobs from the check rather than raising the threshold for everything.

1.2 - Process Errors

Tailored Operational Context
  • Target Database:
  • Context Type:
  • Alert Severity:
  • Triggered Time:
  • Firing Context:

Process Errors Alert

Alert ID: process_errors Category: Process Scheduler Default lookback: 24 hours

What This Alert Detects

This alert finds Process Scheduler requests that have failed within a configurable lookback window. It catches processes that ended in one of three error statuses:

Run StatusPeopleSoft CodeMeaning
Error3The process ended with an error condition
Not Successful10The process ran but reported a non-success result
Unable to Post12The process output could not be delivered

Severity Logic

Process TypeStatusSeverity
Recurring (on a recurrence schedule)Error (3), Not Successful (10), Unable to Post (12)Critical
Non-Recurring (ad-hoc execution)Error (3), Not Successful (10), Unable to Post (12)Warning
  • Recurring Processes: Any failure fires Critical immediately.
  • Non-Recurring Processes: Fire Warning after the thresholdMinutes grace period.

Alert Details

Each alert item includes:

  • Process name and instance number
  • Run status label (Error, Not Successful, Unable to Post)
  • The operator who submitted the request
  • When the process ran
  • A link to the Process Monitor detail page for that instance

Configuration

alerts:
  checks:
    process_errors:
      enabled: true
      lookbackHours: 24        # How far back to look for failures
      thresholdMinutes: 15     # Grace period buffer in minutes for non-recurring errors
      excludeProcesses:        # Process names to skip
        - KNOWN_FLAKY_PROCESS
SettingDefaultDescription
lookbackHours24Number of hours back to search for failed processes
thresholdMinutes0Grace period buffer (in minutes) for non-recurring process errors before they raise a Warning alert.
excludeProcesses[]List of process names to exclude from this check

How to Respond

  1. Click the alert link to go directly to the Process Monitor entry for the failed process
  2. Review the process details: run status, begin and end times, server
  3. Look for output files or log information that might explain the failure
  4. Check whether this is a one-time failure or a repeating issue
  5. If the process needs to be rerun, submit a new request from PeopleSoft

Common Causes of Process Failures

  • Data errors: The process encountered unexpected data (null values, bad formats, constraint violations)
  • Resource issues: The server ran out of memory or disk space
  • Timeout: The process exceeded its allowed run time
  • Configuration problems: A required configuration parameter is missing or incorrect
  • Dependency failures: A process that runs after another failed because the first one didn’t complete correctly

Reducing Alert Noise

If certain processes fail regularly and you’re already tracking them separately, add them to excludeProcesses to keep the alert list focused on unexpected failures.

1.3 - Backlogged Processes

Tailored Operational Context
  • Target Database:
  • Context Type:
  • Alert Severity:
  • Triggered Time:
  • Firing Context:

Backlogged Processes Alert

Alert ID: backlogged_processes Category: Process Scheduler Default threshold: 30 minutes

What This Alert Detects

This alert finds Process Scheduler requests that are in Queued or Blocked status and whose scheduled run time (RUNDTTM) has already passed by more than the configured threshold.

Severity Logic

ConditionSeverity
Overdue by more than thresholdMinutesWarning
Overdue by more than thresholdMinutes × 2Critical

For example, with the default threshold of 30 minutes:

  • A process scheduled 40 minutes ago that is still queued → Warning
  • A process scheduled 65 minutes ago that is still queued → Critical

What Gets Checked

The alert queries the Process Scheduler request table for processes in run status 5 (Queued) or 18 (Blocked) whose RUNDTTM (scheduled run datetime) is in the past. For each result, it calculates how far past the scheduled time the process is based on RUNDTTM and the current server time.

Processes with no RUNDTTM value are skipped.

Alert Details

Each alert item includes:

  • Process name (PRCSNAME)
  • Process instance number
  • How long the process is overdue (in minutes)
  • Current run status (Queued or Blocked)
  • The operator who submitted the request
  • A link to the Process Monitor detail page for that instance

Configuration

alerts:
  checks:
    backlogged_processes:
      enabled: true
      thresholdMinutes: 30         # Minutes overdue before flagging as Warning
      excludeProcesses:            # Process names to skip
        - SOME_LOW_PRIORITY_JOB
SettingDefaultDescription
thresholdMinutes30Minutes past the scheduled run time before a queued/blocked process triggers a Warning alert. Critical fires at 2× this value.
excludeProcesses[]List of process names to exclude from this check. Use for processes that are known to queue for a long time and are not a concern.

How to Respond

  1. Click the alert link to go directly to the Process Monitor entry for the flagged process
  2. Check whether the Process Scheduler server is running and accepting work
  3. Look at how many processes are currently running on the server. It may have hit its concurrency limit
  4. Check if the process type or class has reached its maximum allowed concurrent instances
  5. For blocked processes, investigate what is blocking them (dependencies, server restrictions, etc.)
  6. If the Process Scheduler server is down, restart it from PeopleSoft’s Process Scheduler administration

Tuning the Threshold

The right threshold depends on how busy your Process Scheduler is. In environments where many jobs are submitted at once, some queuing is normal. Set thresholdMinutes high enough to avoid false positives during peak batch windows but low enough to catch genuine problems. You can also use excludeProcesses to exclude specific low-priority processes that are known to queue for long periods.

1.4 - Locked OPRID Scheduled Processes

Tailored Operational Context
  • Target Database:
  • Context Type:
  • Alert Severity:
  • Triggered Time:
  • Firing Context:

Locked OPRID Scheduled Processes Alert

Alert ID: locked_oprid_processes Category: Process Scheduler

What This Alert Detects

This alert finds queued or scheduled Process Scheduler requests where the submitting operator’s account (OPRID) is currently locked in PSOPRDEFN (ACCTLOCK = 1).

When an operator account is locked after a process has been queued, PeopleSoft will refuse to run the process, or run it under the locked account and immediately fail. PeopleSoft does not surface this condition anywhere obvious: Process Monitor shows the job queued, the operator’s user page shows them locked, but nothing connects the two. This alert does.

Common scenarios:

  • A service or batch account had its password expire and was locked
  • An employee left and their account was locked, but scheduled jobs were not transferred
  • A security lockout from failed login attempts affected a batch account

Severity Logic

All findings are reported at Warning severity. Every queued or scheduled process with a locked submitting account is flagged.

What Gets Checked

The alert queries PSPRCSRQST joined to PSOPRDEFN for process requests in Queued or Scheduled run status where the submitting OPRID has ACCTLOCK = 1.

Alert Details

Each alert item includes:

  • Process name and instance number
  • Submitting OPRID (with link to User detail page)
  • Current run status (Queued, Scheduled, etc.)
  • Scheduled run date/time
  • Recurrence name (if applicable)

Configuration

alerts:
  checks:
    locked_oprid_processes:
      enabled: true
      excludeProcesses: []   # Process names to ignore
SettingDefaultDescription
excludeProcesses[]List of process names to exclude from this check

How to Respond

  1. Click the alert link to open the Process Monitor detail page for the affected instance
  2. Identify the locked OPRID shown in the alert
  3. Navigate to the User detail page to review the account lock status
  4. Either unlock the account (if appropriate) or re-queue the process under an active operator account
  5. For recurring processes, update the recurrence definition to use a non-locked operator
  6. Investigate why the account was locked. If it was a failed login lockout, check the Failed Logins alert for additional context

Tables Queried

TableDescription
PSPRCSRQSTProcess Scheduler request queue
PSOPRDEFNOperator definitions (user accounts)

1.5 - Process Run Check

Tailored Operational Context
  • Target Database:
  • Context Type:
  • Alert Severity:
  • Triggered Time:
  • Firing Context:

Process Run Check Alert

Alert ID: process_run_check Category: Process Scheduler

What This Alert Detects

This alert monitors configured critical processes and fires when one has not completed successfully within its expected time window. It is the alert equivalent of the Process Run Check report. The difference is that this runs automatically on every check cycle and surfaces failures on the dashboard without any manual action.

Use this alert for processes that must run on a regular cadence, such as:

  • Nightly batch jobs that must complete before business hours
  • Data synchronization processes that run every few hours
  • Critical integrations that should run multiple times per day
  • Post-maintenance verification of essential processes

Severity Logic

ConditionSeverity
Process has run recently but not successfully in the configured windowWarning
Process has no run history at allCritical

Configuration

Process checks are configured per process name in config.yaml. Each entry specifies the process name and the number of hours within which a successful run is expected.

alerts:
  checks:
    process_run_check:
      enabled: true
      processChecks:
        SOMEJOBNAME: 24      # Must run successfully within 24 hours
        ANOTHERJOB: 8        # Must run successfully within 8 hours
        NIGHTLY_ETL: 12      # Must run successfully within 12 hours
SettingDefaultDescription
processChecks{}Map of process name to expected run window in hours

If a process name is listed with 0 or a negative value, the check defaults to a 24-hour window.

What Gets Checked

For each configured process, psLens queries PSPRCSRQST for successful runs (RunStatus = 9 / Success) within the configured time window. If none are found, it then checks for any run history to determine severity:

  • No successful run in window + recent run history found: Warning
  • No run history at all: Critical

Alert Details

Each alert item includes:

  • Process name
  • Configured threshold (hours)
  • Last known run status (if any history exists)
  • Last known run time (if any history exists)
  • Link to the Process Definition detail page

How to Respond

  1. Click the alert link to open the Process Definition detail page for the affected process
  2. Review recent run history to understand what happened. Did the process run but fail, or did it not run at all?
  3. Check the Process Scheduler server configuration if the process never ran
  4. Investigate error logs if the process ran but ended in a failed state
  5. If the process ran and succeeded but outside the expected window, consider adjusting the threshold in config.yaml

Tables Queried

TableDescription
PSPRCSRQSTProcess Scheduler request queue and run history

1.6 - Process Scheduler Down

Tailored Operational Context
  • Target Database:
  • Context Type:
  • Alert Severity:
  • Triggered Time:
  • Firing Context:

Process Scheduler Down Alert

Alert ID: process_scheduler_down Category: Process Scheduler Default threshold: 10 minutes

What This Alert Detects

This alert triggers when any active Process Scheduler server registered in PSSERVERSTAT has not reported a status update (heartbeat) within the configured amount of time.

Severity Logic

ConditionSeverity
Heartbeat stale by more than thresholdMinutesWarning
Heartbeat stale by more than thresholdMinutes × 2Critical

For example, with the default threshold of 10 minutes:

  • A scheduler that hasn’t heartbeat’ed for 12 minutes → Warning
  • A scheduler that hasn’t heartbeat’ed for 22 minutes → Critical

What Gets Checked

The alert queries the PSSERVERSTAT table to retrieve all server status definitions. For each active scheduler (status not Down/Offline), it calculates the elapsed time since its LASTUPDDTTM timestamp. If that time exceeds the configured threshold, the alert fires.

Alert Details

Each alert item includes:

  • Server name (SERVERNAME)
  • Current status code and friendly string status (e.g., Running, Error, Suspended)
  • Last heartbeat timestamp (LASTUPDDTTM)
  • Host name (SRVRHOSTNAME)
  • A detailed explanation of how long the heartbeat has been stale
  • A link to the Server Definition detail page for that server

Configuration

alerts:
  checks:
    process_scheduler_down:
      enabled: true
      thresholdMinutes: 10          # Minutes stale before flagging as Warning
      excludeProcesses:             # Server names (e.g., PSUNX, PSNT) to skip
        - PSUNX_OLD
SettingDefaultDescription
thresholdMinutes10Minutes of stale heartbeat status updates before a scheduler triggers a Warning alert. Critical fires at 2× this value.
excludeProcesses[]List of server names to exclude from this check. Use for retired scheduler definitions that linger in PSSERVERSTAT but aren’t cleaned up.

How to Respond

  1. Click the alert link to go directly to the Server Definition detail page for the affected scheduler.
  2. Check the Host Name where the Process Scheduler daemon runs.
  3. Access the server host and verify whether the Process Scheduler processes (e.g., psadmin, PSAESRV, etc.) are running.
  4. Review the Process Scheduler logs (e.g., TUXLOG, SCHED_*.LOG) on the host machine to diagnose why the process has hung or crashed.
  5. If the scheduler has hung, stop the process scheduler daemon and restart it using psadmin.
  6. If the server definition is obsolete or decommissioned, consider deleting it in PeopleSoft Server Definitions configuration to clean up the PSSERVERSTAT row.

1.7 - No Process Completed

Tailored Operational Context
  • Target Database:
  • Context Type:
  • Alert Severity:
  • Triggered Time:
  • Firing Context:

No Process Completed Alert

Alert ID: no_process_completed Category: Process Scheduler Default lookback: 1 hour

What This Alert Detects

This alert fires when no process has successfully completed within the configured lookback window. It is a broad scheduler health check. If nothing has finished successfully in the past hour, the Process Scheduler may be down, stalled, or not dispatching jobs.

This is distinct from the Process Run Check, which monitors specific named processes. This alert monitors overall scheduler activity.

Severity Logic

ConditionSeverity
Zero successful completions in the lookback windowWarning

What Gets Checked

The alert queries PSPRCSRQST for any process with RunStatus = 9 (Success) and an end datetime within the lookback window. If no rows are returned, the alert fires.

Only one result is needed to resolve the alert. The check uses a limit of 1 for efficiency.

Alert Details

When firing, the alert produces a single item:

  • Summary: No process completed successfully in the last N hour(s)
  • Lookback hours used for the check

Configuration

alerts:
  checks:
    no_process_completed:
      enabled: true
      lookbackHours: 1    # How far back to look for completed processes
SettingDefaultDescription
lookbackHours1How many hours back to look for a successfully completed process.

How to Respond

  1. Check PeopleSoft’s Process Monitor to see if any processes are running, queued, or have recently completed
  2. Verify the Process Scheduler server is running (PeopleSoft > PeopleTools > Process Scheduler > Servers)
  3. If processes are queued but not running, the scheduler daemon may need to be restarted
  4. If this fires regularly during off-hours when no jobs run, increase lookbackHours or disable the alert for those periods

Tuning

If your environment has periods where no batch jobs are expected to run (e.g., overnight maintenance windows), consider increasing lookbackHours to cover those gaps, or disable the alert entirely during those windows.

2 - Integration Broker

Integration Broker alerts: operation errors, contract errors, stalled messages, abnormal volume detection, and sync exceptions.

Integration Broker alerts monitor your PeopleSoft IB infrastructure for errors, stalled messages, volume anomalies, and sync exceptions.

Where to start. For a new environment, enable IB Down, IB Dispatcher Down, IB No Active Domain, and IB Operation Errors first. These four catch the conditions that produce the most pages. Add stalled checks after tuning thresholds for your operations. Volume checks need 24 hours of history before they fire usefully.

AlertDescription
IB Operation ErrorsAsync IB operation instances in Error or Timeout status
IB Publication Contract ErrorsIB publication contracts in Error or Timeout status
IB Subscription Contract ErrorsIB subscription contracts in Error or Timeout status
IB Operations StalledAsync IB operations stuck in New or Working status longer than the threshold
IB Publication Contracts StalledIB publication contracts stuck in New or Working status
IB Subscription Contracts StalledIB subscription contracts stuck in New or Working status
Abnormal IB Operation VolumeIB operation instance volume exceeds the rolling historical average by the configured percentage
Abnormal IB Publication Contract VolumePublication contract volume exceeds the rolling historical average by the configured percentage
Abnormal IB Subscription Contract VolumeSubscription contract volume exceeds the rolling historical average by the configured percentage
IB Sync Operation ExceptionsSynchronous service operations with errors in the sync transaction log (PSIBLOGHDR)
Integration Broker DownSWS connection failure indicating Integration Broker is down
IB No Active DomainDetects when no active message domains exist in PSAPMSGDOMSTAT
IB Dispatcher DownActive domain dispatcher processes that are inactive or stalled
IB Nodes DownMessage nodes registered as down in PSNODESDOWN with blocked transactions

2.1 - IB Operation Errors

Tailored Operational Context
  • Target Database:
  • Context Type:
  • Alert Severity:
  • Triggered Time:
  • Firing Context:

IB Operation Errors Alert

Alert ID: ib_operation_errors Category: Integration Broker Default lookback: 24 hours

What This Alert Detects

This alert finds asynchronous Integration Broker operation instances that are in Error or Timeout status within a configurable lookback window. These are messages that attempted to process but did not complete successfully.

An IB operation instance represents a single execution of a Service Operation through the Integration Broker. When an instance errors, the message did not reach its destination. The original publish/subscribe data is preserved in the IB tables and can usually be resubmitted from the Service Operations Monitor.

Severity Logic

StatusSeverity
ErrorCritical
TimeoutWarning

Error status means the processing actively failed. Timeout means it ran out of time, which may be a transient issue but still warrants investigation.

Alert Details

Each alert item includes:

  • Operation instance ID
  • Service operation name
  • Status (Error or Timeout)
  • The originating node
  • When the instance was created
  • A link to the IB Monitor detail page

Configuration

alerts:
  checks:
    ib_operation_errors:
      enabled: true
      lookbackHours: 24          # How far back to look for errors
      excludeOperations:         # Operation names to skip
        - SOME_NOISY_OPERATION
SettingDefaultDescription
lookbackHours24Number of hours back to search for error/timeout instances
excludeOperations[]List of IB operation names to exclude from this check

How to Respond

  1. Click the alert link to go to the IB Monitor entry for the failed operation
  2. Review the error details. The IB Monitor shows the error message or exception
  3. Check whether this is a configuration issue (wrong endpoint, auth failure) or a data issue
  4. If the message needs to be reprocessed, you can do so from PeopleSoft’s Service Operations Monitor
  5. If this is a recurring operation, investigate the root cause before errors pile up

Relationship to Other IB Alerts

This alert finds operations that have already ended in error. For operations that are stuck in progress, see IB Operations Stalled.

For similar alerts on publication and subscription contracts, see:

2.2 - IB Operations Stalled

Tailored Operational Context
  • Target Database:
  • Context Type:
  • Alert Severity:
  • Triggered Time:
  • Firing Context:

IB Operations Stalled Alert

Alert ID: ib_operation_stalled Category: Integration Broker Default threshold: 30 minutes

What This Alert Detects

This alert finds asynchronous Integration Broker operation instances that are stuck in New or Working status and have been in that state longer than the configured threshold. These are messages that started processing (or are waiting to be processed) but have not completed in a reasonable amount of time.

Cross-reference with IB Dispatcher Down first. A stalled queue plus a down dispatcher is almost always the dispatcher.

Severity Logic

ConditionSeverity
Stuck longer than thresholdMinutesWarning
Stuck longer than thresholdMinutes × 2Critical

For example, with the default threshold of 30 minutes:

  • An operation stuck for 35 minutes → Warning
  • An operation stuck for 65 minutes or more → Critical

Alert Details

Each alert item includes:

  • Operation instance ID
  • Service operation name
  • Current status (New or Working)
  • How long it has been stuck (in minutes)
  • The originating node
  • A link to the IB Monitor detail page

Configuration

alerts:
  checks:
    ib_operation_stalled:
      enabled: true
      thresholdMinutes: 30       # Minutes before flagging as Warning
      excludeOperations:         # Operation names to skip
        - BULK_SYNC_OPERATION
SettingDefaultDescription
thresholdMinutes30Minutes an operation must be stuck to trigger a Warning. Critical fires at 2× this value.
excludeOperations[]List of IB operation names to exclude from this check. Use for known long-running operations.

How to Respond

  1. Click the alert link to go to the IB Monitor entry for the stalled operation
  2. Check whether the IB dispatcher/handlers are running on the PeopleSoft application server
  3. Look for signs of a larger IB backlog (many operations in New status)
  4. Check the gateway and connector configuration if the operation can’t reach a node
  5. If the operation is safe to reprocess, you can cancel and resubmit from PeopleSoft’s Service Operations Monitor

Relationship to Other IB Alerts

This alert finds operations that are stuck in progress. For operations that have already ended in error, see IB Operation Errors.

For similar alerts on publication and subscription contracts, see:

2.3 - IB Publication Contract Errors

Tailored Operational Context
  • Target Database:
  • Context Type:
  • Alert Severity:
  • Triggered Time:
  • Firing Context:

IB Publication Contract Errors Alert

Alert ID: ib_pub_contract_errors Category: Integration Broker Default lookback: 24 hours

What This Alert Detects

This alert finds Integration Broker publication contracts that are in Error or Timeout status within a configurable lookback window.

In the PeopleSoft Integration Broker architecture, a publication contract tracks the delivery of a published message to a specific subscribing node. When a publication contract fails, it means a message that PeopleSoft published was not successfully delivered to one or more subscribers.

When This Matters

Publication contract errors typically mean:

  • An outbound message from PeopleSoft was not delivered to a downstream system
  • An integration partner did not receive data it was expecting
  • A workflow or data sync that depends on this message may be incomplete

Severity Logic

StatusSeverity
ErrorCritical
TimeoutWarning

Alert Details

Each alert item includes:

  • Publication contract ID
  • Service operation name
  • Status (Error or Timeout)
  • The target subscribing node
  • When the contract was created
  • A link to the IB Monitor detail page

Configuration

alerts:
  checks:
    ib_pub_contract_errors:
      enabled: true
      lookbackHours: 24          # How far back to look for errors
      excludeOperations:         # Operation names to skip
        - HIGH_VOLUME_SYNC
SettingDefaultDescription
lookbackHours24Number of hours back to search for error/timeout contracts
excludeOperations[]List of IB operation names to exclude from this check

How to Respond

  1. Click the alert link to go to the IB Monitor entry for the failed contract
  2. Review the error details. The IB Monitor shows the error message
  3. Check whether the target subscribing node is reachable and its connector is configured correctly
  4. Review the node’s authentication settings (see the Nodes with No Password report if auth may be the issue)
  5. If the message can be safely reprocessed, resubmit from PeopleSoft’s Publication Contracts Monitor

Relationship to Other IB Alerts

For pub contracts stuck in progress, see IB Publication Contracts Stalled.

For similar alerts on async operations and subscription contracts:

2.4 - IB Publication Contracts Stalled

Tailored Operational Context
  • Target Database:
  • Context Type:
  • Alert Severity:
  • Triggered Time:
  • Firing Context:

IB Publication Contracts Stalled Alert

Alert ID: ib_pub_contract_stalled Category: Integration Broker Default threshold: 30 minutes

What This Alert Detects

This alert finds Integration Broker publication contracts that are stuck in New or Working status and have not progressed beyond that state within the configured threshold.

A stalled publication contract means PeopleSoft has published a message to a subscriber, but the delivery has not completed. The message is either waiting to be picked up (New) or is in the process of being delivered but taking too long (Working).

When This Matters

Check IB Nodes Down and IB Dispatcher Down first. A stalled pub contract is usually one of those two conditions.

Severity Logic

ConditionSeverity
Stuck longer than thresholdMinutesWarning
Stuck longer than thresholdMinutes × 2Critical

Alert Details

Each alert item includes:

  • Publication contract ID
  • Service operation name
  • Current status (New or Working)
  • How long it has been stuck (in minutes)
  • The target subscribing node
  • A link to the IB Monitor detail page

Configuration

alerts:
  checks:
    ib_pub_contract_stalled:
      enabled: true
      thresholdMinutes: 30       # Minutes before flagging as Warning
      excludeOperations:         # Operation names to skip
        - LARGE_BATCH_SYNC
SettingDefaultDescription
thresholdMinutes30Minutes a contract must be stuck to trigger a Warning. Critical fires at 2× this value.
excludeOperations[]List of IB operation names to exclude. Use for operations that legitimately take a long time.

How to Respond

  1. Click the alert link to go to the IB Monitor entry for the stalled contract
  2. Check whether the IB dispatchers are running on the PeopleSoft application server
  3. Look for a larger backlog (many contracts in New status may mean the dispatcher is down)
  4. Check whether the target node’s endpoint is reachable
  5. Review connector configuration for the target node

Relationship to Other IB Alerts

For pub contracts that have already ended in error, see IB Publication Contract Errors.

For similar stalled alerts on operations and subscription contracts:

2.5 - IB Subscription Contract Errors

Tailored Operational Context
  • Target Database:
  • Context Type:
  • Alert Severity:
  • Triggered Time:
  • Firing Context:

IB Subscription Contract Errors Alert

Alert ID: ib_sub_contract_errors Category: Integration Broker Default lookback: 24 hours

What This Alert Detects

This alert finds Integration Broker subscription contracts that are in Error or Timeout status within a configurable lookback window.

In the PeopleSoft Integration Broker architecture, a subscription contract tracks the processing of an inbound message by a subscribing handler. When a subscription contract fails, it means PeopleSoft received a message from an external system but was unable to process it completely.

When This Matters

Subscription contract errors typically mean:

  • An inbound message from an integration partner was not fully processed
  • PeopleSoft was unable to apply the data changes the message carried
  • A business process that depends on this message may be incomplete or in an error state

Severity Logic

StatusSeverity
ErrorCritical
TimeoutWarning

Alert Details

Each alert item includes:

  • Subscription contract ID
  • Service operation name
  • Status (Error or Timeout)
  • The originating (publishing) node
  • When the contract was created
  • A link to the IB Monitor detail page

Configuration

alerts:
  checks:
    ib_sub_contract_errors:
      enabled: true
      lookbackHours: 24          # How far back to look for errors
      excludeOperations:         # Operation names to skip
        - KNOWN_RETRY_OPERATION
SettingDefaultDescription
lookbackHours24Number of hours back to search for error/timeout contracts
excludeOperations[]List of IB operation names to exclude from this check

How to Respond

  1. Click the alert link to go to the IB Monitor entry for the failed contract
  2. Review the error details. The IB Monitor typically shows the exception or error message from the handler PeopleCode
  3. Check the subscription handler code for the operation (viewable in the Service Operation detail page in psLens)
  4. Investigate whether the data in the message is valid. Handler errors often come from unexpected data
  5. If the subscription can be safely reprocessed, resubmit from PeopleSoft’s Subscription Contracts Monitor

Relationship to Other IB Alerts

For sub contracts stuck in progress, see IB Subscription Contracts Stalled.

For similar alerts on operations and publication contracts:

2.6 - IB Subscription Contracts Stalled

Tailored Operational Context
  • Target Database:
  • Context Type:
  • Alert Severity:
  • Triggered Time:
  • Firing Context:

IB Subscription Contracts Stalled Alert

Alert ID: ib_sub_contract_stalled Category: Integration Broker Default threshold: 30 minutes

What This Alert Detects

This alert finds Integration Broker subscription contracts that are stuck in New or Working status and have not progressed beyond that state within the configured threshold.

A stalled subscription contract means PeopleSoft received an inbound message from an external system, but has not yet finished processing it. The message is either waiting to be handled (New) or is in the process of being handled but taking too long (Working).

When This Matters

For sub contracts, the most common cause specific to this alert is subscription handler PeopleCode that runs unusually long or waits on an external resource. For broader IB-side causes (dispatcher down, backlog), cross-reference IB Dispatcher Down and IB Operations Stalled.

Severity Logic

ConditionSeverity
Stuck longer than thresholdMinutesWarning
Stuck longer than thresholdMinutes × 2Critical

Alert Details

Each alert item includes:

  • Subscription contract ID
  • Service operation name
  • Current status (New or Working)
  • How long it has been stuck (in minutes)
  • The originating (publishing) node
  • A link to the IB Monitor detail page

Configuration

alerts:
  checks:
    ib_sub_contract_stalled:
      enabled: true
      thresholdMinutes: 30       # Minutes before flagging as Warning
      excludeOperations:         # Operation names to skip
        - BULK_INBOUND_SYNC
SettingDefaultDescription
thresholdMinutes30Minutes a contract must be stuck to trigger a Warning. Critical fires at 2× this value.
excludeOperations[]List of IB operation names to exclude. Use for known long-running handlers.

How to Respond

  1. Click the alert link to go to the IB Monitor entry for the stalled contract
  2. Check whether the IB dispatchers are running on the PeopleSoft application server
  3. Look for a broader backlog. Many New contracts may mean no dispatcher is running
  4. Check application server logs for errors in the subscription handler
  5. For Working contracts, check whether the handler PeopleCode is looping or waiting on an external resource

Relationship to Other IB Alerts

For sub contracts that have already ended in error, see IB Subscription Contract Errors.

For similar stalled alerts on operations and publication contracts:

2.7 - Abnormal IB Operation Volume

Tailored Operational Context
  • Target Database:
  • Context Type:
  • Alert Severity:
  • Triggered Time:
  • Firing Context:

Abnormal IB Operation Volume Alert

Alert ID: ib_operation_volume Category: Integration Broker

What This Alert Detects

This alert detects when the volume of IB async operation instances (PSAPMSGPUBHDR) is significantly higher than the rolling historical average.

A sudden volume spike may indicate a runaway integration sending messages in a loop, an upstream system retrying failed messages, or an unusually large but legitimate batch publish event.

How Baselining Works

psLens maintains a rolling history of up to 288 volume snapshots (approximately 24 hours at a 5-minute check interval). Each check cycle records the current message count for the lookback window.

Once at least 6 baseline snapshots have accumulated, the alert begins comparing the current count against the historical average. This prevents false alerts during the first few minutes after psLens starts.

Severity Logic

ConditionSeverity
Volume exceeds average by >= thresholdPercentWarning
Volume exceeds average by >= thresholdPercent x 2Critical
Historical average is 0 and current count is > 0Warning

For example, with the default threshold of 50%:

  • Current count is 75% above average → Warning
  • Current count is 100%+ above average → Critical

Configuration

alerts:
  checks:
    ib_operation_volume:
      enabled: true
      lookbackHours: 1         # Window for counting current messages
      thresholdCount: 50       # Percentage increase to trigger Warning
SettingDefaultDescription
lookbackHours1Hours to look back when counting current message volume
thresholdCount50Percentage increase over historical average to trigger a Warning alert. Critical fires at 2x this value.

Alert Details

Each alert item includes:

  • Current message count for the lookback window
  • Historical average count
  • Percentage increase above average
  • Number of baseline samples used
  • Link to the IB Monitor page

How to Respond

  1. Navigate to the IB Monitor in psLens to see which operations are generating the volume
  2. Check if any operations are in Error or Stalled status (see related IB alerts)
  3. Review the specific operations with high message counts to determine if the volume is expected
  4. If a runaway integration is identified, investigate the upstream system sending the messages
  5. Consider using the Daily IB Volume report to review historical patterns

Tables Queried

TableDescription
PSAPMSGPUBHDRIB async operation instance headers

2.8 - Abnormal IB Publication Contract Volume

Tailored Operational Context
  • Target Database:
  • Context Type:
  • Alert Severity:
  • Triggered Time:
  • Firing Context:

Abnormal IB Publication Contract Volume Alert

Alert ID: ib_pub_contract_volume Category: Integration Broker

What This Alert Detects

This alert detects when the volume of IB publication contracts (PSAPMSGPUBCON) is significantly higher than the rolling historical average. Publication contracts represent messages being delivered to subscribing nodes, so a spike here can indicate unexpected fan-out, retries, or a high-volume event.

A sudden volume spike may indicate repeated retries of failed contracts inflating the count, an upstream system publishing in a loop, or a legitimate but unusually large batch publication event.

How Baselining Works

psLens maintains a rolling history of up to 288 volume snapshots (approximately 24 hours at a 5-minute check interval). Each check cycle records the current publication contract count for the lookback window.

Once at least 6 baseline snapshots have accumulated, the alert begins comparing the current count against the historical average.

Severity Logic

ConditionSeverity
Volume exceeds average by >= thresholdPercentWarning
Volume exceeds average by >= thresholdPercent x 2Critical
Historical average is 0 and current count is > 0Warning

For example, with the default threshold of 50%:

  • Current count is 75% above average → Warning
  • Current count is 100%+ above average → Critical

Configuration

alerts:
  checks:
    ib_pub_contract_volume:
      enabled: true
      lookbackHours: 1         # Window for counting current contracts
      thresholdCount: 50       # Percentage increase to trigger Warning
SettingDefaultDescription
lookbackHours1Hours to look back when counting current contract volume
thresholdCount50Percentage increase over historical average to trigger a Warning alert. Critical fires at 2x this value.

Alert Details

Each alert item includes:

  • Current publication contract count for the lookback window
  • Historical average count
  • Percentage increase above average
  • Number of baseline samples used
  • Link to the IB Monitor page

How to Respond

  1. Navigate to the IB Monitor in psLens to see which publication contracts are generating the volume
  2. Check for contracts in Error status (see IB Publication Contract Errors alert)
  3. Review whether retries are inflating the count. Repeated errors create new contract instances.
  4. Consider using the Daily IB Volume report to compare against historical patterns

Tables Queried

TableDescription
PSAPMSGPUBCONIB publication contract records

2.9 - Integration Broker Down

Tailored Operational Context
  • Target Database:
  • Context Type:
  • Alert Severity:
  • Triggered Time:
  • Firing Context:

Integration Broker Down Alert

Alert ID: ib_down Category: Integration Broker Default threshold: Immediate

What This Alert Detects

This alert fires when the SWS endpoint returns a connection error, timeout, 404, or 5xx. psLens cannot fetch IB data without SWS, so this also blocks every other psLens IB check. A connection failure indicates that the Integration Broker gateway, local node, or application server is down.

Severity Logic

ConditionSeverity
SWS connection attempt failsCritical

What Gets Checked

The alert invokes the standard SWS connection test method. If it receives any network connection error (e.g., HTTP gateway timeout, connection refused, dns resolving failure, or HTTP 404/500 errors), a Critical alert is raised immediately.

Alert Details

Each alert item includes:

  • The base REST URL of the SWS endpoint (baseURL)
  • The raw network or connection error message
  • Troubleshooting links to verify connection settings

Configuration

alerts:
  checks:
    ib_down:
      enabled: true
SettingDefaultDescription
enabledtrueWhether this check is active.

How to Respond

  1. Verify that the PeopleSoft Web Server and PIA are up and running.
  2. Check the Integration Gateway web application status (typically /PSIGW/PeopleSoftListeningConnector).
  3. Ensure that the application server domain is booted and the Integration Broker handlers/dispatchers are active.
  4. Verify there are no firewalls, proxies, or security policies blocking outbound HTTPS traffic from the psLens server to the Integration Broker gateway port.
  5. Inspect the basic auth credentials in the config.yaml to ensure the API service user has not expired or been locked.

2.10 - Abnormal IB Subscription Contract Volume

Tailored Operational Context
  • Target Database:
  • Context Type:
  • Alert Severity:
  • Triggered Time:
  • Firing Context:

Abnormal IB Subscription Contract Volume Alert

Alert ID: ib_sub_contract_volume Category: Integration Broker

What This Alert Detects

This alert detects when the volume of IB subscription contracts (PSAPMSGSUBCON) is significantly higher than the rolling historical average. Subscription contracts represent inbound messages being processed by local subscribers, so a spike here can indicate a flood of inbound messages, excessive retries, or an integration firing more frequently than expected.

How Baselining Works

psLens maintains a rolling history of up to 288 volume snapshots (approximately 24 hours at a 5-minute check interval). Each check cycle records the current subscription contract count for the lookback window.

Once at least 6 baseline snapshots have accumulated, the alert begins comparing the current count against the historical average.

Severity Logic

ConditionSeverity
Volume exceeds average by >= thresholdPercentWarning
Volume exceeds average by >= thresholdPercent x 2Critical
Historical average is 0 and current count is > 0Warning

For example, with the default threshold of 50%:

  • Current count is 75% above average → Warning
  • Current count is 100%+ above average → Critical

Configuration

alerts:
  checks:
    ib_sub_contract_volume:
      enabled: true
      lookbackHours: 1         # Window for counting current contracts
      thresholdCount: 50       # Percentage increase to trigger Warning
SettingDefaultDescription
lookbackHours1Hours to look back when counting current contract volume
thresholdCount50Percentage increase over historical average to trigger a Warning alert. Critical fires at 2x this value.

Alert Details

Each alert item includes:

  • Current subscription contract count for the lookback window
  • Historical average count
  • Percentage increase above average
  • Number of baseline samples used
  • Link to the IB Monitor page

How to Respond

  1. Navigate to the IB Monitor in psLens to see which subscription contracts are generating the volume
  2. Check for contracts in Error status (see IB Subscription Contract Errors alert)
  3. Identify which operations are receiving the high volume of inbound messages
  4. Contact the sending system’s administrators if the volume is unexpected
  5. Consider using the Daily IB Volume report to compare against historical patterns

Tables Queried

TableDescription
PSAPMSGSUBCONIB subscription contract records

2.11 - IB No Active Domain

Tailored Operational Context
  • Target Database:
  • Context Type:
  • Alert Severity:
  • Triggered Time:
  • Firing Context:

IB No Active Domain Alert

Alert ID: ib_no_active_domain Category: Integration Broker Default threshold: Immediate

What This Alert Detects

This alert triggers when no active message domains are found in the Integration Broker. In PeopleSoft, message domains correspond to individual application servers running the Integration Broker background processes. If there are no active domains, Integration Broker cannot route, dispatch, or process any asynchronous publication or subscription messages.

Severity Logic

ConditionSeverity
No message domains are in “Active” statusCritical

What Gets Checked

The alert queries the PSAPMSGDOMSTAT table to retrieve all registered domains. It counts the domains where the status is "A" (Active). If the count of active domains is zero, it triggers a Critical alert.

Alert Details

Each alert item includes:

  • The total count of registered domains
  • A list of inactive domains along with their machine names and app server paths
  • A link to the Integration Broker Monitor on the dashboard to inspect the domain statuses

Configuration

alerts:
  checks:
    ib_no_active_domain:
      enabled: true
SettingDefaultDescription
enabledtrueWhether this check is active.

How to Respond

  1. Log into the PeopleSoft server or use PSAdmin to check the status of the application server domains.
  2. Verify if the Quick-Start or normal app server configuration has Integration Broker (Pub/Sub) processes enabled.
  3. If the application server was recently restarted, check if the domains were configured to boot automatically or if they need to be booted manually.
  4. Check system logs for application server boot crashes, memory errors, or database connection failures.

2.12 - IB Dispatcher Down

Tailored Operational Context
  • Target Database:
  • Context Type:
  • Alert Severity:
  • Triggered Time:
  • Firing Context:

IB Dispatcher Down Alert

Alert ID: ib_dispatcher_down Category: Integration Broker Default threshold: 10 minutes

What This Alert Detects

This alert monitors Integration Broker dispatcher processes (such as the publication dispatcher, subscription dispatcher, or handler dispatchers) on active domains and triggers when any dispatcher is inactive or has stopped reporting health updates.

When a dispatcher process is down, messages assigned to that dispatcher fail to process and queue up indefinitely, leading to a backlog.

Severity Logic

ConditionSeverity
Dispatcher status is not active, or no health update within thresholdMinutesWarning
Dispatcher has not updated health status for more than thresholdMinutes × 2Critical

For example, with the default threshold of 10 minutes:

  • No health update for 11 minutes → Warning
  • No health update for 20+ minutes → Critical

What Gets Checked

The alert queries PSAPMSGDSPSTAT for dispatcher process statuses. For each dispatcher associated with an Active domain (from PSAPMSGDOMSTAT), it verifies:

  1. The status string is "ACT" (Active).
  2. The health timestamp (DspHealthDttm) has been updated within the threshold window.

Note: Dispatchers on inactive domains are ignored by this check (they are covered by the IB No Active Domain check).

Alert Details

Each alert item includes:

  • Dispatcher process name
  • Physical machine/host name
  • App server path
  • The last updated health timestamp
  • Reason for the down status (e.g. status not active, or elapsed minutes since last update)
  • A link to the Integration Broker Monitor page

Configuration

alerts:
  checks:
    ib_dispatcher_down:
      enabled: true
      thresholdMinutes: 10
SettingDefaultDescription
thresholdMinutes10Minutes a dispatcher can go without updating health before raising a Warning. Critical fires at 2× this threshold.

How to Respond

  1. Go to the Integration Broker Monitor on the psLens dashboard to identify which specific dispatcher on which host is failing.
  2. Log into the affected PeopleSoft application server and check the status of the dispatcher processes via PSAdmin.
  3. Review the application server and Pub/Sub subdirectories log files (such as APPSRV.log, TUXLOG, or stderr / stdout logs) for crashes, Tuxedo errors, or database lockups.
  4. Restart the Pub/Sub processes on the application server if the dispatcher has locked up or crashed.

2.13 - IB Sync Operation Exceptions

Tailored Operational Context
  • Target Database:
  • Context Type:
  • Alert Severity:
  • Triggered Time:
  • Firing Context:

IB Sync Operation Exceptions Alert

Alert ID: ib_sync_exceptions Category: Integration Broker

What This Alert Detects

This alert detects synchronous IB service operations that have logged errors in the sync transaction log (PSIBLOGHDR) within the configured lookback window. Results are aggregated by operation name to avoid noise from high-volume sync operations that may have individual errors but function normally overall.

Unlike async IB errors (which create persistent queue entries), sync operation errors are logged transiently in PSIBLOGHDR. Without active monitoring, these errors are often missed until a consuming system reports a problem.

Severity Logic

All findings are reported at Critical severity. Sync errors are always surfaced immediately because they affect real-time integrations.

What Gets Checked

The alert queries PSIBLOGHDR for records with error status within the lookback window, aggregated by IB operation name. Only operations with at least one error are reported. Operations in the exclude list are skipped.

Alert Details

Each alert item includes:

  • Service operation name (with link to Service Operation detail page)
  • Number of errors in the lookback window
  • Lookback window in hours

Configuration

alerts:
  checks:
    ib_sync_exceptions:
      enabled: true
      lookbackHours: 24          # How far back to check for sync errors
      excludeOperations: []      # Operation names to ignore
SettingDefaultDescription
lookbackHours24Hours to look back for sync operation errors
excludeOperations[]List of operation names to exclude from this check

How to Respond

  1. Click the alert link to open the Service Operation detail page
  2. Review the operation’s handler configuration and routing
  3. Check PSIBLOGHDR directly for the specific error messages (psLens links to the operation, not individual log entries)
  4. Verify that the operation’s service handler is still functional and the underlying code has not changed
  5. Contact the calling system to understand whether they are seeing failures on their end
  6. Use the Sync Operations Without Logging report to identify any sync operations that are not logging and may have undetected errors

Prerequisites

The PSIBLOGHDR table must be whitelisted in the PeopleSoft SWS framework on each target environment. Sync logging must be enabled on the relevant routings. If logging is disabled, errors will not appear in PSIBLOGHDR. See the Sync Operations Without Logging report to find operations where logging may be disabled.

Tables Queried

TableDescription
PSIBLOGHDRIB sync transaction log headers

2.14 - IB Nodes Down

Tailored Operational Context
  • Target Database:
  • Context Type:
  • Alert Severity:
  • Triggered Time:
  • Firing Context:

IB Nodes Down Alert

Alert ID: ib_nodes_down Category: Integration Broker Default threshold: Immediate

What This Alert Detects

This alert triggers when there are entries in the PeopleSoft table PSNODESDOWN. An entry in this table indicates that a message node is blocked or offline.

When the Integration Broker attempts to publish a message to an external or remote node and the connection fails, the system registers the node as “down” in PSNODESDOWN. While the node is down, all subsequent publication contracts to that node are automatically paused/blocked and remain queued in the database until the node status is resolved.

Severity Logic

ConditionSeverity
Message node entry exists in PSNODESDOWNCritical

What Gets Checked

The alert queries PSNODESDOWN for any active rows. For each row found, it groups the results by message node and counts the number of blocked transactions.

Alert Details

Each alert item includes:

  • The name of the offline message node
  • The count of blocked transactions queueing for the node
  • A link to the Node detail page in the browser showing connection diagnostics

Configuration

alerts:
  checks:
    ib_nodes_down:
      enabled: true
SettingDefaultDescription
enabledtrueWhether this check is active.

How to Respond

  1. Click the node link in the alert to inspect the node configuration and test connectivity.
  2. Check if the external target service (e.g. an external API gateway, third-party system, or another PeopleSoft application node) is offline or undergoing maintenance.
  3. Verify network routing and gateway connector configurations.
  4. Once the external endpoint is confirmed to be healthy, delete the down-node status entry in PeopleSoft (typically via the Service Operations Monitor > Administration > Nodes Down page) and resubmit or force retry the stalled publication contracts.

3 - Security

Security alerts: failed login detection and authentication monitoring.

Security alerts monitor your PeopleSoft environment for authentication issues and suspicious login activity.

AlertDescription
Failed LoginsUsers with excessive failed login attempts in PSPTLOGINAUDIT

3.1 - Failed Logins

Tailored Operational Context
  • Target Database:
  • Context Type:
  • Alert Severity:
  • Triggered Time:
  • Firing Context:

Failed Logins Alert

Alert ID: failed_logins Category: Security Default threshold: 5 failed attempts

What This Alert Detects

This alert finds PeopleSoft users with excessive failed login attempts by querying the PSPTLOGINAUDIT table. It only reports users whose most recent login attempt was a failure (PT_SIGNON_STATUS = 1).

PSPTLOGINAUDIT stores only the last login state per user. Once a user successfully logs in, their failure count resets. This means the alert reflects the current state: users who are actively failing to log in right now.

A high number of failed logins may indicate:

  • A brute-force attack against a user account
  • A user who has forgotten their password
  • An integration or batch account with stale credentials
  • An account lockout situation that needs admin attention

Severity Logic

ConditionSeverity
Failed logins >= thresholdCountWarning
Failed logins >= thresholdCount x 2Critical

For example, with the default threshold of 5:

  • A user with 6 failed logins -> Warning
  • A user with 10 or more failed logins -> Critical

What Gets Checked

The alert queries PSPTLOGINAUDIT for rows where:

  • PT_SIGNON_STATUS = '1' (last attempt was a failure)
  • FAILEDLOGINS >= threshold (failed count meets or exceeds the configured threshold)

Results are ordered by FAILEDLOGINS descending (highest failure counts first).

Alert Details

Each alert item includes:

  • Signon ID (PTSIGNONID) — the username entered at the login screen
  • OPRID — the resolved PeopleSoft user ID
  • Number of failed login attempts
  • Authentication type (Token/SSO, Signon PeopleCode, or Standard)
  • Timestamp of the last failed login attempt
  • A link to the User detail page (when the OPRID is resolved)

Configuration

alerts:
  checks:
    failed_logins:
      enabled: true
      thresholdCount: 5    # Failed attempts before flagging as Warning
SettingDefaultDescription
thresholdCount5Number of failed logins to trigger a Warning alert. Critical fires at 2x this value.

How to Respond

  1. Click the alert link to go to the User detail page for the affected account
  2. Check the authentication type. Token/SSO failures may indicate a misconfigured integration
  3. Review the timestamp. Recent failures are more concerning than old ones
  4. Check if the user’s account is locked (ACCTLOCK in PSOPRDEFN)
  5. If the failures look like a brute-force attempt, consider locking the account and investigating the source
  6. For legitimate users, help them reset their password and unlock their account

PeopleSoft Table Reference

This alert queries the PSPTLOGINAUDIT Tools table. For more details on this table, see Exploring the PSPTLOGINAUDIT Tools Table.

Prerequisites

The PSPTLOGINAUDIT table must be whitelisted in the PeopleSoft SWS framework on each target environment. If the table is not whitelisted, this alert will log an error on each check cycle but will not affect other alerts.

4 - Web Server / WebLib Down

Tailored Operational Context
  • Target Database:
  • Context Type:
  • Alert Severity:
  • Triggered Time:
  • Firing Context:

Web Server / WebLib Down Alert

Alert ID: weblib_down Category: Web Server / WebLib Default threshold: Immediate

What This Alert Detects

psLens POSTs to one or more configured WebLib or IScript URLs every check cycle. If any target connection fails or returns a 5xx, the alert fires Critical.

By default, the checker tests the standard delivered WebLib endpoint: WEBLIB_PTBR.ISCRIPT1.FieldFormula.IScript_StartPage

If the Web Server is down, or if the whitelisted API service account loses security access to the tested WebLib, the alert triggers immediately.

Severity Logic

ConditionSeverity
Target URL is unreachable or connection times outCritical
Target WebLib returns an HTTP status code >= 500 (Internal Server Error)Critical

Note: HTTP statuses like 200 (Success), 401 (Unauthorized), 403 (Forbidden), or 302 (Redirect) indicate the Web Server is active and processing requests; they do not trigger a down alert.

What Gets Checked

For each target URL configured in weblibTestTargets:

  1. It sends an HTTP POST request with a 10-second timeout.
  2. It applies HTTP Basic Authentication using either target-specific credentials or default database connection credentials.
  3. It registers a failure if the request fails to connect or returns a status code in the 5xx range.

Configuration

alerts:
  checks:
    weblib_down:
      enabled: true
      weblibTestTargets:
        - url: "https://pia.yourcompany.com/psc/ps/s/WEBLIB_PTBR.ISCRIPT1.FieldFormula.IScript_StartPage"
          username: "PIA_TEST_USER"
          password: "secure-password"
SettingDefaultDescription
enabledtrueWhether this check is active.
weblibTestTargets[]List of target configurations. Each target requires a url and optional username / password overrides.

How to Respond

  1. Verify whether the PeopleSoft Web Server (WebLogic or WebSphere) process is booted and running.
  2. Check if there are network outages, load balancer failures, or firewall changes between the psLens server and the PIA URL.
  3. If the server is reachable but returning a weblib_down alert, verify that the configured PeopleSoft service account has security clearance (assigned Permission Lists) to access the target WebLib.

5 - Generic SWS Alerts

Generic SWS Alerts

Generic SWS Alerts let you define alert checks in YAML using PsoftQL queries against whitelisted PeopleSoft records. Use these when you need a one-off check that the built-in alert types don’t cover.

The scheduler runs each query on the database’s checking interval and triggers alerts based on the resulting row counts.


Configuration Properties

Generic alerts are configured under the genericSWSAlerts list, either globally under alerts or overridden per-database under databases[].alerts.

PropertyTypeRequiredDefaultDescription
idStringYes-Unique alphanumeric identifier. The system registers the alert internally as generic_sws_<id>.
nameStringYes-Friendly name shown on the dashboard and in reports (e.g. Stale Admins).
enabledBooleanNotrueToggle execution of this generic alert.
severityStringNowarningSeverity of the alert when triggered: info, warning, or critical.
alertOnStringNorow_foundCondition to trigger the alert: row_found (trigger if row count > 0) or no_result_found (trigger if row count == 0).
messageStringYes-Summary message shown on the dashboard and sent in notifications when the alert triggers.
queryObjectYes-A complete PsoftQL query request payload. See PsoftQL Query Structure.

Whitelisting Security Requirement


PsoftQL Query Structure

The query property follows the exact structure of a psLens PsoftQLRequest query:

PropertyTypeDescription
recordsArrayList of record configurations to query (can be nested for joins).
rowLimitIntegerMax rows to return (recommended to keep low, e.g. 5 or 10).
orderByStringSQL ORDER BY clause for sorting findings.
noEffectiveDateLogicBooleanSet true to skip automatic EFFDT filtering logic.
noEffectiveStatusLogicBooleanSet true to skip automatic EFF_STATUS = 'A' filtering logic.

Record Configuration (records[])

  • recordName (String, Required): PeopleSoft record name (e.g., PSOPRDEFN).
  • sqlWhereClause (String, Optional): Filter criteria SQL fragment (e.g., ACCTLOCK = 1).
  • excludeFields (List, Optional): Field names to exclude from results.

Practical Examples

Example 1: Critical Administrative Account Access (Row Found)

This alert triggers a Critical warning if an administrator account has been modified recently, or if a locked/inactive operator is seen initiating processes.

alerts:
  genericSWSAlerts:
    - id: "locked_oprid_activity"
      name: "Locked Admin Activity"
      enabled: true
      severity: "critical"
      alertOn: "row_found"
      message: "Security warning: Activity detected from locked operator accounts!"
      query:
        records:
          - recordName: "PSPRCSRQST"
            sqlWhereClause: "RUNDTTM > CAST(SYSDATE - 1 AS DATE) AND OPRID IN (SELECT OPRID FROM PSOPRDEFN WHERE ACCTLOCK = 1)"
        rowLimit: 5

Example 2: Process Scheduler Daemon Down (No Result Found)

This alert triggers a Critical warning if no process scheduler daemon has updated its status in the last 15 minutes, indicating that the scheduler might be down.

alerts:
  genericSWSAlerts:
    - id: "scheduler_daemon_down"
      name: "Process Scheduler Daemon Status"
      enabled: true
      severity: "critical"
      alertOn: "no_result_found"
      message: "Alert: No active process scheduler daemons detected in the last 15 minutes!"
      query:
        records:
          - recordName: "PSSERVERDEFN"
            sqlWhereClause: "LASTUPDDTTM > CAST(SYSDATE - 1/96 AS DATE)" # 15 minutes lookback
        rowLimit: 1

Notification Routing

To route notifications for a generic SWS alert, use its registered ID (generic_sws_<id>) in the alertTypes property of your notification subscription:

notifications:
  subscriptions:
    - id: "critical-teams-webhooks"
      enabled: true
      alertTypes:
        - "generic_sws_locked_oprid_activity"
        - "generic_sws_scheduler_daemon_down"
      databases: ["*"]
      type: "webhook"
      target: "https://hooks.slack.com/services/..."