# Deployment & Operations

> How psLens is deployed, network requirements, sizing, backup, upgrade, DR, monitoring — consolidated for ops and security reviewers.

---

LLMS index: [llms.txt](/llms.txt)

---

This page consolidates the operational story that an IT or DBA reviewer needs in one place. It overlaps with [Installation](/docs/getting-started/installation/) and [Deployment Options](/docs/getting-started/deployment-options/); those are the *how-to* references, and this page is the *what-to-expect* security-review companion.

---

## 1. Deployment Model

- **One container per customer.** Each customer gets a dedicated psLens deployment: separate process, separate NATS instance, separate `/data` volume.
- **No shared multi-tenant backend.** There is no Cedar Hills Group SaaS plane that customer instances talk to. Your psLens instance talks to your PeopleSoft and (optionally) your SMTP, and that is it.
- **Two hosting options:**

  |    Mode     |    Operated by    |                                           Where it runs                                           |
  | ----------- | ----------------- | ------------------------------------------------------------------------------------------------- |
  | Managed     | Cedar Hills Group | fly.io, in the region you choose at provisioning                                                  |
  | Self-hosted | You               | Docker, docker-compose, Kubernetes, or systemd on a Linux VM. Your cloud, on-prem, or air-gapped. |

The choice is reversible. You can start managed and migrate to self-hosted (or vice versa). The data volume is portable and the configuration travels.

---

## 2. Network Requirements

### Inbound

|          From          |                       To                       |              Why               |
| ---------------------- | ---------------------------------------------- | ------------------------------ |
| Your users (browsers)  | psLens UI on HTTPS (port per your TLS pattern) | The whole point                |
| Your monitoring system | `GET /healthz`                                 | Liveness check, returns 200 OK |

### Outbound

|  From  |            To             |                                     When                                     |
| ------ | ------------------------- | ---------------------------------------------------------------------------- |
| psLens | Your SWS endpoint (HTTPS) | Every request that hits PeopleSoft                                           |
| psLens | Your SMTP server          | Only if `auth.enabled: true` and you're using built-in magic-link            |
| psLens | Anywhere else             | **Never.** No telemetry, no update checks, no callback to Cedar Hills Group. |

This means psLens runs cleanly behind strict egress filtering. Allow it the SWS hostname (and SMTP host if applicable) and deny everything else.

### TLS termination

Cross-link: [Deployment Options → TLS](/docs/getting-started/deployment-options/) compares 6 patterns (native cert, Let's Encrypt, Caddy, nginx, Traefik, Tailscale Serve). Pick the one that matches your existing edge.

---

## 3. Sizing

Recommended minimums:

| Resource |     Minimum     |                                 Notes                                 |
| -------- | --------------- | --------------------------------------------------------------------- |
| CPU      | 1 vCPU          | Matches the smallest fly.io machine class currently in production     |
| RAM      | 512 MB          | Most requests are well under this; reports can spike briefly          |
| Disk     | 1 GB persistent | NATS KV + uploaded project archives; grow if you upload many projects |
| Network  | ~negligible     | Metadata queries are small; most traffic is HTML rendering            |

For deployments handling many concurrent users or running heavy reports, scale up to 2 vCPU / 1 GB RAM. psLens is a single Go process; vertical scaling is the path, and there is no clustering model today.

---

## 4. Backup and Restore

### What to back up

|             Path              |                                    Contents                                    |                                                            Loss impact                                                            |
| ----------------------------- | ------------------------------------------------------------------------------ | --------------------------------------------------------------------------------------------------------------------------------- |
| `/data/nats`                  | Alert history, report output, encrypted credentials, sessions, recently-viewed | Lose alert/report history; users re-enter PS DB passwords                                                                         |
| `/data/projects`              | Uploaded PS project archives                                                   | Re-upload from source                                                                                                             |
| `/app/config.yaml`            | Configuration                                                                  | Re-create from your config-management tooling                                                                                     |
| `PSLENS_MASTER_KEY` (env var) | Encryption key for credentials in `/data/nats`                                 | **Without it, the encrypted credentials in NATS KV are unreadable.** Back up out of band (password manager, vault, secret store). |

### Suggested cadence

- **Nightly tarball** of `/data/nats` and `/data/projects` to your backup target.
- **14–30 day retention** for daily snapshots; longer if your audit policy requires.
- **Master key** stored separately (so a single-system compromise can't yield both).

Example backup script (Docker host):

```bash
#!/usr/bin/env bash
set -euo pipefail
TS=$(date -u +%Y%m%dT%H%M%SZ)
docker run --rm \
  --volumes-from pslens \
  -v "$BACKUP_DIR":/backup \
  alpine \
  tar czf "/backup/pslens-${TS}.tar.gz" /data
```

### Restore

1. Stop the container.
2. Restore the tarball into a fresh `/data` volume.
3. Ensure `PSLENS_MASTER_KEY` matches the key that encrypted the credentials.
4. Start the container.

---

## 5. Disaster Recovery

Today's posture:

- For **self-hosted**: DR is whatever your existing container DR posture is. psLens fits the same pattern as any small stateful Go service: restore the data volume, restart.
- For **managed (fly.io)**: redeployment is fast because all state fits in one volume. Cedar Hills Group operates the deployment; RTO/RPO targets for managed deployments are disclosed during contracting.
- No multi-region replication of psLens state today. The single-customer scope makes this rarely worth the complexity, but if your contract requires it, raise it on the demo call.

If you need a higher DR posture than this, the answer is usually "self-host and use your existing DR tooling for the volume." Cedar Hills Group can help structure that.

---

## 6. Monitoring

### What ships today

- **`GET /healthz`** returns 200 OK if the process is alive. Liveness check only; no readiness signal beyond startup completion.
- **Structured slog to stderr** for every request and every error. Ship to your SIEM via your container runtime's log driver.
- **Startup banner** in logs: version, commit SHA, build timestamp. Useful for confirming an upgrade landed.
- **Connection-status UI** inside psLens shows live SWS reachability for each configured database.

### What doesn't ship today

- **No Prometheus `/metrics` endpoint.** Planned. No committed date.
- **No built-in alerting**, in the sense of "psLens noticing it has a problem and notifying you." That is your monitoring system's job, fed by `/healthz` and the log stream.

Recommended wiring for a customer environment:

|      Signal      |                                         How to monitor                                         |
| ---------------- | ---------------------------------------------------------------------------------------------- |
| Process alive    | `GET /healthz` from your uptime monitor every 30–60 s                                          |
| Errors           | Forward stderr to your log aggregation; alert on error-rate spikes                             |
| Auth failures    | Same: the request log includes status codes                                                    |
| SWS reachability | psLens already shows this; if you want it externalized, build a small probe against `/healthz` |

---

## 7. Upgrades and Rollback

Pin production to a `vMAJOR.MINOR` tag. To upgrade:

```bash
docker compose pull
docker compose up -d
```

The data volume survives the restart. Schema migrations on the NATS KV layer (if any) run automatically on first start of the new version.

**To roll back**, pin to the prior `vMAJOR.MINOR.PATCH` tag, `docker compose up -d`. Downgrade compatibility within a `vMAJOR.MINOR` is guaranteed; across major versions, check the release notes.

**Breaking change contract:** breaking changes only land in major version bumps and are called out explicitly in release notes. Patch and minor releases preserve config compatibility.

Cross-link: [Deployment Options → Image Tags](/docs/getting-started/deployment-options/) for the full tag-flavor table.

---

## 8. Multi-Environment Support (DEV / TEST / PROD)

A single psLens instance can be configured to connect to multiple PeopleSoft databases (DEV, TEST, PROD, demo) via entries in `config.yaml`. The database selector appears on every search and report page.

This is usually the right deployment shape: one psLens instance per *psLens user community*, configured to see every PS environment that community needs. It is rarely useful to run multiple psLens instances unless those communities have non-overlapping access requirements that need to be enforced at the network layer.

---

## 9. DBA Concerns: What psLens Does to Your PS Server

For PeopleSoft admins who need to sign off on the load profile:

|          Property           |                                                         Behavior                                                          |
| --------------------------- | ------------------------------------------------------------------------------------------------------------------------- |
| Connection type             | HTTPS to SWS, basic auth, no JDBC, no direct DB connection                                                                |
| Read or write               | Read-only. No code path issues `INSERT`, `UPDATE`, or `DELETE`.                                                           |
| Table coverage              | Only whitelisted PeopleTools metadata tables (`PSRECDEFN`, `PSPNLDEFN`, `PSAUTHITEM`, process scheduler, IB tables, etc.) |
| Query shape                 | Mostly short metadata reads. Reports use a 90-second `QuerySlow` path; nothing runs longer.                               |
| Concurrency                 | One concurrent request per active user; small (handful) at any moment for typical customers                               |
| Service account permissions | Read on whitelisted tables; no write, no execute                                                                          |

The PS-side audit trail (SWS query log) captures every query psLens issues with timestamps and the SWS service-account OPRID. If psLens is misbehaving, that log shows what it actually did.

Cross-link: [Installation → Whitelisting Tables](/docs/getting-started/installation/) for the full list and SQL.

---

## Related

- [Installation](/docs/getting-started/installation/). First-time setup with copy-pasteable commands.
- [Deployment Options](/docs/getting-started/deployment-options/). TLS, image distribution, configuration.
- [Data Handling & Logging](/security/data-and-logging/). What's in `/data/nats` and how it's encrypted.
- [Compliance & Vendor](/security/compliance-and-vendor/). Sub-processors, residency, contract.
