PostgreSQL Alerts & Rules
Setup alert rules for your PostgreSQL deployment to monitor key metrics, disk space, and server role change for custom thresholds, triggers, and time parameters to improve your app performance.
ScaleGrid's alerts and rules for PostgreSQL hosting allow you to stay on top of your deployment activity. Every database type has a different set of alert rules that can be configured as per your convenience. You can get alerts sent directly to you via SMS, email, PagerDuty, Opsgenie or Slack.
PostgreSQL Global Alert Rules
These are the default alert rules automated by ScaleGrid for all PostgreSQL clusters:
- CPU - Total (%) greater than 60%
- Available disk space (%) less than 20%
- Elected a new master
- Replication Lag (secs) greater than 60 secs
- Deadlocks greater than 0
- Conflicts greater than 0
Override Global Rules
You can override a Global Alert Rule by creating a new Cluster-Level Alert Rule or an Account-Level Alert Rule. For example, our CPU usage rule is set to 60% by default, but if you create a new Cluster-Level Alert and set it to 80%, this will override the Global Alert Rules for this cluster.
Account Level Rules
These rules are enforced by you and apply to all PostgreSQL clusters in your ScaleGrid account based on database type. You can configure account level rules under:
Settings > Global Rules > Alert Rules
Create PostgreSQL Account-Level Alert Rules
These rules are enforced by you and apply to all clusters in your ScaleGrid account based on database type. Account-Level rules override Global Rules and are overridden by Cluster-Level rules.
Cluster Level Rules
These rules are enforced by you and apply to specific clusters in your ScaleGrid account. You can configure cluster level rules on the details page:
Details Page > Alerts > Rules > Create New Rule
Create PostgreSQL Cluster-Level Alert Rules
These rules are enforced by you and apply to all clusters in your ScaleGrid account based on database type. Cluster-Level rules override Global Rules and Account-Level rules.
PostgreSQL Alert Types
ScaleGrid supports three top-level alert types:
- Metrics
- Disk Free
- Server Role Change
If you create an alert rule based on a specific metric, you have the option set conditions:
- Threshold
- Trigger if threshold greater than or less than the current value
- Time for which the condition lasts for: 2 minutes, 6 minutes, or 1 hour
PostgreSQL & OS Metrics
Here is a list of PostgreSQL and OS metrics that you can use to create alerts:
CPU - User (%)
API Enum: CPU_USER
The percentage of time the CPU spent on user applications.
CPU - System (%)
API Enum: CPU_SYSTEM
The percentage of time the CPU spent in the operating system.
CPU - Nice (%)
API Enum: CPU_NICE
The percentage of time the CPU spent in nice mode.
CPU - IO Wait (%)
API Enum: CPU_IOWAIT
The percentage of time the CPU spent waiting for IO operations to complete.
CPU - Total (%)
API Enum: CPU_TOTAL
The total percentage of time the CPU spent on user applications, operating systems, nice mode, and IO wait.
Memory - Total (KB)
API Enum: MEM_TOTAL
Total available system memory in KB.
Memory - Used (KB)
API Enum: MEM_USED
KB of system memory used.
Memory - Free (KB)
API Enum: MEM_FREE
KB of system memory free.
Memory - Buffers (KB)
API Enum: MEM_BUFFERS
KB of system memory used for buffers
Memory - Cached (KB)
API Enum: MEM_CACHED
KB of system memory used for page cache
Disk - Read (KB/sec)
API Enum: DISK_READ
Total disk reads in KB/sec.
Disk - Write (KB/sec)
API Enum: DISK_WRITE
Total disk writes in KB/sec.
Operation - Select (per sec)
API Enum: POSTGRES_OPCOUNTERS_SELECT
Number of Select operations/sec.
Operation - Update (per sec)
API Enum: POSTGRES_OPCOUNTERS_UPDATE
Number of Update operations/sec.
Operation - Insert (per sec)
API Enum: POSTGRES_OPCOUNTERS_INSERT
Number of Insert operations/sec.
Operation - Commit (per sec)
API Enum: POSTGRES_OPCOUNTERS_COMMIT
Number of Commit operations/sec.
Operation - Delete (per sec)
API Enum: POSTGRES_OPCOUNTERS_DELETE
Number of Delete operations/sec.
Operation - Rollback (per sec)
API Enum: POSTGRES_OPCOUNTERS_ROLLBACK
Number of currently active connections to the server.
Active connections
API Enum: POSTGRES_CONNECTION_ACTIVE
Number of Flush operations/sec.
Idle Connections
API Enum: POSTGRES_CONNECTION_IDLE
Number of idle connections to the server.
Locks - Access Share
API Enum: POSTGRES_LOCK_ACCESS_SHARE
Total number of locks held in acess_share mode. The SELECT command acquires a lock of this mode on referenced tables. In general, any query that only reads a table and does not modify it will acquire this lock mode.
Locks - Rows Share
API Enum: POSTGRES_LOCK_ROW_SHARE
Total number of locks held in row_share mode. The SELECT FOR UPDATE and SELECT FOR SHARE commands acquire a lock of this mode on the target table(s).
Locks - Rows Exclusive
API Enum: POSTGRES_LOCK_ROW_EXCLUSIVE
Total number of locks held in row_share mode. The SELECT FOR UPDATE and SELECT FOR SHARE commands acquire a lock of this mode on the target table(s).
Locks - Share Update Exclusive
API Enum: POSTGRES_LOCK_SHARE_UPDATE_EXCLUSIVE
Total number of held in share_update_exclusive mode. This mode protects a table against concurrent schema changes and VACUUM runs.Acquired by VACUUM (without FULL), ANALYZE, CREATE INDEX CONCURRENTLY, and some ALTER TABLE variants.
Locks - Share
API Enum: POSTGRES_LOCK_SHARE
Total number of locks held in share mode. This mode protects a table against concurrent data changes. Acquired by CREATE INDEX (without CONCURRENTLY).
Locks - Row Exclusive
API Enum: POSTGRES_LOCK_SHARE_ROW_EXCLUSIVE
Total number of locks held in share_row_exclusive mode. This mode protects a table against concurrent data changes, and is self-exclusive so that only one session can hold it at a time. Acquired by CREATE TRIGGER and many forms of ALTER TABLE.
Locks - Exclusive
API Enum: POSTGRES_LOCK_EXCLUSIVE
Total number of locks held in exclusive mode. This mode allows only concurrent ACCESS SHARE locks, i.e., only reads from the table can proceed in parallel with a transaction holding this lock mode.
Locks - Access Exclusive
API Enum: POSTGRES_LOCK_ACCESS_EXCLUSIVE
Total number of locks held in access_exclusive mode. This mode guarantees that the holder is the only transaction accessing the table in any way. Acquired by the DROP TABLE, TRUNCATE, REINDEX, CLUSTER, VACUUM FULL, etc.
Locks - Total
API Enum: POSTGRES_LOCK_TOTAL
Total of all the Locks.
Scheduled Checkpoints
API Enum: POSTGRES_CHECKPOINT_SCHEDULED
Number of scheduled checkpoints that were performed in the interval.
Requested Checkpoints
API Enum: POSTGRES_CHECKPOINT_REQUESTED
Number of requested checkpoints that were performed in the interval.
Temporary Bytes Written to Disk
API Enum: POSTGRES_TEMP_BYTES
The amount of data written temporarily to disk to execute queries: PostgreSQL reserves a certain amount of memory—specified by work_mem (4 MB by default)—to perform sort operations and hash tables needed to execute queries. When a complex query requires access to more memory than work_mem allows, it has to write some data temporarily to disk in order to do its work. This impacts performance, and might indicate a need to increase the work_mem. However, it’s important not to set this too high, because it can encourage the query planner to choose more inefficient queries.
Replication Lag (sec)
API Enum: POSTGRES_REPLICATION_LAG
Delay between an operation on the master and the application of that operation on the standby.
Cache Hit Ratio
API Enum: POSTGRES_CACHE_HIT
Ratio of reads-from-buffer-cache to total-reads. This does not take in to account hits on the file system's cache.
Deadlocks
API Enum: POSTGRES_DEADLOCK_COUNT
Number of deadlocks currently in the server.
Conflicts
API Enum: POSTGRES_CONFLICT_COUNT
Number of operations that were aborted due to lock conflicts.
Write Ahead Log Size (b)
API Enum: POSTGRES_WAL_SIZE
The size of the transaction log.
Memory Used by Postgres process (KB)
API Enum: POSTGRES_MEM_USED
The amount of memory in KB used by the Postgres processes.
Network In (KB/s)
API Enum:
Averages KBytes/sec sent into database server.
Network Out (KB/s)
API Enum:
Averages KBytes/sec sent out of the database server.
Loag Avg - 1m
API Enum:
Average system load for the last 1 minute.
Load Avg - 5m
API Enum:
Average system load for the last 5 minutes.
Load Avg - 15m
API Enum:
Average system load for the last 15 minutes.
Updated about 2 years ago