Amazon DocumentDB Monitoring Integration

Amazon DocumentDB is a document database service compatible with MongoDB workloads for managing JSON data at scale. With Site24x7's integration, you can monitor the health and performance of Amazon DocumentDB's cluster and instances.

Setup
Permissions
Poll frequency
Licensing
Supported metrics
Threshold configuration
Site24x7's DocumentDB monitoring interface

Setup

Please provide Site24x7 access to your AWS account by either creating an IAM User or IAM Role. Learn more.
On the Integrate AWS Account page, please make sure the DocumentDB checkbox is selected in the Services to be Discovered field. Learn more.

Permissions

Please make sure the following read-level actions are present in the IAM policy assigned to the IAM User or IAM Role created for Site24x7. Learn more.

"rds:DescribeDBClusters",
"rds:DescribeDBInstances",
"rds:ListTagsForResource",
"rds:DescribeCertificates",
"rds:DescribeEvents",
"rds:DescribeGlobalClusters",
"logs:DescribeLogStreams",
"logs:GetLogEvents",
"logs:GetLogEvents",

Poll Frequency

Aggregated DocumentDB metric data is collected as per the poll frequency set (1 minute to a day). Learn more.

Licensing

Each DocumentDB monitor is considered a basic monitor.

Supported Metrics

DocumentDB Cluster and Instance Metrics

Attribute	Description	Statistics	Unit
Backup Retention Period Storage Used	The total amount of backup storage in GiB used to support the point-in-time restore feature within the Amazon DocumentDB's retention window.	Maximum	GB, Bytes
Change Stream Log Size	The amount of storage used by your cluster to store the change stream log in megabytes.	Average	MB
CPU Utilization	The percentage of CPU used by an Cluster	Maximum	Percent
Database Connections	The number of connections open on an cluster taken at a one-minute frequency.	Average, Sum, Maximum	Count
Database Connections Max	The maximum number of open database connections on an cluster in a one-minute period.	Average, Sum, Maximum	Count
Database Cursors	The number of cursors open on an cluster taken at a one-minute frequency.	Average, Sum, Maximum	Count
Database Cursors Max	The maximum number of open cursors on an cluster in a one-minute period.	Average, Sum, Maximum	Count
Database Cursors Timed Out	The number of cursors that timed out in a one-minute period.	Sum	Count
Freeable Memory	The amount of available random access memory.	Average	Bytes
Free Local Storage	This metric reports the amount of storage available to each instance for temporary tables and logs.	Average	MB
Low Memory Throttle Queue Depth	The queue depth for requests that are throttled due to low available memory	Sum	Count
Low Memory Throttle Max Queue Depth	The maximum queue depth for requests that are throttled due to low available memory	Sum	Count
Low Memory Number Operations Throttled	The number of requests that are throttled due to low available memory	Sum	Count
Snapshot Storage Used	The total amount of backup storage in GiB consumed by all snapshots for a given Amazon DocumentDB cluster outside its backup retention window	Average	GB, Bytes
Total Backup Storage Billed	The total amount of backup storage in GiB for which you are billed for a given Amazon DocumentDB cluster	Maximum	GB, Bytes
Transactions Open	The number of transactions open on an instance	Average, Sum, Maximum	Count
Transactions Open Max	The maximum number of transactions open on an instance	Average, Sum, Maximum	Count
Volume Bytes Used	The amount of storage used by your cluster in bytes	Average	MB
DB Cluster Replica Lag Maximum	The maximum amount of lag, in milliseconds, between the primary instance and each Amazon DocumentDB instance in the cluster	Maximum	ms
DB Cluster Replica Lag Minimum	The minimum amount of lag, in milliseconds, between the primary instance and each replica instance in the cluster.	Minimum	ms
DB Instance Replica Lag	The amount of lag, in milliseconds, when replicating updates from the primary instance to a replica instance.	Average	ms
Read Latency	The average amount of time taken per disk I/O operation.	Average	ms
Write Latency	The average amount of time, in milliseconds, taken per disk I/O operation.	Average	ms
Low Memory Number Operations Timed Out	Number of operations timed out due to low available memory	Sum	Count
Documents Deleted	The number of deleted documents	Sum	Count
Documents Inserted	The number of inserted documents	Sum	Count
Documents Returned	The number of returned documents	Sum	Count
Documents Updated	The number of updated documents	Sum	Count
Opcounters Command	The number of commands	Sum	Count
Opcounters Delete	The number of delete operations	Sum	Count
Opcounters Getmore	The number of getmores	Sum	Count
Opcounters Insert	The number of insert operations	Sum	Count
Opcounters Query	The number of queries issued	Sum	Count
Opcounters Update	The number of update operations issued	Sum	Count
Transactions Started	The number of transactions started	Sum	Count
Transactions Committed	The number of transactions committed	Sum	Count
Transactions Aborted	The number of transactions aborted	Sum	Count
TTL Deleted Documents	The number of documents deleted	Sum	Count
Network Receive Throughput	The amount of network throughput, in bytes per second, received from clients by each instance in the cluster	Average	mb/sec
Network Throughput	The amount of network throughput, in bytes per second, both received from and transmitted to clients by each instance in the Amazon DocumentDB cluster.	Average	mb/sec
Network Transmit Throughput	The amount of network throughput, in bytes per second, sent to clients by each instance in the cluster.	Average	mb/sec
Read IOPS	The average number of disk read I/O operations per second.	Average	Count
Write IOPS	The average number of disk write I/O operations per second.	Average	Count
Read Throughput	The average number of bytes read from disk per second.	Average	Bytes/sec
Write Throughput	The average number of bytes write to disk per second.	Average	Bytes/sec
Volume Read IOPs	The average number of billed read I/O operations from a cluster volume	Average	Count
Volume Write IOPs	The average number of billed write I/O operations from a cluster volume	Average	Count
Buffer Cache Hit Ratio	The percentage of requests that are served by the buffer cache.	Average	Percent
Disk Queue Depth	The number of concurrent write requests to the distributed storage volume.	Sum	Count
Engine Uptime	The amount of time, in seconds, that the instance has been running.	Average	Seconds
Index Buffer Cache Hit Ratio	The percentage of index requests that are served by the buffer cache.	Average	Percent
CPU Credit Usage	The number of CPU credits spent during the measurement period.	Average	Count
CPU Credit Balance	The number of CPU credits that an instance has accrued.	Average	Count
CPU Surplus Credit Balance	The number of surplus CPU credits spent to sustain CPU performance when the CPUCreditBalance value is zero.	Average	Count
CPU Surplus Credits Charged	The number of surplus CPU credits exceeding the maximum number of CPU credits that can be earned in a 24-hour period, and thus attracting an additional charge.	Average	Count
Swap Usage	The amount of swap space used on the instance.	Average	Bytes

DocumentDB Global Cluster Metrics

Attribute	Description	Statistics	Unit
Global Cluster Replicated Write IO	The average number of billed write I/O operations replicated from the cluster volume in the primary AWS Region to the cluster volume in a secondary AWS Region	Average	Count
GlobalClusterDataTransferBytes	The amount of data transferred from the primary cluster’s AWS Region to a secondary cluster’s AWS Region	Average	MB
GlobalClusterReplicationLag	The amount of lag, in milliseconds, when replicating change events from the primary cluster’s AWS Region to a secondary cluster’s AWS Region	Average	ms

To View Data

Sign in to the Site24x7 console. Click on AWS. Choose the monitored AWS account.
Choose DocumentDB from the menu dropdown.
From the list of monitored resources, choose the DocumentDB resource for which you want to view metrics for.

Threshold Configuration

Set thresholds for the various performance metrics related to DocumentDB and get alerts when they exceed the configured values.

Go to Admin > Configuration Profiles > Threshold and Availability > (+). You can also navigate via Cloud > AWS > click on the AWS account > DocumentDB Cluster/DocumentDB Instance/DocumentDB Global Clusters > hover on the hamburger icon beside the display name > Edit > Threshold and Availability > click on the pencil icon.
In the Add Threshold and Availability form, select DocumentDB Cluster, DocumentDB Global Clusters, or DocumentDB Instance.
Set threshold values for the required metrics.
Save your changes.

Site24x7's DocumentDB Monitoring Interface

Summary

This section provides you with operational details like CPU utilization, database connections, database connection max, database cursors, database cursors max, freeable memory, buffer cache hit ratio, number of operations timed out due to low memory, snapshot and backup storage, and many more metrics.

Configuration Details

Get details including the cluster ID, status, availability zone, region, backup retention period, engine name and its version, master username, port, subnet group details, and other configuration details.

Monitored Resources

Various resource availability statuses are provided here, with information on associated DocumentDB cluster and instances, resource name, type, display name, status, and action. The Action column allows you to set alerts and add automations for when a monitored resource is marked as Down, Critical, or Trouble.

Audit Logs and Profiler Logs

View audit events and profiler events to monitor the execution time and details of operations performed on your cluster. These logs prove helpful to identify slow operations on the cluster and improve individual query performance and overall cluster performance.

Cluster Events

View events related to your clusters, instances, snapshots, security groups, and cluster parameter groups. Get details including the date and time of the event, source name and source type of the event, and a message that is associated with the event. This tab is available only for DocumentDB Cluster and DocumentDB Instance monitors.

Outages

A history of your resources’ various states, like down, trouble, critical, or maintenance, is displayed in the Outages tab. Details on the start time and end time of an outage, duration, and comments (if any) are provided in this section. You can also edit or delete comments.

Log Report

Here you can view the audit log data for DocumentDB clusters and DocumentDB instances, along with details on the timestamp, status, CPU utilization, database connections sum, and database cursors sum.