Help Docs

Amazon DocumentDB Monitoring Integration

Amazon DocumentDB is a document database service compatible with MongoDB workloads for managing JSON data at scale. With Site24x7's integration, you can monitor the health and performance of Amazon DocumentDB's cluster and instances.

Setup

  • Please provide Site24x7 access to your AWS account by either creating an IAM User or IAM Role. Learn more.
  • On the Integrate AWS Account page, please make sure the DocumentDB checkbox is selected in the Services to be Discovered field. Learn more.

Permissions

Please make sure the following read-level actions are present in the IAM policy assigned to the IAM User or IAM Role created for Site24x7. Learn more.

  • "rds:DescribeDBClusters",
  • "rds:DescribeDBInstances",
  • "rds:ListTagsForResource",
  • "rds:DescribeCertificates",
  • "rds:DescribeEvents",
  • "rds:DescribeGlobalClusters",
  • "logs:DescribeLogStreams",
  • "logs:GetLogEvents",
  • "logs:GetLogEvents",

Poll Frequency

Aggregated DocumentDB metric data is collected as per the poll frequency set (1 minute to a day). Learn more.

Licensing

  • Each DocumentDB monitor is considered a basic monitor. 

Supported Metrics

DocumentDB Cluster and Instance Metrics

Attribute Description Statistics Unit
Backup Retention Period Storage Used The total amount of backup storage in GiB used to support the point-in-time restore feature within the Amazon DocumentDB's retention window.  Maximum GB, Bytes
Change Stream Log Size The amount of storage used by your cluster to store the change stream log in megabytes. Average MB
CPU Utilization The percentage of CPU used by an Cluster Maximum Percent
Database Connections The number of connections open on an cluster taken at a one-minute frequency. Average, Sum, Maximum Count
Database Connections Max The maximum number of open database connections on an cluster in a one-minute period. Average, Sum, Maximum Count
Database Cursors The number of cursors open on an cluster taken at a one-minute frequency. Average, Sum, Maximum Count
Database Cursors Max The maximum number of open cursors on an cluster in a one-minute period. Average, Sum, Maximum Count
Database Cursors Timed Out The number of cursors that timed out in a one-minute period. Sum Count
Freeable Memory The amount of available random access memory. Average Bytes
Free Local Storage This metric reports the amount of storage available to each instance for temporary tables and logs. Average MB
Low Memory Throttle Queue Depth The queue depth for requests that are throttled due to low available memory  Sum Count
Low Memory Throttle Max Queue Depth The maximum queue depth for requests that are throttled due to low available memory Sum Count
Low Memory Number Operations Throttled The number of requests that are throttled due to low available memory Sum Count
Snapshot Storage Used The total amount of backup storage in GiB consumed by all snapshots for a given Amazon DocumentDB cluster outside its backup retention window Average GB, Bytes
Total Backup Storage Billed The total amount of backup storage in GiB for which you are billed for a given Amazon DocumentDB cluster Maximum GB, Bytes
Transactions Open The number of transactions open on an instance Average, Sum, Maximum Count
Transactions Open Max The maximum number of transactions open on an instance Average, Sum, Maximum Count
Volume Bytes Used The amount of storage used by your cluster in bytes Average MB
DB Cluster Replica Lag Maximum The maximum amount of lag, in milliseconds, between the primary instance and each Amazon DocumentDB instance in the cluster Maximum ms
DB Cluster Replica Lag Minimum The minimum amount of lag, in milliseconds, between the primary instance and each replica instance in the cluster. Minimum ms
DB Instance Replica Lag The amount of lag, in milliseconds, when replicating updates from the primary instance to a replica instance. Average ms
Read Latency The average amount of time taken per disk I/O operation. Average ms
Write Latency The average amount of time, in milliseconds, taken per disk I/O operation. Average ms
Low Memory Number Operations Timed Out Number of operations timed out due to low available memory Sum Count
Documents Deleted The number of deleted documents Sum Count
Documents Inserted The number of inserted documents Sum Count
Documents Returned The number of returned documents Sum Count
Documents Updated The number of updated documents Sum Count
Opcounters Command The number of commands Sum Count
Opcounters Delete The number of delete operations Sum Count
Opcounters Getmore The number of getmores Sum Count
Opcounters Insert The number of insert operations Sum Count
Opcounters Query The number of queries issued Sum Count
Opcounters Update The number of update operations issued Sum Count
Transactions Started The number of transactions started Sum Count
Transactions Committed The number of transactions committed Sum Count
Transactions Aborted The number of transactions aborted Sum Count
TTL Deleted Documents The number of documents deleted Sum Count
Network Receive Throughput The amount of network throughput, in bytes per second, received from clients by each instance in the cluster Average mb/sec
Network Throughput The amount of network throughput, in bytes per second, both received from and transmitted to clients by each instance in the Amazon DocumentDB cluster. Average mb/sec
Network Transmit Throughput The amount of network throughput, in bytes per second, sent to clients by each instance in the cluster.  Average  mb/sec 
Read IOPS The average number of disk read I/O operations per second.   Average  Count 
Write IOPS The average number of disk write I/O operations per second.  Average  Count 
Read Throughput  The average number of bytes read from disk per second.  Average  Bytes/sec 
Write Throughput  The average number of bytes write to disk per second.  Average  Bytes/sec 
Volume Read IOPs  The average number of billed read I/O operations from a cluster volume  Average  Count 
Volume Write IOPs  The average number of billed write I/O operations from a cluster volume  Average  Count 
Buffer Cache Hit Ratio  The percentage of requests that are served by the buffer cache.  Average   Percent 
Disk Queue Depth  The number of concurrent write requests to the distributed storage volume.  Sum  Count 
Engine Uptime  The amount of time, in seconds, that the instance has been running.  Average   Seconds 
Index Buffer Cache Hit Ratio The percentage of index requests that are served by the buffer cache.  Average    Percent 
CPU Credit Usage The number of CPU credits spent during the measurement period.  Average    Count 
CPU Credit Balance The number of CPU credits that an instance has accrued.  Average    Count  
CPU Surplus Credit Balance The number of surplus CPU credits spent to sustain CPU performance when the CPUCreditBalance value is zero.  Average    Count  
CPU Surplus Credits Charged The number of surplus CPU credits exceeding the maximum number of CPU credits that can be earned in a 24-hour period, and thus attracting an additional charge.  Average    Count  
Swap Usage The amount of swap space used on the instance.  Average    Bytes 

DocumentDB Global Cluster Metrics

Attribute Description Statistics Unit
Global Cluster Replicated Write IO The average number of billed write I/O operations replicated from the cluster volume in the primary AWS Region to the cluster volume in a secondary AWS Region Average Count
GlobalClusterDataTransferBytes The amount of data transferred from the primary cluster’s AWS Region to a secondary cluster’s AWS Region Average MB
GlobalClusterReplicationLag The amount of lag, in milliseconds, when replicating change events from the primary cluster’s AWS Region to a secondary cluster’s AWS Region Average ms

To View Data

  • Sign in to the Site24x7 console. Click on AWS. Choose the monitored AWS account.
  • Choose DocumentDB from the menu dropdown.
  • From the list of monitored resources, choose the DocumentDB resource for which you want to view metrics for.

Threshold Configuration

Set thresholds for the various performance metrics related to DocumentDB and get alerts when they exceed the configured values.

  1. Go to Admin > Configuration Profiles > Threshold and Availability > (+). You can also navigate via Cloud > AWS > click on the AWS account > DocumentDB Cluster/DocumentDB Instance/DocumentDB Global Clusters > hover on the hamburger icon beside the display name > Edit > Threshold and Availability > click on the pencil icon. 
  2. In the Add Threshold and Availability form, select DocumentDB Cluster, DocumentDB Global Clusters, or DocumentDB Instance.
  3. Set threshold values for the required metrics.
  4. Save your changes.

Site24x7's DocumentDB Monitoring Interface

Summary

This section provides you with operational details like CPU utilization, database connections, database connection max, database cursors, database cursors max, freeable memory, buffer cache hit ratio, number of operations timed out due to low memory, snapshot and backup storage, and many more metrics.

Configuration Details

Get details including the cluster ID, status, availability zone, region, backup retention period, engine name and its version, master username, port, subnet group details, and other configuration details. 

Monitored Resources

Various resource availability statuses are provided here, with information on associated DocumentDB cluster and instances, resource name, type, display name, status, and action. The Action column allows you to set alerts and add automations for when a monitored resource is marked as Down, Critical, or Trouble.

Audit Logs and Profiler Logs

View audit events and profiler events to monitor the execution time and details of operations performed on your cluster. These logs prove helpful to identify slow operations on the cluster and improve individual query performance and overall cluster performance. 

Cluster Events

View events related to your clusters, instances, snapshots, security groups, and cluster parameter groups. Get details including the date and time of the event, source name and source type of the event, and a message that is associated with the event. This tab is available only for DocumentDB Cluster and DocumentDB Instance monitors.

Outages

A history of your resources’ various states, like down, trouble, critical, or maintenance, is displayed in the Outages tab. Details on the start time and end time of an outage, duration, and comments (if any) are provided in this section. You can also edit or delete comments.

Log Report

Here you can view the audit log data for DocumentDB clusters and DocumentDB instances, along with details on the timestamp, status, CPU utilization, database connections sum, and database cursors sum.

Was this document helpful?

Shortlink has been copied!