Amazon Neptune Cluster Monitoring Integration
Amazon Neptune is a fully-managed graph database service used to build and run applications that work with highly connected datasets. An Amazon Neptune cluster contains one or more Neptune Instance(s).
Setup and configuration
1. If you haven't already, enable access to your AWS resources between your AWS account and Site24x7's AWS account by either:
- Creating Site24x7 as an IAM user
- Creating a cross-account IAM role. Learn more
2. On the Integrate AWS Account page, check the box next to Amazon Neptune Cluster. Learn more
Policies and permissions
The following permissions are required by Site24x7 to discover an Amazon Neptune and collect configuration information. Learn more
- "rds:DescribeDBInstances",
- "rds:ListTagsForResource",
- "rds:DescribeDBClusters",
- "rds:DescribeEvents",
- "logs:DescribeLogStreams",
- "logs:GetLogEvents",
- "rds:DescribeDBClusterParameterGroups"
Polling Frequency
Site24x7 queries AWS to collect Neptune Cluster performance metrics according to the configured poll frequency. The minimum poll interval supported is one minute, and the maximum is 24 hours.Learn more
Supported metrics
Attribute | Description | Statistic | Data Type |
---|---|---|---|
Cluster ReplicaLag Maximum | The maximum amount of lag between the primary instance and each Neptune DB instance in the DB cluster | Maximum | Milliseconds |
Cluster ReplicaLag Minimum | The minimum amount of lag between the primary instance and each Neptune DB instance in the DB cluster | Minimum | Milliseconds |
Engine Up Time | The amount of time that the instance has been running | Maximum | Seconds |
Freeable Memory | The amount of random access memory available | Minimum | MB |
Free Local Storage | The amount of storage available for temporary tables and logs | Minimum | MB |
Gremlin Errors | The number of errors in Gremlin traversals | Sum | Count |
Gremlin Requests | The number of requests to the Gremlin engine | Sum | Count |
Gremlin Requests Per Sec | The number of requests to the Gremlin engine per second | Sum | Count/sec |
Gremlin WebSocket Available Connections | The number of potential WebSocket connections currently available | Sum | Count/sec |
Gremlin WebSocket Client Errors | The number of WebSocket client errors on the Gremlin endpoint per second | Sum | Count/sec |
Gremlin WebSocket Server Errors | The number of WebSocket server errors on the Gremlin endpoint per second | Sum | Count/sec |
Gremlin WebSocket Success | The number of successful WebSocket connections to the Gremlin endpoint per second | Sum | Count/sec |
Loader Errors | The number of errors from Loader requests | Sum | Count |
Loader Requests | The number of Loader Requests | Sum | Count |
Network Receive Throughput | The incoming network traffic on the DB instance, including both customer database traffic and Neptune traffic used for monitoring and replication | Average | MB/sec |
Network Throughput | The amount of network throughput both received from and transmitted to clients by each instance in the Neptune DB cluster | Average | MB/sec |
Network Transmit Throughput | The outgoing network traffic on the DB instance, including both customer database traffic and Neptune traffic used for monitoring and replication | Average | MB/sec |
SPARQL Errors | Number of errors in the SPARQL queries | Sum | Count |
SPARQL Requests | The number of requests to the SPARQL engine | Sum | Count |
SPARQL Requests Per Sec | The number of requests to the SPARQL engine per second | Sum | Count/sec |
Status Errors | The number of errors from the status endpoint | Sum | Count |
Status Requests | The number of requests to the status endpoint | Sum | Count |
Http1xx | The number of HTTP 1xx errors for the endpoint per second | Sum | Count/sec |
Http2xx | The number of HTTP 2xx errors for the endpoint per second | Sum | Count/sec |
Http4xx | The number of HTTP 4xx errors for the endpoint per second | Sum | Count/sec |
Http5xx | The number of HTTP 5xx errors for the endpoint per second | Sum | Count/sec |
Gremlin Http1xx | The number of HTTP 1xx errors for the Gremlin endpoint per second | Sum | Count/sec |
Gremlin Http2xx | The number of HTTP 2xx errors for the Gremlin endpoint per second | Sum | Count/sec |
Gremlin Http4xx | The number of HTTP 4xx errors for the Gremlin endpoint per second | Sum | Count/sec |
Gremlin Http5xx | The number of HTTP 5xx errors for the Gremlin endpoint per second | Sum | Count/sec |
Sparql Http1xx | The number of HTTP 1xx errors for the SPARQL endpoint per second | Sum | Count/sec |
Sparql Http2xx | The number of HTTP 2xx errors for the SPARQL endpoint per second | Sum | Count/sec |
Sparql Http4xx | The number of HTTP 4xx errors for the SPARQL endpoint per second | Sum | Count/sec |
Sparql Http5xx | The number of HTTP 5xx errors for the SPARQL endpoint per second | Sum | Count/sec |
Backup Retention Period Storage Used | Measures the amount of billed backup storage used to support the point-in-time restore feature within backup retention window. | Maximum | MB |
Cluster Replica Lag | For a read replica, the amount of lag when replicating updates from the primary instance. | Average | Milliseconds |
Total Backup Storage Billed | The total amount of billed backup storage. | Maximum | MB |
Volume ReadI OPs | The average number of billed read I/O operations from a cluster volume. | Sum | Count |
Volume WriteI OPs | The average number of write disk I/O operations to the cluster volume. | Sum | Count |
Volume Bytes Used | The amount of storage used by your Neptune DB instance. | Sum | MB |
Forecast
Estimate future values of the following performance metrics and make informed decisions about adding capacity or scaling your AWS infrastructure.
- CPU Utilization
- Gremlin Errors
- Gremlin Requests
- SPARQL Errors
- SPARQL Requests
- Volume Bytes Used
Site24x7's Amazon Neptune Cluster monitoring tabs
Summary
Gain an overview of different processes occurring within each cluster with time series charts that provide the events timeline on Gremlin Requests, Gremlin Errors, Network Throughput, SPARQL Errors, and SPARQL Requests.
Neptune Instances
If you are monitoring your Neptune instances with Site24x7, the status of these services will be listed in the Neptune Instances tab. You can click on any of the services to view their detailed metrics. You can also set thresholds and be notified when any of these services fail by clicking the pencil icon under Action.
Configuration Details
The configuration details of a cluster are provided under this tab. The details you may find here include cluster state, cluster ARN, endpoint URL, engine version, allocated storage space, and more.
Events
The events tab contains information on events related to DB instances, DB security groups, DB snapshots, and DB parameter groups for the past.
Recent Logs
Here, you can view the audit log data of a Neptune DB cluster that has been published by CloudWatch logs.