Data lake
A data lake serves as a central storage repository where you can securely store extensive amounts of structured and unstructured data on a large scale. This data is typically stored in its raw, original format and can be used for various data analytics and processing purposes. It's often used for big data analytics, machine learning, data mining, and more.
Site24x7's Data Lake feature supports the creation of Custom Metrics monitors, allowing data to be pushed through API or SDK.
Table of contents
- Use case
- Benefits of a data lake
- Add Data Lake monitor
- Data lake dashboard
- Import data to the created monitor
- Performance metrics
- Next Step: Custom Attributes
Use case
Consider an e-commerce platform where critical user journey metrics like product searches, shopping cart activity, check-in, and checkout processing can be stored. Customized dashboards can be created to visualize these metrics through charts, graphs, and tables, aiding in the identification of performance trends and bottlenecks. Furthermore, with the Custom Attributes feature, custom alerts and notifications can also be configured to proactively address issues, such as high cart abandonment rates or slow payment processing, ensuring a seamless shopping experience.
Benefits of a data lake
The Site24x7 Data Lake feature enables you to define and track metrics that are specific to your application and its unique requirements.
- Tailored monitoring: You can track any application-specific metric in addition to what Site24x7 provides by default.
- Granular visibility: You can monitor specific parts of your application's code, third-party integrations, or other components that are particularly important to your application.
- Customized dashboard: You can create your own dashboard that includes various visualization options like numeric charts, graphs, and tables to generate illustrations that help you understand the metrics.
- Alerting and notifications: You can set up custom alerts and notifications based on these metrics.
- Flexibility: You can adapt and create new custom metrics to monitor different aspects of your application that become important over time.
Add Data Lake monitor
To add a Data Lake monitor, follow the steps below:
- Log in to the Site24x7 web client.
- Navigate to Metrics > Data Lake and then click the (+) icon.
- Provide a Display Name, which will be the name of your monitor.
- If you want a template of your monitor with the field details, set Send template to mail to Yes, otherwise No.
Add fields
The fields can be added individually, or in bulk by selecting Add Field or Bulk Addition.
Add Field option
This option is used to add fields one by one.
-
Name
Provide a name for the field. -
Type
String: String fields, also called tags, can be used to filter numerical data.
Numeric: Each monitor should contain at least one numeric field.
NoteEach Data Lake monitor can accommodate a maximum of 50 string fields and 1 numeric field. -
NoteIt will only be enabled if the Numeric type is selected.
- Time is measured in milliseconds, seconds, and so on.
- Size is measured in bytes, KB, MB, GB, and so on.
- Count is measured in ones, thousands, lakhs, millions, billions, and trillions.
-
NoteIt will only be enabled if the Numeric type is selected.
Count
The Count metric indicates the total number of event occurrences within a specific time frame.
Rate
The Rate metric measures the number of event occurrences within a specific time interval.
Gauge
The Gauge metric represents a snapshot of events for a given time interval. This value is the last value submitted to the SDK during that time interval.
Histogram
The Histogram metric calculates and stores various metrics such as Total, Count, and Percentiles (0, 50, 95, 99, 100) for an array of events. This type of metric can help identify anomalies during specific periods of the event.
A detailed description of Metric Types can be found in the UI. - Click Save.
- If you want to add more fields, repeat the above steps.
- The list of fields added will be displayed under Fields Configuration.
Bulk Addition option
If you want to add a large number of fields at once, follow the steps below:
- Click on Bulk Addition.
- Paste the JSON code and click Validate.
Sample code for bulk addition:
[
{
"name": "users count",
"type": "numeric",
"format": "count",
"metric_type": "count"
},
{
"name": "Type",
"type": "string"
},
{
"name": "Floor",
"type": "string"
},
] - If the code is error-free after validation, the Save button will be enabled.
If the Save button is not enabled after clicking the Validate button, you must recheck the code for errors and update it. - Click Save.
Fields configuration
The various fields that have been added are listed here, along with their Type.
- Once you finish adding the fields, Click Save.
You will be redirected to the dashboard, where you can see the newly created monitor.
- You are all set. You must now push data to the newly added monitor.
Data lake dashboard
Once you've successfully added a data lake monitor, you'll be directed to the data lake dashboard. You can also access it by navigating to Metrics > Data Lake.
Get a holistic view of your Data Lake monitors. The dashboard categorizes your monitors based on their status (Up, Trouble, Critical, or Down).
Monitors view
Gives an overall summary of the Fields Count, Custom Attributes Monitors Count, and Custom Attributes Count.
Edit monitor
You can add/remove fields or update the details of the existing fields.
- Click the hamburger icon () beside the monitor name. Click Edit.
- In the Edit Data Lake Monitor page, you can add or remove the required fields.
- Save your changes.
Delete monitor
You can delete any data lake monitor that is no longer needed.
- Click the hamburger icon () beside the monitor name.
- Click Delete.
Fields view
Gives a comprehensive list of all the fields from all monitors. You can view details like Monitor Name, Field Type, Last Polled Value, and Last Polled Time.
Import data to the created monitor
You can push data via API or SDK:
Push data via API
This option is used to push the aggregated data.
- Go to the data lake monitor listed and click on it.
- Hover over the hamburger icon () next to the monitor's name.
- Select Show Post URL from the drop-down menu.
- Follow the on-screen instructions.
NoteSample Payload:
{"values":[{"numeric_data":
[{"name":"Users Count","count":1,"value":0}],
"time_stamp":1697107024462,
"tags":{"Type":"data"}}]Where,
- count is the number of values aggregated. This field is optional, and the default value is 1.
- value is the aggregated value. The aggregate calculation should be done based on the Metric Type (Count, Rate, Gauge, and Histogram) you chose.
Push data via SDK
This option is used to send raw data (unaggregated data), and the aggregation will be done at our end.
Step 1: Add/Set up the SDK to your application
- Go to the Maven Central Repository.
- Add the dependency into your application based on your build framework.
NoteYou can also download and add the SDK jar to your application.
Step 2: Create the object of SDK MetricProvider
Create a MetricProvider object with the Site24x7 exporter using the MetricProviderBuilder class. This object is used to push metric data.
MetricProviderBuilder metricProviderBuilder = new MetricProviderBuilder();
metricProviderBuilder.withExporter(new Site24x7Exporter("<Your license key>"));
MetricProvider metricProvider = metricProviderBuilder.build();
Step 3: Provide raw data to the SDK
Specify the accurate method to properly aggregate the values of the defined field.
Count
metricProvider.count(
"<app key>", // The app key associated with the Data-lake monitor
"db_write", // The metric name (field name)
2.0, // Numerical value of the metric
Collections.singletonMap("host", "192.168.10.254") // Set of tags to associate with the metric
);
Rate
metricProvider.rate(
"<app key>",
"disk_free",
16868264.0,
Collections.singletonMap("host", "192.168.10.254")
);
Gauge
metricProvider.gauge(
"<app key>",
"disk_free",
16868264.0,
Collections.singletonMap("host", "192.168.10.254")
);
Histogram
metricProvider.histogram(
"<app key>",
"disk_free",
16868264.0,
Collections.singletonMap("host", "192.168.10.254")
);
Where,
- disk_free is the numeric field.
- Collections.singletonMap ("host", "192.168.10.254") is the String field, which is optional.
- The default values for units are as follows:
- ms for Time
- Bytes for Size
- Ones for Count
Performance metrics
You can click on the required monitor to view the performance metrics of the fields added to it. On the right corner, you can also decide on the time frame for which you need the metrics.
Dashboard tab
A default dashboard view will be displayed, showing the number of fields, custom attributes, and custom attribute monitors present.
It also displays numeric widgets and single-line charts for all numeric fields. You can also customize the dashboard using the Edit Dashboard option.
Edit Dashboard
You can build your own dashboard that includes various visualization options like numeric charts, graphs, and tables, which help you visualize and understand the metrics data. This visual representation makes it easier to spot trends, anomalies, and correlations.
Follow the steps below to create a custom dashboard:
- Click on Edit Dashboard.
- Provide a suitable name and description for your dashboard.
- Under Widget Type, choose from Numeric, Graph, or Table.
- Numeric chart:
Select the required Field, Aggregation, Unit, and Time Period. Finally, drag and drop the widget onto the working space to visualize the chart. - Graph:
Select the required Graph Type, Label Name, Field, Aggregation, and Time Period. Finally, drag and drop the widget onto the working space to visualize the graph. - Table:
Select the required string field to split the rows based on that. Provide the column properties, Aggregation, and Time Period. Finally, drag and drop the widget onto the working space to visualize the table.
- Numeric chart:
- When you're done, click Done customizing at the top of the page.
The new widgets added are displayed on the Dashboard tab.
Filter
The Add Filters option includes AND conditions. The filter conditions will be applied to all widgets present, and the results will be shown accordingly.
Fields tab
In general, you can see a comprehensive list of all the fields of your monitor. You can view details like monitor name, field type, last polled value, and last polled time.
Alerts tab
The list of custom attributes configured is displayed here. You can also view the real-time status of each, aggregated value, name of the data lake monitor to which it belongs, configured filter, and aggregation type.
Query Data tab
This tab gives a detailed data report of the respective data lake monitor for the chosen time period.
You can perform a quick search by simply entering a valid SQL query into the filter box.
Next Step
You can also create a Custom Attributes monitor for certain key metrics of a data lake monitor for which you need alerts.