As a transitional step, this site will temporarily be made Read-Only from July 8th until the new community launch. During this time, you can still search and read articles and discussions.

While the community is read-only, if you have questions or issues requiring TIBCO review/response, please access the new TIBCO Community and select "Ask A Question."

You will need to register or log in or register to engage in the new community.

Grid Monitoring Accelerator

Monitor and manage computational data grids like TIBCO GridServer with the Grid Monitoring Accelerator. View the telemetry in real-time dashboards and predict failures before they occur using data science models.

Compatible Products

TIBCO Spotfire® TIBCO® Streaming

Provider

TIBCO Software

Compatible Versions

Software Minimum Version
TIBCO DataSynapse GridServer (optional) 7.1.0
TIBCO Spotfire Desktop 11.8.0
TIBCO Streaming 10.6.2
TIBCO Streaming Artifact Management Server 1.6.2

License

TIBCO Component Exchange License

Overview

The Grid Monitoring Accelerator contains components that capture and analyze telemetry about the performance of individual engines, brokers, and drivers from TIBCO GridServer and equivalent entities in other computational grids. It presents this information for analytics purposes, which can provide insights into the grid health and performance.

Documentation

The Accelerator distribution contains a Quick Start guide in MD format.

 

Support Details

The list of Supported Versions represents the TIBCO product versions that were used to build the currently released version of this accelerator.  We expect newer versions of the TIBCO products will also work.  Please see the wiki page for the accelerator for possible further details around product versions.

Accelerators are provided as fast start templates and design pattern examples and are supported as delivered. Please join the Community to discuss the use and implementation of the Grid Monitoring Accelerator.
 

License Details

Component Exchange License

Release(s)

Release 1.0.0

Published: April 2022

Initial release

There are currently no reviews for this content.

Why not be the first to review it - click here to login

Grid Monitoring Accelerator


Overview

The Grid Monitoring Accelerator provides a reference architecture and code assets for monitoring and managing computational data grids. It makes use of rule processing and data science models to alert and predict anomalies before they cause issues with completing a processing run, allowing operational staff the opportunity to intervene in a timely manner.

What's New

April 20, 2022, initial release of Grid Monitoring Accelerator 1.0.0

Business Scenario

A data grid is a software architecture that allows for highly distributed processing. It is often applied in situations where there are large amounts of data, and computations can be broken down into small, individual units of work. The individual computation results are then aggregated together to produce a final computed result. Data grids can be located on a single site with many physical or virtual machines, or geographically distributed. Monitoring and managing the performance of data grids is a complex problem.

Data grids are managed by supervising software, such as TIBCO GridServer®. They can capture telemetry about the performance of individual engines, brokers, and drivers that compose the grid, and present this information for analytics purposes. This telemetry can provide insights into the grid health and performance.

Concepts

The Accelerator was written specifically with TIBCO GridServer® as the data source, but principles can apply to any generic grid supervisor, provided data can be provided in the correct format. 

For TIBCO GridServer®, the following components are involved:

  • Grid Client-- these are the components that submit service requests into the grid, also known as Drivers
  • Engines -- processes that host and run services on grid nodes, the workers
  • Brokers -- provide request queuing, scheduling, and load-balancing, as well as Engine management
  • Directors -- component that assign Grid Clients to Brokers based on policies, such as what are the installed capabilities of the Broker's Engines and how busy are the Engines

The Accelerator captures telemetry from each of these components and transforms it into a standard data format. The data can then be viewed on live dashboards implemented using TIBCO Spotfire®. In addition, the Accelerator builds a task state model for each of the submitted tasks. There are 3 different task notifications used to determine state:

  • Task Submitted -- the task has been submitted to the grid for processing
  • Task Assigned -- the task has been allocated to an engine for execution
  • Task Completed -- the engine has completed executing the task

Under normal processing these 3 events will occur in sequence in a timely manner. If there is a gap between Submitted and Assigned this means the task was queued, and the grid was too busy to accept it at this time. Tasks can also experience rescheduling and reassigning, both of which are indicators of non-optimal grid health.

Since data grids produce different types of events, with many dozens of parameters per individual event, it becomes difficult to manually inspect the data, or even build simple rules-based systems to detect anomalies. The use of data science models can automate this process through the use of anomaly detection models. By using unsupervised model techniques against grid data streams, outliers can be identified and flagged to operations staff for investigation.

Benefits and Business Value

Data grids are used for complex calculations in large global financial institutions. These platforms are critical for nightly reconciliation of positions and reporting to government regulators. Failure to report in a timely manner can result in fines and costly adverse publicity.

When grids go wrong, it's often a difficult task to detect this early enough to take corrective action. Since the underlying engines are executing code created by data analysts and programmers, it is subject to the same quality control issues as any other piece of software. Memory leaks, crashing nodes, and incomplete calculations are all issues that can adversely impact grid health. The Accelerator provides an intelligent platform for capturing grid telemetry and presenting it to operations staff in a manner to flag potential issues before they consume a large amount of time and processing power.

Technical Scenario

The Accelerator demonstrates grid monitoring using a recorded dataset produced from a real TIBCO GridServer® implementation. Using a recorded dataset allows users to try out the Accelerator without having to spin up an entire data grid. In a real implementation an integration between the data grid and the Accelerator would be necessary.

A Spotfire® dashboard is provided to show key grid metrics and task states. The Accelerator also executes an anomaly detection model in Python to produce an anomaly score called Loss MAE. Once this value exceeds a configurable threshold, the grid state is declared to be anomalous and this is a flag to operations to begin investigating activities.

Components

Software Minimum Version
TIBCO DataSynapse GridServer (optional) 7.1.0
TIBCO Spotfire Desktop 11.8.0
TIBCO Streaming 10.6.2
TIBCO Streaming Artifact Management Server 1.6.2

Documentation

Documentation Quick Start Guide is available by downloading the full Accelerator distribution package from here . The license file for this component is here.

Back to main Accelerator page

View the Wiki Page