As a transitional step, this site will temporarily be made Read-Only from July 8th until the new community launch. During this time, you can still search and read articles and discussions.

While the community is read-only, if you have questions or issues requiring TIBCO review/response, please access the new TIBCO Community and select "Ask A Question."

You will need to register or log in or register to engage in the new community.

TAF21 Hackathon in Review

Published:
11:12pm Jun 15, 2021

Overview

This year’s hackathon focused on analyzing wind power production in Texas. The year 2020 marked a record year for the global wind industry. Participants used real-world data from multiple sources like Electrical Reliability Council of Texas (ERCOT) and the National Oceanic and Atmospheric Administration (NOAA) to model wind power production from weather patterns and wind farms’ metadata. 

The open-ended challenges prompted participants to build visuals and highlight analytics from both geospatial and time series data. By taking data from disparate weather stations around Texas, participants needed to interpolate and transfer wind data spatially onto individual wind farms, and then merge that environmental data with wind turbine power curves to accurately estimate power generated. Some of the tricks planted along the way required participants to join together datasets, construct layers on map charts, create calculated columns, and configure data functions from R code. In the end, successful participants could clearly show and predict locations and wind farms in Texas producing the most wind power, and the top submissions also took advantage of Spotfire’s Mods, scripting, and maps.

Winners

Hackathon winners were selected based on their ability to take data visualization to places we’ve never seen before, and to communicate advanced analytics effectively to inform their audiences. There were several creative entries making it very difficult to choose a winner! Below we detail what we liked most across several entries.

1st Place | Jolene Robertson
“Most Analytical”

Jolene demonstrated a clear handle on multiple components of Spotfire and Visual Data Communication. The display does not show too much or too little information, and the insights conveyed are too the point. 

On the left, we see our eyes drawn directly to an R generated contour map of wind speed, and the contour fitting is adjustable to different smoothing parameter values with the slider below. Jolene was the only submission to offer this sort of dynamic flexibility of direct calculation with Spotfire Data Functions. The markers on the map represent the wind power generated at each wind farm, with color and sizing used for intuitive interpretation.

At the top, we are easily able to select among different Turbine Models for drill down in the power curves and associated wind farms, a useful interaction for understanding power generation equipment in practice.

Next, we liked the use of subtle reference information to add a layer of detail without obscuring the main insights. The ERCOT CDR Zones are shown gently in the map background, and vertical reference lines were added to the Power Curves on the bottom right to show turbine cutoff speeds, and horizontal lines to show peak power generating capacity. These small touches were not instructed, and chosen entirely by Jolene.

Lastly, the custom menu buttons and popups viewable from the top right add another layer of detail without taking up more screen real-estate, and giving the consumer more control over information of interest. This type of layered approach to an analysis allows consumers to quickly and easily see the most critical information immediately, then drill through the data intuitively to explore more information, specifically for better decision making.

>> See Jolene’s analysis here

2nd Place | Ryan Hartquist
“Most Complete App”

“My approach was to try to leverage spotfire’s ability to produce a highly analytical product while still giving users a polished and usable experience.” --Ryan Hartquist

Ryan delivered a highly complete application. Collapsible menu accordions on the left give the consumer access to various data filterings. A navigation pane at the top helps the audience move along in a storyboard format. Standard page layouts across the dashboard create a sense of familiarity that drives intuitive interaction for any first time viewer.

Ryan spent considerable effort making his analysis very usable for a variety of audiences, regardless of their background. Such a dashboard would likely require little changes or improvements over time as the audience has enough flexibility and control to continue answering business questions at increasing levels of detail over time.

Visually, we really enjoyed Ryans consistent use of a primary blue color across the pages, not adding too much color to confuse data interpretations. The use of the side-by-side bar chart to demonstrate each turbine’s Rated Power Capacity against its actual produced power was very clever, interpretable, and effective.

>> See Ryan’s analysis here

3rd Place | Jade Liu
“Most Targeted”

“I was really impressed with the Hackathon challenges! It was the perfect balance of difficulty and creativity - nothing was designed for you to get stuck or waste a lot of time on one task.” --Jade Liu

When speaking to Jade after the competition, she communicates very concisely and to the point. It’s no wonder that her analytics follow the same level of directness and efficiency (and that she was our fastest submission!)

Jade did not add too much extra pizazz or superfluous elements to her analysis. Everything included is useful and carefully chosen. From the subtle white map background, slightly visible US County borders (that do not interfere with the subtle contour lines), and color usage that is clear and consistent across the heatmap and markers. Such large arrows for wind speed and direction may seem stark at first, but in our opinion they actually do a very good job at inferring the key information about wind behavior, which is the key insight of the map.

Throughout Jade’s analysis, we repeatedly saw challenges address directly and efficiently, leading us to give her the “Most Targeted” award. Congrats Jade!

>> See Jade’s analysis here

4th Place | Braeden Gilchrist
“Most Advanced”

Braeden chose to go above and beyond to provide advanced analytics from out-of-the-box Spotfire capabilities, and use that to enhance his understanding of the data during the Exploratory Data Analysis portion.

He started with a quick linear regression model of temperature vs wind speed to understand if there was any relationship, and how that might vary seasonally. Then, using that confirmed insight, he chose to use a sophisticated Spotfire Mod to convey these patterns over time with animation:

The insight from this animation was very interesting and justifiably used additional dimensions of data to communicate a deeper understanding of the environmental patterns.

>> See Braeden’s analysis here

Honorable Mention | Jon Henderson
“Most Aesthetic”

Enertel’s Jon Henderson has a special talent for visual aesthetics and communication -- he often puts Dr. Spotfire to shame!

Using radiantly beautiful color schemes across his heatmap, he clearly communicates the areas of highest wind speed and minimizes the noise of areas with little wind. The custom map background Jon made from scratch creates deep cohesion, while the bright green primary color pulls the data off the page without being distracting or jarring.

Special uses of typography, action shortcuts, and Spotfire Mods are next-level wizardry that overall completed easily the “Most Aesthetic” dashboard we’ve seen. As mentioned, Jon has a special knack for this level of detail -- if you want to learn more about his tips and guidance, check out our recent Dr. Spotfire session with Jon as a guest speaker!

>> See Jon’s analysis here

Other Notable Submissions

Ricky Barthel - “Clearest Exploratory Analysis”

Ricky Barthel, who placed 4th at our last TAF Hackathon in 2019, returned again this year and wowed us with an elegant and creative Exploratory Analysis in the first exercise. A creative use of a heatmap visual at the top right shows seasonal patterns over month and season. This is reinforced by the line chart and bar chart below, using data for each weather station, and again reinforced geospatially with the nice, narrow map on the right. Great ideas Ricky!

>> See Ricky’s analysis here

James Kim - “Beautiful Maps”

James Kim used a full range of color to communicate wind patterns in both the heatmap and on wind farm markers, with a subtle white indication of weather station locations as well. He had very beautiful maps we wanted to share. Nice job James!

>> See James’ analysis here

Tyler Kendle - “The Calendar Chart”

With a really creative use of a custom Calendar Chart, Tyler showed us how wind speeds varied throughout the year in the above heatmap. A very cool visual by Tyler!

>> See Tyler’s analysis here

The Solutions

Key steps in the solution included:
•    Visualizing environmental data for context
•    Mapping Weather Stations and Wind Farms from disconnected data sets
•    Computationally generating a 3-dimensional continuous surface for wind speed
•    Using resulting surface to interpolate wind speeds at specific Wind Farms
•    Merging wind speeds with turbine power curve data to determine actual power generated at each wind farm

Exercise 1: Visualize contextual environmental data to explore patterns in wind speed, time, and seasonality.

>> Solution: This exercise was intentionally left very open ended for participants to flex their creativity in Exploratory Data Analysis. The goal was for the grader (analytics consumer) to be able to easily and effectively understand the patterns in data without confusion, and elements of visual perception and communication were highly rewarded. This exercise should set the stage for deeper analysis into wind energy generation in subsequent exercises, by allowing the participant and consumer to understand available data and its limitations.

Note that Exploratory Data Analysis can often be very ad-hoc and difficult to interpret by others if not designed with key questions, insights, and usability in mind. The exercise was graded from the perspective of a consumer that had no awareness of the data available.

Exercise 2: Public data at Weather Stations and Wind Farms are typically decoupled. Create a Smoothed 3-Dimensional Surface to represent wind speed at weather stations, with the wind farms then overlaid.

>> Solution: Spatial interpolation requires data to interpolate from, and could be considered a nuanced concept many may not be familiar with. Such interpolation was required in the following exercise, however, this exercise was a good intermediary step to help visualize what is happening mathematically.

The interpolation method used in the following exercise uses a LOESS smoothing method that essentially works by weighting individual observations against their distance from the best fit line of nearby points:

The above example shows two dimensions (x=E and y=NOx) with different smoothing parameters creating differing smoothed red lines. In our use caes, however, we have 3 dimensions of interest: latitude, longitude, and wind speed.

Using a freely available Spotfire Data Function on the TIBCO Exchange, participants were able to simply “plug-and-play” to create a 3-dimensional surface that represents wind speed across the spatial data. This surface represents the average wind speed at finite individual grid points within the perimeter of available weather stations. Wind Farms were overlaid for context, the actual wind speed data from Weather Stations have yet to be combined with the farms, which is done in the next exercise.

Exercise 3: Using a provided R script, create a Spotfire Data Function to spatially interpret wind speed data onto individual wind farms using the LOESS smoothing method.

(Note the circle markers for wind farms are now colored by the average wind speed that was interpolated from the triangle weather stations)

>> Solution: The exact script needed to complete this exercise was provided, participants only needed to configure it correctly as a data function. A challenge was in knowing if data function parameters were inputted and outputted as Values, Columns, or Data Tables (but a careful reading of instructions gave strong hints!). Another challenge was knowing which data tables to work with. The Weather Station data set constituted the inputs for fitting the surface, and the Wind Farm data set constituted inputs for the interpolation response locations (wind farm latitude and longitude), as well as the destination for interpolated wind speed values (a column for wind speeds outputted on the Wind Farm data set).

Once done correctly, a new column of data (average wind speed) was available at each Wind Farm and could be easily visualized with a sequential gradient color scheme.

Exercise 4: Visualize Turbine Power Curves for each available turbine model

>> Solution: Wind Turbines produce varying power at different wind speeds, and are optimized in design for certain wind speed ranges. This concept can be quite technical so this exercise served as an intermediary to first get the consumers acquainted with the power curve concept by simply plotting a line chart for each model, illustrating power generated on the y-axis and wind speeds on the x-axis.

As wind speeds increase, the power generated moves up on these curves until it plateaus and eventually shuts down to prevent equipment damage. The minimal requirements in this exercise were designed to be quite simple:

Exercise 5: By now having wind speeds at each Wind Farm, use the Generator ID’s for the turbines at each Wind Farm to join power curve data with wind speed data.

>> Solution: A trick here is knowing that you cannot join on a calculated column or a data function column output, but you can freeze such columns to join, or link to the table, or even export the table and re-import it static.

So, by having a data function output column for wind speed AND the Generator ID in the Wind Farm data set, then having the Generator ID and corresponding power generated at 0.5 mph intervals of Wind Speed on the power curve, one could simply join the two data sets on Generator ID and wind speed to bring in the resulting Power Generated at each wind farm. A trick to note was that the data function output for wind speed must be rounded to the nearest whole number, and there are multiple turbines at each Wind Farm, so the power generation number found for each turbine must then be multiplied by the number of turbines at that farm.

After using the Spotfire Data Canvas for this visual data wrangling exercise, a single and simple data visualization could be created for showing the power generated across all wind farms in the state of Texas during this period (these are of course only estimates, but educated estimates at that).

It’s not over yet!

TIBCO Analytics Forum 2021 is still live through the end of June 2021! Click here to view all the great content recorded and on-demand, and also be sure to visit us for more great analytics content at TIBCO NOW 2021 on September 27-30!

 

Authors:

Sweta Kotha is a Data Scientist at TIBCO and a recent graduate from Carnegie Mellon University. Her experience spans data science, natural language processing, and biostatistics. She likes trying out new technologies and methods to address analytics challenges and is interested in effectively communicating with data. She enjoys reading, running, and traveling.

Neil Kanungo is a Data Scientist at TIBCO and specializes in data visualization and business analytics. He helps deliver unique solutions to industry’s biggest challenges. Neil takes a special interest in operationalizing analytics across organizations at multiple levels, and in fostering user engagement. In his free time, Neil enjoys hiking with his dog, live music, and playing pinball.