AWS Thrifty Mod for Flowpipe

standard

turbot/aws_thrifty

Overview

Get Involved

Edit on GitHub Discuss on Slack

Version

v1.0.1

Overview

EMR clusters which are live but not currently running tasks should be reviewed and checked whether the cluster has been idle for more than 30 minutes. It is ideal to delete such clusters for cost optimization.

This query trigger identifies EMR clusters idle for more than 30 mins and then either sends a notification or attempts to perform a predefined corrective action.

Getting Started

By default, this trigger is disabled, but it can be configured by setting the following variables:

emr_clusters_idle_30_mins_trigger_enabled should be set to true as the default is false.
emr_clusters_idle_30_mins_trigger_schedule should be set to your preferred schedule.
emr_clusters_idle_30_mins_default_action should be set to the desired action (e.g., "notify" for notifications or "terminate_cluster" to terminate the cluster).

Then starting the server:

flowpipe server

or if you've set the variables in a .fpvars file:

flowpipe server --var-file=/path/to/your.fpvars

Pipeline

Correct EMR Clusters idle 30 mins

Query

with cluster_metrics as (
  select
    id,
    maximum,
    date(timestamp) as timestamp
  from
    aws_emr_cluster_metric_is_idle
  where
    timestamp <= current_timestamp - interval '30 minutes'
),
emr_cluster_isidle as (
  select
    id,
    count(maximum) as count,
    sum(maximum) / count(maximum) as avagsum
  from
    cluster_metrics
  group by
    id,
    timestamp
)
select
  concat(i.id, ' [', i.region, '/', i.account_id, ']') as title,
  i.id,
  i.region,
  i.sp_connection_name as conn
from
  aws_emr_cluster as i
  left join emr_cluster_isidle as u on u.id = i.id
where
  u.id is null
  and avagsum = 1
  and count >= 7;

Schedule

15m

Overview

Getting Started

Pipeline

Query

Schedule

Tags