AWS LambdaでRDSの自動起動と自動停止

忘れるので自分用メモ。

EventBridge Scheduler でも出来るだろうけど今回は従来からある Lambda を使って DB インスタンスを指定して自動起動・停止を行う方法。

※「これをコピペすれば出来ます」的なことは書いてないです

IAM の作成
素直に起動と停止の関数とイベントを作成する場合
タグで切り分ける
EventBridge からの入力で切り分ける
EventBridge からの入力とタグで切り分ける

IAM の作成

Lambda 用のロールを作成しておくか、Lambda 側でロールを作成して権限を追加しておく。今回、DB インスタンスを起動・停止するために使用するのは下記の 3 つ。

rds:DescribeDBInstances
rds:StartDBInstance
rds:StopDBInstance

DB インスタンスのタグを使用する場合は下記が必要。

rds:ListTagsForResource

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "logs:CreateLogGroup",
            "Resource": "arn:aws:logs:*:*:*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "logs:CreateLogStream",
                "logs:PutLogEvents"
            ],
            "Resource": "arn:aws:logs:*:*:log-group:*:*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "rds:DescribeDBInstances",
                "rds:ListTagsForResource",
                "rds:StartDBInstance",
                "rds:StopDBInstance"
            ],
            "Resource": "arn:aws:rds:*:*:db:*"
        }
    ]
}

素直に起動と停止の関数とイベントを作成する場合

起動イベントと起動関数、停止イベントと停止関数を作成し、対象をコード内で指定するような場合。簡単ではあるけど変更がある場合にコードを書き換えなければいけないので面倒くさい。

StopDBInstance

import boto3

rds = boto3.client('rds', region_name='ap-northeast-1')

def lambda_handler(event, context):
    instances = [
        'staging',
        'development',
    ]
    for instance in instances:
        status = rds.describe_db_instances(DBInstanceIdentifier=instance)['DBInstances'][0]['DBInstanceStatus']
        if status not in ['available']:
            continue
        response = rds.stop_db_instance(DBInstanceIdentifier=instance)
        print(response)

DB インスタンスをコード内ではなく環境変数で指定するのであればカンマ区切りで複数指定可能にしておく。DB インスタンスの指定は a,b,c や a, b, c になる可能性があるので strip() を使った方がいいかもしれない。環境変数の値が空の場合は空の配列にして欲しいのでその条件も入れておく。

import os

instances = [i.strip() for i in os.environ.get('DB_INSTANCE_IDENTIFIERS', '').split(',') if i is str]

既に「available（利用可能）」な状態で start_db_instance() を実行したり、「stopped（一時的に停止済み）」の状態で stop_db_instance() を実行するとエラーになるため最初に describe_db_instances() で DB インスタンスのステータスを確認しておく必要がある。エラーを無視するなら省略してもかまわない。

今回は「停止は毎日 20 時、開始は月〜金の 8:00」というスケジュールにするため土曜と日曜に必ず停止中に stop_db_instance() が実行されてエラーになってしまうのできちんと処理を入れておく（停止も月〜金にすれば別に問題ないのだけど）。

describe_db_instances() の結果サンプルは下記の通り（長いので一部省略）。各インスタンスの配下に DBInstanceStatus がある。レスポンスの詳細は https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/rds.html#RDS.Client.describe_db_instances を参照。

{
    'Marker': 'string',
    'DBInstances': [
        {
            'DBInstanceIdentifier': 'string',
            'DBInstanceClass': 'string',
            'Engine': 'string',
            'DBInstanceStatus': 'string'
        }
    ]
}

状態には下記のようなものがある。

available（利用可能）
starting（開始中）
stopping（停止中）
stopped（一時的に停止済み）

関数を作成したらトリガーに CloudWatch Events を追加する。

インスタンスを指定せずにタグを確認して自動起動・停止させたり、一つの関数で実行時間を 8 時と 20 時にして時間（h）や AM/PM で動作を変えたりしたりするのもありだと思う。context の中には function_name が入っているので context.function_name.startswith() 等を使えば命名規則さえ守れば起動と停止でスクリプトの内容を揃えることもできそう。

タグで切り分ける

DB インスタンスが複数ある場合にタグを使って自動起動・停止の対象として指定する場合。{"Key": "AutoStartStop", "Value": "true"} というタグの場合の例。起動・停止切り分けは無いので関数 2 つ作ることになる。後述の EventBridge の入力を併用するのが良さそう。

import boto3

rds = boto3.client('rds')

def lambda_handler(event, context):
    instances = rds.describe_db_instances()['DBInstances']
    for instance in instances:
        tags = {tag['Key']: tag['Value'] for tag in rds.list_tags_for_resource(ResourceName=instance['DBInstanceArn'])['TagList']}
        if tags.get('AutoStartStop') == 'true' and instance['DBInstanceStatus'] in ['stopped']:
            response = rds.start_db_instance(DBInstanceIdentifier=instance['DBInstanceIdentifier'])
            print(response)

list_tags_for_resource() で Filters を使いたかったんだけど対応してないらしい。

list_tags_for_resource() の返却値は下記のようになっている。

{
    'TagList': [
        {
            'Key': 'string',
            'Value': 'string'
        },
    ]
}

この状態だと少し扱いづらいので下記のような形式に置き換えたけど TagList 回してチェックしてもいいと思う。

{
    'key_1': '1',
    'key_2': '2',
}

EventBridge からの入力で切り分ける

イベントの入力で起動・停止を切り分けるパターン。これだと関数が一つで済む。DB インスタンスが多い場合は起動と停止のイベントそれぞれにインスタンスを書くことになるのでちょっと大変になると思う。DB インスタンスは Lambda 側の環境変数で設定するようにした方がいいかもしれない。

import boto3

rds = boto3.client('rds')

def lambda_handler(event, context):
    instances = event['DBInstances']
    for instance in instances:
        status = rds.describe_db_instances(DBInstanceIdentifier=instance)['DBInstances'][0]['DBInstanceStatus']
        if   event['Action'] == 'start' and status in ['stopped']:
            response = rds.start_db_instance(DBInstanceIdentifier=instance)
            print(response)
        elif event['Action'] == 'stop'  and status in ['available']:
            response = rds.stop_db_instance(DBInstanceIdentifier=instance)
            print(response)

起動と停止のトリガーを追加したら各イベントの入力に JSON を設定する。

{
    "Action": "stop",
    "DBInstances": [
        "staging",
        "development"
    ]
}

EventBridge からの入力とタグで切り分ける

新しく DB インスタンス立てたりすると面倒だからやっぱりタグを使うのが楽かもしれない。「起動は手動でやるけど停止は勝手にやっておいてほしい」みたいな要求もタグなら AutoStart: false と AutoStop: true のタグを設定しておけばよさそう。

1 つの関数で EventBridge から起動・停止の入力だけ受け取ってタグで切り分ける場合はこんな感じだろうか。

StartStopDBInstance

import boto3

rds = boto3.client('rds')

def lambda_handler(event, context):
    DBInstances = rds.describe_db_instances()['DBInstances']
    for DBInstance in DBInstances:
        DBInstanceIdentifier = DBInstance['DBInstanceIdentifier']
        DBInstanceArn        = DBInstance['DBInstanceArn']
        DBInstanceStatus     = DBInstance['DBInstanceStatus']
        TagList              = rds.list_tags_for_resource(ResourceName=DBInstanceArn)['TagList']
        Tags                 = {tag['Key']: tag['Value'] for tag in TagList}
        if   event['Action'] == 'start' and Tags.get('AutoStart') == 'true' and DBInstanceStatus in ['stopped']:
            response = rds.start_db_instance(DBInstanceIdentifier=DBInstanceIdentifier)
            print(response)
        elif event['Action'] == 'stop'  and Tags.get('AutoStop')  == 'true' and DBInstanceStatus in ['available']:
            response = rds.stop_db_instance(DBInstanceIdentifier=DBInstanceIdentifier)
            print(response)