Earlier this year AWS released CloudTrail Lake. CloudTrail Lake is managed data lake to store and query CloudTrail audit logs. While at a first glance the price of $2.5 per GB seems expensive, it’s actually not, since AWS CloudTrail Lake pricing includes 7 years of storage. As of today there are still some features missing, most importantly encryption support with KMS, but if you can live with that CloudTrail Lake is a good option to explore. In the sections below I’ll show you how to enable and migrate to CloudTrail Lake.

Console

Enabling CloudTrail Lake through the console is a pretty straightforward exercise.

  1. Log in to the AWS Management Console in your Organizations’ management account.
  2. Navigate to the CloudTrail console.
  3. From the left menu, select Lake.
  4. Click Event Data stores.
  5. Click Create event data store.
  6. In the General details section, configure a name and retention period and click Next.

Create event data store

  1. On the next page, select the event types you want to capture. Make sure to check Enable for all accounts in my organization if you want to get events from all of your AWS accounts, comparable to a CloudTrail Organizational Trail. If you want to import events from an existing CloudTrail trail, make sure to check the Copy Trail events button.

Event Types

  1. In the Management events section, select the management events you want to capture in your CloudTrail Lake. Note that you might want to consider excluding KMS and RDS Data API events, as these can rack up quite some GBs over time.

Event Types

  1. If you selected the Data events checkbox above, you will need to configure from which services CloudTrail Lake should capture events. Same applies here, be aware that this can generate quite a lot of events.

Event Types

  1. If you selected the Copy Trail events checkbox above, you will need to configure the details of the import in the Copy existing trail events section. You can also skip this step and perform the import later by unchecking the box in the Event type section.

Event Types

  1. Once done with the configuration, click Create event data store. Your event data store will be created and events from your CloudTrail trail will be imported if you selected Copy Trail events.

  2. You can now stop logging events in your CloudTrail trail. Click Trails in the left menu, select your Trail and click Stop logging.

Cloudformation

You can also configure CloudTrail Lake using AWS Cloudformation. The main thing that confused me a bit was the configuration of the Management and Data events filters. To replicate console behavior, include the following AdvancedEventSelectors in your template.

  • Enable Management Events
  • Exclude KMS Events
  • Exclude RDS Data API Events
AdvancedEventSelectors:
# Enable Management Events
- Name: Management Events
  FieldSelectors:
    - Field: eventCategory
      Equals: 
        - Management
    - Field: eventSource
      NotEquals:
        - kms.amazonaws.com # Exclude KMS Events
        - rdsdata.amazonaws.com # Exclude RDS Data API Events
    # If you want Read & Write events leave the below lines commented out
    # If you want Write events only uncomment the following lines
    # - Field: readOnly
    #   Equals: 
    #     - false
    # If you want Read events only uncomment the following lines
    # - Field: readOnly
    #   Equals: 
    #     - true                

To configure Data Event filters you can adapt the following examples.

  • Capture all S3 data events
AdvancedEventSelectors:
- Name: Data Events - S3
  FieldSelectors:
  - Field: eventCategory
    Equals:
      - Data
  - Field: resources.type
    Equals:
      - AWS::S3::Object
  • Capture only S3 data events for a specific S3 bucket
AdvancedEventSelectors:
- Name: Data Events - S3
  FieldSelectors:
  - Field: eventCategory
    Equals:
      - Data
  - Field: resources.type
    Equals:
      - AWS::S3::Object
  - Field: resources.ARN
    StartsWith:
     - arn:<partition>:s3:::<bucket_name>/

For a full overview of filter settings go to the official documentation.

For a full example refer to the template below. This template will create:

  • a role that can be used for importing CloudTrail trail events
  • an organization-wide CloudTrail Lake Event Data Store, excluding management events from KMS and the RDS Data API.
AWSTemplateFormatVersion: 2010-09-09
Description: CloudTrail Organizational Lake
Parameters:
  LakeName:
    Type: String
    Default: my-organizational-lake
  LakeRetention:
    Type: Number
    Default: 7
  BucketName:
    Type: String
    Default: my-organizational-trail

Resources:
  CloudTrailLakeEventDataStore:
    Type: AWS::CloudTrail::EventDataStore
    Properties: 
      Name: !Ref LakeName
      MultiRegionEnabled: true
      OrganizationEnabled: true
      RetentionPeriod: !Ref LakeRetention
      TerminationProtectionEnabled: true
      AdvancedEventSelectors:
        # Enable Management Events
        - Name: Management Events
          FieldSelectors:
            - Field: eventCategory
              Equals: 
                - Management
            # Exclude KMS & RDS Data Api Events
            - Field: eventSource
              NotEquals:
                - kms.amazonaws.com
                - rdsdata.amazonaws.com
        - Name: Data Events - S3
          FieldSelectors:
            - Field: eventCategory
              Equals:
                - Data
            - Field: resources.type
              Equals:
               - AWS::S3::Object
        - Name: Data Events - Lambda
          FieldSelectors:
            - Field: eventCategory
              Equals:
                - Data
            - Field: resources.type
              Equals:
               - AWS::Lambda::Function
        - Name: Data Events - DynamoDB
          FieldSelectors:
            - Field: eventCategory
              Equals:
                - Data
            - Field: resources.type
              Equals:
               - AWS::DynamoDB::Table

  CloudTrailImportRole:
    Type: AWS::IAM::Role
    Properties:
      RoleName: cloudtrail-lake-import
      AssumeRolePolicyDocument:
        Version: 2012-10-17
        Statement:
          - Sid: AssumeRole
            Action: sts:AssumeRole
            Effect: Allow
            Principal:
              Service: cloudtrail.amazonaws.com
            Condition: 
              StringEquals:
                "aws:SourceAccount": !Sub ${AWS::AccountId}
              ArnLike:
                "aws:SourceArn": !Sub "arn:aws:cloudtrail:*:${AWS::AccountId}:*"
      Policies:
        - PolicyName: CloudTrailCloudWatchLogs
          PolicyDocument:
            Version: 2012-10-17
            Statement:
            - Sid: AWSCloudTrailImportBucketAccess
              Effect: Allow
              Action:
              - s3:ListBucket
              - s3:GetBucketAcl
              Resource:
              - !Sub arn:aws:s3:::${BucketName}
            - Sid: AWSCloudTrailImportGetObject
              Effect: Allow
              Action:
              - s3:GetObject
              Resource:
              - !Sub arn:aws:s3:::${BucketName}
              - !Sub arn:aws:s3:::${BucketName}/*

Currently, CloudFormation does not support the configuration of the import function yet. You can use Boto3 to achieve the same.

import boto3
from datetime import datetime

client = boto3.client('cloudtrail')
client.start_import(
        Destinations=[
          '<event-data-store-name>',
        ],
        ImportSource={
            'S3': {
                'S3LocationUri': 's3://<bucket-name>',
                'S3BucketRegion': '<region>',
                'S3BucketAccessRoleArn': '<role-arn>'
            }
        },
        StartEventTime=datetime(<year>, <month>, <day>),
        EndEventTime=datetime(<year>, <month>, <day>),
        ImportId='<random-name>'
    )

Photo by Ryan Bahm on Unsplash

comments powered by Disqus