# Architecture details
<a name="architecture-details"></a>

This section describes the components and AWS services that make up this solution and the architecture details on how these components work together.

## AWS services in this solution
<a name="aws-services-in-this-solution"></a>


| AWS service | Description | 
| --- | --- | 
|   [AWS WAF](https://aws.amazon.com/waf/)   |   **Core**. Deploys an AWS WAF web ACL, AWS Managed Rules rule groups, custom rules, and IP sets. Makes AWS WAF API calls to block common attacks and secure web applications.  | 
|   [Amazon Data Firehose](https://aws.amazon.com/kinesis/data-firehose/)   |   **Core**. Delivers AWS WAF logs to Amazon S3 buckets.  | 
|   [Amazon S3](https://aws.amazon.com/s3/)   |   **Core**. Stores AWS WAF, CloudFront, and ALB logs.  | 
|   [AWS Lambda](https://aws.amazon.com/lambda/)   |   **Core**. Deploys multiple Lambda functions to support custom rules.  | 
|   [Amazon EventBridge](https://aws.amazon.com/eventbridge/)   |   **Core**. Creates events rules to invoke Lambda.  | 
|   [Amazon Athena](https://aws.amazon.com/athena/)   |   **Supporting**. Creates Athena queries and work groups to support the Athena log parser.  | 
|   [AWS Glue](https://aws.amazon.com/glue/)   |   **Supporting**. Creates databases and tables to support the Athena log parser.  | 
|   [Amazon SNS](https://aws.amazon.com/sns/)   |   **Supporting**. Sends Amazon Simple Notification Service (Amazon SNS) email notifications to support IP retention on allowed and denied lists.  | 
|   [AWS Systems Manager](https://aws.amazon.com/systems-manager/)   |   **Supporting**. Provides application-level resource monitoring and visualization of resource operations and cost data.  | 

# Log parser options
<a name="log-parser-options"></a>

As described in the [Architecture overview](architecture-overview.md), there are three options to handle HTTP flood and scanner and probe protections. The following sections explain each of these options in more detail.

## AWS WAF rate-based rule
<a name="aws-waf-rate-based-rule"></a>

Rate-based rules are available for HTTP flood protection. By default, a rate-based rule aggregates and rate limits requests based on the request IP address. This solution allows you to specify the number of web requests that a client IP allows in a trailing, continuously updated five-minute period. If an IP address breaches the configured quota, AWS WAF blocks new requests blocked until the request rate is less than the configured quota.

We recommend selecting the rate-based rule option if the request quota is more than 2,000 requests per five minutes and you don’t need to implement customizations. For example, you don’t consider static resource access when counting requests.

You can further configure the rule to use various other aggregation keys and key combinations. For more information, see [Aggregation options and keys](https://docs.aws.amazon.com/waf/latest/developerguide/waf-rule-statement-type-rate-based-aggregation-options.html).

## Amazon Athena log parser
<a name="amazon-athena-log-parser"></a>

Both **HTTP Flood Protection** and **Scanner & Probe** **Protection** template parameters provide the Athena log parser option. When activated, CloudFormation provisions an Athena query and a scheduled Lambda function responsible for orchestrating Athena to run, process result output, and update AWS WAF. This Lambda function is invoked by a CloudWatch event configured to run every five minutes. This is configurable with the **Athena Query Run Time Schedule** parameter.

We recommend selecting this option when you can’t use AWS WAF rate-based rules and you have familiarity with SQL to implement customizations. For more information about how to change the default query, refer to [View Amazon Athena queries](view-amazon-athena-queries.md).

HTTP flood protection is based on AWS WAF access log processing and uses WAF log files. The WAF access log type has a lower lag time, which you can use to identify HTTP flood origins more quickly when compared to CloudFront or ALB log delivery time. However, you must select the CloudFront or ALB log type in the **Activate Scanner & Probe Protection** template parameter to receive response status codes.

**Note**  
If a bad bot bypasses the honeypot and directly interacts with ALB or CloudFront, the system detects malicious behavior through log analysis unless both HTTP Flood Protection and Scanner & Probe Protection are not using the Lambda log parser.

## AWS Lambda log parser
<a name="aws-lambda-log-parser"></a>

The **HTTP Flood Protection** and **Scanner & Probe Protection** template parameters provide the **AWS Lambda Log Parser** option. Use the Lambda log parser only when the **AWS WAF rate-based rule** and **Amazon Athena log parser** options aren’t available. A known limitation of this option is that information is processed within the context of the file being processed. For example, an IP might generate more requests or errors than the defined quota, but because this information is split into different files, each file doesn’t store enough data to exceed the quota.

**Note**  
Additionally, if a bad bot bypasses the honeypot and interacts directly with ALB or CloudFront, detection relies on the chosen log parser option to effectively identify and block malicious activity.

# Component details
<a name="component-details"></a>

As described in the [Architecture diagram](architecture-overview.md#architecture-diagram), four of this solution’s components use automations to inspect IP addresses and add them to the AWS WAF block list. The following sections explain each of these components in more detail.

## Log parser - Application
<a name="log-parser--application"></a>

The Application log parser helps protect against scanners and probes.

 **Application log parser flow.** 

![\[app log parser flow\]](http://docs.aws.amazon.com/solutions/latest/security-automations-for-aws-waf/images/app-log-parser-flow.png)


1. When CloudFront or an ALB receives requests on behalf of your web application, it sends access logs to an Amazon S3 bucket.

   1. (Optional) If you select `Yes - Amazon Athena log parser` for the template parameters **Activate HTTP Flood Protection** and **Activate Scanner & Probe Protection**, a Lambda function moves access logs from their original folder *<customer-bucket>* `/AWSLogs` to a newly partitioned folder *<customer-bucket>* `/AWSLogs-partitioned/` *<optional-prefix>* `/year=` *<YYYY>* `/month=` *<MM>* `/day=` *<DD>* `/hour=` *<HH>*/ upon their arrival in Amazon S3.

   1. (Optional) If you select `yes` for the **Keep Data in Original S3 location** template parameter, logs remain in their original location and are copied to their partitioned folder, duplicating your log storage.
**Note**  
For the Athena log parser, this solution only partitions new logs that arrive in your Amazon S3 bucket after you deploy this solution. If you have existing logs that you want to partition, you must manually upload those logs to Amazon S3 after you deploy this solution.

1. Based on your selection for the template parameters **Activate HTTP Flood Protection** and **Activate Scanner & Probe Protection**, this solution processes logs using one of the following:

   1.  **Lambda** - Each time a new access log is stored in the Amazon S3 bucket, the `Log Parser` Lambda function is initiated.

   1.  **Athena** - By default, every five minutes the **Scanner & Probe Protection** Athena query runs, and the output pushes to AWS WAF. This process is initiated by a CloudWatch event, which starts the Lambda function responsible for running the Athena query and pushes the result into AWS WAF.

1. The solution analyzes the log data to identify IP addresses that generated more errors than the defined quota. The solution then updates an AWS WAF IP set condition to block those IP addresses for a customer-defined period of time.

## Log parser - AWS WAF
<a name="log-parser--aws-waf"></a>

If you select `yes - AWS Lambda log parser` or `yes - Amazon Athena log parser` for **Activate HTTP Flood Protection**, this solution provisions the following components, which parse AWS WAF logs to identify and block origins that flood the endpoint with a request rate greater than the quota you defined.

 **AWS WAF log parser flow.** 

![\[waf log parser flow\]](http://docs.aws.amazon.com/solutions/latest/security-automations-for-aws-waf/images/waf-log-parser-flow.png)


1. When AWS WAF receives access logs, it sends the logs to an Firehose endpoint. Firehose then delivers the logs to a partitioned bucket in Amazon S3 named *<customer-bucket>* `/AWSLogs/` *<optional-prefix>* `/year=` *<YYYY>* `/month=` *<MM>* `/day=` *<DD>* `/hour=` *<HH>* `/` 

1. Based on your selection for the template parameters **Activate HTTP Flood Protection** and **Activate Scanner & Probe Protection**, this solution processes logs using one of the following:

   1.  **Lambda**: Each time a new access log is stored in the Amazon S3 bucket, the `Log Parser` Lambda function is initiated.

   1.  **Athena:** By default, every five minutes the scanner and probe Athena query is run and the output is pushed to AWS WAF. This process is initiated by an Amazon CloudWatch event, that then starts the Lambda function responsible for executing the Amazon Athena query, and pushes the result into AWS WAF.

1. The solution analyses the log data to identify IP addresses that sent more requests than the defined quota. The solution then updates an AWS WAF IP set condition to block those IP addresses for a customer-defined period of time.

## Log parser - Bad bot
<a name="log-parser--badbot"></a>

The Bad bot log parser inspects requests to the honeypot endpoint to extract their source IP address.

 **Bad bot log parser flow.** 

![\[badbot log parser flow\]](http://docs.aws.amazon.com/solutions/latest/security-automations-for-aws-waf/images/badbot-log-parser-flow.png)


1. If `Bad Bot Protection` is activated and both HTTP Flood Protection and Scanner & Probe Protection features are disabled: The system will use the Log Lambda parser, which logs only bad bot requests based on [WAF label filters](https://docs.aws.amazon.com/waf/latest/developerguide/waf-labels.html).

1. The Lambda function intercepts and inspects request headers to extract the IP address of the source that accessed the trap endpoint.

1. The solution analyses the log data to identify IP addresses that sent more requests than the defined quota. The solution then updates an AWS WAF IP set condition to block those IP addresses for a customer-defined period of time.

## IP lists parser
<a name="ip-lists-parser"></a>

The `IP Lists Parser` Lambda function helps protect against known attackers identified in third-party IP reputation lists.

 **IP eputation lists parser flow.** 

![\[ip reputation lists flow\]](http://docs.aws.amazon.com/solutions/latest/security-automations-for-aws-waf/images/ip-reputation-lists-flow.png)


1. An hourly Amazon CloudWatch event invokes the `IP Lists Parser` Lambda function.

1. The Lambda function gathers and parses data from three sources:
   + Spamhaus DROP and EDROP lists
   + Proofpoint Emerging Threats IP list
   + Tor exit node list

1. The Lambda function updates the AWS WAF block list with the current IP addresses.