Skip to content

Commit b099086

Browse files
committed
Add Lambda function for the Amazon Security Lake integration (#189)
* Migrate from #147 * Update amazon-security-lake integration - Improved documentation. - Python code has been moved to `wazuh-indexer/integrations/amazon-security-lake/src`. - Development environment now uses OpenSearch 2.12.0. - The `wazuh.integration.security.lake` container now displays logs, by watching logstash's log file. - [**NEEDS FIX**] As a temporary solution, the `INDEXER_USERNAME` and `INDEXER_PASSWORD` values have been added as an environment variable to the `wazuh.integration.security.lake` container. These values should be set at Dockerfile level, but isn't working, probably due to permission denied on invocation of the `setup.sh` script. - [**NEEDS FIX**] As a temporary solution, the output file of the `indexer-to-file` pipeline as been moved to `/var/log/logstash/indexer-to-file`. Previous path `/usr/share/logstash/pipeline/indexer-to-file.json` results in permission denied. - [**NEEDS FIX**] As a temporary solution, the input.opensearch.query has been replaced with `match_all`, as the previous one does not return any data, probably to the use of time filters `gt: now-1m`. - Standard output enable for `/usr/share/logstash/pipeline/indexer-to-file.json`. - [**NEEDS FIX**] ECS compatibility disabled: `echo "pipeline.ecs_compatibility: disabled" >> /etc/logstash/logstash.yml` -- to be included automatically - Python3 environment path added to the `indexer-to-integrator` pipeline. * Disable ECS compatibility (auto) - Adds pipeline.ecs_compatibility: disabled at Dockerfile level. - Removes `INDEXER_USERNAME` and `INDEXER_PASSWORD` as environment variables on the `wazuh.integration.security.lake` container. * Add @timestamp field to sample alerts * Fix Logstash pipelines * Add working indexer-to-s3 pipeline * Add working Python script up to S3 upload * Add latest changes * Remove duplicated line * Add working environment with minimal AWS lambda function * Mount src folder to Lambda's workdir * Add first functional lambda function Tested on local environment, using S3 Ninja and a Lambda container * Working state * Add documentation * Improve code * Improve code * Clean up * Add instructions to build a deployment package * Make zip file lighter * Use default name for aws_region * Add destination bucket validation * Add env var validation and full destination S3 path * Add AWS_ENDPOINT environment variable * Rename AWS_DEFAULT_REGION * Remove unused env vars * Remove unused file and improve documentation a bit. * Makefile improvements * Use dummy env variables --------- Signed-off-by: Álex Ruiz <[email protected]>
1 parent 0396e88 commit b099086

24 files changed

+481
-329
lines changed

.gitignore

+6
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,11 @@
11
# build files
22
artifacts/
3+
*.deb
4+
*.rpm
5+
*.zip
6+
*.tar.gz
7+
8+
integrations/amazon-security-lake/package
39

410
.java
511
.m2

docker/dev/dev.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ services:
55
image: wi-dev:${VERSION}
66
container_name: wi-dev_${VERSION}
77
build:
8-
context: ./../..
8+
context: ${REPO_PATH}
99
dockerfile: ${REPO_PATH}/docker/dev/images/Dockerfile
1010
ports:
1111
# OpenSearch REST API

integrations/README.md

+58-23
Original file line numberDiff line numberDiff line change
@@ -1,58 +1,93 @@
11
## Wazuh indexer integrations
22

3-
This folder contains integrations with third-party XDR, SIEM and cybersecurity software.
3+
This folder contains integrations with third-party XDR, SIEM and cybersecurity software.
44
The goal is to transport Wazuh's analysis to the platform that suits your needs.
55

66
### Amazon Security Lake
77

8-
Amazon Security Lake automatically centralizes security data from AWS environments, SaaS providers,
9-
on premises, and cloud sources into a purpose-built data lake stored in your account. With Security Lake,
10-
you can get a more complete understanding of your security data across your entire organization. You can
11-
also improve the protection of your workloads, applications, and data. Security Lake has adopted the
12-
Open Cybersecurity Schema Framework (OCSF), an open standard. With OCSF support, the service normalizes
8+
Amazon Security Lake automatically centralizes security data from AWS environments, SaaS providers,
9+
on premises, and cloud sources into a purpose-built data lake stored in your account. With Security Lake,
10+
you can get a more complete understanding of your security data across your entire organization. You can
11+
also improve the protection of your workloads, applications, and data. Security Lake has adopted the
12+
Open Cybersecurity Schema Framework (OCSF), an open standard. With OCSF support, the service normalizes
1313
and combines security data from AWS and a broad range of enterprise security data sources.
1414

15-
##### Usage
15+
#### Development guide
1616

1717
A demo of the integration can be started using the content of this folder and Docker.
1818

1919
```console
2020
docker compose -f ./docker/amazon-security-lake.yml up -d
2121
```
2222

23-
This docker compose project will bring a *wazuh-indexer* node, a *wazuh-dashboard* node,
24-
a *logstash* node and our event generator. On the one hand, the event generator will push events
25-
constantly to the indexer, on the `wazuh-alerts-4.x-sample` index by default (refer to the [events
26-
generator](./tools/events-generator/README.md) documentation for customization options).
27-
On the other hand, logstash will constantly query for new data and deliver it to the integration
28-
Python program, also present in that node. Finally, the integration module will prepare and send the
29-
data to the Amazon Security Lake's S3 bucket.
23+
This docker compose project will bring a _wazuh-indexer_ node, a _wazuh-dashboard_ node,
24+
a _logstash_ node, our event generator and an AWS Lambda Python container. On the one hand, the event generator will push events
25+
constantly to the indexer, to the `wazuh-alerts-4.x-sample` index by default (refer to the [events
26+
generator](./tools/events-generator/README.md) documentation for customization options).
27+
On the other hand, logstash will constantly query for new data and deliver it to output configured in the
28+
pipeline, which can be one of `indexer-to-s3` or `indexer-to-file`.
29+
30+
The `indexer-to-s3` pipeline is the method used by the integration. This pipeline delivers
31+
the data to an S3 bucket, from which the data is processed using a Lambda function, to finally
32+
be sent to the Amazon Security Lake bucket in Parquet format.
33+
3034
<!-- TODO continue with S3 credentials setup -->
3135

3236
Attach a terminal to the container and start the integration by starting logstash, as follows:
3337

3438
```console
35-
/usr/share/logstash/bin/logstash -f /usr/share/logstash/pipeline/indexer-to-integrator.conf --path.settings /etc/logstash
39+
/usr/share/logstash/bin/logstash -f /usr/share/logstash/pipeline/indexer-to-s3.conf --path.settings /etc/logstash
3640
```
3741

38-
Unprocessed data can be sent to a file or to an S3 bucket.
39-
```console
40-
/usr/share/logstash/bin/logstash -f /usr/share/logstash/pipeline/indexer-to-file.conf --path.settings /etc/logstash
41-
/usr/share/logstash/bin/logstash -f /usr/share/logstash/pipeline/indexer-to-s3.conf --path.settings /etc/logstash
42+
After 5 minutes, the first batch of data will show up in http://localhost:9444/ui/wazuh-indexer-aux-bucket.
43+
You'll need to invoke the Lambda function manually, selecting the log file to process.
44+
45+
```bash
46+
export AUX_BUCKET=wazuh-indexer-aux-bucket
47+
48+
bash amazon-security-lake/src/invoke-lambda.sh <file>
4249
```
4350

44-
All three pipelines are configured to fetch the latest data from the *wazuh-indexer* every minute. In
45-
the case of `indexer-to-file`, the data is written at the same pace, whereas `indexer-to-s3`, data
46-
is uploaded every 5 minutes.
51+
Processed data will be uploaded to http://localhost:9444/ui/wazuh-indexer-amazon-security-lake-bucket. Click on any file to download it,
52+
and check it's content using `parquet-tools`. Just make sure of installing the virtual environment first, through [requirements.txt](./amazon-security-lake/).
4753

48-
For development or debugging purposes, you may want to enable hot-reload, test or debug on these files,
54+
```bash
55+
parquet-tools show <parquet-file>
56+
```
57+
58+
Bucket names can be configured editing the [amazon-security-lake.yml](./docker/amazon-security-lake.yml) file.
59+
60+
For development or debugging purposes, you may want to enable hot-reload, test or debug on these files,
4961
by using the `--config.reload.automatic`, `--config.test_and_exit` or `--debug` flags, respectively.
5062

5163
For production usage, follow the instructions in our documentation page about this matter.
5264
(_when-its-done_)
5365

5466
As a last note, we would like to point out that we also use this Docker environment for development.
5567

68+
#### Deployment guide
69+
70+
- Create one S3 bucket to store the raw events, for example: `wazuh-security-lake-integration`
71+
- Create a new AWS Lambda function
72+
- Create an IAM role with access to the S3 bucket created above.
73+
- Select Python 3.12 as the runtime
74+
- Configure the runtime to have 512 MB of memory and 30 seconds timeout
75+
- Configure an S3 trigger so every created object in the bucket with `.txt` extension invokes the Lambda.
76+
- Run `make` to generate a zip deployment package, or create it manually as per the [AWS Lambda documentation](https://docs.aws.amazon.com/lambda/latest/dg/python-package.html#python-package-create-dependencies).
77+
- Upload the zip package to the bucket. Then, upload it to the Lambda from the S3 as per these instructions: https://docs.aws.amazon.com/lambda/latest/dg/gettingstarted-package.html#gettingstarted-package-zip
78+
- Create a Custom Source within Security Lake for the Wazuh Parquet files as per the following guide: https://docs.aws.amazon.com/security-lake/latest/userguide/custom-sources.html
79+
- Set the **AWS account ID** for the Custom Source **AWS account with permission to write data**.
80+
81+
<!-- TODO Configure AWS Lambda Environment Variables /-->
82+
<!-- TODO Install and configure Logstash /-->
83+
84+
The instructions on this section have been based on the following AWS tutorials and documentation.
85+
86+
- [Tutorial: Using an Amazon S3 trigger to create thumbnail images](https://docs.aws.amazon.com/lambda/latest/dg/with-s3-tutorial.html)
87+
- [Tutorial: Using an Amazon S3 trigger to invoke a Lambda function](https://docs.aws.amazon.com/lambda/latest/dg/with-s3-example.html)
88+
- [Working with .zip file archives for Python Lambda functions](https://docs.aws.amazon.com/lambda/latest/dg/python-package.html)
89+
- [Best practices for working with AWS Lambda functions](https://docs.aws.amazon.com/lambda/latest/dg/best-practices.html)
90+
5691
### Other integrations
5792

5893
TBD
+28
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
2+
ZIP_NAME = wazuh_to_amazon_security_lake
3+
TARGET = package
4+
SRC = src
5+
6+
# Main target
7+
.ONESHELL:
8+
$(ZIP_NAME).zip: $(TARGET) $(SRC)/lambda_function.py $(SRC)/wazuh_ocsf_converter.py
9+
@cd $(TARGET)
10+
@zip -r ../$(ZIP_NAME).zip .
11+
@cd ../$(SRC)
12+
@zip ../$@ lambda_function.py wazuh_ocsf_converter.py
13+
@zip ../$@ models -r
14+
15+
$(TARGET):
16+
docker run -v `pwd`:/src -w /src \
17+
python:3.12 \
18+
pip install \
19+
--platform manylinux2014_x86_64 \
20+
--target=$(TARGET) \
21+
--implementation cp \
22+
--python-version 3.12 \
23+
--only-binary=:all: --upgrade \
24+
-r requirements.aws.txt
25+
26+
clean:
27+
@rm -rf $(TARGET)
28+
@py3clean .
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
# docker build --platform linux/amd64 --no-cache -f aws-lambda.dockerfile -t docker-image:test .
2+
# docker run --platform linux/amd64 -p 9000:8080 docker-image:test
3+
4+
# FROM public.ecr.aws/lambda/python:3.9
5+
FROM amazon/aws-lambda-python:3.12
6+
7+
# Copy requirements.txt
8+
COPY requirements.aws.txt ${LAMBDA_TASK_ROOT}
9+
10+
# Install the specified packages
11+
RUN pip install -r requirements.aws.txt
12+
13+
# Copy function code
14+
COPY src ${LAMBDA_TASK_ROOT}
15+
16+
# Set the CMD to your handler (could also be done as a parameter override outside of the Dockerfile)
17+
CMD [ "lambda_function.lambda_handler" ]
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
#!/bin/bash
2+
3+
export AUX_BUCKET=wazuh-indexer-aux-bucket
4+
5+
curl -X POST "http://localhost:9000/2015-03-31/functions/function/invocations" -d '{
6+
"Records": [
7+
{
8+
"eventVersion": "2.0",
9+
"eventSource": "aws:s3",
10+
"awsRegion": "us-east-1",
11+
"eventTime": "1970-01-01T00:00:00.000Z",
12+
"eventName": "ObjectCreated:Put",
13+
"userIdentity": {
14+
"principalId": "AIDAJDPLRKLG7UEXAMPLE"
15+
},
16+
"requestParameters":{
17+
"sourceIPAddress":"127.0.0.1"
18+
},
19+
"responseElements":{
20+
"x-amz-request-id":"C3D13FE58DE4C810",
21+
"x-amz-id-2":"FMyUVURIY8/IgAtTv8xRjskZQpcIZ9KG4V5Wp6S7S/JRWeUWerMUE5JgHvANOjpD"
22+
},
23+
"s3": {
24+
"s3SchemaVersion": "1.0",
25+
"configurationId": "testConfigRule",
26+
"bucket": {
27+
"name": "'"${AUX_BUCKET}"'",
28+
"ownerIdentity": {
29+
"principalId":"A3NL1KOZZKExample"
30+
},
31+
"arn": "'"arn:aws:s3:::${AUX_BUCKET}"'"
32+
},
33+
"object": {
34+
"key": "'"${1}"'",
35+
"size": 1024,
36+
"eTag":"d41d8cd98f00b204e9800998ecf8427e",
37+
"versionId":"096fKKXTRTtl3on89fVO.nfljtsv6qko"
38+
}
39+
}
40+
}
41+
]
42+
}'

integrations/amazon-security-lake/logstash/pipeline/indexer-to-integrator.conf

-33
This file was deleted.

integrations/amazon-security-lake/logstash/pipeline/indexer-to-s3.conf

+8-8
Original file line numberDiff line numberDiff line change
@@ -10,12 +10,12 @@ input {
1010
"query": {
1111
"range": {
1212
"@timestamp": {
13-
"gt": "now-1m"
13+
"gt": "now-5m"
1414
}
1515
}
1616
}
1717
}'
18-
schedule => "5/* * * * *"
18+
schedule => "*/5 * * * *"
1919
}
2020
}
2121

@@ -26,15 +26,15 @@ output {
2626
}
2727
s3 {
2828
id => "output.s3"
29-
access_key_id => "${AWS_KEY}"
30-
secret_access_key => "${AWS_SECRET}"
29+
access_key_id => "${AWS_ACCESS_KEY_ID}"
30+
secret_access_key => "${AWS_SECRET_ACCESS_KEY}"
3131
region => "${AWS_REGION}"
32-
endpoint => "http://s3.ninja:9000"
33-
bucket => "${AWS_BUCKET}"
34-
codec => "json"
32+
endpoint => "${AWS_ENDPOINT}"
33+
bucket => "${AUX_BUCKET}"
34+
codec => "json_lines"
3535
retry_count => 0
3636
validate_credentials_on_root_bucket => false
37-
prefix => "%{+YYYY}/%{+MM}/%{+dd}"
37+
prefix => "%{+YYYY}%{+MM}%{+dd}"
3838
server_side_encryption => true
3939
server_side_encryption_algorithm => "AES256"
4040
additional_settings => {
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
pyarrow>=10.0.1
2+
pydantic>=2.6.1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
11
pyarrow>=10.0.1
22
parquet-tools>=0.2.15
3-
pydantic==2.6.1
3+
pydantic>=2.6.1
44
boto3==1.34.46

0 commit comments

Comments
 (0)