Skip to content

Commit bb101b5

Browse files
committed
Initialize
1 parent 501852f commit bb101b5

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

42 files changed

+3136
-7
lines changed

.gitattributes

+6
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
#
2+
# https://help.github.com/articles/dealing-with-line-endings/
3+
#
4+
# These are explicitly windows files and should use crlf
5+
*.bat text eol=crlf
6+

.gitignore

+6
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
# Ignore Gradle project-specific cache directory
2+
.gradle
3+
.idea
4+
# Ignore Gradle build output directory
5+
build
6+
out

README.md

+122-7
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,132 @@
1-
## My Project
1+
### Macie Finding Data Reveal
22

3-
TODO: Fill this README out!
3+
This project contains a command line utility to help you analyze Macie findings. Macie generates
4+
sensitive data findings when it discovers sensitive data in S3 objects that you configure a
5+
sensitive data discovery job to analyze. The finding includes [locators] that point to where the
6+
specific sensitive data was observed. The operator can follow these pointers to see what Macie saw
7+
in the object. This follow-up helps the operator (usually a security engineer) decide what to do
8+
next with the specific finding. The CLI in this package automates the manual work involved there.
49

5-
Be sure to:
10+
[locators]: https://docs.aws.amazon.com/macie/latest/user/findings-locate-sd.html
611

7-
* Change the title in this README
8-
* Edit your repository description on GitHub
12+
For example, say you're looking at a finding like the following:
13+
14+
![](finding.png)
15+
16+
Often the next step, i.e. remediation or confirmation of security controls (like encryption,
17+
access-logging, etc.) depends on determining the accuracy and severity of the findings. Macie shows
18+
the occurrences that are often sufficient to make the decision.
19+
20+
![](occurrences.png)
21+
22+
If these /pointers/ are not enough for a particular finding and you need to see the exact data Macie
23+
saw to generate this finding - this is the tool for you.
24+
25+
### Build and Install
26+
27+
This is a Gradle Kotlin project. To build, you need Java 11:
28+
29+
```bash
30+
> git clone https://github.com/aws-samples/amazon-macie-finding-data-reveal
31+
> cd amazon-macie-finding-data-reveal
32+
33+
> ./gradlew build
34+
```
35+
36+
The build produces an executable jar that you can run with Java 11. For convenience you may want to
37+
define an alias:
38+
39+
```bash
40+
> alias reveal="java -jar ${PWD}/reveal/build/libs/reveal-executable.jar"
41+
```
42+
43+
### Usage
44+
45+
The tool makes API calls to Macie and S3 and you'd need to configure [credentials] as you do for use
46+
with AWS ClI.
47+
48+
[credentials]: https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-files.html
49+
50+
```bash
51+
# Locate where your finding and object are
52+
> export AWS_REGION=us-east-1
53+
54+
# Pick the Finding ID from the console and reveal it (all values below are fake)
55+
> reveal 8db5d79296b57dade4abeb2b9a5a8797
56+
┌────────┬────────────────────────────────────────────────────────┐
57+
│ Object │ s3://DOC-EXAMPLE-BUCKET/mock-data/json/50169671.json │
58+
├────────┼────────────────────────────────────────────────────────┤
59+
│ Mime │ application/json │
60+
├────────┼────────────────────────────────────────────────────────┤
61+
│ Count │ 493 │
62+
└────────┴────────────────────────────────────────────────────────┘
63+
┌──────────────┐
64+
│ PHONE_NUMBER │
65+
├──────────────┤
66+
│ 555-0100 │
67+
├──────────────┤
68+
│ 555-0100 │
69+
├──────────────┤
70+
│ 555-0100 │
71+
└──────────────┘
72+
┌────────────────────┐
73+
│ NAME │
74+
├────────────────────┤
75+
│ Alejandro Rosalez │
76+
├────────────────────┤
77+
│ Diego Ramirez │
78+
├────────────────────┤
79+
│ Martha Rivera │
80+
└────────────────────┘
81+
┌────────────────────────┐
82+
│ ADDRESS │
83+
├────────────────────────┤
84+
│ 12 Any Street Any Town │
85+
├────────────────────────┤
86+
│ 34 Any Street Any Town │
87+
├────────────────────────┤
88+
│ 11 Any Street Any Town │
89+
└────────────────────────┘
90+
```
91+
92+
### Can I Reveal all findings?
93+
94+
No. Macie scans a wide variety of objects in S3 buckets, ranging from small text files to large
95+
archives that hold hundreds of GBs of data. This tool helps you take a quick peek at the most common
96+
finding types findings to confirm the presence of sensitive data. Currently, the following
97+
mime-types are supported:
98+
99+
- `application/avro`
100+
- `text/csv`
101+
- `application/json`
102+
- `text/plain`
103+
- `application/parquet`
104+
- `application/vnd.openxmlformats-officedocument.spreadsheetml.sheet` (Excel spreadsheets)
105+
106+
Please create an issue if a format you'd like to see isn't on the list. We'll try to add it,
107+
contributions are welcome too!
108+
109+
### Permissions
110+
111+
The tool makes use of public API calls to S3 and Macie, so the usual IAM access control applies. The
112+
caller needs to have the permissions to invoke `macie:GetFindings` on the account and `s3:GetObject`
113+
on the specific object reported in the finding.
114+
115+
### Troubleshooting
116+
117+
#### Mismatched bucket region
118+
Error:
119+
```
120+
ERROR: The bucket you are attempting to access must be addressed using the specified endpoint. Please send all future requests to this endpoint (Service: S3, Status Code: 301)
121+
```
122+
Cause:
123+
124+
The tool uses regional endpoints. Set `AWS_REGION` to match where your Macie session is.
9125

10126
## Security
11127

12128
See [CONTRIBUTING](CONTRIBUTING.md#security-issue-notifications) for more information.
13129

14130
## License
15131

16-
This library is licensed under the MIT-0 License. See the LICENSE file.
17-
132+
This project is licensed under the MIT-0 License. See the LICENSE file.

finding.png

130 KB
Loading
+5
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
distributionBase=GRADLE_USER_HOME
2+
distributionPath=wrapper/dists
3+
distributionUrl=https\://services.gradle.org/distributions/gradle-7.3-bin.zip
4+
zipStoreBase=GRADLE_USER_HOME
5+
zipStorePath=wrapper/dists

gradlew

+234
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,234 @@
1+
#!/bin/sh
2+
3+
#
4+
# Copyright © 2015-2021 the original authors.
5+
#
6+
# Licensed under the Apache License, Version 2.0 (the "License");
7+
# you may not use this file except in compliance with the License.
8+
# You may obtain a copy of the License at
9+
#
10+
# https://www.apache.org/licenses/LICENSE-2.0
11+
#
12+
# Unless required by applicable law or agreed to in writing, software
13+
# distributed under the License is distributed on an "AS IS" BASIS,
14+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
15+
# See the License for the specific language governing permissions and
16+
# limitations under the License.
17+
#
18+
19+
##############################################################################
20+
#
21+
# Gradle start up script for POSIX generated by Gradle.
22+
#
23+
# Important for running:
24+
#
25+
# (1) You need a POSIX-compliant shell to run this script. If your /bin/sh is
26+
# noncompliant, but you have some other compliant shell such as ksh or
27+
# bash, then to run this script, type that shell name before the whole
28+
# command line, like:
29+
#
30+
# ksh Gradle
31+
#
32+
# Busybox and similar reduced shells will NOT work, because this script
33+
# requires all of these POSIX shell features:
34+
# * functions;
35+
# * expansions «$var», «${var}», «${var:-default}», «${var+SET}»,
36+
# «${var#prefix}», «${var%suffix}», and «$( cmd )»;
37+
# * compound commands having a testable exit status, especially «case»;
38+
# * various built-in commands including «command», «set», and «ulimit».
39+
#
40+
# Important for patching:
41+
#
42+
# (2) This script targets any POSIX shell, so it avoids extensions provided
43+
# by Bash, Ksh, etc; in particular arrays are avoided.
44+
#
45+
# The "traditional" practice of packing multiple parameters into a
46+
# space-separated string is a well documented source of bugs and security
47+
# problems, so this is (mostly) avoided, by progressively accumulating
48+
# options in "$@", and eventually passing that to Java.
49+
#
50+
# Where the inherited environment variables (DEFAULT_JVM_OPTS, JAVA_OPTS,
51+
# and GRADLE_OPTS) rely on word-splitting, this is performed explicitly;
52+
# see the in-line comments for details.
53+
#
54+
# There are tweaks for specific operating systems such as AIX, CygWin,
55+
# Darwin, MinGW, and NonStop.
56+
#
57+
# (3) This script is generated from the Groovy template
58+
# https://github.com/gradle/gradle/blob/master/subprojects/plugins/src/main/resources/org/gradle/api/internal/plugins/unixStartScript.txt
59+
# within the Gradle project.
60+
#
61+
# You can find Gradle at https://github.com/gradle/gradle/.
62+
#
63+
##############################################################################
64+
65+
# Attempt to set APP_HOME
66+
67+
# Resolve links: $0 may be a link
68+
app_path=$0
69+
70+
# Need this for daisy-chained symlinks.
71+
while
72+
APP_HOME=${app_path%"${app_path##*/}"} # leaves a trailing /; empty if no leading path
73+
[ -h "$app_path" ]
74+
do
75+
ls=$( ls -ld "$app_path" )
76+
link=${ls#*' -> '}
77+
case $link in #(
78+
/*) app_path=$link ;; #(
79+
*) app_path=$APP_HOME$link ;;
80+
esac
81+
done
82+
83+
APP_HOME=$( cd "${APP_HOME:-./}" && pwd -P ) || exit
84+
85+
APP_NAME="Gradle"
86+
APP_BASE_NAME=${0##*/}
87+
88+
# Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script.
89+
DEFAULT_JVM_OPTS='"-Xmx64m" "-Xms64m"'
90+
91+
# Use the maximum available, or set MAX_FD != -1 to use that value.
92+
MAX_FD=maximum
93+
94+
warn () {
95+
echo "$*"
96+
} >&2
97+
98+
die () {
99+
echo
100+
echo "$*"
101+
echo
102+
exit 1
103+
} >&2
104+
105+
# OS specific support (must be 'true' or 'false').
106+
cygwin=false
107+
msys=false
108+
darwin=false
109+
nonstop=false
110+
case "$( uname )" in #(
111+
CYGWIN* ) cygwin=true ;; #(
112+
Darwin* ) darwin=true ;; #(
113+
MSYS* | MINGW* ) msys=true ;; #(
114+
NONSTOP* ) nonstop=true ;;
115+
esac
116+
117+
CLASSPATH=$APP_HOME/gradle/wrapper/gradle-wrapper.jar
118+
119+
120+
# Determine the Java command to use to start the JVM.
121+
if [ -n "$JAVA_HOME" ] ; then
122+
if [ -x "$JAVA_HOME/jre/sh/java" ] ; then
123+
# IBM's JDK on AIX uses strange locations for the executables
124+
JAVACMD=$JAVA_HOME/jre/sh/java
125+
else
126+
JAVACMD=$JAVA_HOME/bin/java
127+
fi
128+
if [ ! -x "$JAVACMD" ] ; then
129+
die "ERROR: JAVA_HOME is set to an invalid directory: $JAVA_HOME
130+
131+
Please set the JAVA_HOME variable in your environment to match the
132+
location of your Java installation."
133+
fi
134+
else
135+
JAVACMD=java
136+
which java >/dev/null 2>&1 || die "ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH.
137+
138+
Please set the JAVA_HOME variable in your environment to match the
139+
location of your Java installation."
140+
fi
141+
142+
# Increase the maximum file descriptors if we can.
143+
if ! "$cygwin" && ! "$darwin" && ! "$nonstop" ; then
144+
case $MAX_FD in #(
145+
max*)
146+
MAX_FD=$( ulimit -H -n ) ||
147+
warn "Could not query maximum file descriptor limit"
148+
esac
149+
case $MAX_FD in #(
150+
'' | soft) :;; #(
151+
*)
152+
ulimit -n "$MAX_FD" ||
153+
warn "Could not set maximum file descriptor limit to $MAX_FD"
154+
esac
155+
fi
156+
157+
# Collect all arguments for the java command, stacking in reverse order:
158+
# * args from the command line
159+
# * the main class name
160+
# * -classpath
161+
# * -D...appname settings
162+
# * --module-path (only if needed)
163+
# * DEFAULT_JVM_OPTS, JAVA_OPTS, and GRADLE_OPTS environment variables.
164+
165+
# For Cygwin or MSYS, switch paths to Windows format before running java
166+
if "$cygwin" || "$msys" ; then
167+
APP_HOME=$( cygpath --path --mixed "$APP_HOME" )
168+
CLASSPATH=$( cygpath --path --mixed "$CLASSPATH" )
169+
170+
JAVACMD=$( cygpath --unix "$JAVACMD" )
171+
172+
# Now convert the arguments - kludge to limit ourselves to /bin/sh
173+
for arg do
174+
if
175+
case $arg in #(
176+
-*) false ;; # don't mess with options #(
177+
/?*) t=${arg#/} t=/${t%%/*} # looks like a POSIX filepath
178+
[ -e "$t" ] ;; #(
179+
*) false ;;
180+
esac
181+
then
182+
arg=$( cygpath --path --ignore --mixed "$arg" )
183+
fi
184+
# Roll the args list around exactly as many times as the number of
185+
# args, so each arg winds up back in the position where it started, but
186+
# possibly modified.
187+
#
188+
# NB: a `for` loop captures its iteration list before it begins, so
189+
# changing the positional parameters here affects neither the number of
190+
# iterations, nor the values presented in `arg`.
191+
shift # remove old arg
192+
set -- "$@" "$arg" # push replacement arg
193+
done
194+
fi
195+
196+
# Collect all arguments for the java command;
197+
# * $DEFAULT_JVM_OPTS, $JAVA_OPTS, and $GRADLE_OPTS can contain fragments of
198+
# shell script including quotes and variable substitutions, so put them in
199+
# double quotes to make sure that they get re-expanded; and
200+
# * put everything else in single quotes, so that it's not re-expanded.
201+
202+
set -- \
203+
"-Dorg.gradle.appname=$APP_BASE_NAME" \
204+
-classpath "$CLASSPATH" \
205+
org.gradle.wrapper.GradleWrapperMain \
206+
"$@"
207+
208+
# Use "xargs" to parse quoted args.
209+
#
210+
# With -n1 it outputs one arg per line, with the quotes and backslashes removed.
211+
#
212+
# In Bash we could simply go:
213+
#
214+
# readarray ARGS < <( xargs -n1 <<<"$var" ) &&
215+
# set -- "${ARGS[@]}" "$@"
216+
#
217+
# but POSIX shell has neither arrays nor command substitution, so instead we
218+
# post-process each arg (as a line of input to sed) to backslash-escape any
219+
# character that might be a shell metacharacter, then use eval to reverse
220+
# that process (while maintaining the separation between arguments), and wrap
221+
# the whole thing up as a single "set" statement.
222+
#
223+
# This will of course break if any of these variables contains a newline or
224+
# an unmatched quote.
225+
#
226+
227+
eval "set -- $(
228+
printf '%s\n' "$DEFAULT_JVM_OPTS $JAVA_OPTS $GRADLE_OPTS" |
229+
xargs -n1 |
230+
sed ' s~[^-[:alnum:]+,./:=@_]~\\&~g; ' |
231+
tr '\n' ' '
232+
)" '"$@"'
233+
234+
exec "$JAVACMD" "$@"

0 commit comments

Comments
 (0)