Elasticsearch is an open-source search and analytics engine for things like log analytics, real-time application monitoring, etc.
Amazon Elasticsearch is a managed service that makes it easy to deploy, operate, and scale Elasticsearch.
Kibana is a tool for visualizing (typically, large amounts of) data indexed by Elasticsearch, using bar graphs, line graphs, scatter plots, pie charts, etc.
About the Lab
See how to use VPC Flow Logs and Amazon Elasticsearch to track port, protocol, and IP address of traffic passing thru' a VPC.
Then configure Kibana to view VPC Flow Logs data and use a bash script to map Security Groups to AWS resources associated with them.
After relationships are established, use Kibana to identify unnecessary ports that are open on an instance.
In the AWS Console, click Services, then click EC2
Click Launch Instance
Click Select against the Amazon Linux AMI option
Choose a t2.micro (or any free tier eligible instance – we don’t need anything beyond that) and click Next: Configure Instance Details
For Auto-assign Public IP choose Enable and leave the rest as-is
Click on Next: Add Storage
Enter 16 under Size column (16GB for storage) and leave the rest as-is
Click on Next: Add Tags, then click on Add Tag
Under Key enter Name and under Value enter VPC Flow Logs Test
Click Next: Configure Secuity Group
Choose Create a new security group
Enter flowLogsTestSG-[your-user_id] as the security group’s name and something useful in the description (if sharing accounts, adding your name/userid to description could be helpful)
Click on Add Rule and choose HTTP (to allow all HTTP traffic in addition to SSL)
Click on Review and Launch and review one last time
Click on Launch to open the KeyPair modal dialog
Note: If you already have a keypair that you know works with this account and you’re sure that you can login with that keypair, then you can skip the steps for creating and downloading a new keypair. Alternatively, if you have a keypair, now’s a good time to test if you’re able to login to the instance with the keypair.
Choose Create a new keypair and give it a descriptive name
Click on Download Key Pair (this is the only time you’ll be able to download it)
Click on Launch
File-Name.pem>
This will take a few mins, so let’s proceed. We’ll connect to this EC2 instance later.
If you already have IAM Credentials, you can skip this section.
You’ll need your own IAM Credentials (that is valid and that works). You’ll need this to authenticate and run AWS APIs via commandline. If you don’t have IAM Credentials, you may be able to create a new set of IAM Credentials yourself (if your AWS Account admin has granted you permissions to iam:CreateAccessKey
, iam:DeleteAccessKey
, iam:UpdateAccessKey
, and iam:ListAccessKeys
or request your AWS Account admin to grant you permissions or create them for you.
In the AWS Console, click Services, then click IAM
On the left hand pane, click Policies and click Create policy
Choose the JSON tab and enter the policy below
{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:DescribeLogGroups",
"logs:DescribeLogStreams",
"logs:PutLogEvents"
],
"Resource": "*",
"Effect": "Allow"
}
]
}
Click Review policy
For the name, enter flowLogsPolicy-[your-user-id] and then click Create policy
On the left hand pane, click Roles and then click Create role
Choose AWS service For Select type of trusted entity
Choose EC2 for Choose the service that will use this role
Click Next: Permissions and then click Create policy
For Filter: enter flowLogsPolicy and choose the check box on the left to attach this policy to the role we are creating
Click on Next: Review
For Name enter flowLogsRole-[your-user-id]
Click on the newly created role and then click on the Trust relationships tab
Click on Edit trust relationship and replace ec2.amazonaws.com
with vpc-flow-logs.amazonaws.com
Click Update Trust Policy
Phew!
{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents"
],
"Resource": [
"arn:aws:logs:*:*:*"
],
"Effect": "Allow"
},
{
"Action": "es:ESHttpPost",
"Resource": "arn:aws:es:*:*:*",
"Effect": "Allow"
}
]
}
Click on the newly created role and then click on the Trust relationships tab
Click on Edit trust relationship and replace ec2.amazonaws.com
with lambda.amazonaws.com
{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"config:GetResourceConfigHistory"
],
"Resource": "*",
"Effect": "Allow"
},
{
"Action": [
"ec2:DescribeInstances",
"ec2:DescribeSecurityGroups",
"ec2:DescribeVPCs"
],
"Resource": "*",
"Effect": "Allow"
},
{
"Action": [
"rds:DescribeDBInstances"
],
"Resource": "*",
"Effect": "Allow"
},
{
"Action": [
"elasticloadbalancing:DescribeLoadBalancers"
],
"Resource": "*",
"Effect": "Allow"
},
{
"Action": [
"redshift:DescribeClusters"
],
"Resource": "*",
"Effect": "Allow"
},
{
"Action": [
"ec2:DescribeNetworkInterfaces"
],
"Resource": "*",
"Effect": "Allow"
},
{
"Action": [
"elasticache:DescribeCacheClusters"
],
"Resource": "*",
"Effect": "Allow"
},
{
"Action": [
"es:Describe*",
"es:List*"
],
"Resource": "*",
"Effect": "Allow"
}
]
}
In the AWS Console, click Services then click Elasticsearch Service.
Click Get started.
For Elasticsearch domain name,type flowlogs-[your-user-id]
For Elasticsearch version choose 2.3
Click Next.
For Instance count, type 2.
Check the box labeled Enable zone awareness.
Leave all other settings at their defaults, and click Next.
Note These settings instruct the Elasticsearch Service to launch your domain with two nodes. Because you have enabled zone awareness, these nodes will be separated into two different availability zones. If this were a production cluster, you would then use the Elasticsearch API to set up replicas for your cluster. These replicas would then be distributed across the nodes in the two Availability Zones, increasing the availability of the cluster.
Kibana, which is included in Elasticsearch Service clusters, does not directly support authentication. In a production environment, you will need an authentication proxy in front of Kibana, but for this lab you will restrict access by whitelisting your public IP address.
For Set the domain access policy to,choose Allow access to the domain from specific IP(s).
Choose the Public Access radio button
In the IP address dialogbox, enter two IP addresses separated by a comma. Something like this: 73.159.145.254, 35.163.64.127
Enter your computer’s public IP address followed by a comma. Don’t know your IP address? Visit https://www.google.com/search?q=what+is+my+ip+address in a separate tab.
Then enter the Public IP address of your EC2 Instance.
When you enter the IP addresses, it should look something like this:
73.159.145.254, 35.163.64.127
Your cluster’s access policy should now be visible. Click Next to move on to configuration review.
After verifying your cluster settings, click Confirm and create to create your Elasticsearch domain.
Note It can take up to 15 minutes to finish creating a domain. While your domain is still creating its Domain status will be listed as loading. While it is creating we will continue setting up other lab components, but you may wish to leave the Elasticsearch console open in a separate browser tab as you will return to it later.
In the AWS Management Console, on the Services menu, click VPC. On the navigation pane, click Your VPCs.
Check the box next to your Default VPC to select it.
In the VPC Details frame at the bottom of the page, click the Flow
Logs tab. Delete any existing Flow Logs. Then click Create Flow Log.
For Create Flow Log, click on Role and select the newly created
flowlogsRole.
For DestinationLogGroup, type FlowLog-[your-user-id]. This log group
will contain entries for your VPC traffic.
Click Create Flow Log.
Note VPC Flow Logs can take up to 15 minutes to create the CloudWatch Logs Group and begin delivering flow records. While you wait for the Log Group to be created, continue logging into the EC2 instance provided with the lab.
To connect to your EC2 instance you’ll need its public DNS.
This section is for Mac/Linux users only. If you are running Windows, please skip to the next section.
$ chmod 600 ~/Downloads/<EC2-KeyPair-PEM-Format-File-Name.pem>
$ ssh -i ~/Downloads/<C2-KeyPair-PEM-Format-File-Name.pem> ec2-user@<Public
This section is for Windows users only. If you are running OSX/Linux, please refer to the previous section.
Download PuTTY from http://www.putty.org and launch putty.exe.
In the Host Name box, type
ec2-user@<publicDNS>
Paste the public DNS value from your Clipboard.
In the Category list, expand SSH.
Click Auth (don’t expand it).
Use puttygen.exe to convert the PEM file into a PPK file. To see how to do that, see this StackOverflow post: https://stackoverflow.com/questions/3190667/convert-pem-to-ppk-file-format
In the Private key file for authentication box, browse to the PPK file that you downloaded and double-click it.
Click Open.
Click Yes when prompted to allow a first connection to this remote
SSH server. Because you are using a key pair for authentication, you will not be prompted for a password.
If PuTTY fails to connect to your Amazon EC2 instance, verify that:
sudo yum install -y jq
sudo yum install -y httpd
sudo service httpd start
/var/www/html/
writable.sudo chmod -R ugo+r+w /var/www/html/
will also work.Now you’ll create (or use) your own IAM Credentials and configure the default aws commandline profile in the EC2 instance to use it. To do this got to the EC@ instance commandline and run:
aws configure
and enter the values asked for by the prompt. It should look something like this
[ec2-user@ip-172-31-30-125 ~]$ aws configure
AWS Access Key ID [None]: AKIA----------------
AWS Secret Access Key [None]: <Need 2 parts vodka, 1 part...>
Default region name [None]: ca-central-1
Default output format [None]: json
[ec2-user@ip-172-31-30-125 ~]$
[ec2-user@ip-172-31-30-125 ~]$
Return to the Elasticsearch Service console. In the AWS Management Console, click Services, then click Elasticsearch Service. Verify the Cluster health for your cluster is Green. If your cluster is not yet green, wait a few minutes and refresh the browser page. Once Cluster health is listed as Green, continue to the next part of the lab.
On the Services menu, click CloudWatch. Then click Logs in the left CloudWatch Console pane. The log group you created, FlowLogs
, should be listed in the Log Groups column. If you see a Getting Started
page instead of your log group, wait a few minutes and refresh the page. Once your log group appears under the Log Groups column header, you can continue.
Note Do not begin these steps until have you have verified that your resources have finished provisioning.
In the CloudWatch Log Groups page, check the box next to the log group you created, FlowLogs.
Click Actions and select Stream to Amazon Elasticsearch Service.
For Amazon ES cluster, select your cluster, flowlogs-[your-user-id].
In the Lambda IAM Execution Role field, select the role
lambdaElasticsearchExecutionRole, then click Next.
In the Log Format field, select Amazon VPC Flow Logs. Leave other settings at their defaults and click Next.
Review your streaming settings, thenclickNext.ClickStart Streaming.
You have now created a Lambda function that is triggered whenever new logs are written to CloudWatch Logs from VPC Flow Logs. Each Lambda invocation transforms the new records posted to CloudWatch and inserts them into your ElasticSearch Cluster.
A flow log record is a space-separated string in the following format:
version account-id interface-id srcaddr dstaddr srcport dstport protocol packets bytes start end action log-status
By default, Elasticsearch passes strings through an analyzer before being indexed. The analyzer interprets dashes and periods as field separators and will split fields that have them present. As our logs contain dashes this will create undesired behavior. In order to correct this, we will use the curl command from our instance to configure interface-id, srcaddr, and dstaddr to not_analyzed. This configuration change will cause Elasticsearch to skip the analyzer for these fields and leave their punctuation intact.
aws es describe-elasticsearch-domain --domain-name flowlogs
Your endpoint will look something like this:
search-flowlogs-nihldocymhswk4gzonzrw4j5rq.us-west-2.es.amazonaws.com.
curl -XPUT "https://<ElasticsearchDomainEndpoint>/_template/template_1" -d'
{
"template":"cwl-*","mappings":{
"FlowLogs": {
"properties": {
"interface_id": { "type": "string", "index": "not_analyzed"},
"srcaddr": { "type": "string", "index": "not_analyzed"},
"dstaddr": { "type": "string", "index": "not_analyzed"}
}
}
}
}' -H 'Content-Type: application/json'
Note After you add the endpoint URL, make sure you also remove the < > brackets from the placeholder text. (I left that in…)
Delete the old, incorrectly indexed data by running the following command:(replace the placeholder text with your endpoint):
curl -XDELETE 'https://<ElasticsearchDomainEndpoint>/cwl*/'
Now configure visualizations in Kibana to view the data in Elasticsearch. Kibana is a tool built into Amazon Elasticsearch used to visualize and search data. An example dashboard has been created here: https://s3.amazonaws.com/awsiammedia/public/sample/securitygroups/SGDashboard.json.
Dashboards are defined in JSON and are normally imported through curl commands against the Elasticsearch endpoint.
Here, however, we will be using a tool called elasticdump to make this process easier.
curl -o- https://raw.githubusercontent.com/creationix/nvm/v0.32.0/install.sh | bash
. ~/.nvm/nvm.sh
nvm install 8.11.1
npm install -g elasticdump
curl -O https://s3.amazonaws.com/awsiammedia/public/sample/securitygroups/SGDashboard.json
elasticdump --input=SGDashboard.json --output=https://<ElasticsearchDomainEndpoint>/.kibana-4
Return to the Elasticsearch Console. Click your domain, flowlogs. Click on the link for Kibana.
You will be prompted to configure an index pattern. You do not need to do this. Instead, click on the existing pattern you imported in the last step. Click on [cwl-]YYY.MM.DD in the left pane under Index Patterns.
Click the green star to set as default index.
Click Dashboard (in the top-level menu), then click on the folder icon
to Load Saved Dashboard. Click FlowLogDash.
You should now see a visualization of traffic flowing into and out of your
VPC.
As you viewed the visualization of your Flow Logs data, you may have noticed that traffic was broken down by ENI, not by Security Group. Security Group information is not provided in Flow Logs, so you will need to generate ENI to Security Group mappings . You will do this through a bash script, sgremediate.sh
. Download this from https://s3.amazonaws.com/awsiammedia/public/sample/securitygroups/sgremediate.sh like so:
curl -O https://s3.amazonaws.com/awsiammedia/public/sample/securitygroups/sgremediate.sh
chmod +x sgremediate.sh
You will need to configure two settings in the script: your VpcID and Elasticsearch Endpoint. This information is available from the AWS Console, but we’ll gather it from the AWS CLI instead.
To view your VpcID, in your EC2 instance run the following command:
aws ec2 describe-vpcs --filter Name=isDefault,Values=true | jq -r .'Vpcs[].VpcId'
This command returns JSON data about your Default VPC and parses it with jq. The result is your VpcID which will be needed when you configure the script. It should look something like this: vpc-13303b71.
aws es describe-elasticsearch-domain --domain-name flowlogs | jq -r .'DomainStatus.Endpoint'
It returns JSON data about your Elasticsearch domain and parses it with jq. The result is your Elasticsearch Endpoint which will be needed when you configure the script. Your endpoint should look something like this: search-flowlogs-nihldocymhswk4gzonzrw4j5rq.us-west-2.es.amazonaws.com
.
In your EC2 instance, open sgremediate.sh with a text editor such as nano or vi
Edit sgremediate.sh and…
For vpcID= paste your lab’s VPC ID.
For ElasticsearchURL= paste your Elasticsearch endpoint. (Note that part of the URL is already filled in for you, so look out for mangled URLs.)
Save and exit the script file.
Executing the script will list all of the Security Groups in your account and iterate through every resource they could be attached to. When it finds an association, it builds a link to the Kibana dashboard for that Security Group and outputs it as HTML. Your EC2 instance is running a web server that can host the HTML output.
./sgremediate.sh > /var/www/html/index.html
The Kibana dashboard consists of a set of visualizations. Each visualization summarizes data available in your Elasticsearch clustrer for a particular timeframe. The timeframe can be adjusted in the top right corner.
On the FlowLogDash dashboard, the left side consists of three panes.
Using this dashboard you can visualize what traffic is actually flowing into a particular security group. If the visualization shows that no traffic has passed to a specific port that is open in the security group you have successfully identified an overly permissive security group that can be tightened. If the visualization shows limited traffic to a specific port that is open in the security group, you can further evaluate the source IP of that traffic to determine if it is expected or potentially abusive (e.g. port scaning). If you determine that the port is not intentionally left open, you have identified an overly permissive security group that can be tightened.
Multiple Security Groups can be associated to an ENI. VPC Flow logs track connections based on ENI, not Security Group, so dashboards created by the sgremediate.sh script will show ALL security groups associated to a particular ENI.
Also note that security groups are stateful. If an instance is initiating traffic to a different location Flow Logs will contain the return traffic.
You’ve successfully enabled VPC Flow Logs, streamed records to Elasticsearch, and visualized them with Kibana. You have also used Kibana dashboards to find overly permissive Security Groups.