Enriching ElasticSearch With Threat Data – Part 2 – Memcached and Python

In our previous post we covered MISP and some of the preparation work needed to integrate MISP and ElasticSearch. With MISP now setup and prepped, we can now focus on Python and Memcached.

Part 1:- https://www.securitydistractions.com/2019/05/17/enriching-elasticsearch-with-threat-data-part-1-misp/

This image has an empty alt attribute; its file name is image-1024x547.png

Background

First a little background into why we chose to use Memcached for our ElasticSearch integration…..

Threat data feeds are dynamic by nature, they are being constantly updated and multipe times a day. The updates contain both additions to the feeds and deletions. This means our enrichment engine would need to be dynamic too…. To explain this better, we will use Ransomware Tracker as an example..

Lets say a new IP is published to the Ransomware Tracker feed, this would be easy to manage in an enrichment engine, as we could simply add the new IP to our list. But what if an IP is removed from Ransomware Tracker, now we have to monitor the Ransomware Tracker feed to find the deletion, then check our own list to see if we have this IP and then delete it from our list. This can very quickly get complex to handle…

Another way to handle it could be to monitor the Ransomware Tracker feed for changes, when a change is made then clear our list completely and pull the latest feed instead….. This would solve part of the problem, but it can result in having a small period where the enrichment engine is empty, it also increases complexity as we would have to delete the list each time, which is definitely not what we wanted!

We decided to look into a way of simply assigning a TTL to each IoC on the feed, and then age out the IoC’s which are no longer present on the feed. We would setup our script to pull the feed at a given time interval, then push this into our enrichment engine store. Simple yet incredibly effective… This method also had to be supported by ElasticSearch, and how lucky we were that Logstash has a filter plugin for memcached. So it was this we settled on using to store the feed data for enrichment.

Memached – Preparation

Memcached meets our requirements for being simple, and handling aging of IoC’s, it is also supported by ElasticSearch/Logstash which makes it perfect for this task. It also comes with the huge additional benefit of storing the data in memory, so lookups from Logstash to the data will be ultra fast.

https://memcached.org/

The Memcached application is a very simple key-value store running in memory, you can telnet into the application running by default on port 11211.

The application is made up of only a few commands. The ones we are in need of here, are the “get” and “set” commands. Both of which are quite self explanatory….

The set command will be used by our Python script, to set the data into the store.

The get command will be used by the Logstash filter plugin, to query the store for a specific IoC and return the result back to Logstash.

The only thing we need to do, is set the structure of the data within the key-value store. Since we are going to be working with multiple data types, domain names, IP addresses etc. We will make our key a combination of the datatype and the IoC. So in the example that securitydistractions.com is on the RansomwareTracker feed, it will be represented as: “domain-securitydistractions.com”.

Using the key as the combination of the data type and the IoC will be easier to understand later when we look at the Logstash configuration.

The value will be the feed name, so in this example “Feed-RansomwareTracker”.

The TTL can be set to whatever suits your organisation, in our example we will use 70 seconds. This is because we are going to run our Python script for pulling the feed from MISP every 30 seconds, this would then allow us to miss 1 pull and not age out all IoC’s within the memcached store.

So the set command for memcached with our example data will be as follows:- “domain-securitydistractions.com”, “Feed-RansomwareTracker”, “70”.

It is highly recommended that you run Memcached on the same machine as logstash, for latency purposes. In our lab we are running everything on a Debian VM. There are Debian packages available for Memcached…..

Python – Memcache/MISP integration

Caveat: I am not a developer, and my programming skills are limited… The script here only had to be very simple, so it suited by skill level. There will be multiple ways to improve it in the future… But this is what we are running with here, and it works!

As ever, any form of integration between tools is probably going to require some form of scripting. In our case we knew we needed a script that would handle the pulling of the data from our MISP platform API, and then pushing this data into Memached. The full script can be found at the bottom of the page….

The first part is our interaction with the MISP API….

def misppull():
    headers = {
            'Authorization': 'INSERT YOUR OWN MISP API KEY',
            'Accept': 'application/json',
            'Content-type': 'application/json',
             }

    data = '{"returnFormat":"text","type":"domain","tags":"Feed-RansomwareTracker","to_ids":"yes"}'

    response = requests.post('https://*INSERTYOUROWNMISPHERE*/attributes/restSearch', headers=headers, data=data, verify=False) #Call to MISP API

    return response

Remember to change the “Authorization” section within the header to your own API key.

The data variable, is used to tell the MISP API which IoC’s we want to retrieve, in this example we are asking for all domain names that are tagged with the “Feed-RansomwareTracker” and where the “to_ids” setting is set to yes. This will be returned as plaintext…

Remember also to change the URL within the response variable to reflect the domain name or IP address of your own MISP instance. I have also disabled verification of SSL as it is done within my lab. It is not recommended to keep this setting if you are running in production.

Reliable as always, there are multiple python libraries for interacting with the Memcahed application. We settled on the first one we found, “pymemcache”.

https://pypi.org/project/pymemcache/

if __name__ == '__main__':
    response = misppull()
    domains = (response.text).splitlines()
    for domain in domains:
               client.set("domain-" + domain, "Feed-RansomwareTracker", 70)

Using the structure we settled on earlier in this blog post, this is how it is reflected when using pymemcached. Using the client.set command to push the IoC’s we retrieved via the “misppull” function into the memached application.

Full script:-

When I get round to it, this will be uploaded onto our github, it is released under the MIT license.

import requests
from requests.packages.urllib3.exceptions import InsecureRequestWarning
from pymemcache.client.base import Client
requests.packages.urllib3.disable_warnings(InsecureRequestWarning)

client = Client(('127.0.0.1', 11211)) #Location of memached application

def misppull():
    headers = {
            'Authorization': 'INSERT YOUR OWN API KEY HERE',
            'Accept': 'application/json',
            'Content-type': 'application/json',
             }

    data = '{"returnFormat":"text","type":"domain","tags":"Feed-eCrimes","to_ids":"yes"}' #Setting up the data format we require from MISP

    response = requests.post('https://*INSERTYOUROWNMISPHERE*/attributes/restSearch', headers=headers, data=data, verify=False) #Call to MISP API
    return response


if __name__ == '__main__':
    response = misppull()
    domains = (response.text).splitlines()
    for domain in domains:
               client.set("domain-" + domain, "Feed-RansomwareTracker", 70)

Next in the post series, is overing the last step… Integrating it all together using Logstash!

Part 3:- https://www.securitydistractions.com/2019/05/17/enriching-elasticsearch-with-threat-data-part-3-logstash/

Leave a Reply

Your email address will not be published. Required fields are marked *