TheHive enrichment

Intro

An increasing number of SOC’s/IRT-teams, etc. are beginning to use The Hive and ElasticSearch.

While researching these tools I saw a lot of talk about enrichment, and tying various tools together, so I wanted to provide my take on it as well.

I am by no means an expert in any of these tools, or in the IRT process, but I have had the priviledge of getting to know a few people that I would consider experts (even though they might not themselves feel that way), and while watching them work, I started thinking that some of the tasks they routinely perform could be eligible for automation.

Specifically I saw that a lot of the time when they where doing triage or incident response, they would receive an alert (this could be from their EDR tool, tier1 SOC, IDS/IPS, etc), where they would only get provided with an ip-address, and a timestamp.

Because most corporate infrastructures are configured with DHCP they would often have to go look at their ElasticSearch logs, to determine which endpoint (hostname) was assigned with the given IP-Address at the given time.

While this is somewhat trivial to do, it is also a well defined, recurring task, which meant that (if possible) i wanted to see if I could automate it.

Integrating TheHive and ElasticSearch

As you may or may not know The Hive uses an underlying enrichment engine called Cortex.

In short, Cortex works by leveraging analyzers (used for collecting information related to an observable, for instance collecting information from VirusTotal in relation to a checksum) and responders (used to act on information, for instance pushing an ip to a blacklist, or sending an email out).

With this in mind I figured that the way to go, would be to create an analyzer that would be able to query ElasticSearch, and return the hostname that was using the given IP-Address at the specified time.

I figured that the way to do this would be to create the event in TheHive, and attach the given IP-Address as an observable, from which the analyzer could be run.

This however turned out to be somewhat of a dead end for me as analyzers have the caveat of only working on observables, which meant that the only way I was able to provide a timestamp to the analyzer was to manually type it into the messageField of the observable (which I briefly considered but ended up deciding would be way to error-prone in a production environment, as the timestamp would have to adhere to specific formatting rules).

Because of this caveat I started looking at the possibilities if I were to implement this as a responder instead (even though this is not how responders are supposed to be used).

I quickly realized that because responders can be invoked on event, alerts and observables, a responder has acces to a wide range of information related to the event, even if it is implemented to only work with observables.

With this in mind I was able to implement functinal timestamps, using customFields with datatype datetime:

So this meant that I was able to implement A functional responder, which was able to query elasticsearch (through the standard rets-API), and return a report containing all relevant entries, corresponding to the query.

I, however was not entirely satisfied by this, as I felt like this could only be considered as somewhat automation, since I would still have to read through the returned report, and manually input the results as new observables.

Completing the automation

Using cortex, I felt quite limited in what I could do with my results, so I started contemplating how to take my attempted automation a step further, and therefore I started looking into the rest-API for TheHive.

This gave me all the possibilities I wanted, and with this in mind, I was able to leverage another customField called autoEnrichment (with datatype boolean) to be able to define whether I wanted the responder to automatically create new observable(s) from the ElasticSearch results.

The actual code

Analyzers and responders usually consist of the following:

  • A requirements file (which defines which non-standard libraries is needed for the analyzer/responder to work)
  • a json file (defining the prerequisites for the responder/analyzer, such as which datatype it can work with)
  • the analyzer/responder itself (the actual code, that performs the required operations)

I, however choose to split the actual analyzer/responder file into 3 seperate files (DHCPResponder.py, DHCPConf.py, and DHCPCallScript.py).

The idea behind this is to seperate the initialization, configurable items, and functionality, in an attempt to make the responder easier to maintain, and easier to build upon, in case a need for a similar responder which can handle other types of logs, should arise.

In keeping with the spirit of maintainability (and best practice) I have also tried to document the code with comments, explaining the functionality, and thoughts behind each code-section, and as such most of the code should be somewhat self-explanatory…

So without further ado, Here is a link to the github repo with the code:

https://github.com/securitydistractions/ElasticSearch-CortexResponder

6 thoughts on “TheHive enrichment

  1. Hello, I’m quite new to TheHive and I would like to ask if you could tell more about the installation process.
    I’m trying to customize your responder to get proxy information.

    1. Of course, but there is really not much to it.
      I really just spun up a vm with a linux distro (I choose Debian, but that is not important), and followed the guide from the docs: https://github.com/TheHive-Project/TheHiveDocs/blob/master/installation/install-guide.md

      I know others have done great work using the training vm which can be downloaded from here: https://github.com/TheHive-Project/TheHiveDocs/blob/master/training-material.md

      I seem to remember some issue with responders not being enabled by default, but not to sure about that, since it has been a while since i set up my lab-environment. 🙂

      1. I’m sorry I confused you, I actually meant the installation process of the responder, not of TheHive itself.
        I’m kinda stuck at the installation of the requirements. Using python 3.7 and 2.7 I get this error:
        “ValueError: This extension should not be used with Python 2.6 or later (already built in), and has not been tested with Python 2.3.4 or earlier.”
        The question is why would you use such an old version of Python and how do I get it working?
        Also It would be really nice of you If you could give me a somewhat step-by-step instruction for the installation of your responder.

        1. Ahh, now I understand.

          Hmm, I use python 3.7 too, but I must confess that I manually installed all the requirements while developing, so I did not test the requirements file. I however just tested with a new ubuntu vm, and edited the requirements file to reflect which libraries were not installed by default (it seems that I included some python libraries that were installed by default, and that might be causing errors when attempting to install via pip3). Please try again with the updated requirements file from https://github.com/securitydistractions/ElasticSearch-CortexResponder/blob/master/requirements.txt

          If that does not work, I advice you to try and open a python terminal, and import each requirement (from the old requirements file) manually, and if any of them are not found, try and install them manually using pip3.

          If this does not work for you please reply with information on which python library is causing the problem (and which linux distro you are using), and I will try to see if I can find the root cause of the problem. 🙂

          1. Thank you for the quick responses! You’ve been of great help, I’ll try and figure out the rest

          2. Happy to help. Please feel free to write again if you need any more assistance, or have any other questions 🙂

Leave a Reply

Your email address will not be published. Required fields are marked *