Sunday, September 25, 2016

IBM Qradar: How to import logs from an Amazon S3 compatible log source

Many vendors nowadays are using the Amazon S3 API as a method to access and download their logs. Cisco is an example of this, and they host their Cloud Web Security (CWS) product logs at vault.scansafe.com and use the Amazon S3 API to make the logs accessible to their users (Other vendors include Hitachi, EMC Vcloud, and many more).

IBM Qradar has added support for the Amazon S3 API as a log protocol to allow Qradar to download logs from AWS services such as CloudTrail, but we found out that the use of this protocol on Qradar is limited to downloading logs if they are stored on Amazon S3, and that we couldn’t use it in the case of products such as Cisco CWS where the logs are hosted on their own servers.

To add Cisco CWS as a log source for IBM Qradar, we used a manual python script to download the logs using the S3 API to a local directory on the Qradar console, and then configured Qradar to automatically import the logs from that local directory.


In this blog post, I will walk through the steps to allow you to add S3 compatible log destinations as log sources in Qradar. The overall steps are as follows:

  1. Download the script to the Qradar server (console or log collector).

  2. Install dependencies for the script and configure the script parameters.

  3. Setup a cronjob to run the script on a recurrent basis.

  4. Create a new log source in Qradar to pull the downloaded log files using SFTP.

Cisco support has a python script available to automatically pull logs from their servers for CWS (vault.scansafe.com). However, the script needed to be modified and a few features to be added before we could properly run it on Qradar (Original script can be requested from Cisco support here):

  • Added statefulness to the script. Originally, the script would download all log files available every time it ran, but this meant it could be downloading many gigabytes of data on each run. Instead, we added the capability of saving the timestamp of the last file downloaded, and then only downloading recently created log files on subsequent runs.

  • Cleanup: Once the log files are downloaded and processed by Qradar, there is no need to keep the files on the system. The script was modified so that it deletes files after a set number of hours.
  • The script was written for a newer python version but Qradar has version 2.6 installed. We had to modify the ‘with .. as’ statements as they were not supported, and replaced them with manual os.open and gzip.open (and corresponding close) function calls.

The full script can be found at the bottom of this post.


Script setup and configuration

  1. First, copy the script to Qradar using SCP. You can choose any directory you prefer (We placed it under /opt/qradar/bin/)

  2. Install the boto python library (Boto is the AWS SDK provided by Amazon):

    1. Download boto-2.42.0.tar.gz from https://pypi.python.org/pypi/boto (This is the older version but the script was written to use it, and so we didn’t modify it).

    2. Copy the file to Qradar.

    3. Uncompress the file using tar ('tar -xzvf boto-2.42.0.tar.gz').

    4. Install boto by running ‘python setup.py install’ from the directory the files were extracted to.

  3. Edit the script using vi and set the following values:

    1. Endpoint: hostname for the endpoint hosting the logs (For Cisco CWS this would be vault.scansafe.com).

    2. accessKey: Should be provided from the vendor

    3. secretKey: Should be provided from the vendor

    4. bucket: Should be provided from the vendor

    5. localPath: Local path to where the files should be downloaded (We used /store/tmp/)

    6. Hours: number of hours to keep the files after they were downloaded. The default is one hour.

  4. Add execution permissions to the file ('chmod +x cws_script.py').

  5. Configure a cronjob to run the script automatically. We set the script to run every 5 minutes, and as a rule you should set it to run more frequently than the interval at which Qradar is processing them. To configure cron to run the script every 5 minutes:

    1. Edit the cronjob scheduled tasks by running ‘crontab -e’

    2. Add the following line at the end:

*/5 * * * * /opt/qradar/bin/cws_script.py >> /var/log/cws-script.log 2>&1


Create a new log source in Qradar


  1. From the Qradar Console go to Admin > Log Sources, and click Add.

  2. Select Univeral DSM for the ‘Log Source Type’, and select ‘Log File’ for the protocol.

  3. Choose ‘SFTP’ and enter the Qradar’s own IP address and enter user/password details.

  4. Set the Remote Directory to the directory on Qradar to which the script downloads the log files.

  5. For ‘File Pattern’ enter a regex that would match the files downloaded. For Cisco CWS logs, we used ‘.*txt’

  6. Set the recurrence to specify the interval at which Qradar will import the logs. We used 15M so that the log files are processed every 15 minutes.

Once configured and saved, you can verify operations by going to the ‘Log Activity’ tab and setting the newly added log source as a filter and then viewing the logs as they are downloaded. You can also use the log file specified when configuring the cronjob to verify the script operations and if it is running properly.


Script


#!/usr/bin/env python

import gzip
import boto
import boto.s3.connection
import sys, os

from boto.s3.key import Key
from datetime import datetime

# Required parameters

accessKey = 'access key here'
secretKey = 'secret key here'
bucket = 'bucket id here'
localPath = '/store/tmp/logfiles' # local path to download files to
endpoint = 'vault.scansafe.com'
hours = 1  #number of hours to keep downloaded files


# Optional parameters
extractLogs = True   # set to True or False. If set to True, the script will also extract the log files from the downloaded .gz archives
consolidateLogs = False   # set to True or False. If True, will consolidate content of all .gz archives into a single file, <bucket-id>.log

if not localPath.endswith('/'):
        localPath = localPath + "/"

current_datetime = datetime.now()

print "======================================================="
print "Running at",current_datetime

# Get the date/time for the last downloaded file (stored in the file 'timestamp')

if os.path.exists(localPath+"timestamp"):
 last_download_timestamp = str(open(localPath+"timestamp",'r').read()).strip() 
 print "Last timestamp",last_download_timestamp
 last_download_timestamp = datetime.strptime(last_download_timestamp, '%Y-%m-%jT%H:%M:%S.000Z')
else:
 last_download_timestamp = ""


s3Conn = boto.connect_s3(accessKey, secretKey, host=endpoint)
myBucket = s3Conn.get_bucket(bucket, validate=False)

print "Connected to CWS backend infrastructure..."
print "Downloading log files to " + localPath + "\n"

for myKey in myBucket.list():
 if (last_download_timestamp == "" or last_download_timestamp < datetime.strptime(myKey.last_modified, '%Y-%m-%jT%H:%M:%S.000Z')):
  print "{name}\t{size}\t{modified}".format(
   name = myKey.name,
   size = myKey.size,
   modified = myKey.last_modified,
   )

  #save the timestamp of the last file read
  timestamp = open(localPath+"timestamp",'w')
  timestamp.write(myKey.last_modified)
  timestamp.close()


  fileName = os.path.basename(str(myKey.key))
  if not os.path.exists(localPath + fileName):

   myKey.get_contents_to_filename(localPath + fileName)

   if extractLogs:
    mode = 'w'
    extractedFilename = fileName[:-3]

    if consolidateLogs:
     extractedFilename = bucket + ".log"
     mode = 'a'

    extractedLog = os.path.join(localPath, extractedFilename)

    gzipFile = gzip.open(localPath + fileName, 'rb')
    print "{name} extracted to {log_file}".format(
     name = myKey.name,
     log_file = extractedFilename,
     )
    gzFile = gzipFile.read()
    unzippedFile = open(extractedLog, mode)
    for line in gzFile:
     unzippedFile.write(line)
    gzipFile.close()
    unzippedFile.close()

# Clean files older than number of hours specified 
dnldFiles = os.listdir(localPath)
for file in dnldFiles:
 if ((datetime.now() - datetime.fromtimestamp(os.path.getmtime(localPath+file))).seconds > (hours * 60 * 60)):
  print "deleting file",file
  os.remove(localPath + file)

print "\nLog files download complete.\n"