Back to Tech Center

How to Ingest Data into Falcon LogScale Using Python

February 23, 2023

Tech Center

This post covers how to ingest data into CrowdStrike Falcon® LogScale from your MacOS platform using Python. This guide is great for setting up a one-node proof of concept (POC) so you can take advantage of LogScale’s free trial.

Before you can write your ingest client, you must prepare a good foundation. That means preparing your MacOS instance via the following steps:

  • Download Homebrew 
  • Update your default MacOS Python
  • Install Python Package Manager
  • Download LogScale’s libraries

Ready? Let’s get started.

Prepare your MacOS instance

One of the methods of ingestion is to use LogScale’s software libraries that are available in a variety of languages. Today we’ll be working with Python and MacOS.

Step 1: Install Homebrew, a package manager for MacOS. Installing new packages with Homebrew is a simple command line in Terminal, similar to installing new packages in Linux. Follow the instructions on the Homebrew site.

Step 2:  Use Homebrew and update your default MacOS Python. As you may know, MacOS 10.15 (Catalina) is currently running on Python 2.7, even though newer releases are available. It’s essential to MacOS that the default Python remains at 2.7. We’ll need to update your Python to the latest version while retaining version 2.7 for essential MacOS functions.

Follow these instructions from Matthew Broberg: The right and wrong way to set Python 3 as default on your Mac.

Update for Ventura: For the latest version of MacOS 13.2.1 (Ventura) Python 3.9.x is available, but it isn’t installed by default. You will need to install XCode to install Python using the terminal with the following command:

xcode-select –install

You can find more information here: Python3 now included with Ventura

Step 3: Once we have the appropriate version of Python running in your MacOS, we’ll need to install Python Package Manager, pip, so that we can install LogScale’s Client Library. Usually pip comes packaged with Python and there’s no additional step to install.

To see if pip is installed, run the following command in your Terminal:

python -m pip --version

If pip is installed, you’ll see the following output:

Alternatively, you can manually install pip by opening Terminal and run the following command:

curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py

Then run python -m pip --version again to verify pip is installed and you have the latest version.

Step 4: Install LogScale’s Python software library. The humiolib library is a wrapper for LogScale’s web API, supporting easy interaction with LogScale directly from Python. You can find more information in our humiolib github.

You can start the install by running the following this pip command in your Terminal:

pip install humiolib

This command will give you a series of outputs that tells you the files being installed. Once installation is complete, you’ve finished the prep work for your MacOS instance. Now we can move on to the fun stuff.

Build your ingest client

It’s time to start writing to an ingest client. Take a look at this example program:

Let’s break down some of the pieces of the code.

At a minimum, you’ll need to add humiolib to be able to run the codes required to send logs to LogScale.

from humiolib.HumioClient import HumioIngestClient

You’ll also need to create an ingest client with attributes that tells the client where to ship this log.

client = HumioIngestClient(
  base_url= "The url where LogScale resides",
  ingest_token="An API token from LogScale"
)

API Token can be retrieved from your LogScale instance.

Structured log messages

There are two types of messages you can send to LogScale: structured and unstructured.

In most of our use cases, LogScale receives structured data as a JSON object. There’s no strict format as to how the JSON object is structured, but you do need to ensure the JSON object is valid. You can check the structure of a JSON object using a tool like JSONLint.

Additionally, with structured data, you can send valid timestamps as part of the log entry, and LogScale will use the timestamp provided instead of inserting one of its own. Therefore, please ensure that the timestamp for the log entry is less than 24 hours from the time it is sent. Otherwise LogScale will assume it’s older data and drop the log entry without an error message.

Below is an example of structured data:

structured_data = [
 {
 		"tags": {
 			"host": "str(ip)",
 			"host_name": "str(host)",
            "filename": "str(caller.filename)",
 			"line": "str(caller.lineno)",
 			"error_level": "INFO"
 		},
 		"events": [
 			{
 				"timestamp": str(datetime.now(timezone("EST")).isoformat()), #.strftime("%Y-%m-%d %H:%M:%S %Z"),
 				"attributes": {
 					"message": "Structured message",
 				}
 			}
 		]
 	}
 ]

Once the structured data is validated, you can send it to LogScale using the following function where the variable structured_data is the object you created above to store your JSON:

client.ingest_json_data(structured_data)

Support for unstructured data

Alternatively, you can send unstructured data to LogScale. Unstructured data are timestamped at ingestion since it’s a long comma delimited string. Thus the timestamp that you may or may not provide in the log entry has no impact on the ingestion timestamp. Below is an example of unstructured data:

unstructured_data = ["Unstructured message","Hello Python World",str(datetime.now(timezone("EST")).isoformat())]

You can send it to LogScale using the following function where unstructured_data is the object that contains your message. Please note the differences in the syntax between ingesting structured and unstructured data.

client.ingest_messages(unstructured_data)

If you completed all the steps above, you should start seeing messages appearing in your LogScale instance. Happy logging!

 

Related Content