Linux Logging Guide:
Best Practices

Arfan Sharif - March 3, 2023

In previous posts in this series, we covered Linux logging basics, such as common file locations and the syslog protocol, and advanced concepts, such as rsyslog and how rsyslog works in conjunction with journald.

At scale, logging can lead to issues. Some examples include excessive logging, inconsistent formats, and lack of context. However, by following Linux logging best practices, you can leverage logs more effectively and avoid many common pitfalls. In this post, we’ll explore these best practices, connecting together pieces we’ve covered throughout our series while paving the way for integration with a centralized logging backend.

Are you ready? Let’s dive in.

Learn More

Explore the complete Linux Logging Guide series:

Use log levels to distinguish severity

In Linux, log levels indicate the importance of a log. This is a powerful way to distinguish between log events. Each log statement should have its own level appropriate for the entry. Some common log levels include:

  • Trace: Include all details about application behavior. These should only be used for development purposes and never in production.
  • Debug: These should include meaningful entries to aid in debugging issues but should not be used in production.
  • Info: Used to understand system or user behavior.
  • Warn: Indicates a potential issue that could become an error.
  • Error: Indicates errors in the system, but those errors do not prevent the system from functioning.
  • Critical/Fatal: Indicates system failure that prevents it from continuing normal operations.

All levels serve a specific function and can be a useful way to group events. It’s imperative to pay close attention to logs that indicate critical failures; these usually show that your application stopped providing service to its users.

Avoid creating unnecessary logs

Although it may sound counterintuitive, too many logs can be a problem. Too many log messages make it difficult for a human to troubleshoot a production issue in the middle of the night because they’ll have to search through a lot of noise to find useful information. Similarly, automated processes may waste time and resources by parsing and analyzing useless messages. In addition, storing unimportant log messages incurs unnecessary costs. Therefore, it is best to avoid creating irrelevant log messages.

At the same time, you want enough logs with sufficient information to accomplish your tasks—such as security information and event management (SIEM)—along with the necessary logs to assess potential threats are most effective.

Find a balance between the number of log messages produced and their quality. Using adequate log levels helps balance quantity and quality at different software development and maintenance stages. For example, in a test environment, verbose logging may be helpful; but in production, the logs could be pared down to include only errors or critical issues.

Distinguish between types of logs

Different logs serve different purposes, so it’s important to distinguish between them. For example, user access logs help track who accessed a certain system, while debug logs provide a clear understanding of the system’s full state.

Mixing different types of logs can make extracting information of value difficult. Instead, you should split log messages depending on their purpose, using each one for independent analysis. In practice, the distinction is not always straightforward. Keep relevant log messages together while adding enough information. This makes relationships between different types of logs possible.

Rate-limit your logs

Some services might generate too many logs and exhaust resources (such as disk space). However, simply disabling logs or reducing the amount of information in them may render them unhelpful.

One possible solution is to enforce rate limiting of log production to slow down the rate at which messages are produced. This will help you to avoid system overload. Most libraries and services have rate-limiting features.

Use a consistent log format

It’s important not to write logs with manual print statements, as this approach is prone to errors and inconsistencies. Logging with printf() creates logs with inconsistent formats, making them hard to read and parse. Logs need to be useful, and using a consistent format is important. Every programming language has logging libraries available to help you create logs with an expected format (for example, Apache Log4j 2 or zerolog).

Using libraries will allow you to create messages easily consumed by other services. For example, producing messages in a JSON format enables log parsers to consume index messages easily and in a straightforward manner.

Using a consistent format also makes it easier for humans to read log messages, especially when troubleshooting. Having a standard date and time format, using the correct levels, and including the proper context will make your logs much more human-readable.

Add context to your log messages

Without adequate context, log messages are potentially useless. A log message that simply logs “Error” doesn’t communicate enough information to either a human or machine. Questions can arise, such as: What operation does the error refer to? When did the error occur? What exactly happened?

Adding context allows log events to be understood in relation to other events. To understand how a system arrived into a certain state, events often need to be correlated with one another. Consistent logging format and proper context in log messages greatly simplify these tasks.

Avoid logging sensitive information

It’s critical for logs not to contain sensitive information. Exposing passwords or personally identifiable information (PII) information in logs is a bad practice, and these security issues can bring legal implications.

It’s important to sanitize logs to avoid logging and exposing sensitive information. Most standard libraries have features to help. Using log scanners can also reveal sensitive information, so it’s important to handle these logs accordingly.

Additionally, ensure log file permissions are relevant to its file contents. Logs with highly sensitive information should have tighter file permissions and be shipped to a secure location. Avoid keeping them on the host.

Rotate your logs

If not properly handled, log files can uncontrollably grow in size. It’s important to configure log rotation to deal with this ahead of time. Log rotation ensures files don’t get too large, potentially breaking services due to full disks or exhausted memory.

For details on setting up log rotation, check out the section in Advanced Concepts.

Use a centralized logging solution

Many enterprises have complex systems with complicated infrastructure and many services. With such complexity, it’s not feasible to debug or investigate logs by connecting to every server or component for manual checks. This process would be unbearable, even with automation in the form of scripts.

Centralized logging solutions, like CrowdStrike Falcon LogScale, offer a solution to this problem. LogScale provides a single place to analyze logs and a consistent way to query them, and it can handle massive scale. In addition, centralized logging solutions can offer other features, such as alerting or APIs, allowing you to leverage your logs even more.

In the next part of this series, we’ll expand on this concept by diving into how to leverage CrowdStrike Falcon LogScale as your Linux logging backend.

Log your data with CrowdStrike Falcon Next-Gen SIEM

Elevate your cybersecurity with the CrowdStrike Falcon® platform, the premier AI-native platform for SIEM and log management. Experience security logging at a petabyte scale, choosing between cloud-native or self-hosted deployment options. Log your data with a powerful, index-free architecture, without bottlenecks, allowing threat hunting with over 1 PB of data ingestion per day. Ensure real-time search capabilities to outpace adversaries, achieving sub-second latency for complex queries. Benefit from 360-degree visibility, consolidating data to break down silos and enabling security, IT, and DevOps teams to hunt threats, monitor performance, and ensure compliance seamlessly across 3 billion events in less than 1 second.

Schedule Falcon Next-Gen SIEM Demo

GET TO KNOW THE AUTHOR

Arfan Sharif is a product marketing lead for the Observability portfolio at CrowdStrike. He has over 15 years experience driving Log Management, ITOps, Observability, Security and CX solutions for companies such as Splunk, Genesys and Quest Software. Arfan graduated in Computer Science at Bucks and Chilterns University and has a career spanning across Product Marketing and Sales Engineering.