Quarkus logging splunk

Introduction

Splunk is a middleware solution that receives, stores, indexes and finally allows to exploit the logs of an application.

This Quarkus extension provides the support of the official Splunk client library to index log events through the HTTP Event collection, provided by Splunk enterprise solution.

  • The official client is an opensource library available here.

  • The documentation of HTTP Event collection can be found here.

Installation

If you want to use this extension, you need to add the quarkus-logging-splunk extension first. In your pom.xml file, add:

<dependency>
    <groupId>io.quarkiverse.logging.splunk</groupId>
    <artifactId>quarkus-logging-splunk</artifactId>
    <version>{project-version}</version>
</dependency>

Features

The extension can be used transparently with any log frontend used by Quarkus (Log4j, SLF4J, …​ ).

Log message formatting

In all cases the log message formatter is aligned by default with the one of Quarkus console handler:

quarkus.log.handler.splunk.format="%d{yyyy-MM-dd HH:mm:ss,SSS} %-5p [%c{3.}] (%t) %s%e%n"

This can be adapted in order to avoid duplication with metadata that are passed in a structured way.

Log event metadata

The type of metadata depends on the serialization format.

If quarkus.log.handler.splunk.raw is enabled or quarkus.log.handler.splunk.serialization is raw, there are no per-event metadata. Only few global metadata shared between all events of a batch are sent via HTTP headers and query parameters.

In other cases, the extension uses structured logging, via JSON serialization. There are two supported structured formats:

  • The nested serialization is the default format of Splunk HEC Java client and defines the name of some pre-defined metadata. Combined with quarkus.log.handler.splunk.format=%s%e it also support log messages that are themselves JSON.

  • The flat serialization is a simpler and more generic format, also used by the OpenTelemetry Splunk HEC exporter.

Some metadata can be indexed by Splunk, see indexed fields. The default _json source type indexes metadata passed in the fields object.

The extension provides the support of the resolution of MDC scoped properties, as defined in JBoss supported formatters.

Serialization format nested flat

HEC metadata

time and host are always sent. source, sourcetype, index are sent if not empty.

Pre-defined metadata

Only event.severity is sent by default. Other metadata can be added:

  • event.thread via quarkus.log.handler.splunk.include-thread-name

  • event.exception via quarkus.log.handler.splunk.include-exception

  • event.logger via quarkus.log.handler.splunk.include-logger-name

Only fields.severity is sent by default. The metadata name can be customized via quarkus.log.handler.splunk.metadataSeverityFieldName Other metadata can be added:

  • fields.thread via quarkus.log.handler.splunk.include-thread-name

  • fields.exception via quarkus.log.handler.splunk.include-exception

  • fields.logger via quarkus.log.handler.splunk.include-logger-name

MDC properties

Passed via event.properties

Passed via fields

Static metadata

Passed via fields

A structured query to Splunk HEC looks like:

curl -k -v -X POST https://localhost:8080/services/collector/event/1.0 -H "Content-type: application/json; profile=\"urn:splunk:event:1.0\"; charset=utf-8" -H "Authorization: Splunk 29fe2838-cab6-4d17-a392-37b7b8f41f75" -d@events.json

Nested serialization example
{
  "time": "1673001538.042",
  "host": "hostname",
  "source": "mysource",
  "sourcetype": "_json",
  "index": "main",
  "event": {
    "message": "2023-01-06 ERROR The log message",
    "logger": "com.acme.MyClass",
    "severity": "ERROR",
    "exception": "java.lang.NullPointerException",
    "properties": {
      "mdc-key": "mdc-value"
    }
  },
  "fields": {
    "key": "static-value"
  }
}
Flat serialization example
{
  "time": "1673001538.042",
  "host": "hostname",
  "source": "mysource",
  "index": "main",
  "event": "2023-01-06 ERROR The log message",
  "fields": {
    "severity": "ERROR",
    "mdc-key": "mdc-value",
    "key": "static-value"
  }
}

Connectivity failures

Batched events that cannot be sent to the Splunk indexer will be logged to stdout:

  • Formatted using console handler settings if the console handler is enabled

  • Formatted using splunk handler settings otherwise

In any case, the root cause of the failure is always logged to stderr.

Asynchronous handler

By default, the log handler is synchronous and only the HTTP requests to HEC endpoint are done asynchronously:

sync

This can be an issue because the Splunk library #send is synchronized, so any preprocessing of the batch HTTP request itself happens on the application thread of the log event that triggered the batch to be full (either by reaching quarkus.log.handler.splunk.batch-size-count or quarkus.log.handler.splunk.batch-size-bytes)

By enabling quarkus.log.handler.splunk.async=true, an intermediate event queue is used, which decouples the flushing of the batch from any application thread:

async

By default quarkus.log.handler.splunk.async.queue-length=block, so applicative threads will block once the queue limit has reached quarkus.log.handler.splunk.async.queue-length.

There’s no link between quarkus.log.handler.splunk.async.queue-length and quarkus.log.handler.splunk.batch-size-count.

Sequential and parallel modes

The number of events kept in memory for batching purposes is not limited. After tuning quarkus.log.handler.splunk.batch-size-count and quarkus.log.handler.splunk.batch-size-bytes, in case the HEC endpoint cannot keep up with the batch throughput, using multiple HTTP connections might help to reduce memory usage on the client.

By setting quarkus.log.handler.splunk.send-mode=parallel multiple batches will be sent over the wire in parallel, potentially increasing throughput with the HEC endpoint.

Named Splunk log handlers

A named log handler can be configured to manage multiple Splunk configurations for particular log emissions. Like for core Quarkus handlers (console, file or syslog), Splunk named handlers follow the same configuration:

# Global configuration
quarkus.log.handler.splunk.token=12345678-1234-1234-1234-1234567890AB
quarkus.log.handler.splunk.metadata-index=mylogindex

# Splunk named handler configuration, named here MONITORING
quarkus.log.handler.splunk."MONITORING".token=12345678-0000-0000-0000-1234567890AB
quarkus.log.handler.splunk."MONITORING".metadata-index=mystatsindex

# Registration of the custom handler through Quarkus core category management, here monitoring as the logging category
quarkus.log.category."monitoring".handlers=MONITORING
quarkus.log.category."monitoring".use-parent-handlers=false

Next to use such logger in actual code, you can rely on annotation or factory:

  • With annotation:

@LoggerName("monitoring")
Logger monitoringLogger;
  • With factory:

static final Logger monitoringLogger = Logger.getLogger("monitoring");

Some important considerations

  • Every handler is isolated and uses a separate Splunk client and connection pool, which means it has a cost.

  • The configuration from the root handler are not inherited by named handlers.

  • Use quarkus.log.category."named-handler".use-parent-handlers=false is required if you do not want the root handler to also receive log events already sent to named handlers.

Extension Configuration Reference

This extension follows the log handlers configuration domain that is defined by Quarkus, every configuration property of this extension will belong to the following configuration root : quarkus.log.handler.splunk

When present this extension is enabled by default, meaning the client would expect a valid connection to a Splunk indexer and would print an error message for every log created by the application.

So in local environment, the log handler can be disabled with the following property :

quarkus.log.handler.splunk.enabled=false

Every configuration property of the extension is overridable at runtime.

Configuration property fixed at build time - All other configuration properties are overridable at runtime

Configuration property

Type

Default

Determine whether to enable the handler

Environment variable: QUARKUS_LOG_HANDLER_SPLUNK_ENABLED

boolean

true

The splunk handler log level. By default, it is no more strict than the root handler level.

Environment variable: QUARKUS_LOG_HANDLER_SPLUNK_LEVEL

Level

ALL

Splunk HEC endpoint base url.

With raw events, the endpoint targeted is /services/collector/raw. With flat or nested JSON events, the endpoint targeted is /services/collector/event/1.0.

Environment variable: QUARKUS_LOG_HANDLER_SPLUNK_URL

string

https://localhost:8088/

Disable TLS certificate validation with HEC endpoint

Environment variable: QUARKUS_LOG_HANDLER_SPLUNK_DISABLE_CERTIFICATE_VALIDATION

boolean

false

The application token to authenticate with HEC, the token is mandatory if the extension is enabled https://docs.splunk.com/Documentation/Splunk/latest/Data/FormateventsforHTTPEventCollector#HEC_token

Environment variable: QUARKUS_LOG_HANDLER_SPLUNK_TOKEN

string

The strategy to send events to HEC.

In sequential mode, there is only one HTTP connection to HEC and the order of events is preserved, but performance is lower. In parallel mode, event batches are sent asynchronously over multiple HTTP connections, and events with the same timestamp (that has 1 millisecond resolution) may be indexed out of order by Splunk.

Environment variable: QUARKUS_LOG_HANDLER_SPLUNK_SEND_MODE

sequential, parallel

sequential

A GUID to identify an HEC client and guarantee isolation at HEC level in case of slow clients. https://docs.splunk.com/Documentation/Splunk/latest/Data/AboutHECIDXAck#About_channels_and_sending_data

Environment variable: QUARKUS_LOG_HANDLER_SPLUNK_CHANNEL

string

Batching delay before sending a group of events. If 0, the events are sent immediately.

Environment variable: QUARKUS_LOG_HANDLER_SPLUNK_BATCH_INTERVAL

Duration

10S

Maximum number of events in a batch. By default 10, if 0 no batching.

Environment variable: QUARKUS_LOG_HANDLER_SPLUNK_BATCH_SIZE_COUNT

long

10

Maximum total size in bytes of events in a batch. By default 10KB, if 0 no batching.

Environment variable: QUARKUS_LOG_HANDLER_SPLUNK_BATCH_SIZE_BYTES

long

10240

Maximum number of retries in case of I/O exceptions with HEC connection.

Environment variable: QUARKUS_LOG_HANDLER_SPLUNK_MAX_RETRIES

long

0

The log format, defining which metadata are inlined inside the log main payload.

Specific metadata (hostname, category, thread name, …​), as well as MDC key/value map, can also be sent in a structured way.

Environment variable: QUARKUS_LOG_HANDLER_SPLUNK_FORMAT

string

%d{yyyy-MM-dd HH:mm:ss,SSS} %-5p [%c{3.}] (%t) %s%e%n

Whether to send the thrown exception message as a structured metadata of the log event (as opposed to %e in a formatted message, it does not include the exception name or stacktrace). Only applicable to 'nested' serialization.

Environment variable: QUARKUS_LOG_HANDLER_SPLUNK_INCLUDE_EXCEPTION

boolean

false

Whether to send the logger name as a structured metadata of the log event (equivalent of %c in a formatted message). Only applicable to 'nested' serialization.

Environment variable: QUARKUS_LOG_HANDLER_SPLUNK_INCLUDE_LOGGER_NAME

boolean

false

Whether to send the thread name as a structured metadata of the log event (equivalent of %t in a formatted message). Only applicable to 'nested' serialization.

Environment variable: QUARKUS_LOG_HANDLER_SPLUNK_INCLUDE_THREAD_NAME

boolean

false

Overrides the host name metadata value.

Environment variable: QUARKUS_LOG_HANDLER_SPLUNK_METADATA_HOST

string

The equivalent of %h in a formatted message

The source value to assign to the event data. For example, if you’re sending data from an app you’re developing, you could set this key to the name of the app. https://docs.splunk.com/Documentation/Splunk/latest/Data/FormateventsforHTTPEventCollector#Event_metadata

Environment variable: QUARKUS_LOG_HANDLER_SPLUNK_METADATA_SOURCE

string

The optional format of the events, to enable some parsing on Splunk side. https://docs.splunk.com/Documentation/Splunk/latest/Data/FormateventsforHTTPEventCollector#Event_metadata

A given source type may have indexed fields extraction enabled, which is the case of the built-in _json used for nested serialization.

Environment variable: QUARKUS_LOG_HANDLER_SPLUNK_METADATA_SOURCE_TYPE

string

_json for nested serialization, not set otherwise

The optional name of the index by which the event data is to be stored. If set, it must be within the list of allowed indexes of the token (if it has the indexes parameter set). https://docs.splunk.com/Documentation/Splunk/latest/Data/FormateventsforHTTPEventCollector#Event_metadata

Environment variable: QUARKUS_LOG_HANDLER_SPLUNK_METADATA_INDEX

string

The name of the key used to convey the severity / log level in the metadata fields. Only applicable to 'flat' serialization. With 'nested' serialization, there is already a 'severity' field.

Environment variable: QUARKUS_LOG_HANDLER_SPLUNK_METADATA_SEVERITY_FIELD_NAME

string

severity

The format of the payload.

  • With raw serialization, the log message is sent 'as is' in the HTTP body. Metadata can only be common to a whole batch and are sent via HTTP parameters.

  • With nested serialization, the log message is sent into a 'message' field of a JSON structure which also contains dynamic metadata.

  • With flat serialization, the log message is sent into the root 'event' field. Dynamic metadata is sent via the 'fields' root object.

Environment variable: QUARKUS_LOG_HANDLER_SPLUNK_SERIALIZATION

raw, nested, flat

nested

Indicates whether to log asynchronously

Environment variable: QUARKUS_LOG_HANDLER_SPLUNK_ASYNC

boolean

false

The queue length to use before flushing writing

Environment variable: QUARKUS_LOG_HANDLER_SPLUNK_ASYNC_QUEUE_LENGTH

int

512

Determine whether to block the publisher (rather than drop the message) when the queue is full

Environment variable: QUARKUS_LOG_HANDLER_SPLUNK_ASYNC_OVERFLOW

block, discard

block

Optional static key/value pairs to populate the "fields" key of event metadata. This isn’t applicable to raw serialization. https://docs.splunk.com/Documentation/Splunk/latest/Data/FormateventsforHTTPEventCollector#Event_metadata

Environment variable: QUARKUS_LOG_HANDLER_SPLUNK_METADATA_FIELDS

Map<String,String>

Determine whether to enable the handler

Environment variable: QUARKUS_LOG_HANDLER_SPLUNK__NAMED_HANDLERS__ENABLED

boolean

true

The splunk handler log level. By default, it is no more strict than the root handler level.

Environment variable: QUARKUS_LOG_HANDLER_SPLUNK__NAMED_HANDLERS__LEVEL

Level

ALL

Splunk HEC endpoint base url.

With raw events, the endpoint targeted is /services/collector/raw. With flat or nested JSON events, the endpoint targeted is /services/collector/event/1.0.

Environment variable: QUARKUS_LOG_HANDLER_SPLUNK__NAMED_HANDLERS__URL

string

https://localhost:8088/

Disable TLS certificate validation with HEC endpoint

Environment variable: QUARKUS_LOG_HANDLER_SPLUNK__NAMED_HANDLERS__DISABLE_CERTIFICATE_VALIDATION

boolean

false

The application token to authenticate with HEC, the token is mandatory if the extension is enabled https://docs.splunk.com/Documentation/Splunk/latest/Data/FormateventsforHTTPEventCollector#HEC_token

Environment variable: QUARKUS_LOG_HANDLER_SPLUNK__NAMED_HANDLERS__TOKEN

string

The strategy to send events to HEC.

In sequential mode, there is only one HTTP connection to HEC and the order of events is preserved, but performance is lower. In parallel mode, event batches are sent asynchronously over multiple HTTP connections, and events with the same timestamp (that has 1 millisecond resolution) may be indexed out of order by Splunk.

Environment variable: QUARKUS_LOG_HANDLER_SPLUNK__NAMED_HANDLERS__SEND_MODE

sequential, parallel

sequential

A GUID to identify an HEC client and guarantee isolation at HEC level in case of slow clients. https://docs.splunk.com/Documentation/Splunk/latest/Data/AboutHECIDXAck#About_channels_and_sending_data

Environment variable: QUARKUS_LOG_HANDLER_SPLUNK__NAMED_HANDLERS__CHANNEL

string

Batching delay before sending a group of events. If 0, the events are sent immediately.

Environment variable: QUARKUS_LOG_HANDLER_SPLUNK__NAMED_HANDLERS__BATCH_INTERVAL

Duration

10S

Maximum number of events in a batch. By default 10, if 0 no batching.

Environment variable: QUARKUS_LOG_HANDLER_SPLUNK__NAMED_HANDLERS__BATCH_SIZE_COUNT

long

10

Maximum total size in bytes of events in a batch. By default 10KB, if 0 no batching.

Environment variable: QUARKUS_LOG_HANDLER_SPLUNK__NAMED_HANDLERS__BATCH_SIZE_BYTES

long

10240

Maximum number of retries in case of I/O exceptions with HEC connection.

Environment variable: QUARKUS_LOG_HANDLER_SPLUNK__NAMED_HANDLERS__MAX_RETRIES

long

0

The log format, defining which metadata are inlined inside the log main payload.

Specific metadata (hostname, category, thread name, …​), as well as MDC key/value map, can also be sent in a structured way.

Environment variable: QUARKUS_LOG_HANDLER_SPLUNK__NAMED_HANDLERS__FORMAT

string

%d{yyyy-MM-dd HH:mm:ss,SSS} %-5p [%c{3.}] (%t) %s%e%n

Whether to send the thrown exception message as a structured metadata of the log event (as opposed to %e in a formatted message, it does not include the exception name or stacktrace). Only applicable to 'nested' serialization.

Environment variable: QUARKUS_LOG_HANDLER_SPLUNK__NAMED_HANDLERS__INCLUDE_EXCEPTION

boolean

false

Whether to send the logger name as a structured metadata of the log event (equivalent of %c in a formatted message). Only applicable to 'nested' serialization.

Environment variable: QUARKUS_LOG_HANDLER_SPLUNK__NAMED_HANDLERS__INCLUDE_LOGGER_NAME

boolean

false

Whether to send the thread name as a structured metadata of the log event (equivalent of %t in a formatted message). Only applicable to 'nested' serialization.

Environment variable: QUARKUS_LOG_HANDLER_SPLUNK__NAMED_HANDLERS__INCLUDE_THREAD_NAME

boolean

false

Overrides the host name metadata value.

Environment variable: QUARKUS_LOG_HANDLER_SPLUNK__NAMED_HANDLERS__METADATA_HOST

string

The equivalent of %h in a formatted message

The source value to assign to the event data. For example, if you’re sending data from an app you’re developing, you could set this key to the name of the app. https://docs.splunk.com/Documentation/Splunk/latest/Data/FormateventsforHTTPEventCollector#Event_metadata

Environment variable: QUARKUS_LOG_HANDLER_SPLUNK__NAMED_HANDLERS__METADATA_SOURCE

string

The optional format of the events, to enable some parsing on Splunk side. https://docs.splunk.com/Documentation/Splunk/latest/Data/FormateventsforHTTPEventCollector#Event_metadata

A given source type may have indexed fields extraction enabled, which is the case of the built-in _json used for nested serialization.

Environment variable: QUARKUS_LOG_HANDLER_SPLUNK__NAMED_HANDLERS__METADATA_SOURCE_TYPE

string

_json for nested serialization, not set otherwise

The optional name of the index by which the event data is to be stored. If set, it must be within the list of allowed indexes of the token (if it has the indexes parameter set). https://docs.splunk.com/Documentation/Splunk/latest/Data/FormateventsforHTTPEventCollector#Event_metadata

Environment variable: QUARKUS_LOG_HANDLER_SPLUNK__NAMED_HANDLERS__METADATA_INDEX

string

Optional static key/value pairs to populate the "fields" key of event metadata. This isn’t applicable to raw serialization. https://docs.splunk.com/Documentation/Splunk/latest/Data/FormateventsforHTTPEventCollector#Event_metadata

Environment variable: QUARKUS_LOG_HANDLER_SPLUNK__NAMED_HANDLERS__METADATA_FIELDS

Map<String,String>

The name of the key used to convey the severity / log level in the metadata fields. Only applicable to 'flat' serialization. With 'nested' serialization, there is already a 'severity' field.

Environment variable: QUARKUS_LOG_HANDLER_SPLUNK__NAMED_HANDLERS__METADATA_SEVERITY_FIELD_NAME

string

severity

The format of the payload.

  • With raw serialization, the log message is sent 'as is' in the HTTP body. Metadata can only be common to a whole batch and are sent via HTTP parameters.

  • With nested serialization, the log message is sent into a 'message' field of a JSON structure which also contains dynamic metadata.

  • With flat serialization, the log message is sent into the root 'event' field. Dynamic metadata is sent via the 'fields' root object.

Environment variable: QUARKUS_LOG_HANDLER_SPLUNK__NAMED_HANDLERS__SERIALIZATION

raw, nested, flat

nested

Indicates whether to log asynchronously

Environment variable: QUARKUS_LOG_HANDLER_SPLUNK__NAMED_HANDLERS__ASYNC

boolean

false

The queue length to use before flushing writing

Environment variable: QUARKUS_LOG_HANDLER_SPLUNK__NAMED_HANDLERS__ASYNC_QUEUE_LENGTH

int

512

Determine whether to block the publisher (rather than drop the message) when the queue is full

Environment variable: QUARKUS_LOG_HANDLER_SPLUNK__NAMED_HANDLERS__ASYNC_OVERFLOW

block, discard

block

About the Duration format

The format for durations uses the standard java.time.Duration format. You can learn more about it in the Duration#parse() javadoc.

You can also provide duration values starting with a number. In this case, if the value consists only of a number, the converter treats the value as seconds. Otherwise, PT is implicitly prepended to the value to obtain a standard java.time.Duration format.