Skip to content

Conversation

@cosmo0920
Copy link
Contributor

@cosmo0920 cosmo0920 commented Jan 26, 2026

Currently, in_splunk does not handle remote address.
This could be inconvenient to track remote address for traceability.


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • Example configuration file for the change
  • Debug log output from testing the change
  • Attached Valgrind output that shows no leaks or memory corruption was found

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • Run local packaging test showing all targets (including any new ones) build.
  • Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

Backporting

  • Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

Summary by CodeRabbit

  • New Features

    • Optional per-request remote-address injection into emitted events, with configurable record key.
    • Client address is extracted from X-Forwarded-For header with fallback to connection peer; per-request address is cleared after handling to avoid leakage.
  • Tests

    • Added a runtime test validating X-Forwarded-For extraction and inclusion of the remote address.
  • Chores

    • Enhanced commit-prefix validation logic and adjusted related unit test expectations.

✏️ Tip: You can customize this high-level summary in your review settings.

Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
@coderabbitai
Copy link

coderabbitai bot commented Jan 26, 2026

📝 Walkthrough

Walkthrough

Adds optional per-request remote address extraction (from X-Forwarded-For or connection peer), threads that address through HTTP handlers and payload processors to append it into emitted records, exposes config options to enable/name the field, initializes per-request storage, and adds an integration test for XFF extraction.

Changes

Cohort / File(s) Summary
Configuration & Core
plugins/in_splunk/splunk.c, plugins/in_splunk/splunk.h, plugins/in_splunk/splunk_config.c
Add config options add_remote_addr and remote_addr_key; add struct fields add_remote_addr, remote_addr_key, current_remote_addr, current_remote_addr_len; initialize and reference new fields.
Protocol & Payload Handling
plugins/in_splunk/splunk_prot.c, plugins/in_splunk/splunk_prot.h
Introduce SPLUNK_XFF_HEADER; add static helpers for header lookup and XFF/peer extraction; resolve remote_addr at request entry; thread remote_addr through handlers and payload processors; update internal function signatures to accept remote_addr/length and clear per-request storage after use.
Tests
tests/runtime/in_splunk.c
Add integration test flb_test_splunk_xff_extract() that posts with X-Forwarded-For and verifies injected remote address in emitted output; include new header include.
CI / Commit Lint
.github/scripts/commit_prefix_check.py, .github/scripts/tests/test_commit_lint.py
Refine commit-prefix inference for tests/* paths to derive file-based prefixes and emit both component and tests: prefixes when applicable; update unit test expectations accordingly.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client
    participant H as Splunk HTTP Handler
    participant P as Header Parser
    participant Peer as Connection Peer
    participant L as Payload Processor
    participant E as Emitter

    Client->>H: POST /services/collector/event (headers + payload)
    H->>P: lookup "x-forwarded-for"
    alt XFF present
        P-->>H: XFF value (remote_addr)
    else XFF missing
        H->>Peer: get peer address
        Peer-->>H: peer address (remote_addr)
    end
    H->>H: store current_remote_addr
    H->>L: process payload (pass remote_addr)
    L->>L: append remote_addr to records
    L->>E: emit records with injected remote_addr
    H->>H: clear current_remote_addr
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~30 minutes

Possibly related PRs

Suggested reviewers

  • niedbalski
  • patrick-stephens
  • celalettin1286

Poem

🐇 I sniffed the headers, found a wandering trace,
I tuck X‑Forwarded‑For into each record's space,
I hop through requests and carry the clue,
Then clear my paws when the handling is through,
A rabbit's small joy to keep traces true.

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and directly describes the main change: implementing remote address handling in the Splunk input plugin, which aligns with the PR's primary objective.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ba50be0c16

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@plugins/in_splunk/splunk_prot.c`:
- Around line 1381-1399: In splunk_prot_handle_ng() the per-request fields
context->current_remote_addr and context->current_remote_addr_len are set but
never reset, allowing stale addresses to persist across requests; update the
function to ensure these fields are cleared before every return (or funnel
returns through a single cleanup label), e.g., after using
extract_remote_address(), when falling back to peer
(flb_connection_get_remote_address(parent_session->connection)), and prior to
any early exits: set context->current_remote_addr = NULL and
context->current_remote_addr_len = 0 (or free/reset as appropriate) so the same
cleanup performed in splunk_prot_handle() is applied here.
🧹 Nitpick comments (5)
plugins/in_splunk/splunk.h (1)

76-79: Consider using const char * instead of flb_sds_t for borrowed pointers.

current_remote_addr is assigned non-owned pointers from either the XFF header value or flb_connection_get_remote_address() in splunk_prot.c. Using flb_sds_t is misleading since it implies an owned/allocated string that should be managed with flb_sds_* functions.

For clarity and to prevent accidental misuse:

♻️ Suggested change
     /* Remote address */
-    flb_sds_t current_remote_addr;
+    const char *current_remote_addr;
     size_t current_remote_addr_len;
plugins/in_splunk/splunk_prot.c (4)

265-290: Const-correctness issue in output parameter.

The function assigns const char * values (from extract_xff_value and flb_connection_get_remote_address) to *out, but out is declared as char **. This discards the const qualifier and may cause compiler warnings.

♻️ Suggested fix
 static int extract_remote_address(const char *xff_value,
                                   size_t xff_value_len,
                                   struct flb_connection *connection,
-                                  char **out,
+                                  const char **out,
                                   size_t *out_len)
 {

Also update the corresponding field type in splunk.h and call sites in splunk_prot_handle() and splunk_prot_handle_ng().


424-428: Unused parameters in function signature.

The remote_addr and remote_addr_len parameters are added to the signature but never used. The function uses ctx->current_remote_addr and ctx->current_remote_addr_len directly at lines 478-480.

Either use the passed parameters or remove them from the signature to avoid confusion:

♻️ Option 1: Remove unused parameters
 static void process_flb_log_append(struct flb_splunk *ctx, msgpack_object *record,
                                    flb_sds_t tag, flb_sds_t tag_from_record,
-                                   struct flb_time tm,
-                                   const char *remote_addr,
-                                   size_t remote_addr_len)
+                                   struct flb_time tm)
♻️ Option 2: Use the passed parameters
     if (ret == FLB_EVENT_ENCODER_SUCCESS) {
         ret = append_remote_addr(ctx,
-                                 ctx->current_remote_addr,
-                                 ctx->current_remote_addr_len);
+                                 remote_addr,
+                                 remote_addr_len);
     }

775-780: Unused parameters in function signature.

Similar to process_flb_log_append(), the remote_addr and remote_addr_len parameters are not used within this function. The downstream process_raw_payload_pack() accesses ctx->current_remote_addr directly.

Consider removing these unused parameters for consistency:

♻️ Suggested change
 static int process_hec_raw_payload(struct flb_splunk *ctx, struct splunk_conn *conn,
                                    flb_sds_t tag,
                                    struct mk_http_session *session,
-                                   struct mk_http_request *request,
-                                   const char *remote_addr,
-                                   size_t remote_addr_len)
+                                   struct mk_http_request *request)

1115-1118: Missing cleanup on early return paths.

The per-request remote address is cleared at the end of successful processing, but multiple early return paths (lines 861, 928, 974, 1040, 1066, 1088, 1104) skip this cleanup. While the state is re-initialized at the start of each request (lines 1003-1004), for defensive coding it would be cleaner to use a goto cleanup pattern to ensure consistent cleanup.

Alternatively, since the state is always re-initialized at the start of splunk_prot_handle(), this might be acceptable as-is. Just ensure this initialization always happens before any potential use.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
plugins/in_splunk/splunk_prot.c (2)

424-481: Unused parameters: remote_addr and remote_addr_len are never referenced.

The function signature was updated to accept remote_addr and remote_addr_len, but the implementation uses ctx->current_remote_addr and ctx->current_remote_addr_len directly (lines 479-480). Either use the parameters or remove them from the signature.

♻️ Option 1: Remove unused parameters (simpler)
 static void process_flb_log_append(struct flb_splunk *ctx, msgpack_object *record,
                                    flb_sds_t tag, flb_sds_t tag_from_record,
-                                   struct flb_time tm,
-                                   const char *remote_addr,
-                                   size_t remote_addr_len)
+                                   struct flb_time tm)
 {

And update call sites accordingly.

♻️ Option 2: Use the parameters instead of context fields
     if (ret == FLB_EVENT_ENCODER_SUCCESS) {
         ret = append_remote_addr(ctx,
-                                 ctx->current_remote_addr,
-                                 ctx->current_remote_addr_len);
+                                 remote_addr,
+                                 remote_addr_len);
     }

775-814: Unused parameters: remote_addr and remote_addr_len are never referenced.

Similar to process_flb_log_append, these parameters are added to the signature but never used. The underlying process_raw_payload_pack reads from ctx->current_remote_addr directly.

♻️ Suggested fix - remove unused parameters
 static int process_hec_raw_payload(struct flb_splunk *ctx, struct splunk_conn *conn,
                                    flb_sds_t tag,
                                    struct mk_http_session *session,
-                                   struct mk_http_request *request,
-                                   const char *remote_addr,
-                                   size_t remote_addr_len)
+                                   struct mk_http_request *request)
 {

Update the call site at line 1027 accordingly.

🤖 Fix all issues with AI agents
In `@plugins/in_splunk/splunk_prot.c`:
- Around line 1455-1459: In splunk_prot_handle_ng() the cleanup lines refer to
an undefined variable ctx; replace those uses with the correct function-local
variable name context (i.e., set context->current_remote_addr = NULL and
context->current_remote_addr_len = 0) so the per-request remote address is
cleared on the correct struct instance.
🧹 Nitpick comments (1)
plugins/in_splunk/splunk_prot.c (1)

265-290: Const-correctness issue: discarding const qualifier.

The function accepts char **out but assigns const char * values to it (from extract_xff_value and flb_connection_get_remote_address). This silently discards the const qualifier. Consider changing the output parameter type to preserve const-correctness.

♻️ Suggested fix
 static int extract_remote_address(const char *xff_value,
                                   size_t xff_value_len,
                                   struct flb_connection *connection,
-                                  char **out,
+                                  const char **out,
                                   size_t *out_len)
 {
     const char *value = NULL;
     size_t len = 0;

     extract_xff_value(xff_value, xff_value_len, &value, &len);

     if (value == NULL && connection != NULL) {
         value = flb_connection_get_remote_address(connection);
         if (value != NULL) {
             len = strlen(value);
         }
     }

     if (value == NULL || len == 0) {
         return -1;
     }

-    *out = value;
+    *out = value;
     *out_len = len;
     return 0;
 }

Also update the callers (splunk_prot_handle and splunk_prot_handle_ng) to declare hval as const char *:

-    char *hval = NULL;
+    const char *hval = NULL;

And update the context fields if they aren't already const char *.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@plugins/in_splunk/splunk_prot.c`:
- Around line 1381-1399: Add a NULL-check for request->stream->parent before
dereferencing parent_session->connection: after assigning parent_session =
(struct flb_http_server_session *) request->stream->parent, verify
parent_session != NULL and return an error (e.g., -1) or perform appropriate
error handling if it is NULL; then continue with the existing logic that uses
parent_session->connection (used by extract_remote_address and
flb_connection_get_remote_address) to avoid a crash if the parent session is
missing.
🧹 Nitpick comments (1)
plugins/in_splunk/splunk_prot.c (1)

424-481: Use the passed remote_addr parameters to avoid shared mutable state.
Right now process_flb_log_append() ignores its new parameters and re-reads ctx->current_remote_addr. Using the parameters makes the function’s contract explicit and reduces reliance on shared state.

♻️ Suggested change
-    if (ret == FLB_EVENT_ENCODER_SUCCESS) {
-        ret = append_remote_addr(ctx,
-                                 ctx->current_remote_addr,
-                                 ctx->current_remote_addr_len);
-    }
+    if (ret == FLB_EVENT_ENCODER_SUCCESS) {
+        ret = append_remote_addr(ctx, remote_addr, remote_addr_len);
+    }

Also applies to: 775-780

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
plugins/in_splunk/splunk_prot.c (2)

424-486: Unused parameters: remote_addr and remote_addr_len are passed but ignored.

The function signature accepts remote_addr and remote_addr_len parameters (lines 427-428), but the implementation at lines 482-485 uses ctx->current_remote_addr and ctx->current_remote_addr_len directly instead. This inconsistency makes the API misleading.

🐛 Proposed fix: use the passed parameters
     if (ret == FLB_EVENT_ENCODER_SUCCESS) {
         ret = append_remote_addr(ctx,
-                                 ctx->current_remote_addr,
-                                 ctx->current_remote_addr_len);
+                                 remote_addr,
+                                 remote_addr_len);
     }

Alternatively, if the intent is to always use the context's current address, remove the unused parameters from the function signature.


780-819: Unused parameters in process_hec_raw_payload.

The function signature was extended to include remote_addr and remote_addr_len (lines 784-785), but these parameters are never used in the function body. The call to process_raw_payload_pack at line 816 doesn't pass them, and process_raw_payload_pack reads from ctx->current_remote_addr directly.

Either remove the unused parameters from the signature, or if they were intended for future use, add a comment explaining this.

♻️ Proposed fix: remove unused parameters
 static int process_hec_raw_payload(struct flb_splunk *ctx, struct splunk_conn *conn,
                                    flb_sds_t tag,
                                    struct mk_http_session *session,
-                                   struct mk_http_request *request,
-                                   const char *remote_addr,
-                                   size_t remote_addr_len)
+                                   struct mk_http_request *request)

Then update the call site at line 1032 accordingly.

🧹 Nitpick comments (1)
plugins/in_splunk/splunk_prot.c (1)

265-290: Const-correctness issue: output parameter should be const char **.

The function assigns a const char * (from extract_xff_value and flb_connection_get_remote_address) to *out, but the parameter is declared as char **. This casts away const, which could lead to undefined behavior if callers attempt to modify the returned string.

♻️ Proposed fix
-static int extract_remote_address(const char *xff_value,
-                                  size_t xff_value_len,
-                                  struct flb_connection *connection,
-                                  char **out,
-                                  size_t *out_len)
+static int extract_remote_address(const char *xff_value,
+                                  size_t xff_value_len,
+                                  struct flb_connection *connection,
+                                  const char **out,
+                                  size_t *out_len)

This will require updating the callers to use const char * for the corresponding local variables (hval in splunk_prot_handle and splunk_prot_handle_ng).

Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
…bject

Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Copy link
Contributor

@patrick-stephens patrick-stephens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would probably break the commit linter changes out into a separate PR to allow easier merging.

@cosmo0920
Copy link
Contributor Author

I would probably break the commit linter changes out into a separate PR to allow easier merging.

I sent a separated PR as:
#11407

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants