-
Notifications
You must be signed in to change notification settings - Fork 1.9k
in_splunk: Implement handling remote addr feature #11398
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
📝 WalkthroughWalkthroughAdds optional per-request remote address extraction (from X-Forwarded-For or connection peer), threads that address through HTTP handlers and payload processors to append it into emitted records, exposes config options to enable/name the field, initializes per-request storage, and adds an integration test for XFF extraction. Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Client
participant H as Splunk HTTP Handler
participant P as Header Parser
participant Peer as Connection Peer
participant L as Payload Processor
participant E as Emitter
Client->>H: POST /services/collector/event (headers + payload)
H->>P: lookup "x-forwarded-for"
alt XFF present
P-->>H: XFF value (remote_addr)
else XFF missing
H->>Peer: get peer address
Peer-->>H: peer address (remote_addr)
end
H->>H: store current_remote_addr
H->>L: process payload (pass remote_addr)
L->>L: append remote_addr to records
L->>E: emit records with injected remote_addr
H->>H: clear current_remote_addr
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~30 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ba50be0c16
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@plugins/in_splunk/splunk_prot.c`:
- Around line 1381-1399: In splunk_prot_handle_ng() the per-request fields
context->current_remote_addr and context->current_remote_addr_len are set but
never reset, allowing stale addresses to persist across requests; update the
function to ensure these fields are cleared before every return (or funnel
returns through a single cleanup label), e.g., after using
extract_remote_address(), when falling back to peer
(flb_connection_get_remote_address(parent_session->connection)), and prior to
any early exits: set context->current_remote_addr = NULL and
context->current_remote_addr_len = 0 (or free/reset as appropriate) so the same
cleanup performed in splunk_prot_handle() is applied here.
🧹 Nitpick comments (5)
plugins/in_splunk/splunk.h (1)
76-79: Consider usingconst char *instead offlb_sds_tfor borrowed pointers.
current_remote_addris assigned non-owned pointers from either the XFF header value orflb_connection_get_remote_address()insplunk_prot.c. Usingflb_sds_tis misleading since it implies an owned/allocated string that should be managed withflb_sds_*functions.For clarity and to prevent accidental misuse:
♻️ Suggested change
/* Remote address */ - flb_sds_t current_remote_addr; + const char *current_remote_addr; size_t current_remote_addr_len;plugins/in_splunk/splunk_prot.c (4)
265-290: Const-correctness issue in output parameter.The function assigns
const char *values (fromextract_xff_valueandflb_connection_get_remote_address) to*out, butoutis declared aschar **. This discards theconstqualifier and may cause compiler warnings.♻️ Suggested fix
static int extract_remote_address(const char *xff_value, size_t xff_value_len, struct flb_connection *connection, - char **out, + const char **out, size_t *out_len) {Also update the corresponding field type in
splunk.hand call sites insplunk_prot_handle()andsplunk_prot_handle_ng().
424-428: Unused parameters in function signature.The
remote_addrandremote_addr_lenparameters are added to the signature but never used. The function usesctx->current_remote_addrandctx->current_remote_addr_lendirectly at lines 478-480.Either use the passed parameters or remove them from the signature to avoid confusion:
♻️ Option 1: Remove unused parameters
static void process_flb_log_append(struct flb_splunk *ctx, msgpack_object *record, flb_sds_t tag, flb_sds_t tag_from_record, - struct flb_time tm, - const char *remote_addr, - size_t remote_addr_len) + struct flb_time tm)♻️ Option 2: Use the passed parameters
if (ret == FLB_EVENT_ENCODER_SUCCESS) { ret = append_remote_addr(ctx, - ctx->current_remote_addr, - ctx->current_remote_addr_len); + remote_addr, + remote_addr_len); }
775-780: Unused parameters in function signature.Similar to
process_flb_log_append(), theremote_addrandremote_addr_lenparameters are not used within this function. The downstreamprocess_raw_payload_pack()accessesctx->current_remote_addrdirectly.Consider removing these unused parameters for consistency:
♻️ Suggested change
static int process_hec_raw_payload(struct flb_splunk *ctx, struct splunk_conn *conn, flb_sds_t tag, struct mk_http_session *session, - struct mk_http_request *request, - const char *remote_addr, - size_t remote_addr_len) + struct mk_http_request *request)
1115-1118: Missing cleanup on early return paths.The per-request remote address is cleared at the end of successful processing, but multiple early return paths (lines 861, 928, 974, 1040, 1066, 1088, 1104) skip this cleanup. While the state is re-initialized at the start of each request (lines 1003-1004), for defensive coding it would be cleaner to use a
goto cleanuppattern to ensure consistent cleanup.Alternatively, since the state is always re-initialized at the start of
splunk_prot_handle(), this might be acceptable as-is. Just ensure this initialization always happens before any potential use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
plugins/in_splunk/splunk_prot.c (2)
424-481: Unused parameters:remote_addrandremote_addr_lenare never referenced.The function signature was updated to accept
remote_addrandremote_addr_len, but the implementation usesctx->current_remote_addrandctx->current_remote_addr_lendirectly (lines 479-480). Either use the parameters or remove them from the signature.♻️ Option 1: Remove unused parameters (simpler)
static void process_flb_log_append(struct flb_splunk *ctx, msgpack_object *record, flb_sds_t tag, flb_sds_t tag_from_record, - struct flb_time tm, - const char *remote_addr, - size_t remote_addr_len) + struct flb_time tm) {And update call sites accordingly.
♻️ Option 2: Use the parameters instead of context fields
if (ret == FLB_EVENT_ENCODER_SUCCESS) { ret = append_remote_addr(ctx, - ctx->current_remote_addr, - ctx->current_remote_addr_len); + remote_addr, + remote_addr_len); }
775-814: Unused parameters:remote_addrandremote_addr_lenare never referenced.Similar to
process_flb_log_append, these parameters are added to the signature but never used. The underlyingprocess_raw_payload_packreads fromctx->current_remote_addrdirectly.♻️ Suggested fix - remove unused parameters
static int process_hec_raw_payload(struct flb_splunk *ctx, struct splunk_conn *conn, flb_sds_t tag, struct mk_http_session *session, - struct mk_http_request *request, - const char *remote_addr, - size_t remote_addr_len) + struct mk_http_request *request) {Update the call site at line 1027 accordingly.
🤖 Fix all issues with AI agents
In `@plugins/in_splunk/splunk_prot.c`:
- Around line 1455-1459: In splunk_prot_handle_ng() the cleanup lines refer to
an undefined variable ctx; replace those uses with the correct function-local
variable name context (i.e., set context->current_remote_addr = NULL and
context->current_remote_addr_len = 0) so the per-request remote address is
cleared on the correct struct instance.
🧹 Nitpick comments (1)
plugins/in_splunk/splunk_prot.c (1)
265-290: Const-correctness issue: discardingconstqualifier.The function accepts
char **outbut assignsconst char *values to it (fromextract_xff_valueandflb_connection_get_remote_address). This silently discards theconstqualifier. Consider changing the output parameter type to preserve const-correctness.♻️ Suggested fix
static int extract_remote_address(const char *xff_value, size_t xff_value_len, struct flb_connection *connection, - char **out, + const char **out, size_t *out_len) { const char *value = NULL; size_t len = 0; extract_xff_value(xff_value, xff_value_len, &value, &len); if (value == NULL && connection != NULL) { value = flb_connection_get_remote_address(connection); if (value != NULL) { len = strlen(value); } } if (value == NULL || len == 0) { return -1; } - *out = value; + *out = value; *out_len = len; return 0; }Also update the callers (
splunk_prot_handleandsplunk_prot_handle_ng) to declarehvalasconst char *:- char *hval = NULL; + const char *hval = NULL;And update the context fields if they aren't already
const char *.
7435927 to
ae2ad69
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@plugins/in_splunk/splunk_prot.c`:
- Around line 1381-1399: Add a NULL-check for request->stream->parent before
dereferencing parent_session->connection: after assigning parent_session =
(struct flb_http_server_session *) request->stream->parent, verify
parent_session != NULL and return an error (e.g., -1) or perform appropriate
error handling if it is NULL; then continue with the existing logic that uses
parent_session->connection (used by extract_remote_address and
flb_connection_get_remote_address) to avoid a crash if the parent session is
missing.
🧹 Nitpick comments (1)
plugins/in_splunk/splunk_prot.c (1)
424-481: Use the passedremote_addrparameters to avoid shared mutable state.
Right nowprocess_flb_log_append()ignores its new parameters and re-readsctx->current_remote_addr. Using the parameters makes the function’s contract explicit and reduces reliance on shared state.♻️ Suggested change
- if (ret == FLB_EVENT_ENCODER_SUCCESS) { - ret = append_remote_addr(ctx, - ctx->current_remote_addr, - ctx->current_remote_addr_len); - } + if (ret == FLB_EVENT_ENCODER_SUCCESS) { + ret = append_remote_addr(ctx, remote_addr, remote_addr_len); + }Also applies to: 775-780
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
plugins/in_splunk/splunk_prot.c (2)
424-486: Unused parameters:remote_addrandremote_addr_lenare passed but ignored.The function signature accepts
remote_addrandremote_addr_lenparameters (lines 427-428), but the implementation at lines 482-485 usesctx->current_remote_addrandctx->current_remote_addr_lendirectly instead. This inconsistency makes the API misleading.🐛 Proposed fix: use the passed parameters
if (ret == FLB_EVENT_ENCODER_SUCCESS) { ret = append_remote_addr(ctx, - ctx->current_remote_addr, - ctx->current_remote_addr_len); + remote_addr, + remote_addr_len); }Alternatively, if the intent is to always use the context's current address, remove the unused parameters from the function signature.
780-819: Unused parameters inprocess_hec_raw_payload.The function signature was extended to include
remote_addrandremote_addr_len(lines 784-785), but these parameters are never used in the function body. The call toprocess_raw_payload_packat line 816 doesn't pass them, andprocess_raw_payload_packreads fromctx->current_remote_addrdirectly.Either remove the unused parameters from the signature, or if they were intended for future use, add a comment explaining this.
♻️ Proposed fix: remove unused parameters
static int process_hec_raw_payload(struct flb_splunk *ctx, struct splunk_conn *conn, flb_sds_t tag, struct mk_http_session *session, - struct mk_http_request *request, - const char *remote_addr, - size_t remote_addr_len) + struct mk_http_request *request)Then update the call site at line 1032 accordingly.
🧹 Nitpick comments (1)
plugins/in_splunk/splunk_prot.c (1)
265-290: Const-correctness issue: output parameter should beconst char **.The function assigns a
const char *(fromextract_xff_valueandflb_connection_get_remote_address) to*out, but the parameter is declared aschar **. This casts away const, which could lead to undefined behavior if callers attempt to modify the returned string.♻️ Proposed fix
-static int extract_remote_address(const char *xff_value, - size_t xff_value_len, - struct flb_connection *connection, - char **out, - size_t *out_len) +static int extract_remote_address(const char *xff_value, + size_t xff_value_len, + struct flb_connection *connection, + const char **out, + size_t *out_len)This will require updating the callers to use
const char *for the corresponding local variables (hvalinsplunk_prot_handleandsplunk_prot_handle_ng).
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
…bject Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
0cc7554 to
4c8b53d
Compare
patrick-stephens
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would probably break the commit linter changes out into a separate PR to allow easier merging.
I sent a separated PR as: |
Currently, in_splunk does not handle remote address.
This could be inconvenient to track remote address for traceability.
Enter
[N/A]in the box, if an item is not applicable to your change.Testing
Before we can approve your change; please submit the following in a comment:
If this is a change to packaging of containers or native binaries then please confirm it works for all targets.
ok-package-testlabel to test for all targets (requires maintainer to do).Documentation
in_splunk: Add remote_addr related parameters' descriptions fluent-bit-docs#2360
Backporting
Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.
Summary by CodeRabbit
New Features
Tests
Chores
✏️ Tip: You can customize this high-level summary in your review settings.