-
Notifications
You must be signed in to change notification settings - Fork 514
add structured logging with context #8363
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Currently a WIP. Right now it provides structured context in the manner of a single level map of strings. A convenience method provides a "group" context where you can log messages for a given "group" name. The thinking here is that we could have all the cache functions log its messages with a group name "cache" so that "group=cache" context is attached to every one of its log messages. And for, say, everything in the discovery package we could log its messages with a logger whose context is "group=discovery". And so on. |
f60e008
to
f3e1549
Compare
The commit 6c6d1af adds a logger for the kialiCache group to see what it would look like to convert to using a logger with context (all logging in the cache package uses this logger). This is what the text log looks like for this (notice "group=kialiCache") -
and when logging in JSON (see the "group" of "kialiCache" for some messages): {"level":"debug","ts":"2025-04-24T01:23:03Z","msg":"Rest perf config QPS: 175.000000 Burst: 200"}
{"level":"info","ts":"2025-04-24T01:23:03Z","msg":"Initializing Kiali Cache"}
{"level":"debug","group":"kialiCache","ts":"2025-04-24T01:23:03Z","msg":"Using 'cluster' scoped Kiali Cache"}
{"level":"debug","ts":"2025-04-24T01:23:03Z","msg":"K8s Gateway API CRDs are not installed. Required K8s Gateway API version: gateway.networking.k8s.io/v1. Gateway API will not be used."}
{"level":"debug","group":"kialiCache","ts":"2025-04-24T01:23:03Z","msg":"Starting cluster-scoped informers"}
{"level":"info","group":"kialiCache","ts":"2025-04-24T01:23:03Z","msg":"Waiting for cluster-scoped cache to sync"}
{"level":"info","group":"kialiCache","ts":"2025-04-24T01:23:04Z","msg":"Started"}
{"level":"info","group":"kialiCache","ts":"2025-04-24T01:23:04Z","msg":"Kube cache is active for cluster: [Kubernetes]"}
{"level":"debug","ts":"2025-04-24T01:23:04Z","msg":"Starting polling istiod(s) every 20 seconds for proxy status"} |
Commit db32fab introduces HTTP request logging. You get a logger in the handler context that logs information about the request for you. You just obtain the logger and log a message and the rest of the structured context is logged for you. The one example in the commit adds a debug message at the start of the Config handler. This is what it looks like in text format:
When logging with json format, it looks like this: {"level":"debug","group":"Config","user_agent":"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/134.0.0.0 Safari/537.36","ip":"127.0.0.1","req_id":"d057v6p3hfhs73evo2tg","ts":"2025-04-24T18:14:51Z","msg":"Kiali configuration has been requested"} So this works nicely. We can tweek what is automatically logged with http requests if we want something different - I just put this together to see it work. For example, it is logging "user_agent" and we probably don't need or want that. I would like to point out something you get for free here -- all messages logged by a specific handler will be grouped together under the name of its route (as the route name is known in the code). In the above example, the name of the route as it is known in the code is "Config" and that's what you see under the "group" context value. So any and all log messages logged by this config handler's http request logger will be grouped by that name - easy for filtering and sorting when analyzing logs ("show me all logs emitted by the config handler"). You could then further drill down by filtering by request ID (which is another one of the context values you will get for free when using these loggers). There is a convenience method in our log package to help avoid handlers from having to explicitly import our current logging library (zerolog). We just need to import our log package like we have always done; then you simply call From there, I left it up to the developer to call zerolog API directly rather than wrap all of it in our own wrapper functions. So to log a debug message, you do this: log.FromRequest(r).Debug().Msg("your message here") With your message there will also be logged all the structured context data that comes with the http request. Note that in all handlers, we should now always have one of these loggers available (it is setup in routes.go - see the changes in this PR for how that is done). So this is good - right now all handlers have the ability to log http request specific messages (none do right now, except for that one in Config that I added just to see this work). |
This is where we apply http request data to the loggers - we will want to see what we really want here: c = c.append(hlog.UserAgentHandler("user_agent"))
c = c.append(hlog.RemoteAddrHandler("ip"))
c = c.append(hlog.RequestIDHandler("req_id", "Request-Id")) See hlog API for the different fields that are supported: https://pkg.go.dev/github.com/rs/zerolog@v1.34.0/hlog#pkg-functions For a local test, I added all the ones that are available just to see what it looks like (I ended up committing this change - we can remove what we don't want, but I wanted this in so reviewers can see everything getting logged). c = c.append(hlog.NewHandler(log.WithGroup(route.Name).Z))
c = c.append(hlog.HostHandler("host", true))
c = c.append(hlog.HTTPVersionHandler("http_version"))
c = c.append(hlog.MethodHandler("method"))
c = c.append(hlog.ProtoHandler("proto"))
c = c.append(hlog.RefererHandler("referer"))
c = c.append(hlog.RemoteAddrHandler("remote_addr"))
c = c.append(hlog.RemoteIPHandler("remote_ip"))
c = c.append(hlog.RequestIDHandler("request_id", "Request-Id"))
c = c.append(hlog.URLHandler("url"))
c = c.append(hlog.UserAgentHandler("user_agent")) formatted json is: {
"level": "debug",
"group": "Config",
"host": "localhost",
"http_version": "1.1",
"method": "GET",
"proto": "HTTP/1.1",
"remote_addr": "127.0.0.1:37532",
"remote_ip": "127.0.0.1",
"request_id": "d058p54g6gcc73d88vtg",
"url": "http:///kiali/api/config",
"user_agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/134.0.0.0 Safari/537.36",
"ts": "2025-04-24T19:10:12Z",
"msg": "Kiali configuration has been requested"
} |
Commit c37abcb adds a debug message when apps in the cluster are requested. I did this as a test to see what the structured content looks like for requests that have params in the URL. The code: log.FromRequest(r).Debug().Msg("Fetching all apps in single cluster") The formatted json: {
"level": "debug",
"group": "ClustersApps",
"host": "localhost",
"http_version": "1.1",
"method": "GET",
"proto": "HTTP/1.1",
"referer": "http://localhost:20001/kiali/console/applications?duration=60&refresh=10000&namespaces=istio-system",
"remote_addr": "127.0.0.1:36070",
"remote_ip": "127.0.0.1",
"request_id": "d0592sfuodus739lfhfg",
"url": "http:///kiali/api/clusters/apps?health=true&istioResources=true&rateInterval=60s&namespaces=istio-system&clusterName=Kubernetes",
"user_agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/134.0.0.0 Safari/537.36",
"ts": "2025-04-24T19:30:57Z",
"msg": "Fetching all apps in single cluster"
} |
I think this PR is ready for review and merge. In followup PRs, we can start adding new log messages in handlers and packages (using groups) - but this will at least get the infrastructure in place now. |
One feature that might be nice is to have those request data configurable in the logs. So have most of them turned off by default, but if someone is debugging, be able to turn them on. Perhaps something in the spec.deployment.logger section of the config: spec:
deployment:
logger:
request_context_enabled: true If false, maybe we only log the request id (or some other set of minimal data) but when true we log everything. I recommend we do this in another PR in the future if it is something we want. |
@nrfox you had some ideas of where you wanted some http request logging to be (you seemed to want to track request IDs for some handlers??) -- what did you want to log / where did you want those logs to be? See how it would fit with this stuff. It should be fairly easy - just look at the two examples I have in here in the handlers/apps.go and handlers/config.go (basically one line in each file right now). I just want to make sure this would work (I think it should but need confirmation). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very much +1 on adding structured logging. I think it will be very helpful in troubleshooting issues in Kiali.
I'm still not convinced we should roll our own structured logger. The API is not as robust as either slog or zerolog and we will need to maintain it going forward. Parts of the app that are using the kiali log.Info
wrappers around zerolog can keep doing this and the parts of the app that adopt structured logging can just use the logger from that lib. zerolog has some additional features that slog doesn't but slog is more similar to our log wrappers in that you can call log.Info(...)
as opposed to log.Info().Msg(...)
and slog is stdlib so I expect that API to be more widely used over time. You can also use them both together by configuring zerolog has a backend handler with something like: https://github.com/samber/slog-zerolog so that we can keep the zerolog options such as sampling rate but use the slog API.
// Things to help build the logger handler chain (see https://pkg.go.dev/github.com/rs/zerolog/hlog#example-package-Handler) | ||
// fake alice to avoid dep | ||
type alice struct { | ||
m []func(http.Handler) http.Handler | ||
} | ||
|
||
func (a alice) append(m func(http.Handler) http.Handler) alice { | ||
a.m = append(a.m, m) | ||
return a | ||
} | ||
func (a alice) then(h http.Handler) http.Handler { | ||
for i := range a.m { | ||
h = a.m[len(a.m)-1-i](h) | ||
} | ||
return h | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just import the dep?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I followed the example that does this: https://pkg.go.dev/github.com/rs/zerolog/hlog#example-package-Handler
We don't need yet another dep to manage just for this - all we need are those two simple functions to do what we need here.
routing/router.go
Outdated
c = c.append(hlog.RefererHandler("referer")) | ||
c = c.append(hlog.RemoteAddrHandler("remote_addr")) | ||
c = c.append(hlog.RemoteIPHandler("remote_ip")) | ||
c = c.append(hlog.RequestIDHandler("request_id", "Request-Id")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO we should only add Request-Id
for info level logs. Some of them, like URL
may have sensitive info in them if it's an auth redirect.
Also -1 on snake case. Unless this is a convention for logging nothing else in the app is logged this way. Seems like it should either be RequestID
or RequestId`.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed them all but request ID and renamed it.
So, I'm not sure what else we would need to adopt structured logging other than what we already have (ignoring this PR). To "introduce structured logging", couldn't we just do what you say there - just use the logger API (in our case from zerolog)? It already has structured logging. So... AFAICS you can just import "zerolog" in the .go file and use "zerolog.Logger" and .. there you go. :) You don't have to use kiali/log wrapper functions. This has been possible since the day we switched to zerolog 5 years ago. Unless I'm vastly misunderstanding what is being asked here. This PR does two things - (1) it provides a convenience logger that can add a group name to the structured context of a logger so, for example, cache code can use that logger and thus devs do not have to remember to constantly add "[Kiali Cache]" explicitly to the log message itself; and (2) it adds structured content to http requests (RequestID), and handlers can grab the logger from the request context and use it (which is what the issue ultimately was asking for: |
Next up, I'm going to experiment with slog-zerolog : https://github.com/samber/slog-zerolog (if this works like I think it should, we can opt to use slog API in addition to zerolog, depending on what we want). |
8efc7a6
to
9724998
Compare
Yes that's great! I did not realize that zerolog already supported this and that's good enough for me.
This is the part I was unsure of since you can also do the same thing with the zerolog API right? Which seems to contradict "if you want structured logging, just use the zerolog API". This isn't a rhetorical question, I'm just trying to clarify. If I'm adding a logger to a package like mesh discovery for example and I want all logs to be labeled with a group, which logger would I use? |
You would follow the pattern of the example of the kiali cache - see commit: 6c6d1af (i.e. there is a logger defined in your package that you would use - in the kiali cache case, it is defined in cache.go and is But then the followup question would be, "Great, now I have a way to log messages with a group name, but what happens if I want to add some additional structured content to that logger?" Answer: There isn't a way with this PR yet, but presumably we could just make public the Z logger (the zerolog logger) in the ContextLogger object you get back - and from there you use the zerolog API directly (e.g.
8000
But since the vast majority of our logging is simply "Debug()" or "Info()", etc... directly accessing Z wouldn't be necessary and all the log message code would like basically like it does now except |
Kiali already configures the global zerolog logger, right? Couldn't you just use that w/o any code changes? |
var ( | ||
klog = log.WithGroup("kialiCache") | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just personal preference but this could also be a variable on the cache/kubecache structs rather than a global variable. This is really nit picky but I'd rather we keep log
than klog
. You can always change the import and keep the log statements the same: klog "github.com/kiali/kiali/log"
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather we keep log than klog
I agree - and I tried to do that. I couldn't figure out a good way to do it and have the code compile-able :)
Putting the logger in some struct could be a good way to do it.
Yes, you can use zerolog.Logger if you don't have any group name ( I mean, you COULD totally avoid using anything in this PR and nothing in our current kiali/log package, but then that gets us back to the same problem we had before - now all the developers have to remember to add the group name to each log message (which, in the kiali cache case, some messages were missing them, presumably because a dev forgot to add the string So, yeah, you could have this everywhere in the package code: zerolog.Logger.Debug().Str("group", "myGroupNameHere").Str("moreContext", "value").Msg("hello") or (if no additional context is wanted): zerolog.Logger.Debug().Str("group", "myGroupNameHere").Msg("hello") or (if no group name is wanted): zerolog.Logger.Debug().Msg("hello") IMO, that's noisy, lots of text, and people will forget to manually add the klog.Z.Debug().Str("moreContext", "value").Msg("hello") klog.Debug("hello") log.Debug("hello") |
No that would be bad and I agree wouldn't help much. What you do with zerolog, I think, is create a sublogger with the context added and use that: https://github.com/rs/zerolog?tab=readme-ov-file#sub-loggers-let-you-chain-loggers-with-additional-context. So something like:
I'm for adding convenience functions, for example being able to do |
Saying this another way, I'm good with this PR as is but I'd rather we expose a zerolog.Logger directly from |
That is actually what klog = log.WithGroup("kialiCache") (our log package's WithGroup() is convenience function that creates that sublogger.) The first question would be: is our standard practice putting these subloggers as package var's or put the subloggers in some struct that is created in the package somewhere and passed around via the struct? I could go either way - but we definitely want to pass subloggers around in the package (either as a global var or struct var). Since we already know we get bitten by global vars in our unit tests, we should probably consider putting them in structs. The second question would be, do we use convenience functions or just use the zerolog API. This PR has a WithContext convenience function that allows devs to create a sublogger with multiple name/value context pairs: func WithContext(fields map[string]string) ContextLogger {
ctx := log.Logger.With()
for k, v := range fields {
ctx = ctx.Str(k, v)
}
cl := ContextLogger{Z: setLogFormat(setSamplingRate(ctx.Logger()))}
return cl
} We don't have to have this but I suspect once we start using structured contexts in logging, it would be nice to have a function like this (side note: WithGroup calls this WithContext but with a single map entry whose key is "group") |
I'd prefer not having them be global variables but this may prove inconvenient if there's a lot of functions in a package where you want to log stuff with the context because then you need to pass the logger as an argument or through
The zerolog API also supports more than just strings as key/value pairs. I'm fine with merging this in as is and kicking the tires on it a bit more before we decide want to go all one way or all another. |
to the logger so when we log a message there will be context that includes the group name (in this case "kialiCache")
adding Config handler log message as an initial test to see what it looks like
…he json output. this also follows the slog convention, too.
This logs everything that hlog will know about the request. I put them all in here because it will be easier to remove what we don't want later. This at least will show us everything in the logs and we can determine what we like or don't like here.
This is just to see what the structured log looks like for a request that has params in the URL
rename RequestID
…he new group logger
9724998
to
1ec3004
Compare
I had to rebase and fix a conflict in go.sum - so the approval was made stale. you'll need to approve this PR again to merge. We can adjust on the fly as we go once we start using this and seeing if we need something different. After this gets in, I'll start another PR that starts looking at places where structured logging could be useful - from there we can see how this works in practice. |
fixes: #8346