Back to the drawing board for utrace?
The utrace tracing framework has had a tortuous path towards the mainline, but it always seemed like it was headed that direction. Over the past week or so, things have gotten rather murkier for the mainline inclusion of utrace. Linus Torvalds made a pronouncement that would seem to leave SystemTap without a future in the mainline—something that many had suspected for a while—but also put the future of utrace in doubt. Further discussion may have provided a way forward, but, at least in its current form, mainline utrace seems very unlikely.
The discussion resulted from a request by
Frank Ch. Eigler to include utrace into linux-next. That led to a
discussion about whether it was ready for linux-next—because it was
likely to be merged in the next release cycle—or whether it should spend
some time in another tree. Since an earlier version of utrace
was in Andrew Morton's -mm tree, that was a potential path. Morton said
that utrace "didn't break anything
", but:
Someone please sell this to us.
Morton also dredged up a response he had gotten from Oleg Nesterov the last time he asked, which listed various potential uses for utrace. In-kernel uses for utrace are important—new features are rarely merged without one—and an earlier utrace merge attempt ran into opposition because it lacked one. This time around, Nesterov and Roland McGrath included a rewrite of the ptrace() system call using utrace as part of the patch submission. It was hoped that rewriting the notoriously ugly ptrace() code using the cleaner utrace API would be the last hurdle for inclusion into the mainline.
But, replacing the guts of the ptrace() call, even though it may clean things up, is controversial. ptrace() is part of the kernel ABI that must be maintained—ugly or not—but cleaning it up is not without its risks, as Morton points out:
The risk is small, though, according to
Eigler, because "this code has been deployed in fedora
and rhel for several *years*, with millions of users. It's not some
rickety experiment.
" Eigler also added to Nesterov's list
of utrace uses as SystemTap's user-space probing is based on utrace. But
SystemTap and one of the other potential uses on that list, namely
reworking seccomp to use utrace, are what set
Torvalds off:
Torvalds's complaint stems from the fact that utrace provides no user-space
interface at all. It is purely an internal kernel API that is meant to be
used by kernel code like the ptrace() rewrite, but also for kernel
modules, which is part of what worries Torvalds. It provides lots of hooks
that can be used by "random crazy out-of-tree crap
", but
doesn't provide any benefit to user space at all, he said:
One of the biggest problems with ptrace() is its signal-oriented interface. Programs using ptrace() act as the parent process of the tracee and must use wait() to detect state changes. For that reason, there can only be one ptrace() active for a particular process. So an strace of a program that is being debugged with gdb will not succeed. The ptrace() implementation using utrace would change that, but not directly, as there would still need to be a kernel piece that attached another utrace engine.
An in-kernel gdb "stub" using utrace—floated as an RFC back in November—could provide that kernel piece, but was met with a fair amount of resistance when it was proposed. The limitation that ptrace() imposes is seen as something that could, perhaps should, be lifted, but adding a relatively large, kernel-only API to do that is excessive. As Torvalds puts it:
So stop the crazy "new kernel interfaces" crap. Stop the crazy "maybe we can use it for ftrace and generic user event tracing too". Stop the crazy.
The elephant in the room, of course, is SystemTap. It creates, builds, and loads kernel modules for doing its tracing, and uses utrace for the user-space tracing. That model is not popular with most kernel developers, especially for an out-of-tree solution—the APIs that it relies on are far too volatile. SystemTap must be updated when those interfaces change, and all of the previous versions must be maintained so that SystemTap can still be used with older kernels. Because of that, SystemTap may be out-of-sync with development kernels, which makes its utility for kernel hackers quite small.
The utrace proponents are pushing it as something useful in its own right, completely separate from its use in SystemTap, but one gets the sense that many of the kernel developers aren't quite buying that. Ted Ts'o tries to explain his concerns to Eigler
He goes on to compare the situation to that of the NVIDIA graphics drivers,
which leads
Kyle Moffett to propose a variation on Godwin's
law: "As an LKML discussion grows longer, the probability of an unfavorable
comparison involving nVidia or Microsoft approaches 1.
" More to the
point, though, Moffett said he was uninterested in SystemTap:
Ts'o sees those features as potentially useful, but points out that they should be submitted with utrace for review. It may be that utrace in its present form does not survive that review:
Without an in-tree "killer feature" that only utrace can provide, there is going to be resistance to merging such an easily-abused API. Several suggestions were made—notably by Torvalds and Ingo Molnar—to enhance ptrace() itself to support some new features (such as multiple active calls or the ability to read/write more than a word at a time between the two processes), but that would mean scrapping much or all of the utrace work. Nesterov and McGrath, who are the ptrace() maintainers, have been largely silent throughout the discussion, but, previously, they have made it clear that they would much rather work with the utrace-based ptrace() implementation. So it is unclear when or if enhancements to the current code might happen.
Without utrace, SystemTap will have to find other ways to hook user space, but that doesn't really faze the kernel developers—particularly after Torvalds's unequivocal rejection of that approach—as there are other tracing solutions in the pipeline. Ftrace and perf events are slowly building capabilities, and are doing so in-tree. They are likely to grow the needed features to support kernel and user-space tracing a la SystemTap (and DTrace). Molnar specifically invites the SystemTap developers to collaborate:
perf record -R -f -e irq:irq_handler_entry --filter 'irq==18 || irq==19'More could be done - a simple C-like set of function perhaps - some minimal per probe local variable state, etc. (perhaps even looping as well, with a limit on number of [predicate] executions per filter invocation.)
It is unfortunate, in many ways, that SystemTap has gotten to this point.
While it is possible that Torvalds could change his mind, he and other
kernel developers find the new tracing
features to be "a million times superior
" to SystemTap. That
could leave Red Hat holding the SystemTap bag
for quite some time to come, as it will need to support it for existing,
and likely future,
RHEL versions. It is interesting to note that this alternate solution,
based on Ftrace, etc., is also largely coming out of Red Hat.
It seems possible that utrace will be a casualty here as well. By incorporating features that were needed for SystemTap, and not providing a user-space interface, it tried to both do too much and too little. There are some potential ways forward, but its unclear whether they will be pursued. Torvalds points to the realtime tree as an example of how to get "crazy" things merged:
But on the whole, I think it's actually worked out pretty well for them. I think the mainline kernel has improved in the process, but I also suspect that _their_ RT patches have also improved thanks to having to make the work more palatable to people like me who don't care all that deeply about their particular flavor of crazy.
There are definitely lessons here, but the standard ones don't seem to
apply. SystemTap and utrace were developed in the open, as free software
from the outset, and were fairly often discussed on linux-kernel.
SystemTap in particular was regularly criticized, to seemingly no
avail. The biggest lesson—and the hardest to learn, especially after
a feature has shipped—may be that
ignoring the advice and complaints of the kernel developers is likely to
come back and bite in the end. It is not terribly surprising, really, but
that seems to be what is happening here.
Index entries for this article | |
---|---|
Kernel | Utrace |
Posted Jan 28, 2010 3:35 UTC (Thu)
by fuhchee (guest, #40059)
[Link] (1 responses)
Posted Jan 29, 2010 16:00 UTC (Fri)
by fuhchee (guest, #40059)
[Link]
"By incorporating features that were needed for SystemTap, and not providing a user-space interface, it tried to both do too much and too little."
... reflect several misunderstandings.
utrace did not incorporate any features particularly for systemtap. It was designed and built independently, and *prior* to this part of systemtap getting started. As a well-designed framework, it turned out to be useful for more than just debugger control, so uprobes used it. (These dependencies are not etched in stone.)
And as for "too little", this line of thinking presumes that any new internal kernel functionality must by necessity be exposed via brand-new userspace protocol. It ignores suitable pre-existing interfaces (ptrace, gdb remote protocol) that simply work *better* with the new internals.
When an article relays so much opinion and heated debate, such editorial comments can lead astray.
Posted Jan 28, 2010 11:00 UTC (Thu)
by nix (subscriber, #2304)
[Link] (1 responses)
It seems the utrace developers are damned if they do and damned if they
Posted Feb 4, 2010 17:32 UTC (Thu)
by ariveira (guest, #57833)
[Link]
Many times the right hand of the kernel community does not know what the left
Posted Jan 28, 2010 13:42 UTC (Thu)
by Cyberax (✭ supporter ✭, #52523)
[Link] (1 responses)
Posted Jan 28, 2010 19:09 UTC (Thu)
by daney (guest, #24551)
[Link]
Userspace? I vaguely remember about hearing about such a thing...
But seriously, I don't think it is hatred. Perhaps benign neglect.
Posted Jan 28, 2010 19:18 UTC (Thu)
by bkoz (guest, #4027)
[Link] (1 responses)
Pointing at a still-unmerged tree for realtime and saying "that's the way to do it, see?" seems disingenuous at best. Saying the maintainers of ptrace have no comment is also questionable, when in a related article Oleg indicates long-term solutions outside of ptrace are mandatory.
From outside the kernel community, tuning in at regular intervals to observe the serialized show, it looks to me that these two groups are just passing in the same development space without any real progress.
Featuring:
THE KERNEL. We watched him grow up, gain power and prestige while trying to remain sane in a world gone mad. There's a constantly changing field of opportunity and adversary that he attempts to navigate deftly. Detractors, even gentle ones, whisper of a touch of dementia, a rather narrow and introspective view of the world. Me! Me! It's not what I say or do or even what I say I plan to do (API), no it's what I've done (ABI).
SUITOR U. Has loved and maintained key parts of THE KERNELs business for years. Fix this. Fix that. Has proposed a grand utrace building for the main kernel grounds, and when the blueprints were shown, and told to adjust and update the main house kitchen in addition (ptrace), did so. While doing this, the groundskeepers stopped by and asks the suitor about using the utrace building basement for some new scheme. Last week's episode was when SUITOR U took the latest in a long series of plans back to THE KERNEL, and due to the general inclement weather and dour moods, was told something else entirely. Oh dear!
SUITOR L. Remember me? We danced way back when. I had a real pretty dress, a special bias cut, and caught your attention for a few months. You still let me come to your balls but now I only dance with an advisor or two, and console myself by saying I have the pick of the entourage. Call me. I follow you on twitter.
SUITOR F. It's my cotillion!!! OMG. He danced with me. Was told he's a rake but in this light, doesn't care. The night is young, whooopeeee!
Posted Jan 28, 2010 19:43 UTC (Thu)
by dlang (guest, #313)
[Link]
this is even allowing for bugfixes over time. They (the realtime people) have explicitly stated that breaking out portions from their tree and submitting them as individual features (and dealing with the demands to justify and clean things up before they are accepted) has significantly improved the realtime tree itself.
Posted Jan 31, 2010 7:59 UTC (Sun)
by sfink (guest, #6405)
[Link] (3 responses)
Man, am I confused! I have a relatively straightforward problem: I am trying to track down
I've tried oprofile briefly, and it seemed mostly irrelevant for my problems. I can't find any
But now I see that SystemTap is viewed as the bastard stepchild, and I wonder if my
Posted Feb 1, 2010 11:59 UTC (Mon)
by fuhchee (guest, #40059)
[Link] (2 responses)
Heh. But the passive voice gives the opinion false authority.
"and I wonder if my investment into learning to use SystemTap was an expensive mistake."
You should explore the alternatives and use whatever works for you.
Posted Feb 1, 2010 13:36 UTC (Mon)
by Cyberax (✭ supporter ✭, #52523)
[Link] (1 responses)
I'm in the same WTF situation. There are like 6 "tracing" solutions for Linux. Most of them with only a few lines of documentation (SystemTap at least is the most documented of them).
That's a complete SNAFU. Why can't we have one nice user-oriented solution?
Posted Feb 1, 2010 14:29 UTC (Mon)
by mjw (subscriber, #16740)
[Link]
The SystemTap documentation can be found at:
Posted Feb 5, 2010 2:09 UTC (Fri)
by mfedyk (guest, #55303)
[Link]
Generating code dynamically that gets compiled and linked into the kernel just seems scary and error prone.
IMO, come up with an API that can express the different things that you want to find out. And oops, in a roundabout way you now are getting a small limited scripting language in the kernel from tracing events and what do you know? Dtrace does that as well.
No, I'm not saying do it how foo does it but I can't see how dynamically generating C code to make a kernel module is seen superior to a small audited scripting language built into the kernel.
Though if anyone besides Ingo had tried getting that scripting language into the kernel, I would laugh. _No Possible Way_. Seriously.
It would be as absurd as putting X drivers in the kernel did a few short years ago.
Back to the drawing board for utrace?
retorted to some extent. It would do an injustice to the
topic to accept these excerpts at face value.
Back to the drawing board for utrace?
Back to the drawing board for utrace?
including gdbstub, certainly including a reimplementation of ptrace using
utrace). The response: come back without all that extra stuff on the side.
don't.
Back to the drawing board for utrace?
not participated in the discussion. i too would be somewhat fed up ...
hand is doing so to speak
Back to the drawing board for utrace?
Back to the drawing board for utrace?
el trace narcotico
el trace narcotico
Back to the drawing board for utrace?
latency and jitter problems in a realtime media stream generating application. How do I
decide what tool to use? Oprofile? Ftrace? Perf events? SystemTap? LTTng?
clear descriptions of what ftrace and perf events actually are, so that I could figure out
whether I should bother with them. I've been fairly happy with systemtap so far -- it is at least
straightforward to dive into and start generating customized traces that at least startto
address what I'm interested in. Each further step seems to involve more and more groveling
about in the kernel sources, but at least I can understand the path ahead and predict what's
going to be possible. (I have a fairly generic problem -- I want to measure the jitter between
context switches of my realtime threads and then diagnose the reasons for that jitter by
reporting what the CPUs were up to when I wasn't running.)
investment into learning to use SystemTap was an expensive mistake. Any guidance on
choosing an appropriate tool for unfortunates like me?
Back to the drawing board for utrace?
Listening to simple smears may well be self-defeating.
Back to the drawing board for utrace?
SystemTap documentation
http://sourceware.org/systemtap/SystemTap_Beginners_Guide/
http://sourceware.org/systemtap/langref/
http://sourceware.org/systemtap/examples/
Back to the drawing board for utrace?