Testing for kernel performance regressions
Testing for kernel performance regressions
Posted Aug 4, 2012 22:12 UTC (Sat) by drdabbles (guest, #48755)Parent article: Testing for kernel performance regressions
The next problem is that there is little incentive for the hardware vendors that contribute directly to the kernel (I'm looking at you, Intel) to also contribute a test suite for their contributions. Moreover, it's quite conceivable that contributing a test suite would allow people to reverse engineer the hardware in question to some extent. I don't see this as a problem, but then again I don't make billions of dollars a year selling chipsets embedded on nearly every device in existence. There may also be regulatory concerns here, much like open-source WiFi drivers had to contend with here in the US several years ago. So, it's a sticky situation.
Additionally, you have the problem of how to actually execute tests. Software bugs aren't always painfully obvious. You don't always get a panic or segfault when a program or driver messes up. In fact, sometimes the program runs perfectly unaware that it has completely ruined data. This problem of subtle bugs can be seen in audio chipset drivers frequently. Sometimes an upgrade causes audio artifacts, sometimes the ports are reversed, and sometimes the power saving mechanisms act wonky. But only after a suspend from a warm reboot where a particular bit in memory wasn't initialized to 0. These things are extremely hard to detect with a test suite, because the suite has no idea if the audio is scratchy or if the port a user expects to be enabled is working properly.
Finally, if you could overcome the issues above, you have the case where suspend/resume/reboot/long run time causes a problem. To test this, the test suite needs complete access to a computer at a very low level. Virtualization will only get you a small portion of the way there. Things like PCI pass through are making this kind of test easier, but that in itself invalidates tests on a very basic hardware access level. This is where the idea of hardware vendors contributing resources to a test farm becomes a great idea. And as a comment above mentioned, the Linux Foundation could create some incentives for this. A platinum list of vendors / devices would be excellent! My organization ONLY uses Linux and *BSD in the field, so having that list to purchase from would be a pretty big win for us.
I think the solution will have to be several layers. A social change will need to be made, where developers don't dread writing test suites for their software. A policy change may be needed such as, "No further major contributions will be accepted without a test suite attached as well". And finally, the technical requirements to actually execute these test suites.
The good news is that booting a kernel with TESTSUITE=yes option could kick the kernel into a mode where only test suites are executed would be pretty easy. The box would never boot the OS, but would sit there running tests for all built (module or in-kernel) components, hammering on hardware to make drivers fail. Passing a further option that points at a storage device could be useful for saving results to a raw piece of hardware in a format that could be uploaded and analyzed easily.
Posted Aug 4, 2012 23:44 UTC (Sat)
by pabs (subscriber, #43278)
[Link] (2 responses)
Posted Aug 4, 2012 23:54 UTC (Sat)
by drdabbles (guest, #48755)
[Link] (1 responses)
Having to install 2, 3, or 4 test suite packages just to run the tests means nobody will ever actually run them.
Perhaps a solution like GIT- built specifically for the Linux kernel use case, could be helpful.
Posted Aug 5, 2012 2:56 UTC (Sun)
by shemminger (subscriber, #5739)
[Link]
Random testing is often better than organized testing! Organized testing works for benchmarks, but the developer in tawain who boots on a new box and reports that the wireless doesn't work is priceless.
Testing for kernel performance regressions
Testing for kernel performance regressions
Testing for kernel performance regressions