Leading items
The coming robot apocalypse
Avid science fiction readers will have run across the "robot takeover" concept more than once. In short, the human race succeeds in building robots that are smart and powerful enough to take charge; the results are rarely presented as being pleasant or desirable. Science fiction occasionally has a habit of becoming reality; while it may be a bit early to proclaim the robot apocalypse, recent events make it clear that some concern is not out of place.Ironically, one of those events involved a science fiction convention — arguably the most important one on the genre's annual calendar. As Neil Gaiman took the stage to accept a Hugo award at WorldCon, an automated copyright enforcement system at UStream disconnected the network feed, claiming that copyrights were being infringed. That turns out not to be the case; the offending Dr. Who segments had been provided by the studio for the purpose of being run before the talk, but the robot involved was not interested in such details. Remote fans who wanted to watch the live stream were denied the privilege in the name of protecting copyright.
Shortly after that, the national convention of the US Democratic party was hit by a similar episode; Mars lander footage has also been denounced by the robots in the past. Perhaps more importantly, there is a growing list of people who lack the prominence of a major author, politician, or interplanetary probe who have run into similar difficulties. Needless to say, these lower-profile publishers tend to have a harder time drawing attention to the problem or getting it fixed. Increasingly, the right to publish is subject to the will of anonymous, semi-autonomous software that is given veto power over any material it does not like. This does not seem like a viable path toward greater freedom.
The free software community, arguably, does not pay as much attention to copyright issues as it should. Copyright infringement is relatively easy to avoid while developing software, and, as the SCO Group so kindly made clear to the world, we are quite good at avoiding infringement. Related issues, like ever-lengthening copyright terms, are mostly of academic interest; whether coverage lasts for 50 or 70 years (or longer), it is hard to imagine that today's software will be of much interest when it finally makes its way into the public domain. So it is natural for us to worry about things that look like more immediate threats: software patents or new init systems, for example.
But the rise of autonomous enforcement bots raises a whole new set of threats. If there is anything that is clear about the "intellectual property" industry, it is that industry's willingness to use every tool and technique available — and to try to force others into doing the same. UStream almost certainly did not set out to be a copyright enforcer as part of its business plan; that role was forced upon it by the entertainment industry. There has been great pressure on internet service providers to do the same. It is not hard to imagine aggressive enforcement bots moving outward from the source of traffic (UStream or YouTube, say) into the transmission path and, eventually, into the endpoints where content is consumed.
It is also not hard to imagine that, as these bots spread across the network, their mission will expand as well. Why be satisfied with interfering with video distribution when there is so much more that could be done? Certainly these bots could be charged with stopping anything that looks like a "circumvention tool" — jailbreak kits or alternative images for locked-down phone handsets, for example. Software that has been deemed to infringe somebody's patents — Android, say — could be blocked. Electronic book readers already phone home with their users' reading habits; it would not be hard at all to block access to reading material that lacks a commercial paper trail or that offends whatever powers are in charge in any given area.
Much of the infrastructure for this kind of regime already exists. There is monitoring software for central nodes, and DRM-enforcement systems for the end nodes. And if your particular system lacks the appropriate hooks, there are companies like FinFisher that are happy to put monitoring and control software in place without concern for niceties like whether the user actually wants it.
In other words, there is little that is new about what is going on except, perhaps, the increasing role of autonomous software agents. These bots have little concern for issues like authorized use, much less fringe details like fair use or inappropriate copyright claims. Their use is on the increase; more stories of ridiculous bot-driven shutdowns in the future seem certain. It may not be the robot apocalypse, but it sure looks like the beginning of a large amount of robot-driven obnoxiousness at the very least.
Naturally, free software can help. We need not implement interfaces for the use of our robot overlords in our software; indeed, we have shown little interest in doing that in the past. When combined with true control over our hardware, free software can, at least, give us some assurance that our endpoints are not acting against our interests. Control over the hardware is far from guaranteed, but the situation could have been far worse than it actually is at the present. With luck and continued attention and pressure, we may be able to avoid a completely locked-down world.
Fixing the rest of the system will be harder. It is time to pay more attention to the copyright maximalist agenda and push back. Fair use rights must be asserted where they exist and created where they don't. The business concerns of the entertainment industry should not drive the design of our systems, our networks, and our international agreements. Perhaps there should be penalties for false assertions of copyright. And so on.
The free software community interacts deeply with the copyright system. We make full use of it in our software licenses; even permissive licenses have requirements that are backed up by copyright law. But the system we use to ensure the freedom of our software can also take away our freedom on other fronts if we do not pay attention. A world where our right to express ourselves is moderated by somebody else's software — usually very proprietary software — is not what we have been working for.
LinuxCon: Open hardware for open hardware
Open hardware platforms like the Arduino have turned device development into a hobbyist enterprise in recent years, but the $20 price tag of a microcontroller board seems a lot less tantalizing when one adds in the costs of testing and debugging it. At LinuxCon 2012 in San Diego, David Anders addressed this issue and offered some guidance on finding and selecting tools for open hardware development, the majority of which are open hardware themselves.
Openness and tools
"Open hardware" can mean a variety of things, from expensive commercial products with published schematics and chip designs all the way down to one-off experiments and home-brewed devices built from cheap parts like the Arduino microcontroller board. What the various definitions have in common, however, is the sharing of information, which in turn lowers the barrier to entry for participants in the community. But despite the "maker" movement's popularity of late, the tools problem that accompanies it is rarely discussed. Reality is that the hardware to build rapid-prototyping and one-off projects may be cheap and plentiful — but the tools required to test and debug that hardware is expensive, specialized, and proprietary.
For example, bench-top oscilloscopes start at $250 and can go up well into the hundreds of thousands of dollars. Logic analyzers start at around $1000. Even sticking with the low end, Anders said, buying a tool that costs ten or one hundred times the price of the device you are building takes some of the shine off of the process, particularly for someone who only needs to make hardware on infrequent occasions. Furthermore, the commercial versions of tools like the oscilloscope are designed for use by people like electrical engineers, and have a difficult learning curve.
On the subject of occasional use, Anders noted that although the maker and open hardware movements are often associated with non-professional settings (such as teaching young people about electronics or scratching one's own project-itch), they are proving to be a disruptive force in software circles as well. He spoke mostly about Arduinos and similar microcontroller boards, since they are the most popular flavor of amateur hardware development, but he made it clear that the same issues apply to most other projects, from sensors to embedded systems. Furthermore, he said, even if open hardware devices are not the issue at hand, most Linux and open source software developers will find themselves needing to build or debug a hardware device at some point.
Despite the high sticker prices of these devices in the commercial world, Anders said, science in general has a long history of developing and sharing open tools — dating at least as far back as Robert Bunsen's 1854 invention of the Bunsen burner, which he licensed for others to make as long as they adhered to his guidelines. Scientists have often documented their tools because their experiments require specialized equipment, he said, and sharing that part of the process is important for peer review and the ability to reproduce others' results.
In some ways, he said, open source software is a continuation of the same principle: developers build and share the tools needed to get work done. As open hardware has become more popular, a number of projects have started to develop the tools needed to analyze and debug devices — even if most of them are still under the radar.
Oscilloscopes
The first tool Anders looked at was the oscilloscope. The earliest versions of the tool drew graphs on paper tape, until CRT-based oscilloscopes took over in the 1950s — and the tool remained essentially unchanged for the next 50 years. It reads analog voltage signals, and displays the result as time-series data for analysis. Recent innovations have introduced updates, he said, like LCD screens, better portability, and built-in storage and analytics. But the core feature set remains the primary selling point; more expensive oscilloscopes justify their higher prices by supporting multiple inputs, higher sample rates, higher sample resolution, and larger frequency ranges.
With those factors in mind, Anders described three tiers of open hardware oscilloscopes to consider. The mid-range are based on PIC microcontrollers. There are about ten vendors on the market, he said, with the cheapest selling kits for $60. A typical example from Sparkfun offers 8-bit sample resolution, 1MHz bandwidth, and memory to store 256 samples. The device can grab screen captures of the signals it reads, and export them as bitmap images.
An even cheaper option is Atmel AVR-based oscilloscopes, which can be had for less than $30. Some AVR-based designs are assembled from off-the-shelf components, and are intended to be plugged directly into a breadboard. Consequently, at that price point they do not contain storage memory or an attached display, but both could be added. At the high end of the spectrum is the Nano-DSO, for around $100. The Nano-DSO is a commercial project, but it has open source firmware that can be modified and re-flashed. It offers a higher sampling resolution (12-bit) than the PIC-based units, but a narrower bandwidth (200KHz). However, it also includes a color LCD display, battery power, and a built-in signal generator.
Dedicated hardware is not the only option, however. Anders showed two open source software oscilloscopes, xoscillo and OsciPrime. Xoscillo runs on Linux, Windows, and Mac OS X, and can use commercial USB oscilloscope dongles or an Arduino (which has analog input pins built-in) to read signals. OsciPrime is an Android application, and can read signals from probes connected to the Android device's microphone port.
Logic analyzers
The logic analyzer is a tool that evolved from the oscilloscope, Anders said. Its purpose is not to monitor the shape of signals, but to analyze their timing — particularly to capture and decode digital signals. For that usage, they typically offer far more probes than an oscilloscope, which increases the cost. Commercial models cost thousands of dollars, an amount most people cannot see investing in their home projects.
The most do-it-yourself friendly option is the Open Bench Logic Sniffer, which can read 32 simultaneous input channels at up to 70MHz. The Open Bench Logic Sniffer can be found for around $50. The same company also offers the Logic Shrimp, a more modest model with four input channels and a 20MHz maximum sampling speed. The lower specifications make the Logic Shrimp a $35 tool, Anders said, but it is more than adequate for sampling I2C, SPI, RS-232, and other low-speed connections. Both are USB peripherals designed to be compatible with the open source SUMP analyzer software.
There are also several open hardware logic analyzer boards based on microcontrollers, including FX2-based models ranging in price from $20 to $150, MSP430-based models ranging from $25 to $35, and AVR-based models ranging from $35 to $50. Some of these latter options include built-in displays, but for the most part the open hardware logic analyzer community relies on PC-based software to display the signals and to decode the protocols within them.
The major player in open source logic analysis software is Sigrok, Anders said. It supports a wide range of hardware devices and is growing by leaps and bounds, he said, so it is worth keeping on the radar even if your hardware is not supported at the moment. One of Sigrok's major selling points is that it is easily extended. It provides an API for writing protocol decoders in Python, and the project maintains a lengthy list of protocols it understands. Even those protocols for which there is no decoder can still be captured as raw signals, of course.
In addition to Sigrok, the Logic Sniffer application is the other major open source software option. It was written to support the Open Bench Logic Sniffer (extending SUMP's feature set), but has expanded to cover additional hardware devices.
Other tools
Anders concluded the talk with a question-and-answer session that covered several other hardware tool topics. The Bus Pirate, for example, is a device designed to provide a serial port interface to a variety of chips, and even program serial ROMs. Anders said that he omitted the Bus Pirate from his main discussion because it is not primarily designed to perform oscilloscope or logic analyzer functions. It can be used to perform some of the same tasks, but it makes those tasks more difficult than do the other tools.
Similarly, when another audience member asked about JTAG tools, Anders observed that the are several open tools on the market. In addition, the Bus Pirate hardware can even be re-flashed with different firmware so that it functions as a JTAG interface.
Anders also alluded to support for automotive communication buses as a feature of Sigrok. Another member of the audience asked whether that support included Controller Area Network (CAN) bus, one of the leading automotive buses, and one for which PC interfaces are expensive. CAN bus uses differential logic, Anders explained, which requires a transceiver module to convert the signal into generic GPIO. However, the actual parts involved are not expensive, he said, so amateurs can build an interface that allows them to read CAN bus traffic with Sigrok.
Ultimately, Anders's talk only had enough time to conduct a survey of the available options for hardware development tools, rather than provide in-depth comparisons. But it was still a valuable session; most of the attendees at an event like LinuxCon come from the software realm — but as Anders said, most of them will (at one time or another) need to build or debug a hardware device. Having an alternative to the high prices of professional equipment is nice, but having an alternative that respects the same ideals as Linux and open source software is even better.
Engine Yard transitions to PostgreSQL
PostgreSQL version 9.2 was released on September 10th, with many enhancements that web application developers—and companies who host web applications—are excited about. One of the most excited is Ruby on Rails and PHP application host Engine Yard, who recently switched its default database option from MySQL to PostgreSQL. Combined with recent data on database migrations, Engine Yard's switch of databases signifies a sea change in the long rivalry between PostgreSQL and MySQL, as well as in Ruby on Rails development.
For coverage of the PostgreSQL 9.2 features, see the LWN article on PostgreSQL 9.2 Beta.
For information about the switch, I interviewed Engine Yard Lead Data Engineer Ines Sombra. But first, some background. Readers who are already familiar with Rails, MySQL, and PostgreSQL can skip down to the interview.
Ruby on Rails and MySQL
Ruby on Rails is a full-stack web framework, based on automated code generation. Rails is implemented in Ruby, a programming language which existed for ten years before Rails was introduced in 2004. It's "full stack" because it handles all parts of a web application above the database and operating system level, including querying the database, generating HTML pages, and responding to HTTP requests.
Rails has become one of the top five web development platforms because of its rapid and beginner-friendly web application building. The centerpiece of this is a code-generation engine, which creates the initial "scaffolding" for the web application developer, using the Model-View-Controller (MVC) design pattern, saving the developer a lot of time by handling many of the repetitive tasks that are normally required. The simplest Rails applications perform what is known as CRUD, for Create, Read, Update, and Destroy records in a database.
Early Rails supported only the MySQL database system, since it was regarded as the simplest, most developer-friendly SQL database available. That allowed developers to focus on keeping all of the application logic inside Rails. While there were early attempts to introduce PostgreSQL support, none of them really caught on until Rails 3.0 was released, so the vast majority of Rails developers used only MySQL through 2010.
Rails hosting, Engine Yard, and Heroku
While Rails made creating and developing a web application much easier than before, it did nothing to reduce the difficulty of hosting a web application. If anything, the highly dynamic nature of Rails makes it one of the most resource-consumptive web frameworks, requiring skill and experience in deployment, scaling, and uptime maintenance. Recognizing this dichotomy in 2006, Tom Mornini, Lance Walley, and Ezra Zygmuntowicz founded Engine Yard, a "fully managed web host" or "cloud host" for Rails projects. Engine Yard allowed the developer to just write code, and leave installation, scaling, and uptime to others — for a monthly hosting fee.
Initially, Engine Yard supported only MySQL for data storage, and hired a large, expert MySQL team to manage thousands of MySQL databases. Engine Yard built a sophisticated "database as a service" infrastructure on MySQL to support its customers. As the number one Rails hosting option, this meant that the majority of hosted Rails applications were not just on MySQL, but on Engine Yard's MySQL.
However, a year after Engine Yard launched, another team of entrepreneurs founded a competing Rails hosting service: Heroku.com. Heroku introduced a Git-centric model of application deployment that appealed to startups practicing agile development, and by 2010 it had grown into a strong competitor to Engine Yard for Rails users. Unlike Engine Yard, Heroku used PostgreSQL as its default—originally its only—database for customers. Heroku began promoting PostgreSQL among Rails developers and its user base, culminating with the introduction of a PostgreSQL cloud hosting service this year.
Today, both Engine Yard and Heroku support additional platforms, such as PHP, Python, and Java, in addition to Rails.
Migrations from MySQL to PostgreSQL
In April of this year, Engine Yard introduced PostgreSQL 9.1 as an option for its users. In August, Engine Yard announced that, with the release of PostgreSQL 9.2, it would become the default database option for new applications.
Engine Yard's users are not alone in migrating from MySQL to PostgreSQL. 451 Research recently released a subscription-only study, called "MYSQL VS. NOSQL AND NEWSQL: 2011-2015", that looked at tech sector MySQL users and their plans two years after the Oracle acquisition. For the one out of six MySQL users planning to migrate away from MySQL, the most popular option is PostgreSQL. In the Rails community, Planet Argon found that 60% of Rails developers now preferred PostgreSQL, up from 14% just three years ago.
Interview with Ines Sombra
To fill us in on the details of Engine Yard's move from MySQL to PostgreSQL, I interviewed Ines Sombra. Ines, who was born in Argentina, became a Rails developer and enthusiast while working at Engine Yard.
Josh: What's your role at Engine Yard?
Josh: What was the primary motivation for Engine Yard to offer PostgreSQL support?
- Feature parity with Engine Yard Managed where PostgreSQL has always been supported.
- Access to flexible and reliable replication modes: Streaming replication and hot standby in major version 9.0 were huge. Hot standby, in particular, was one of the final points of oft-asked-for, and needed, feature parity with MySQL.
- Rails 3.1 came out with significant performance enhancements using PostgreSQL's prepared statements.
- Versatility of natively supported features like full-text search, geolocation, and foreign data wrappers reducing the need for third party tools in our clients' deployments.
- Outstanding third party support options for our customers, both commercially and from the PostgreSQL community.
PostgreSQL also was internally loved, since the majority of our internal applications are already running on Postgres. We truly believe that PostgreSQL is the future of relational open-source databases and we are happy to provide our customers with great support for it.
Josh: Do you expect Engine Yard customers currently using MySQL to migrate to the new PostgreSQL databases?
While we will always support customers using MySQL, we expect the number of new PostgreSQL applications to grow as the new default makes it easier than ever to get started.
Josh: What was the timeline of this change? How long did it take?
Our early PostgreSQL 9.0 release was amongst the most popular we've ever had. Customers started using it in production applications immediately, so we accelerated our engineering processes to better serve their requirements. Within 4 months after 9.1 became available in our platform we were able to make it our new default database.
Josh: What were the biggest technical obstacles you encountered in deploying a PostgreSQL infrastructure, and how did you solve them?
Redefined Assumptions: We have traditionally been a MySQL shop and our product made assumptions based on the existence of a MySQL database. Our tests and continuous integration suite assumed that every new environment would have a MySQL database associated with it. Defaulting to PostgreSQL in our codebase allowed us to introduce the concept of a new default. We were able to break away from dependencies by refactoring tests and redefining what we expect from customer environments.
Allowed Extensions: PostgreSQL has a rich extension ecosystem and we want to encourage our customers to explore it. Engine Yard Cloud customers have dedicated instances that can be further customized by applying custom Chef recipes. We provide a way to enable over 30 available extensions on PostgreSQL environments. We curate this repository for all supported versions of PostgreSQL and continually add new extensions based on customer requests. PostGIS and hstore have been the most popular extensions installed.
Standardized Architecture: Engine Yard Cloud sits on top of AWS [Amazon Web Services] and we rely on EBS [Elastic Block Storage] volumes for database persistence. Unfortunately, snapshots taken on 32-bit [PostgreSQL] instances cannot be used on volumes mounted in 64-bit architectures. Customers with PostgreSQL databases had to dump and restore their data in order to vertically scale to bigger instance sizes. We solved this problem by rolling out 64-bit instance types for small and medium sizes and defaulting all databases to use a 64-bit architecture.
Ease of Upgrades and DR [disaster recovery]: At the moment we are working to make the process of transitioning between database versions easier and more automated. We are looking at tools (like repmgr) that would allow us to replicate across versions and environments. One of our high priority items is to roll out PostgreSQL 9.2 support based on best practices we've learned along the way and allow customers to upgrade with minimal impact.
Josh: Which 9.2 features are Engine Yard users most interested in, and why?
Over the last year, we have seen an increase in the use of document-oriented databases like MongoDB. With native JSON support, developers have access to the schema flexibility of NoSQL databases while continuing to enjoy the ACID guarantees, operational simplicity, and transactional efficiency of PostgreSQL.
JSON validation in the database helps simplify application logic and ensures that any client that connects to the database will have a consistent way to manipulate and save this type of data. We think this feature alone will be a great upgrade motivator and are looking forward to seeing it live.
[Josh Berkus is a member of the PostgreSQL Core Team.]
Page editor: Jonathan Corbet
Next page:
Security>>