IP Software Compliance Tools -- Who Needs Them and Why?
Now there is a second such product, Palamida's IP Amplifier, and it's clear there is a market for such products. Cisco, for one, has just signed on with Palamida. Who really needs products like this, and why? And is there a difference between them?
Who Needs Software Compliance Tools?
Now that Free and Open Source software has hit the mainstream of the enterprise, businesses need to be certain that they are not taking on legal liabilities with the code. There are many licenses, and making sure a company is abiding by them all is complex. That's one reason you are hearing so many voices calling for simplifying and settling on fewer licenses. But it goes deeper than that.
"Everyone who distributes software should know what goes into it," attorney Lawrence Rosen explains. "And almost everyone who distributes software wants to comply with the relevant licenses. Most reputable software-based businesses recognize that playing fast-and-loose with copyright claims isn't worthwhile."
While most businesses today are pleased to adopt and incorporate open source products into their products and services, they want to know what licenses apply so that they can comply with the terms.
"That's what Black Duck and Palamida make possible," Rosen adds. "A distributor or user can know what open source software is in its own software and act accordingly, early in the cycle. It's now possible to evaluate license compatibility for specific component sets and plan appropriate combinations for use in products to be developed."
Unfortunately, developers sometimes use GPL code (or other licensed FOSS code) without telling management, thinking it's public domain. It isn't. And with outsourcing, sometimes developers are in other countries that may have more relaxed views on copyright and this can cause problems. So when developers let things happen they shouldn't (such as making unauthorized copies or derivative works), companies have an automated way to catch some of that and react appropriately before much bigger problems can develop.
Software practices are also changing. Application development today is becoming more like an assembly line, more a matter of assembling bits of code from open source projects and from outsourced firms and incorporating them into proprietary products than handcrafting 100% custom software. This isn't a bad thing, because it makes it possible to avoid having to reinvent the wheel -- one of the advantages of Open Source -- but it also means that checking on license terms and making sure you are complying with them all is vital to the process.
And there is no doubt that enforcement of GPL violations is increasing, as Fortinet learned recently when a German court banned their U.K. subsidiary from further distribution of their firewall and antivirus products until they complied with the GPL, which they promptly did.
Then there is the Sarbanes-Oxley Act [PDF], and its requirements for IT audits.
"The SECs new rules on heightened corporate responsibility for public company reporting known as Sarbanes-Oxley require public companies to abide by internal procedures that are sufficient to provide reasonable assurance that the financial and non-financial information required to be disclosed in its periodic and current reports is accurate," says Karen Copenhaver, executive vice president and general counsel for Black Duck Software.
"Specifically, Sarbanes creates two new corporate governance requirements: assessment of internal controls over financial reporting (required by section 404 of the Act), and heightened corporate responsibility for financial reports (required by section 302 of the Act). It would be hard to overestimate the burden that compliance with these new rules has placed on public companies in the first few years since their enactment.
"Even before Sarbanes, public companies were required to address intellectual property matters in their current and periodic reports. A reporting company traditionally discloses the importance of its intellectual property assets to the companys business and any third-party intellectual property encumbrances on the companys ability to conduct its business. To the extent that a failure to identify or comply with third party license obligations has an effect on the accuracy of any of this information, public companies will be concerned about compliance with their obligations under Sarbanes."
Obviously, Sarbanes-Oxley has upped the ante considerably. But most businesses and developers want to do the right thing anyway, apart from outside pressures. The tools don't set policy for a company, but they surely make it easier to make sure policies are observed.
What Do the Tools Offer?
Before automated software compliance tools were available, due diligence in checking software for infringing code was done by assigning the tedious task to senior software programmers in the company, who, together with lawyers laboriously looked through the code. The problem with such a system, aside from the time it required and the drudgery, is that no one person knows all the Free and Open Source projects available by sight, let alone all the proprietary products you are not allowed to see without complex legal arrangements.
Automated systems are an obvious answer. What they provide is a Google-like collection of code. They've collected it all for you. Both tools scan for copyright infringement and can spot more than verbatim matches. But they do more than scan. Palamida says its IP Amplifier product automatically detects, manages and reports on the third party, commercial and open source components that may exist in their software code base. It consists of two key modules -- the Compliance Library and the Detector. Using an automated collection system, the Compliance Library contains billions of source code snippets and millions of files of the most commonly used open source projects found in the market.
Palamida: "The Palamida IP Amplifier uses three different types of technologies to automate detection, source code fingerprinting, file digest matching, and for Java files, namespace matching. This means the software is able to conduct both source code and binary code analysis. So for companies whose developers download whole libraries, compiled code, XML files, icons, text files, and include those resources into their code base, the software will still detect their usage even though their source code is not available and even if we do not have the components listed in our database."
Next, there is a "layer of analysis that is beyond just code matching for reduction of false positives. We call this technology CodeRank. CodeRank looks at the code matches and evaluates the results on multiple levels, including uniqueness, coverage and clustering. How unique is that match to what is in the Palamida database? How much of a customer file matches a file in Palamidas database? How dense are the matches do they look like a continuous cut and paste or does it look like two engineers coded against the same API?"
After their software evaluates the code matches, Palamida assigns a CodeRank number to the matches; the higher the CodeRank number the higher the chances of copying. In the scan results, users will see a list of all code that has matches and a list of all the third party products that they most likely came from, with the most likely on top.
Reports identify all components that include open source and list their licenses, text and license information, in addition to the CodeRank. All the information and data is exportable in XML data format, allowing users to create custom reports, as well as via HTML reports.
Black Duck too offers a great deal more than just code scanning. Black Duck's Copenhaver: "We do more than just scan code. Our product provides a full suite of services covering project planning, code analysis and detection, license analysis and management, auditing and archival capabilities for the complete life cycle of software projects.
"From an open source perspective," Coperhaver adds, "we help developers manage the origins and obligations of code that they use so they can meet the expectations of the industry and community. But everything we do works for both open source and proprietary or commercial code. Users can add code prints and licenses into the system to manage their internal proprietary code along with open source.
"Our product helps people manage the introduction of licensed materials into their code bases, understand the obligations associated with that code (and combinations of components from different sources), provide an environment for controlled remediation of issues that arise and create an archivable record of the actions that were taken by the team along the way. Our products are designed to bring together developers, lawyers and business decision makers into a collaborative environment."
Black Duck offers an analysis 'engine' that processes licenses at a detailed level and alerts users to license conflicts and obligations of both software source and binary components and their combinations. The ProtexIP Knowledgebase contains detailed breakdowns of 500+ software licenses for automated comparison of license terms and notification of collective obligations, and the data is remotely updated frequently with new licenses as they come to market. It recently added what they call Custom Code Prints, which gives ProtextIP support for proprietary source code.
Palmida claims a database of 40,000 of the most commonly used OSS projects and their associated licenses, monitoring more than 38 million open source files and billions of source code snippets. The Knowledge Base also contains all pertinent information regarding the open source projects: name, version number, project name, licensor, licensor information (when available), license, license text, and project URL, all using an automated collection toolset that incorporates information on all the new projects released on the major OSS repositories for real time updates.
The Palamida database takes up less than 10 Gb disk space, thanks to a compression algorithm, and it's all kept on a customer's own servers, behind their firewall. Its code is written in Java. IP Amplifier can be configured to search daily or weekly and has a set of configuration tools to integrate it into build systems.
Are There Any Differences?
The biggest differentiator is cost. IP Amplifier 3.0 is licensed on an annual subscription basis, for unlimited number of users, at prices that begin at $50,000 and go up to $250,000 per year, depending on the customer's development environment. There is a 30-day Free Trial offer.
Black Duck now offers two options. You can pay an annual licensing fee for its multiuser ProtextIP product, at $25,000 per year, and then add additional charges based on the amount of code you have. Or, you can use their new hosted ProtextIP/OnDemand product, an online system for a single user, single project, 90-day sessions, for which you pay based on the amount of code you wish to scan. It costs $3,000 for 10 MB of code and costs scale up to $25,000 for 100 MBs. A company thinking of acquiring another might wish to use the online tool, rather than purchase more costly version.
Both products still require human analysis, naturally. There can be false
matches, if two independent developers happen to write software that is
very much the same, even if there has been no copying, just because there
are only so many ways of writing the same instruction. Both tools
provide not only identical matches but also flag similarities in your
source code to others' programs that are worth your further investigation
and list issues for review. It's important to realize, however, that
the tools scan and analyze copyright issues and licensing issues, not
patent infringement. That is an entirely separate ballgame.
But for what they are designed to do, unquestionably they have
simplified, organized, and improved the due diligence process.
Index entries for this article | |
---|---|
GuestArticles | Jones, Pamela |
Posted Jun 2, 2005 5:05 UTC (Thu)
by JoeBuck (subscriber, #2330)
[Link] (2 responses)
Posted Jun 2, 2005 10:28 UTC (Thu)
by rjw (guest, #10415)
[Link] (1 responses)
Seems like bad faith though. "Here is how to do numerical recipes in C. But don't actually *do* them after you read this!" Very odd....
Do you have a reference to this incident?
Posted Jun 2, 2005 17:16 UTC (Thu)
by DennisJ (subscriber, #14700)
[Link]
http://www.nr.com/infotop.html#distinfo
Only some of the code is public domain. For the rest, maybe it's easier to find something free on netlib (NR say this too, if you want to distribute the source).
Posted Jun 2, 2005 12:19 UTC (Thu)
by gregwilkins (guest, #515)
[Link]
Which made me think... if a database and/or report is derived from
Posted Jun 2, 2005 17:48 UTC (Thu)
by spitzak (guest, #4593)
[Link] (1 responses)
I would like to see this technology applied to closed code. Maybe a company that wants to protect it's code can contribute it to the database maintained by this service. Other companies can send their code to check against the database. Matching GPL code is a demonstration that their matching code works, so that it may be admissable as legal evidence without the need to reveal the code from either party.
Posted Jun 2, 2005 21:02 UTC (Thu)
by giraffedata (guest, #1954)
[Link]
I believe the point of calling GPL code dangerous is that developers mistakenly believe that it is freely copiable, since it almost is. There's little danger of that misbelief with source code that one got from a former employer's internal code library.
I don't know if it's true that developers are misled by GPL in that way, but I've heard stories. When FSF describes GPL, it usually stresses that the code is is copyrighted, and is not public domain, which leads me to believe they've encountered such confusion.
Posted Jun 2, 2005 21:12 UTC (Thu)
by giraffedata (guest, #1954)
[Link] (3 responses)
People act like accidentally infringing copyright will bring down your company. With GPL, some say it forces you to give away all your own source code. But all it does is make you pay the copyright owner the value of the stolen code. Where the copyright owner isn't selling the code for money, the most value the law would assign it is probably what it would cost someone to recreate it.
Posted Jun 3, 2005 2:23 UTC (Fri)
by brouhaha (subscriber, #1698)
[Link] (2 responses)
Statutory damages are between $750 and $30K per count, unless the act of infringement is found to have been willful, in which case it can be up to $150K per count. Suppose you sold twenty thousand copies of a program then got sued for infringement. Assuming that it wasn't found to have been willful infringment, you could potentially be liable for $600M. As I understand it, statutory damage awards are more commonly in the $2500 per count range, but it's at the discretion of the court.
Posted Jun 3, 2005 3:43 UTC (Fri)
by giraffedata (guest, #1954)
[Link] (1 responses)
Anyway, there's one flaw in your arithmetic. There are no "counts" of infringement, since we're not talking about crimes here. The statutory damage clause says the $30K limit is for each work, not each copy. Where you see it multiply into the millions is where you have something like Napster that involves thousands of works.
Posted Jun 3, 2005 4:35 UTC (Fri)
by brouhaha (subscriber, #1698)
[Link]
However, the hypothetical case was presumably not a criminal case.
Posted Nov 30, 2009 21:04 UTC (Mon)
by imran238 (guest, #62237)
[Link] (1 responses)
Thanks
Posted Dec 2, 2009 0:34 UTC (Wed)
by xoddam (guest, #2322)
[Link]
Me neither.
If you have a perceived need for such a tool, a priori it would seem that you aren't ideologically committed to extending the free-software ecosystem.
So asking for the tool to be free seems a bit rich.
Talk to the vendors mentioned in the article, and see what sort of licence they're prepared to sell you.
I know of a case where programmers for a proprietary company used code from "Numerical Recipes in C" in a product. Since they were honest people, they credited the source in the documentation, thinking that it was public domain code, free for the taking. Needless to say, it's not; the lawyers wound up sorting that one out.
Hope they check more than open source
I think thats quite a dodgy precedent really. I would have thought that the intended purpose of text books would imply some licence .... IANAL. Hope they check more than open source
Distribution info for Numerical Recipes:Hope they check more than open source
These products sound interesting and I'd love to see the reports theyIs a database derived from GPL code a derived product?
generate over OS projects made publically available.
the source code of GPL's project, then surely that is a derived product
and thus covered by the GPL?
Claiming GPL code is "dangerous" is FUD. *Far* more dangerous for companies is their employees stealing *closed* code, from former employers, from the companies their friends or family work for, from documentation they happened to aquire, and most importantly from ignoring the NDA's they signed.What about closed software
What about closed software
I wonder how much copyright you could infringe for the $50K cost of this assurance?
Worth the money?
Worth the money?
I wonder how much copyright you could infringe for the $50K cost of this assurance?
Possibly as little as one or two copies of a single work.
Thanks. I was unaware there were statutory damages for copyright in US law. And these are particularly weird statutory damages, since the amount within that range is up to the discretion of the court. Statutory damages I know of are specifically designed to remove the discretion of the court -- the statute tells you exactly what the formula is. The idea is to save litigation expense and substitute the judgment of legislators for that of judges.
Worth the money?
In the US, copyright infringement actually can be a criminal offense. This came about as part of the No Electronic Theft Act of 1997. The criminal charge carries a maximum penalty of five years in prison and a $250K fine, or twice the gross gain to any defendant or twice the gross loss to any victim, whichever is greater. On top of that, there is mandatory restitution.
Worth the money?
IP Software Compliance Tools -- Who Needs Them and Why?
I'm looking for an application that will scan my code base and identify any open source code etc.
free software to make sure you're in compliance when selling non-free software