AI Tool Rips Off Open Source Software Without Violating Copyright (404media.co) 85
A satirical but working tool called Malus uses AI to create "clean room" clones of open-source software, aiming to reproduce the same functionality while shedding attribution and copyleft obligations. "It works," Mike Nolan, one of the two people behind Malus, who researches the political economy of open source software and currently works for the United Nations, told 404 Media. "The Stripe charge will provide you the thing, and it was important for us to do that, because we felt that if it was just satire, it would end up like every other piece of research I've done on open source, which ends up being largely dismissed by open source tech workers who felt that they were too special and too unique and too intelligent to ever be the ones on the bad side of the layoffs or the economics of the situation." 404 Media reports: Malus's legal strategy for bypassing copyright is based on a historically pivotal moment for software and copyright law dating back to 1982. Back then, IBM dominated home computing, and competitors like Columbia Data Products wanted to sell products that were compatible with software that IBM customers were already using. Reverse engineering IBM's computer would have infringed on the company's copyright, so Columbia Data Products came up with what we now know as a "clean room" design.
It tasked one team with examining IBM's BIOS and creating specifications for what a clone of that system would require. A different "clean" team, one that was never exposed to IBM's code, then created BIOS that met those specifications from scratch. The result was a system that was compatible with IBM's ecosystem but didn't violate its copyright because it did not copy IBM's technical process and counted as original work.
This clean room method, which has been validated by case law and dramatized in the first season of Halt and Catch Fire, made computing more open and competitive than it would have been otherwise. But it has taken on new meaning in the age of generative AI. It is now easier than ever to ask AI tools to produce software that is identical in function to existing open source projects, and that, some would argue, are built from scratch and are therefore original work that can bypass existing copyright licenses. Others would say that software produced by large language models is inherently derivative, because like any LLM output, it is trained on the collective output of humans scraped from the internet, including specific open source projects.
Malus (pronounced malice), uses AI to do the same thing. "Finally, liberation from open source license obligations," Malus's site says. "Our proprietary AI robots independently recreate any open source project from scratch. The result? Legally distinct code with corporate-friendly licensing. No attribution. No copyleft. No problems." Copyleft is a type of copyright license that ensures reproductions or applications of the software keep it free to share and modify.
It tasked one team with examining IBM's BIOS and creating specifications for what a clone of that system would require. A different "clean" team, one that was never exposed to IBM's code, then created BIOS that met those specifications from scratch. The result was a system that was compatible with IBM's ecosystem but didn't violate its copyright because it did not copy IBM's technical process and counted as original work.
This clean room method, which has been validated by case law and dramatized in the first season of Halt and Catch Fire, made computing more open and competitive than it would have been otherwise. But it has taken on new meaning in the age of generative AI. It is now easier than ever to ask AI tools to produce software that is identical in function to existing open source projects, and that, some would argue, are built from scratch and are therefore original work that can bypass existing copyright licenses. Others would say that software produced by large language models is inherently derivative, because like any LLM output, it is trained on the collective output of humans scraped from the internet, including specific open source projects.
Malus (pronounced malice), uses AI to do the same thing. "Finally, liberation from open source license obligations," Malus's site says. "Our proprietary AI robots independently recreate any open source project from scratch. The result? Legally distinct code with corporate-friendly licensing. No attribution. No copyleft. No problems." Copyleft is a type of copyright license that ensures reproductions or applications of the software keep it free to share and modify.