[Download PDF ↓](/docs/legal_knowing_machines/Knowing_Machines_-_USCO-Comment_10-30-23.pdf)

		
On August 30, 2023, the United States Copyright Office (USCO) requested public input on copyright law and artificial intelligence (AI), especially recent generative AI systems.

In this Comment, the Knowing Machines Project (Knowing Machines) urges USCO to rely on research-based, empirical findings to inform its regulatory agenda and any recommendations to Congress on the open issue of the use of copyright-protected works to train AI models. USCO should advocate for support and funding to develop data investigatory tools to inform its assessment of training datasets for generative AI systems (GenAI) and their potential impact on the copyright system as a whole. We briefly discuss Knowing Machines' experience building a training dataset investigatory tool, See:Set, to demonstrate some of the ways in which data investigations may provide empirical findings to support evidence-based policymaking. We also recommend USCO study and fund the development of best practices for dataset creation, curation, recordkeeping, and maintenance.

Our main point: "We understand the difficulties of gaining a deep understanding of these training datasets firsthand. We need new investigatory methods to uncover the hidden problems inscribed in machine learning processes. Because dataset creators and AI developers lack standardized ex ante dataset transparency and recordkeeping requirements, we now rely almost exclusively on ex post data investigations for research, often unable to identify all the necessary information we need to understand datasets, especially in a copyright context. Although it is challenging, we urge the USCO to support evidence-based research concerning the nature of training datasets and their role in GenAI outputs, minimizing the influence of conjecture in the policymaking process."

Knowing Legal Machines

Generative AI Legal Explainer

Comments of the Knowing Machines Research Project to the United States Copyright Office Notice of Inquiry on Artificial Intelligence and Copyright

Amici Brief of Science, Legal, and Technology Scholars in Renderos et al. v. Clearview AI, Inc. et al., No. RG21096898 (Superior Ct. Alameda County)

Clearview AI Is Deploying a California Law Meant to Protect Activists From Bogus Lawsuits

The First Amendment Should Protect Us from Facial Recognition Technologies – Not the Other Way Around

The Right of Publicity: A New Framework for Regulating Facial Recognition

Freedom of Information Laws and Access to Government Data in the Age of AI: Two Recent Cases

Comments of the Knowing Machines Research Project to the White House Office of Science and Technology Policy on Automated Worker Surveillance and Management

Tell the White House to Limit AI-Driven Worker Surveillance

Comments of the Knowing Machines Research Project to the Federal Trade Commission