AI Disclosures: A Regulatory Review

By Elliot Mckernon, Deric Cheng, Convergence Analysis @ 2024-03-29T11:46 (+12)

Cross-posted on LessWrong.

This article is the fourth in a series of ~10 posts comprising a 2024 State of the AI Regulatory Landscape Review, conducted by the Governance Recommendations Research Program at Convergence Analysis. Each post will cover a specific domain of AI governance (e.g. incident reportingsafety evals, model registries, etc.). We’ll provide an overview of existing regulations, focusing on the US, EU, and China as the leading governmental bodies currently developing AI legislation. Additionally, we’ll discuss the relevant context behind each domain and conduct a short analysis.

This series is intended to be a primer for policymakers, researchers, and individuals seeking to develop a high-level overview of the current AI governance space. We’ll publish individual posts on our website and release a comprehensive report at the end of this series.

What are disclosures and why do they matter?

The public and regulators have legal rights to understand goods and services. For example, food products must have clear nutritional labels; medications must disclose their side effects and contraindications; and machinery must come with safety instructions

In the case of AI, these legally mandated disclosures can cover several topics, such as:

Labels and watermarks

Labels and watermarks vary in design; some are subtle, some conspicuous; some easy to remove, some difficult. For example, Dall-E 2 images have 5 coloured squares in their bottom right corner, a conspicuous label that’s easy to remove:

However, Dall-E 3 will add invisible watermarks to generated images, which are much harder to remove. Watermarking techniques are less visible than labels, and are evaluated on criteria such as perceptibility and robustness. A technique is considered robust if the resulting watermark resists both benign and malicious modifications; semi-robust if it resists benign modifications; and fragile if the watermark isn’t detectable after any minor transformation. Note that fragile and semi-robust techniques are still useful, for example in detecting tampering

Imperceptible watermarking methods might embed a signal in the “noise” of the image such that it isn’t detectable to the human eye, and is difficult to fully remove, while still being clearly identifiable to a machine. This is part of steganography, the field of “representing information within another message or physical object”. 

For example, the Least Significant Bit (LSB) technique adjusts unimportant bits in images or sound files to carry messages. For example, 73 represented in binary is 1001001. The leftmost “1” is the most significant bit, representing 26, while the rightmost “1” just represents 1, meaning it can be adjusted to carry part of a message without much significant change. LSB is relatively fragile, while other techniques like Discrete Cosine Transform (DCT) uses Fourier transforms to subtly adjust images at a more fundamental level, and thus is robust against attack techniques such as adding noise, compressing the image, or adding filters. Other popular techniques include DWT and SVD, and there are open-source technical standards such as C2PA that have been adopted by organizations like OpenAI

Text is harder to watermark subtly, as the information in text is far less noisy than in an image, for example. Watermarking can still be applied to metadata, and there are techniques derived from steganography that add hidden messages to text, though these can be disrupted and aren’t under major consideration by legislators or AI labs.  

Importantly, all these labeling and watermarking techniques can be embedded in the weights of generative AI models, for example in a final layer of a neural network, meaning it is possible to have robust but invisible signals in AI-generated content that, if interpreted correctly, could be used to identify what particular model generated a piece of work. 

Watermarking also involves tradeoffs between robustness and detectability; robust watermarking techniques alter the content more fundamentally, which is easier to detect. This means robustness can also trade-off against security, as more obscure and undetectable watermarking are be harder to extract information from, and thus more secure. For example, brain scans feature incredibly sensitive information, and so researchers have developed fragile but secure watermarking techniques for fMRI. In summary, to quote a thorough review of watermarking and steganography:

It is tough to achieve a watermarking system that is simultaneously robust and secure. 

Overall, modern digital watermarking techniques are robust and difficult (but not impossible) to remove.

Current Regulatory Policies

The US

The Executive Order on AI states that Biden’s administration will “develop effective labeling and content provenance mechanisms, so that Americans are able to determine when content is generated using AI and when it is not.” In particular:

The AI Disclosure Act was proposed in 2023, though it has not passed the house or senate yet, instead being referred to the Subcommittee on Innovation, Data, and Commerce. If passed, the act would require any output generated by AI to include the text: ‘‘Disclaimer: this output has been generated by artificial intelligence.’’ 

China

China’s 2022 rules for deep synthesis, which addresses the online provision and use of deep fakes and similar technology, requires providers to watermark and conspicuously label deep fakes. The regulation also requires the notification and consent of any individual whose biometric information is edited (e.g. whose voice or face is edited or added to audio or visual media). 

The 2023 Interim Measures for the Management of Generative AI Services, which addresses public-facing generative AI in mainland China, requires content created by generative AI to be conspicuously labeled as such and digitally watermarked. Developers must also label the data they use in training AI clearly, and disclose the users and user groups of their services. 

The EU

Article 52 of the draft EU AI Act lists the transparency obligations for AI developers. These largely relate to AI systems “intended to directly interact with natural persons”, where natural persons are individual people (excluding legal persons, which can include businesses). For concision, I will just call these “public-facing” AIs. Notably, the following requirements have exemptions for AI used to detect, prevent, investigate, or prosecute crimes (assuming other laws and rights are observed).

Convergence’s Analysis

Mandatory labeling of AI-generated content is a lightweight but imperfect method to keep users informed and reduce the spread of misinformation and similar risks from generative AI.

Mandatory watermarking is a lightweight way to improve traceability and accountability for AI developers.

Labels and watermarks can be disrupted or removed by motivated users, especially in text generation.

Unclear definitions of what constitutes an application of AI will lead to inconsistent disclosure requirements and enforcement.


SummaryBot @ 2024-03-29T12:41 (+1)

Executive summary: Current and proposed regulations require AI-generated content to be labeled and watermarked, but these lightweight methods have limitations in preventing misuse and ensuring accountability.

Key points:

  1. Labeling and watermarking AI-generated content informs users and enables tracing the source AI model.
  2. The US, China, and EU have proposed or enacted rules requiring conspicuous labeling and robust watermarking of AI content.
  3. Labeling and watermarking are lightweight methods with precedent, but compliance and effectiveness can vary.
  4. Labels and watermarks can be removed by motivated users, especially for text, so they are imperfect solutions.
  5. Unclear definitions of AI applications will lead to inconsistent disclosure requirements and enforcement.

 

 

This comment was auto-generated by the EA Forum Team. Feel free to point out issues with this summary by replying to the comment, and contact us if you have feedback.