Google Open-Sources SynthID, A Watermarking Tool For AI-Generated Texts

The spread of misinformation from AI-generated content on the internet has become a major issue over the past few years. Many attempts to detect AI-generated content have been made, but no tool yet exists with high accuracy.

Even OpenAI discontinued its AI classifier, citing a low rate of accuracy in detecting AI and human-written texts.

We are working to incorporate feedback and are currently researching more effective provenance techniques for text, and have made a commitment to develop and deploy mechanisms that enable users to understand if audio or visual content is AI-generated. — OpenAI

Now, Google is addressing the misinformation challenge by open-sourcing its own tool, SynthID, a tool designed to detect AI-generated content.

What is SynthID?

SynthID is a technology from Google DeepMind that watermarks and identifies AI-generated content by embedding digital watermarks directly into AI-generated images, audio, text, or video.

Will this affect the quality or speed of generation?

According to Google, SynthID doesn’t compromise the quality, accuracy, creativity, or speed of generated text created with Gemini.

Google has also open-sourced this technology through their Responsible Generative AI Toolkit, providing guidance and essential tools for creating safer AI applications.

This technology is available in Google’s Responsible Generative AI Toolkit and is also accessible on Hugging Face, which makes it easier for developers to embed and detect watermarks in their AI applications responsibly.

How SynthID Works

Generative watermarking offers a solution that subtly embeds identifiable markers within text during the generation process, enabling detection without compromising quality or requiring access to the LLM itself.

In practice, here’s how it works in a model generation pipeline:

How SynthID Works Generative watermarking offers a solution that subtly embeds identifiable markers within text during the generation process, enabling detection without compromising quality or requiring access to the LLM itself. In practice, here’s how it works in a model generation pipeline: — Image from Nature.com

Standard Text Generation: In the top part of the architecture, standard LLM text generation produces tokens sequentially from left to right, with each token sampled from the LLM’s probability distribution based on the preceding context.
Watermark Integration: In the bottom part of the architecture, a generative watermarking system is introduced, featuring three key components in blue: a random seed generator, a sampling algorithm, and a scoring function.

These elements work together to enable watermark-embedded text generation and detection using the Tournament sampling algorithm within the SynthID-Text method.

For example, if you ask Gemini to make your email sound more professional, the revised version with the watermark would look something like this:

Google SynthID: For example, if you ask Gemini to make your email sound more professional, the revised version with the watermark would look something like this: — Image from Google

The probability of this piece of text is 99.9%.

You can learn more about SynthID from Nature’s technical paper below.

Scalable watermarking for identifying large language model outputs - Nature
Large language models (LLMs) have enabled the generation of high-quality synthetic text, often indistinguishable from…www.nature.com

Applying A Watermark

To embed watermarks, SynthID Text acts as a logits processor within your model’s generation pipeline. It modifies the model’s logits using a pseudorandom g-function to encode watermarking information, balancing both the quality of the generated text and the detectability of the watermark.

Watermarks are configured by parameterizing the g-function and controlling its application during text generation.

Here’s a sample configuration:

from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer,
    SynthIDTextWatermarkingConfig,
)

# Initialize model and tokenizer
tokenizer = AutoTokenizer.from_pretrained('repo/id')
model = AutoModelForCausalLM.from_pretrained('repo/id')

# Set up SynthID Text watermarking configuration
watermarking_config = SynthIDTextWatermarkingConfig(
    ngram_len=5,
    keys=[654, 400, 836, 123, 340, 443, 597, 160, 57, 29, 590, 639, 13, 715, 468, 990, 966, 226, 324, 585, 118, 504, 421, 521, 129, 669, 732, 225, 90, 960],
    sampling_table_size=65536,
    sampling_table_seed=0,
    context_history_size=1024
)

# Generate text with watermarking
tokenized_prompts = tokenizer(["your prompts here"])
output_sequences = model.generate(
    **tokenized_prompts,
    watermarking_config=watermarking_config,
    do_sample=True,
)
watermarked_text = tokenizer.batch_decode(output_sequences)

This setup enables the generation of text embedded with a SynthID watermark, ensuring detectability while maintaining quality in the output.

Detecting a Watermark

Watermarks are designed to be detectable by a trained classifier while remaining invisible to human readers. Each watermarking configuration applied to your models requires a dedicated detector trained to identify its unique mark.

The basic process for training a detector is as follows:

Choose a watermarking configuration.
Collect a detector training set of at least 10K examples, split between watermarked or not and training or testing.
Generate non-watermarked outputs with your model.
Generate watermarked outputs with your model.
Train a watermark detection classifier using this dataset.
Deploy your model with the configured watermarking system and trained detector.

Transformers offers a Bayesian detector class, along with an end-to-end example for training a detector to recognize watermarked text for a specific configuration.

If multiple models use the same tokenizer, they can share both the watermarking configuration and detector, provided the detector’s training set includes samples from all relevant models.

How To Try SynthID

To help you better understand how the watermarking works, let’s do some examples. There are three ways to do this:

Use this Google Colab Notebook to run SynthID on your browser.
Run SynthID on HugggingFace using this space.
Run SynthID on your local machine. Using a virtual environment is highly recommended for any local use.

The easiest way to try SynthID is on HuggingFace, but the current SynthID space from Google doesn’t seem to work. To give you an idea, head over to this Hugging Face space and try to input some text prompts.

How To Try SynthID huggingface — Image by Jim Clyde Monge

Enter up to three prompts, then click the generate button.

Here’s an example:

Prompt 1: Write an essay about my pets, a cat named Mika and a dog named Cleo.

Prompt 2: Tell me everything you can about Portugal.

Prompt 3: What is Hugging Face?

Gemma will then generate watermarked and non-watermarked responses for each non-empty prompt you provided.

Now, let’s try this free Google Colab Notebook to test SynthID on a browser.

How To Try SynthID Google Colab Notebook — Image by Jim Clyde Monge

This notebook demonstrates how to use the SynthID Text library to apply and detect watermarks on generated text.

It is divided into three major sections and intended to be run end-to-end.

Setup: Importing the SynthID Text library, choosing your model (either Gemma or GPT-2) and device (either CPU or GPU, depending on your runtime), defining the watermarking configuration, and initializing some helper functions.
Applying a watermark: loading your selected model using the Hugging Face Transformers library, using that model to generate some watermarked text, and comparing the perplexity of the watermarked text to that of text generated by the base model.
Detecting a watermark: training a detector to recognize text generated with a specific watermarking configuration, and then using that detector to predict whether a set of examples were generated with that configuration.

For each step, click on the tiny little play button on the left side and wait until a green check mark appears. This suggests that the process was completed successfully.

How To Try SynthID Google Colab Notebook. Choose your model — Image by Jim Clyde Monge

You will need a token from HuggingFace in the “Choose your model” step in order to allow Google Notebook to download the required models.

Make sure to edit the permissions of the token to allow access to certain repositories.

HuggingFace token restrictions — Image by Jim Clyde Monge

For running SynthID locally, follow the steps described on the GitHub repository.

Not A Perfect Solution

SynthID text watermarks are robust to some transformations, like cropping parts of the text, altering a few words, or slight paraphrasing.

However, it is not without some limitations:

Model Dependency: SynthID only works with Google’s Gemini model.
Accuracy Reduction with Edits: Editing watermarked content with another AI or through extensive paraphrasing reduces detection accuracy.
Translation sensitivity: Translations or heavy rewrites can remove the watermark’s detectability.

Final Thoughts

It’s good to see tech giants like Google finding ways to control the widespread misinformation caused by AI chatbots while also releasing powerful language models. There’s a bit of irony here, but I get that they’re doing it to stay ahead in the competitive market.

It’s worth noting that SynthID works only on Google’s language models. Try using it on texts generated by Claude or ChatGPT, and the results might not be accurate. This is an important limitation that both developers and casual users need to be aware of. SynthID doesn’t provide a perfect solution to misinformation.

I hope other tech companies like OpenAI, Anthropic, or Amazon also work on their own AI content watermarking tools. This could go a long way in cutting down the spread of AI-generated content on the internet and helping professionals in research and academia.

As a developer, I haven’t tried SynthID yet, but I’m planning to run some experiments to see how well it works on apps created with Gemini and see if there’s any effect on the performance and quality of output.

Stay ahead. Stay updated.