https://imgs.search.brave.com/g8G3mje6065VuUz6kkaw7u2igb78DV9okIbR2VsyZuE/rs:fit:860:0:0:0/g:ce/aHR0cDovL25ldHBy/ZXNlcnZlLm9yZy93/cC1jb250ZW50L3Vw/bG9hZHMvbG9nb19J/QS5vcmdfLnBuZw

AI-Assisted Wayback Machine Extension Using Client-Side Prompt API

Overview

Building a privacy-preserving Wayback Machine browser extension that uses on-device AI (Chrome Prompt API) to help people quickly judge an archived capture’s relevance and quality.

Contributor

Name	Sudipta Das
Organization	Internet Archive
University	Rajiv Gandhi Institute of Petroleum Technology
Timezone	UTC+5:30 (Asia/Calcutta)
Mentors	Dr. Sawood Alam; Will Howes
Proposal status	Accepted

Proposal

You can find the proposal here in the below link:

GSoC Proposal - Internet Archive - Wayback Machine.pdf

Abstract

The Wayback Machine is one of the internet’s most important tools, with billions of archived web pages. But using it can be overwhelming. You open a page from 2009 and immediately wonder: Is this capture reliable? How different is it from the one before? What changed? That is the problem I want to solve. Chrome recently shipped the Prompt API and Gemini Nano: on-device AI that runs entirely in your browser. No server calls. No data leaving your machine. That felt like the right foundation to build on. So I’m building a Chrome extension for the Wayback Machine that uses this on-device AI to help you understand what you’re looking at: surfacing context, flagging differences between captures, and making archived pages feel less like a raw data dump and more like something you can reason about. Everything runs client-side: privacy-friendly by default, fast, and practical within what a browser can do.

Problem statement

Archived web pages are hard to interpret quickly. Users often need to manually:

determine relevance,
identify broken or incomplete captures,
detect soft-404 pages,