https://imgs.search.brave.com/g8G3mje6065VuUz6kkaw7u2igb78DV9okIbR2VsyZuE/rs:fit:860:0:0:0/g:ce/aHR0cDovL25ldHBy/ZXNlcnZlLm9yZy93/cC1jb250ZW50L3Vw/bG9hZHMvbG9nb19J/QS5vcmdfLnBuZw
Building a privacy-preserving Wayback Machine browser extension that uses on-device AI (Chrome Prompt API) to help people quickly judge an archived capture’s relevance and quality.
| Name | Sudipta Das |
|---|---|
| Organization | Internet Archive |
| University | Rajiv Gandhi Institute of Petroleum Technology |
| Timezone | UTC+5:30 (Asia/Calcutta) |
| Mentors | Dr. Sawood Alam; Will Howes |
| Proposal status | Accepted |
You can find the proposal here in the below link:
GSoC Proposal - Internet Archive - Wayback Machine.pdf
The Wayback Machine is one of the internet’s most important tools, with billions of archived web pages. But using it can be overwhelming. You open a page from 2009 and immediately wonder: Is this capture reliable? How different is it from the one before? What changed? That is the problem I want to solve. Chrome recently shipped the Prompt API and Gemini Nano: on-device AI that runs entirely in your browser. No server calls. No data leaving your machine. That felt like the right foundation to build on. So I’m building a Chrome extension for the Wayback Machine that uses this on-device AI to help you understand what you’re looking at: surfacing context, flagging differences between captures, and making archived pages feel less like a raw data dump and more like something you can reason about. Everything runs client-side: privacy-friendly by default, fast, and practical within what a browser can do.
Archived web pages are hard to interpret quickly. Users often need to manually: