Electoral Roll Digitization

AI-powered extraction of voter data from Indian electoral roll PDFs — West Bengal SIR 2025–26
Why this exists: In 2025–26, the Election Commission of India launched a Special Intensive Revision (SIR) of electoral rolls in West Bengal and Bihar — an exercise the ECI itself called unprecedented. In West Bengal alone, the final roll left 60 lakh voters marked “Under Adjudication”, their right to vote uncertain. Yet the Commission publishes this data only as scanned PDF images — no searchable database, no structured data, no public dashboard. This project uses AI vision models to extract every voter record from these PDFs, making SIR’s impact measurable and open to public scrutiny for the first time.

Overall

2
Districts
4
Constituencies analysed
935,758
Total voters extracted
263,279
Under adjudication
28.1%
Adjudication rate
1,010
PDF files processed

Districts

Malda

2 constituencies · 466 PDF files
458,667
Voters
144,388
Adjudicated
31.5%
Adj. Rate

Constituencies

49 — Manikchak 261 booths · 254,915 voters 25.7%
52 — Mothabari 205 booths · 203,752 voters 38.7%
View Malda district →

Murshidabad

2 constituencies · 544 PDF files
477,091
Voters
118,891
Adjudicated
24.9%
Adj. Rate

Constituencies

56 — Samserganj 247 booths · 235,747 voters 45.7%
72 — Bahrampur 290 booths · 235,618 voters 4.7%
View Murshidabad district →