Skip to Content

Technology Blog

Technology Blog

Tag – Poppler-utils

Scraping pdf, doc, and docx with Scrapy

In February 2017, Google announced its plans to discontinue its Google Site Search product. Those clients of Imaginary Landscape who had relied on Google to provide their users with a search engine service for their website looked to us for a new solution. Finding no obvious equivalent replacement, we decided to create our own website scraper and accompanying search app.

Back