Text Extraction / Web Page Cleaning
CrawlingIndia
provides easy-to-use mechanisms to extract page text and title information from
any web page.
A HTML
page cleaning facility is provided, which normalizes / cleans HTML content
(removing ads, navigation links, and other unimportant content), enabling
extraction of only the important article text.
Labels: About US
0 Comments:
Post a Comment
Subscribe to Post Comments [Atom]
<< Home