
WP2TXT extracts plain text data from Wikipedia dump file (encoded in XML/compressed with Bzip2) stripping all the MediaWiki markups and other metadata. In addition, the app allows you to specify text elements to be extracted/converted (title, heading, paragraph, etc.). The character references are converted to UTF-8 entities.
The app is originally intended to be useful for researchers who look for an easy way to obtain open-source multi-lingual corpora, but may be handy for other purposes.
Follow the adventure! Midnight Planets is Midnight Martian's new app for visualizing data from spacecraft exploring our solar system...
Provides access to Reddit from the browser interface.
Highbrow makes it easy to switch web browsers on the Mac, eliminating the hassle of using multiple browsers.
Allows you to run Analytics straight from your Mac as a standalone application.
Active Users for Mac puts Google Analytics in your menu bar, letting you view active users across multiple sites quickly.
Browse nearby Bonjour and Zeroconf/mDNS websites with eae.
The WAVE Firefox toolbar provides a mechanism for running WAVE reports.
Comments