
WP2TXT extracts plain text data from Wikipedia dump file (encoded in XML/compressed with Bzip2) stripping all the MediaWiki markups and other metadata. In addition, the app allows you to specify text elements to be extracted/converted (title, heading, paragraph, etc.). The character references are converted to UTF-8 entities.
The app is originally intended to be useful for researchers who look for an easy way to obtain open-source multi-lingual corpora, but may be handy for other purposes.
Follow the adventure! Midnight Planets is Midnight Martian's new app for visualizing data from spacecraft exploring our solar system...
Fighting The Infected Horde
The Immersive 2D Sandbox Platformer Game You Have to Try
An epic battle royale game
Premium survival horror
One of the best Resident Evil games
The First Game in the Call of Duty: Black Ops Series
Comments