Where does the HTML come from in the dataset? December 06, 2023 22:45 Updated The HTML included is translated from Wikitext (the original markup of Wikimedia projects) and optimized for parsing. It is created as part of the Wikimedia Parsoid project. More information on the data source can be found in these DOM specs. Related articles What data besides the article content is supported in the API? My Realtime streaming connection broke and I may need to go back to get revisions I’d missed. How do I do that? What APIs are available with Enterprise? How do I get all of the articles in the Wikimedia dataset onto my system? What do I do if I forgot my password and haven’t verified my email address yet?