author name
Stephen Merity
Imagine you were a young child and wanted to ask about something. Being so young (and assuming you are not exceedingly precocious), how would you describe a new object, the name of which you have yet to learn? The intuitive answer: point to it!
The WikiText language modeling dataset is a collection of over 100 million tokens extracted from the set of verified Good and Featured articles on Wikipedia. The dataset is available under the Creative Commons Attribution-ShareAlike License.