1. doc2vec::be_parliament_2020
    Corpus with Questions asked in the Belgium Federal Parliament in 2020
  2. nametagger::europeananews
    Tagged news paper articles from Europeana
  3. recogito::openseadragon_areas
    A dataset of annotations using openseadragon
    data.frame|3 x 9
  4. textplot::example_btm
    Example Biterm Topic Model
  5. textplot::example_embedding
    Example word embedding matrix
  6. textplot::example_embedding_clusters
    Example words emitted in a ETM text clustering model
  7. textplot::example_udpipe
    Example annotation of text using udpipe
  8. textrank::joboffer
    The text of a job offer, annotated with the package udpipe
  9. tokenizers.bpe::belgium_parliament
    Dataset from 2017 with Questions asked in the Belgium Federal Parliament
  10. topicmodels.etm::ng20
    Bag of words sample of the 20 newsgroups dataset
  11. udpipe::brussels_listings
    Brussels AirBnB address locations available at www.insideairbnb.com
  12. udpipe::brussels_reviews
    Reviews of AirBnB customers on Brussels address locations available at www.insideairbnb.com
  13. udpipe::brussels_reviews_anno
    Reviews of the AirBnB customers which are tokenised, POS tagged and lemmatised
  14. udpipe::brussels_reviews_w2v_embeddings_lemma_nl
    An example matrix of word embeddings
    matrix|2687 x
  15. udpipe::udpipe_annotation_params
    List with training options set by the UDPipe community when building models based on the Universal Dependencies data