-
NLP (2/n): Tokens of text
A first (basic) look at the collected text: Tokens It’s all great: We’re able to collect all our blog entries, or well, the text thereof. We are even able to crawl a bit faster, provided we have a multi-core setup. (Worst case scenario, we’re limited in the crawling by the speed of our website’s, mostly).…
-
NLP (1/n): Scraping all the Blog articles (the hard way)
Intro To use Natural Language Processing algorithms, we first need data. We’ve seen last time how to scrape ONE article. And how to get to different pages of the Blog. But for this to be usable in the future, we should be able to download all articles UNTIL there is no more (e.g. we probably…
-
Diving into NLP
Intro So I have been thinking a bit about chatbots and other NLP-related things lately. I’m probably NOT ready to implement an NLP-based Chatbot, that’s clear, but I can start doing some other things to get practice on the subject. So this is going to be a multi-part thing, or rather a recurring subject in…