Scrapping data

Scrapping consists in retrieve information through the standard web. Bots are programs or scripts (little pieces of executable code) with an specific purpose. Bots for scrapping used to be named crawlers or even spiders (because they move over the “web”). Crawling or scrapping are used like synonyms.
[googlebot is the most famous crawler in the world; it caches almost all webpages in the world]

Scrape data from internet and index it would be a part of Big Data and Data Analytics. i.e. It’s useful for social sentiment analysis, used in investment.

Python is a multi-purpose scripting language. I use it often. Python makes easy some actions which are quite complex in other programming languages.

python web crawling

Automation is like music for my ears. Yesterday I was trying to learn some phrasal verbs and I wanted a list of them. I founded a list with thousands of phrasal verbs with their corresponding description link.

How to retrieve the whole list? Easy, if I take the webpage I can parse the HTML getting only the verbs. I did it. Once done, I went far away: my program followed the thousands links to retrieve their meanings, examples and some notes. Now I have a huge list with all the information required in a tabular way and I could search for them off-line and filter separately if it is international, american or british english.

It only took me a few minutes to write the code, and a while during retrieve information from +2300 description pages. Now, I have it forever in a little file.

Hello world! One more time

Welcome everybody to my n-th new blog, and my m * nth new post.

First and foremost I’ve chosen to write it in English. I decided it to be a blog not only for me but the people who want to know what is the stuff in my mind at this moment –and eventually to know something about me. Lately I’m not active in the main social networks although my mind keep me on as my main engine.

You might know Hello world! used to be your first program in any programming language. This is a Hello world! more than ever, because my posts are intended to be mind-blowing enough to keep away boring people.