How to extract a web page in python

Hi..to all today i learnt have to extract a webpage in python …here is the simple code…it’s so easy.. we need to install the following..

$ sudo apt-get install python-setuptools
$ sudo easy_install stripogram

import urllib
from stripogram import html2text
myurl=urllib.urlopen(“https://tuxbalaji.wordpress.com”)
html_string=myurl.read()
text= html2text( html_string )
print(text)

This is will print the source page of the given url as text as our output..

Thanks…enjoy coding in python….:)

Author: Balaji

Hi..My name is Balaji and i am working as a Senior Software Developer in India. I am interested in Shell scripts, python, erlang , linux kernel .

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: