Build your own Google Alerts substitute (Python)

I loved Google Alerts, until it worked.  So, I was forced to create my own notifier (in the fastest possible way) and I wrote a basic python script that for my particular case, does the job pretty well.

The idea is the following: we search Google for our interested keyword/phrase, get the results back, find the total number of results, compare this number to our previous number of results (saved somewhere on a file), if greater: send email and save the bigger number on our file to use it for the next comparison. You can put this script on a crontab or, if in Windows, check Z-cron, and schedule it to run invisibly, say, every hour.

After the notification, I would go to Google and check the last 24 hour results to see what’s going on.

Below is the python script that searches for “Nard Ndoka” and sends an email to nardndoka@gmail.com when some page mentions this name. To use the script change the URL, email credentials, file paths, accordingly.


import urllib2
import re
from bs4 import BeautifulSoup
import smtplib


def remove_html_markup(s):
    tag = False
    quote = False
    out = ""

    for c in s:
            if c == '<' and not quote:
                tag = True
            elif c == '>' and not quote:
                tag = False
            elif (c == '"' or c == "'") and tag:
                quote = not quote
            elif not tag:
                out = out + c
    return out

url = "https://www.google.com/search?q=%22Nard+Ndoka%22"; #url to search, change accordingly

opener = urllib2.build_opener()
opener.addheaders = [('User-agent', 'Mozilla/5.0')]
find_html = opener.open(url).read()
soupg = BeautifulSoup(find_html)
results = soupg.findAll("div", { "id" : "resultStats" }).__str__() 

text = remove_html_markup(results) #remove html tags
final = re.sub(r"\D", "", text) #remove non-digit characters


with open("C:\Users\user\Desktop\searched.txt") as f:
    value = f.read()

if int(final)<=int(value):
    print "No new results"
else:
    f = open("C:\Users\user\Desktop\searched.txt", "w")
    f.write(final)      
    f.close()
    print "New results. New results. Sending mail.."
    def send_email2():
            import smtplib

            gmail_user = "nardndoka@gmail.com"
            gmail_pwd = "password"
            FROM = 'nardndoka@gmail.com'
            TO = ['nardndoka@gmail.com'] #must be a list
            SUBJECT = "New Alert"
            TEXT = "Google Search number was changed for Nard Ndoka. Now is ", final

            # Prepare actual message
            message = """\From: %s\nTo: %s\nSubject: %s\n\n%s
            """ % (FROM, ", ".join(TO), SUBJECT, TEXT)
            try:
                #server = smtplib.SMTP(SERVER) 
                server = smtplib.SMTP("smtp.gmail.com", 587) #or port 465 doesn't seem to work!
                server.ehlo()
                server.starttls()
                server.login(gmail_user, gmail_pwd)
                server.sendmail(FROM, TO, message)
                #server.quit()
                server.close()
                print 'Successfully sent the mail'
            except:
                print "Failed to send mail"
    send_email2()

Lini një Përgjigje

Plotësoni më poshtë të dhënat tuaja ose klikoni mbi një nga ikonat për hyrje:

Stema e WordPress.com-it

Po komentoni duke përdorur llogarinë tuaj WordPress.com. Log Out / Ndryshoje )

Figurë Twitter-i

Po komentoni duke përdorur llogarinë tuaj Twitter. Log Out / Ndryshoje )

Foto Facebook-u

Po komentoni duke përdorur llogarinë tuaj Facebook. Log Out / Ndryshoje )

Google+ photo

Po komentoni duke përdorur llogarinë tuaj Google+. Log Out / Ndryshoje )

Po lidhet me %s

Ndiqe

Merreni çdo postim të ri drejt e te email-et tuaja.

Bashkojuni 32 ndjekësve të tjerë

%d bloggers like this: