xyzio

Copy a file from a URL directly to S3 using boto3 and requests

leave a comment »

Copy a file at inUrl directly to a s3 bucket bucketName. ACL is set to public-read and ContentType is maintained from the from URL.

Your AWS credentials must be pre-set.

import boto3
import requests

#Set up boto3 env vars
os.environ['AWS_ACCESS_KEY_ID'] = ''
os.environ['AWS_SECRET_ACCESS_KEY'] = ''
os.environ['AWS_DEFAULT_REGION'] = 'us-west-2'


inUrl = r'file_to_read'
response = requests.get(inUrl)

s3 = boto3.resource('s3')
s3.Bucket(bucketName).put_object(Key='test/file_path.ext', Body=response.content, ACL='public-read',ContentType=response.headers['Content-Type'])

Written by M Kapoor

February 7, 2017 at 2:09 pm

Posted in python

Tagged with ,

Wait a random delay in Python

leave a comment »

Wait a random interval in seconds between 0 and delay.

import random
import time

timeDelay = random.randrange(0, delay)
time.sleep(timeDelay)

Written by M Kapoor

February 4, 2017 at 1:33 am

Posted in python

Tagged with , ,

Set User-Agent for Python requests library

leave a comment »

Some sites don’t allow page access if the User-Agent isn’t set. headers is a hash and can hold additional header tokens.

headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 Safari/537.36'}
resp = requests.get(url, headers=headers)

Written by M Kapoor

February 4, 2017 at 1:29 am

Posted in python

Tagged with , ,

Download and save file using Python Requests

leave a comment »

Download and save a file using the Python requests library.

Where:
url = URL to download
filename = Name of local file for url

import requests

url = ''

response = requests.get(url)
with open(filename, 'wb') as f:
    f.write(response.content)

Written by M Kapoor

February 2, 2017 at 4:46 pm

Posted in python

Tagged with ,

Setting up a golang website to autorun on Ubuntu using systemd

leave a comment »

Here is how to set up a website to auto-launch and restart on failure on Ubuntu using systemd. I’m using a simple site written in Go and compiled into an executable called gosite – I got sample Go code from Jeffrey Bolle’s page.

Simple golang website code

package main

import(
    "net/http"
    "log"
    "os"
)

const resp = `Simple Web App
Hello World!`

func handler(w http.ResponseWriter, r *http.Request) {
    w.Write([]byte(resp))
}

func main() {
    http.HandleFunc("/", handler)
    err := http.ListenAndServe(":8080", nil)

    if err != nil {
        log.Println(err)
        os.Exit(1)
    }
}

Setting up systemd

Once the code is ready and tested, lets move on to setting up systemd.

Create and open a file in the /lib/systemd/system folder:

 vim /lib/systemd/system/gosite.service

Edit the file with your parameters to match the contents below:

[Unit]
Description=A simple go website
ConditionPathExists=/home/user/bin/gosite

[Service]
Restart=always
RestartSec=3
ExecStart=/home/user/bin/gosite

[Install]
WantedBy=multi-user.target

Enable the service using systemctl:

root@hostname:/home/user# systemctl enable gosite.service
Created symlink from /etc/systemd/system/multi-user.target.wants/gosite.service to /lib/systemd/system/gosite.service.

Start the service:

service gosite start

Observe the service is running:

root@hostname:/home/user# ps aux | grep gosite
root 27413 0.0 0.2 827776 5012 ? Ssl 15:50 0:00 /home/user/bin/gosite

Written by M Kapoor

June 14, 2016 at 4:49 pm

Posted in Programming

Tagged with , , ,

Hacking Planet Atwood with Python and AWS

leave a comment »

Peter Atwood is a hobbyist who creates limited edition pocket tools and puts them up for sale on his site at Planet Pocket Tool.  His tools are fairly popular and being limited are hard to acquire.

Peter posts the sales randomly and the tools generally sell out within a few minutes of being listed.  The best way to capture a sale is to periodically check his site and get alerted when a sale is in progress.  This write-up is about automating the check and sending an alert via SMS and email using Python.

Blogspot publishes a RSS feed for their blogs.  I wrote a simple function to use feedparser to grab the datetime of the first item in the feed and compare it against the previous version.  I send out an alert if the datetime of the first item is different from the one I have saved.

#Get the blog entry
feed = feedparser.parse('http://atwoodknives.blogspot.com/feeds/posts/default')

#Figure out the publish time of the first entry
firstEntryPubTime = time.strftime('%Y%m%d%H%M%S', feed.entries[0].published_parsed)

#The newest post time did not match the previously saved post time
#We have a new post
if firstEntryPubTime != currentUpdateTime:

    #Save the new first blog entry time
    setCurrentUpdateTime(firstEntryPubTime)
    url = feed.entries[0].link
    subject = 'Atwood: ' + feed.entries[0].title
    body = url + feed.entries[0].summary

    #Send an alert
    send_alert(subject, body, url)

else:
    print "No Update"

Now, to send the alert we leverage Amazon Web Services and BOTO the AWS Python interface. Amazon has a service called Simple Notification Service or SNS. SNS is a push service that lets users push messages in various formats like SMS and email.

Getting started is simple. First create a topic to which people can subscribe using create_topic. Then subscribe your phone number, email address, and any other form of communication using subscribe. Now you are all set.

def send_alert(message_title, message_body, message_url):
    #Connect with boto using the AWS token and secret key
    c = boto.connect_sns('token','secret_key')
    topicarn = "arn:aws:sns:us-east-1:TopicName"
    #Publish or send out the URL of the blog post for quick clicking
    publication = c.publish(topicarn, url, subject=url[:110])
    #Close connection
    c.close()

I set up this script on a server at DigitalOcean and ran it periodically using cron. I was able to get to the buy link for most of Atwood’s sales with this methodology and eventually bought a Fancy Ti Atwrench. While nice and well made, it is definitely not worth what Peter Atwood charges for it.

Written by M Kapoor

June 10, 2015 at 11:58 pm

Recursively get all file names from a directory

leave a comment »

Recursively get all file names matching a substring from a directory.

Where:
* path = Directory to search
* matchStr = Substring to match

import os
import fnmatch

def getFiles(path, matchStr):
matches = []
for root, dirnames, filenames in os.walk(path):
    for filename in fnmatch.filter(filenames, matchStr):
        matches.append(os.path.join(root, filename))

return matches

Written by M Kapoor

April 12, 2015 at 2:21 pm

Posted in python