Archive for February 2017
Copy a file from a URL directly to S3 using boto3 and requests
Copy a file at inUrl directly to a s3 bucket bucketName. ACL is set to public-read and ContentType is maintained from the from URL.
Your AWS credentials must be pre-set.
import boto3 import requests #Set up boto3 env vars os.environ['AWS_ACCESS_KEY_ID'] = '' os.environ['AWS_SECRET_ACCESS_KEY'] = '' os.environ['AWS_DEFAULT_REGION'] = 'us-west-2' inUrl = r'file_to_read' response = requests.get(inUrl) s3 = boto3.resource('s3') s3.Bucket(bucketName).put_object(Key='test/file_path.ext', Body=response.content, ACL='public-read',ContentType=response.headers['Content-Type'])
Wait a random delay in Python
Wait a random interval in seconds between 0 and delay.
import random import time timeDelay = random.randrange(0, delay) time.sleep(timeDelay)
Set User-Agent for Python requests library
Some sites don’t allow page access if the User-Agent isn’t set. headers is a hash and can hold additional header tokens.
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 Safari/537.36'} resp = requests.get(url, headers=headers)
Download and save file using Python Requests
Download and save a file using the Python requests library.
Where:
url = URL to download
filename = Name of local file for url
import requests url = '' response = requests.get(url) with open(filename, 'wb') as f: f.write(response.content)