Archive for the ‘python’ Category
Generating Charts with Python and matplotlib as Base64 images for embedding in HTML webpages
This is code used to create the charts for my ‘Compare Expense Ratios‘ microsite. It uses matplotlib to generate a chart which is converted to a Base64 to be embedded directly into a webpage. This way there is no need to save an intermediate image.
#Inputs are a dictionary of lists that contains points to plot (points), the title, and the x and y-axis labels. def mathPlotLib(points, title, xlabel, ylabel): import matplotlib matplotlib.use('Agg') from matplotlib import pyplot import base64 import cStringIO import natsort #The two funds are the keys to the dicitonary fundOrder = [] for fund in natsort.natsorted(points.keys()): #Plot the points and keep track of the order they are added for the legend pyplot.plot(points[fund]) fundOrder.append(fund) # Add the legend, title, and x,y labels. pyplot.legend(fundOrder) pyplot.title(title) pyplot.xlabel(xlabel) pyplot.ylabel(ylabel) # Convert the image as png as a byte-string object my_stringIObytes = cStringIO.StringIO() pyplot.savefig(my_stringIObytes, format='png') # Seek to the beginning of the file and encode it as Base64 my_stringIObytes.seek(0) b64png = base64.b64encode(my_stringIObytes.read()) # Add the html wrapper to embed it as base64 html = '<img src="image/png;base64,' + b64png + '" />' # Clean up my_stringIObytes.close() pyplot.clf() pyplot.close() return html
Automate build and load of Hugo sites to Amazon S3 using Rclone with Python
This was the build and load flow for my Hugo site before I gave up on Hugo and moved back to WordPress. The site was generated using Hugo and pushed to Amazon s3 using rclone.
The code assumes the hugo and rclone executables, and the hugo root directory are in the same folder.
In this example the hugo directory is called codepearls.xyzio.com.
import os, shutil # s3 auth info s3SecretAccessKey = 's3_secret_key' s3AccessKeyId = 's3_access_key' # Hugo folder and bucket name. sitepath is path to the hugo public folder. path = 'codepearls.xyzio.com' sitepath = os.path.join(path, 'public') #Remove the files from the previous build by deleting the public folder if os.path.exists(sitepath): shutil.rmtree(sitepath) # Run hugo from root directory cmd = 'hugo.exe -s ' + path os.system(cmd) # Run rclone from root directory to sync /public to s3 and the appropriate args to encrypt the files and enable public read cmd = 'rclone.exe sync ' + sitepath + ' s3:' + path cmd += ' -v --s3-secret-access-key ' + s3SecretAccessKey cmd += ' --s3-access-key-id ' + s3AccessKeyId cmd += ' --s3-acl public-read' cmd += ' --s3-server-side-encryption AES256' cmd += ' --s3-provider AWS' cmd += ' --s3-region us-west-2' os.system(cmd)
Generate and save a static page in Django
Generate and save a static page in Django.
Where
html/static.html is a Django template
data is the data you want to pass into the page
static(request) is a function mapped to a url in urls.py
Then
Visiting the page will generate this static file
Code
from django.template.loader import render_to_string def static(request): results = render_to_string('html/static.html', {'content': html}) with open(r'C:tempstatic.html','w') as f: f.write(results)
Copy a file from a URL directly to S3 using boto3 and requests
Copy a file at inUrl directly to a s3 bucket bucketName. ACL is set to public-read and ContentType is maintained from the from URL.
Your AWS credentials must be pre-set.
import boto3 import requests #Set up boto3 env vars os.environ['AWS_ACCESS_KEY_ID'] = '' os.environ['AWS_SECRET_ACCESS_KEY'] = '' os.environ['AWS_DEFAULT_REGION'] = 'us-west-2' inUrl = r'file_to_read' response = requests.get(inUrl) s3 = boto3.resource('s3') s3.Bucket(bucketName).put_object(Key='test/file_path.ext', Body=response.content, ACL='public-read',ContentType=response.headers['Content-Type'])
Wait a random delay in Python
Wait a random interval in seconds between 0 and delay.
import random import time timeDelay = random.randrange(0, delay) time.sleep(timeDelay)
Set User-Agent for Python requests library
Some sites don’t allow page access if the User-Agent isn’t set. headers is a hash and can hold additional header tokens.
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 Safari/537.36'} resp = requests.get(url, headers=headers)
Download and save file using Python Requests
Download and save a file using the Python requests library.
Where:
url = URL to download
filename = Name of local file for url
import requests url = '' response = requests.get(url) with open(filename, 'wb') as f: f.write(response.content)
Recursively get all file names from a directory
Recursively get all file names matching a substring from a directory.
Where:
* path = Directory to search
* matchStr = Substring to match
import os import fnmatch def getFiles(path, matchStr): matches = [] for root, dirnames, filenames in os.walk(path): for filename in fnmatch.filter(filenames, matchStr): matches.append(os.path.join(root, filename)) return matches
How to use a different version of Python in PyCharm
How to use a different version of Python in PyCharm.
File -> Settings (Ctrl + Alt + S)
Click on Project Interpreter. Select your desired version of Python.
Going through a xlsx spreadsheet by column in Python
Most examples iterate by row. This one iterates by col.
import xlrd workbook = xlrd.open_workbook('test_file.xlsx') worksheet = workbook.sheet_by_index(0) num_cols = worksheet.ncols - 1 curr_col = -1 while curr_col < num_cols: #Iterate through cols curr_col += 1 col = worksheet.col(curr_col) for cell in col: #Iterate through cells in col print cell.value