DataBricks and PySpark

by Mark Nielsen
Copyright August 2023

  1. Links
  2. Get DataBricks 14 day evaluation
  3. Saving passwords
  4. Expect and automation


DataBricks 14 day trial

If you have problems deleting and recreating workspaces...

To get crendiatials for the next part

Create a token.

Get other crenditials -- the hostname and http address.

odbc and python

I had to do this on my AWS EC2, because the version of Ubuntu was older and I had stuff running on it. I didn't want to upgrade.

Python driver for Ubuntu

I did this on a laptop. Installed latest Linut Mint which is based on the latest Ubuntu. Python worked for this.

First setup the env.

ODBC driver for Ubuntu

Installing Python and other software

I am using Ubuntu EC2 server to connect.

Load data into your mysql server on EC2

You could use another source, like RDS MySQL or RDS Aurora, but in this case we are using an EC2 server runnning MySQL.