sqlite in python

Open In Colab

sqlite in python#

An sqlite3 library ships with python. In this tutorial, we’ll discuss how to utilize this library and read sqlite tables into pandas. With this, you can generalize to other python APIs to other databases. First, let’s continue on with our work from the previous notebook. A nice little tutorial can be found here.

import sqlite3 as sq3
import pandas as pd

##Download the database using the unix command wget
!wget https://raw.githubusercontent.com/smart-stats/ds4bio_book/main/book/sql/opioid.db

## Create the connection
con = sq3.connect("opioid.db")

## cursor() creates an object that can execute functions in the sqlite cursor
sql = con.cursor()

## execute a command, loop through the output and print out
for row in sql.execute("select * from county_info limit 5;"):
    print(row)


# you have to close the connection
con.close
--2024-04-15 18:53:36--  https://raw.githubusercontent.com/smart-stats/ds4bio_book/main/book/sql/opioid.db
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.110.133, 185.199.111.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 7774208 (7.4M) [application/octet-stream]
Saving to: ‘opioid.db’

opioid.db           100%[===================>]   7.41M  21.5MB/s    in 0.3s    

2024-04-15 18:53:36 (21.5 MB/s) - ‘opioid.db’ saved [7774208/7774208]

('1', 'AUTAUGA', 'AL', '01001', '1', '1', 'Autauga', 'Autauga County, Alabama', 'B01003_001', '2006', '51328', 'Autauga, AL', '594.44')
('2', 'BALDWIN', 'AL', '01003', '1', '3', 'Baldwin', 'Baldwin County, Alabama', 'B01003_001', '2006', '168121', 'Baldwin, AL', '1589.78')
('3', 'BARBOUR', 'AL', '01005', '1', '5', 'Barbour', 'Barbour County, Alabama', 'B01003_001', '2006', '27861', 'Barbour, AL', '884.88')
('4', 'BIBB', 'AL', '01007', '1', '7', 'Bibb', 'Bibb County, Alabama', 'B01003_001', '2006', '22099', 'Bibb, AL', '622.58')
('5', 'BLOUNT', 'AL', '01009', '1', '9', 'Blount', 'Blount County, Alabama', 'B01003_001', '2006', '55485', 'Blount, AL', '644.78')
<function Connection.close()>

Let’s read this dataset into pandas.

con = sq3.connect("opioid.db")

county_info = pd.read_sql_query("SELECT * from county_info", con)

# you have to close the connection
con.close

county_info.head()
BUYER_COUNTY BUYER_STATE countyfips STATE COUNTY county_name NAME variable year population Areaname LND110210D
0 1 AUTAUGA AL 01001 1 1 Autauga Autauga County, Alabama B01003_001 2006 51328 Autauga, AL 594.44
1 2 BALDWIN AL 01003 1 3 Baldwin Baldwin County, Alabama B01003_001 2006 168121 Baldwin, AL 1589.78
2 3 BARBOUR AL 01005 1 5 Barbour Barbour County, Alabama B01003_001 2006 27861 Barbour, AL 884.88
3 4 BIBB AL 01007 1 7 Bibb Bibb County, Alabama B01003_001 2006 22099 Bibb, AL 622.58
4 5 BLOUNT AL 01009 1 9 Blount Blount County, Alabama B01003_001 2006 55485 Blount, AL 644.78

This should get you started on working with SQL in python. Personally, I find using python easier than directly using the SQL, so the basic approach in this section is how I would suggest that you operate.