sqlite in python

sqlite in python#

An sqlite3 library ships with python. In this tutorial, we’ll discuss how to utilize this library and read sqlite tables into pandas. With this, you can generalize to other python APIs to other databases. First, let’s continue on with our work from the previous notebook. A nice little tutorial can be found here.

import sqlite3 as sq3
import pandas as pd

##Download the database using the unix command wget
!wget https://raw.githubusercontent.com/smart-stats/ds4bio_book/main/book/sql/opioid.db

## Create the connection
con = sq3.connect("opioid.db")

## cursor() creates an object that can execute functions in the sqlite cursor
sql = con.cursor()

## execute a command, loop through the output and print out
for row in sql.execute("select * from county_info limit 5;"):
    print(row)


# you have to close the connection
con.close

--2024-04-15 18:53:36--  https://raw.githubusercontent.com/smart-stats/ds4bio_book/main/book/sql/opioid.db
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.110.133, 185.199.111.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 7774208 (7.4M) [application/octet-stream]
Saving to: ‘opioid.db’

opioid.db           100%[===================>]   7.41M  21.5MB/s    in 0.3s    

2024-04-15 18:53:36 (21.5 MB/s) - ‘opioid.db’ saved [7774208/7774208]

('1', 'AUTAUGA', 'AL', '01001', '1', '1', 'Autauga', 'Autauga County, Alabama', 'B01003_001', '2006', '51328', 'Autauga, AL', '594.44')
('2', 'BALDWIN', 'AL', '01003', '1', '3', 'Baldwin', 'Baldwin County, Alabama', 'B01003_001', '2006', '168121', 'Baldwin, AL', '1589.78')
('3', 'BARBOUR', 'AL', '01005', '1', '5', 'Barbour', 'Barbour County, Alabama', 'B01003_001', '2006', '27861', 'Barbour, AL', '884.88')
('4', 'BIBB', 'AL', '01007', '1', '7', 'Bibb', 'Bibb County, Alabama', 'B01003_001', '2006', '22099', 'Bibb, AL', '622.58')
('5', 'BLOUNT', 'AL', '01009', '1', '9', 'Blount', 'Blount County, Alabama', 'B01003_001', '2006', '55485', 'Blount, AL', '644.78')

<function Connection.close()>

Let’s read this dataset into pandas.

con = sq3.connect("opioid.db")

county_info = pd.read_sql_query("SELECT * from county_info", con)

# you have to close the connection
con.close

county_info.head()

		BUYER_COUNTY	BUYER_STATE	countyfips	STATE	COUNTY	county_name	NAME	variable	year	population	Areaname	LND110210D
0	1	AUTAUGA	AL	01001	1	1	Autauga	Autauga County, Alabama	B01003_001	2006	51328	Autauga, AL	594.44
1	2	BALDWIN	AL	01003	1	3	Baldwin	Baldwin County, Alabama	B01003_001	2006	168121	Baldwin, AL	1589.78
2	3	BARBOUR	AL	01005	1	5	Barbour	Barbour County, Alabama	B01003_001	2006	27861	Barbour, AL	884.88
3	4	BIBB	AL	01007	1	7	Bibb	Bibb County, Alabama	B01003_001	2006	22099	Bibb, AL	622.58
4	5	BLOUNT	AL	01009	1	9	Blount	Blount County, Alabama	B01003_001	2006	55485	Blount, AL	644.78

This should get you started on working with SQL in python. Personally, I find using python easier than directly using the SQL, so the basic approach in this section is how I would suggest that you operate.