Chapter 2: Relational Model

Structure of Relational DB

  • Attibutes $A_1, A_2, \cdots$
    • Domain of the attribute
    • Attribute values are atomic: i.e., indivisible
    • null value
  • Relation schema $R = (A_1, A_2, \cdots)$
  • Tuple: element of relation $r(R)$ (row in the table)

Keys

- Superkey $K \subset R$: can identify unique tuple in $r$
- *candidate key*: minimal superkey
- *primary key*
- Foreign key constraint: foreign key of $r_1$ is primary key of $r_2$
┌───────────────────────────────────────────────────┐
│    instructor(**ID**, name, dept_name, salary)    │ : referencing relation (r1)
└──────────────────────────────────┬────────────────┘
                          ┌────────┘
                          │
                          ▼
┌───────────────────────────────────────────────────┐
│    department(**dept_name**, building, budget)    │ : reference relation (r2)
└───────────────────────────────────────────────────┘

DB Schema

Schema diagram on the book

Schema Diagrams

tbd

Relational Query Languages

  • Procedural vs Declarative (non=procedural)
  • Pure languages: equivalent in computing power
    • relational algebra, tuple relational calculus, domain relational calculus

Relational Algebra

  • select $\sigma_{\text{predicate}}(r)$
    • comparison: $=, \neq, >, \geq, <, \leq$
    • logical connectives: $\vee, \wedge$
  • project $\Pi_{\text{set of attributes}}(r)$
  • cartesian product $r_1 \times r_2$
  • union $r_1 \cup r_2$
    • $r_1$ and $r_2$ must have same arity (number of attributes)
    • attribute domains must be compatible
  • set difference $r_1 - r_2$
    • tuples that are in one relation but not in another
    • $r_1$ and $r_2$ must have same arity (number of attributes)
  • assignment $\text{'NAME'} \leftarrow r$
  • rename $\rho$
    • rename can change the name of attributes, while assignment cannot.
  • Query equivalence

Chapter 3: Basic SQL

Overview

  • SQL Parts
    • DML: insert, delete, modify DB
    • Integrity of DDL: enforces specific integrity constraints of the tuples
    • View definition of DDL
    • Transaction control
    • Embedded SQL / Dynamic SQL
    • Authorization

SQL Data definition

  • What in DDL?
    • Schema for each relation
    • Data type & values of each attribute
    • Integrity constraints
    • Indices
    • Security & Authorization information
    • Physical storage structure
  • Data types
    • char(n): fixed length $n$ char
    • varchar(n): maximum length $n$ char
    • int: just int (4 bytes, machine dependent)
    • smallint: small int (2 bytes, machine dependent)
    • numeric(p, d): fixed point real, $p$ digit precision & $d$ digits right to decimal point
    • real: FP
    • float(n): FP with precision at least $n$ digits

Basic Query Structure

  • Create Table
-- table DEPARTMENT, SECTION exists
create table INSTRUCTOR (
    ID          char(5),
    name        varchar(20),
    dept_name   varchar(20),
    salary      numeric(8,2),
    primary key (ID),
    foreign key (dept_name) references DEPARTMENT,
    foreign key (course_id, sec_id, semester) references SECTION
)
  • Insert to table
insert into INSTRUCTOR values ('10211', 'Smith', 'Biology', 66000)
  • Delete from table
delete from INSTRUCTOR
  • Drop table
drop table INSTRUCTOR
  • Alter table: change the attribute
alter table INSTRUCTOR add attribute Domain
-- existing tuples are assigned `null`
alter table INSTRUCTOR drop attribute Domain
  • Select
-- default is select all
select name
from INSTRUCTOR where `predicate...`

select distinct name as "NAAAAME"
 from INSTRUCTOR where `predicate...`

-- sqlite supports select without from

Additional Basic Operations

  • String operations (like matching)

    • %: any substring
    • _: any single character
    • ||: concatenation
  • order by

select distinct name
from INSTRUCTOR
order by name
-- default is asc!
  • where clause predicate
-- between
select name 
from INSTRUCTOR
where salary between 90000 and 100000
-- tuple match
select name 
from INSTRUCTOR
where (ID, dept_name) = (_something, _something)

Set Operations

(select ...) union (select ...)      -- A or B
(select ...) intersect (select ...)  -- A and B
(select ...) except (select ...)     -- A, not B
-- discard duplicates for default: use `union all`

Null Values

  • arithmetic: 5 + null == null
  • where _attribute is not null clause
  • null comparison: makes unknown value
    • null < 5 == unknown
    • unknown arithmetic
      • true and unknown == unknown
      • false and unknown == false
      • unknown and unknown == unknown
      • true or unknown == true
      • false or unknown == unknown
      • unknown or unknown == unknown

Aggregate Functions

avg, min, max, sum, count

  • group by
select dept_name, avg(salary) as avg_salary
from INSTRUCTOR
group by dept_name
  • having
select dept_name, avg(salary) as avg_salary
from INSTRUCTOR
group by dept_name
having avg(salary) > 42000
  • having vs where
    • where checks the predicate before grouping
    • having checks the predicate after grouping

Nested Subqueries

select in select

select A1, A2
from r1, r2
where P
-- select: A could be replaced by (select ...)
-- from: r could be replaced by (select ...)
-- where: P could be `_attribute in (select ...)`
  • with clause
with MAX_BUDGET(value) as (select max(budget) from DEPARTMENT) 
select department.dept_name
from DEPARTMENT, MAX_BUDGET
where DEPARTMENT.budget=MAX_BUDGET.value;

-- for complicated with clause,
with W1 as (select ...), W2 as (select ...)
_do_something

Set membership

in, not in

Set comparison

  • some
select distinct T.name
from INSTRUCTOR as T, instructor as S
where T.salary>S.salary and S.dept_name='Biology'

-- could be written in
select name
from INSTRUCTOR
where salary > some (
    select salary from instructor where dept_name='Biology'
)
  • all: not supported in sqlite
  • exists
-- does the tuple exists?
select course_id
from SECTION as S
where exists (
    select * from section as T
    where semester='Spring' and year=2018 and S.course_id=T.course_id
)
  • unique: not supported in sqlite
-- is it unique tuple?
select T.course_id
from COURSE as T
where unique (
    select * from section as T
    where semester='Spring' and year=2018 and S.course_id=T.course_id
)

Modification of the DB

  • Deletion
delete from INSTRUCTOR where ...
  • Insertion
insert into COURSE(attr1, attr2, ...) values(something, something, ...)
-- or just
insert into COURSE values(something, something, ...)
-- insert from select: subquery should not loop!
insert into INSTRUCTOR (select attr1, ... from STUDENT where _predicate)
  • Update
update INSTRUCTOR set salary=salary*1.05 where salary<70000
-- conditional update
update INSTRUCTOR set salary = case
    when salary<=100000 then salary*1.05
    else 100
end

When performing multiple updates, the order matters!


Chapter 4: Intermediate SQL

Join Expression

  • Natural join: matches tuples with same values for all common attributes
select name, course_id from 
STUDENT, TAKES
where STUDENT.ID=TAKES.ID
-- equals
select name, course_id
from STUDENT
natural join TAKES
  • using clause: CAUTION! use only if the column naes are the same
select l.title, r.name
from ALBUMS
inner join ARTISTS on
r.ArtistID = l.ArtistId;

-- equals 

select title, name
from albums
inner join ARTISTS
using (ArtistId)
  • on condition

Equi join: special type of inner join using equality operator

  • Inner Join: Returns rows that exists in both tables
  • Outer join
    • left outer join: containing all rows of the left table
    • right outer join: containing all rows of the right table
    • full outer join: containin all rows of both tables

Views

virtual relation mechanism to hide certain data from the view of certain users

  • Create a view
create view v as (select ...)
  • materialized veiw: some database system physically stores the view

Transactions

  • Sequence of query and/or update statements as a unit of work
    • Commit: update the transactions performed
    • Rollback: revert the transactions

Integrity Constraints

  • Single relation constraits
    • not null
    • primary key
    • unique
    • check(PREDICATE)
create table SECTION (
    course_id       varchar(8) not null,
    sec_id          varchar(8) not null,
    semester        varchar(8) not null,
    primary key (course_id, sec_id)
    check (semester in ('Fall', 'Spring'))
)
  • Referential Integrity
    • foreign key
    • c.f. cascading actions in violating the integrity
foreign key (dept_name) references DEPARTMENT
    on delete cascade
    on update cascade
-- deleting dept_name from DEPARTMENT deletes all tuples referencing that value

SQL Data Types and Schemas

  • date: '2005-07-27'

  • time: '09:00:30'

  • timestamp: '2005-07-27 09:00:30.75'

  • interval: subtracting two time values

  • Large-object types

    • blob: binary large object
    • clob: character large object
    • query retuns a pointer rather than the object itself.
  • User defind types

create type Dollars as numeric(12, 2) final;
  • Domains: not supported in sqlite
create domain person_name char(20) not null;
create domaindegree_level varchar(10) constraint degree_level_test 
check (value in ('something', 'something2'))

Index definition

create table STUDENT (
    ID              varchar(5),
    name            varchar(20) not null,
    dept_name       varchar(20),
    tot_cred        numeric(3, 0) default 0,
    primary key (ID)
)
create index studentID_index on STUDENT(id)

Authorization

  • authoriing on select, insert, update, delete, all privileges
  • Grant
grant select on DEPARTMENT to Amit, Satoshi
  • Revoke
revoke select on DEPARTMENT from Amit, Satoshi
  • Roles
create role instructor
grant instructor to Amit
-- we can grant, revoke to roles, too!
  • grant reference
grant reference (dept_name) on DEPARTMENT ot Amit
  • Transferring privileges
grant select on DEPARTMNET to Amit with grant option;
revoke select on DEPARTMENT from Amit, Satoshi cascade;
revoke select on DEPARTMENT from Amit, Satoshi restrict;

Chapter 5: Advanced SQL

Accessing SQL from GP-PL

  • Dynamic SQL: SQL statements constructed at runtime (SQL query running dynamically)
  • Embedded SQL: SQL statements do not change at runtime (hard-coded)

Triggers

  • Trigger Event: insert, delete, update
create trigger setnull_trigger before update of TAKES
referencing old row as orow
referencing new row as nrow
for each row
    when (nrow.grade='')
    begin atomic
        set nrow.grade=null;
    end

Recursive Queries: with recursive

with recursive rec_prereq(course_id, prereq_id) as (
    select course_id, prereq_id from PREREQ

    union

    select rec_prereq.course_id, PREREQ.prereq_id from rec_prereq, PREREQ
    where rec_prereq.prereq_id=PREREQ.course_id
)
select course_id from rec_prereq;

Advanced Aggregate function

Rank

select ID, rank() over (order by GPA desc nulls last) as s_rank 
from student_grades; 
  • ntile(n): partition into n buckets and return the index value
select ID, ntile(4) over (order by GPA desc) as quartile 
from student_grades;

Windowing

select date, sum(value) over (order by date rows between 1 preceding and 1 following) 
from sales;
  • window specification
rows between unbounded preceding and current row
rows unbounded preceding
range between 10 preceding and current row
range interval 10 day preceding
  • window specification within partitions
select account_number, date_time, 
sum (value) over 
    (partition by account_number
    order by date_time
    rows unbounded preceding) as balance
from TRANSACTION
order by account_number, date_time;

Chapter 6: DB Design using E-R Model

NOTICE: using UML

  • Reducing redundancy + prevent incompleteness