CS360 Midterm
Chapter 2: Relational Model
Structure of Relational DB
- Attibutes $A_1, A_2, \cdots$
- Domain of the attribute
- Attribute values are atomic: i.e., indivisible
null
value
- Relation schema $R = (A_1, A_2, \cdots)$
- Tuple: element of relation $r(R)$ (row in the table)
Keys
- Superkey $K \subset R$: can identify unique tuple in $r$
- *candidate key*: minimal superkey
- *primary key*
- Foreign key constraint: foreign key of $r_1$ is primary key of $r_2$
┌───────────────────────────────────────────────────┐
│ instructor(**ID**, name, dept_name, salary) │ : referencing relation (r1)
└──────────────────────────────────┬────────────────┘
┌────────┘
│
▼
┌───────────────────────────────────────────────────┐
│ department(**dept_name**, building, budget) │ : reference relation (r2)
└───────────────────────────────────────────────────┘
DB Schema
Schema diagram on the book
Schema Diagrams
tbd
Relational Query Languages
- Procedural vs Declarative (non=procedural)
- Pure languages: equivalent in computing power
- relational algebra, tuple relational calculus, domain relational calculus
Relational Algebra
- select $\sigma_{\text{predicate}}(r)$
- comparison: $=, \neq, >, \geq, <, \leq$
- logical connectives: $\vee, \wedge$
- project $\Pi_{\text{set of attributes}}(r)$
- cartesian product $r_1 \times r_2$
- union $r_1 \cup r_2$
- $r_1$ and $r_2$ must have same arity (number of attributes)
- attribute domains must be compatible
- set difference $r_1 - r_2$
- tuples that are in one relation but not in another
- $r_1$ and $r_2$ must have same arity (number of attributes)
- assignment $\text{'NAME'} \leftarrow r$
- rename $\rho$
- rename can change the name of attributes, while assignment cannot.
- Query equivalence
Chapter 3: Basic SQL
Overview
- SQL Parts
- DML: insert, delete, modify DB
- Integrity of DDL: enforces specific integrity constraints of the tuples
- View definition of DDL
- Transaction control
- Embedded SQL / Dynamic SQL
- Authorization
SQL Data definition
- What in DDL?
- Schema for each relation
- Data type & values of each attribute
- Integrity constraints
- Indices
- Security & Authorization information
- Physical storage structure
- Data types
char(n)
: fixed length $n$ charvarchar(n)
: maximum length $n$ charint
: just int (4 bytes, machine dependent)smallint
: small int (2 bytes, machine dependent)numeric(p, d)
: fixed point real, $p$ digit precision & $d$ digits right to decimal pointreal
: FPfloat(n)
: FP with precision at least $n$ digits
Basic Query Structure
- Create Table
-- table DEPARTMENT, SECTION exists
create table INSTRUCTOR (
ID char(5),
name varchar(20),
dept_name varchar(20),
salary numeric(8,2),
primary key (ID),
foreign key (dept_name) references DEPARTMENT,
foreign key (course_id, sec_id, semester) references SECTION
)
- Insert to table
insert into INSTRUCTOR values ('10211', 'Smith', 'Biology', 66000)
- Delete from table
delete from INSTRUCTOR
- Drop table
drop table INSTRUCTOR
- Alter table: change the attribute
alter table INSTRUCTOR add attribute Domain
-- existing tuples are assigned `null`
alter table INSTRUCTOR drop attribute Domain
- Select
-- default is select all
select name
from INSTRUCTOR where `predicate...`
select distinct name as "NAAAAME"
from INSTRUCTOR where `predicate...`
-- sqlite supports select without from
Additional Basic Operations
-
String operations (
like
matching)%
: any substring_
: any single character||
: concatenation
-
order by
select distinct name
from INSTRUCTOR
order by name
-- default is asc!
where
clause predicate
-- between
select name
from INSTRUCTOR
where salary between 90000 and 100000
-- tuple match
select name
from INSTRUCTOR
where (ID, dept_name) = (_something, _something)
Set Operations
(select ...) union (select ...) -- A or B
(select ...) intersect (select ...) -- A and B
(select ...) except (select ...) -- A, not B
-- discard duplicates for default: use `union all`
Null Values
- arithmetic:
5 + null == null
where _attribute is not null
clausenull
comparison: makesunknown
valuenull < 5 == unknown
unknown
arithmetictrue and unknown == unknown
false and unknown == false
unknown and unknown == unknown
true or unknown == true
false or unknown == unknown
unknown or unknown == unknown
Aggregate Functions
avg, min, max, sum, count
group by
select dept_name, avg(salary) as avg_salary
from INSTRUCTOR
group by dept_name
having
select dept_name, avg(salary) as avg_salary
from INSTRUCTOR
group by dept_name
having avg(salary) > 42000
having
vswhere
where
checks the predicate before groupinghaving
checks the predicate after grouping
Nested Subqueries
select
inselect
select A1, A2
from r1, r2
where P
-- select: A could be replaced by (select ...)
-- from: r could be replaced by (select ...)
-- where: P could be `_attribute in (select ...)`
with
clause
with MAX_BUDGET(value) as (select max(budget) from DEPARTMENT)
select department.dept_name
from DEPARTMENT, MAX_BUDGET
where DEPARTMENT.budget=MAX_BUDGET.value;
-- for complicated with clause,
with W1 as (select ...), W2 as (select ...)
_do_something
Set membership
in
, not in
Set comparison
some
select distinct T.name
from INSTRUCTOR as T, instructor as S
where T.salary>S.salary and S.dept_name='Biology'
-- could be written in
select name
from INSTRUCTOR
where salary > some (
select salary from instructor where dept_name='Biology'
)
all
: not supported in sqliteexists
-- does the tuple exists?
select course_id
from SECTION as S
where exists (
select * from section as T
where semester='Spring' and year=2018 and S.course_id=T.course_id
)
unique
: not supported in sqlite
-- is it unique tuple?
select T.course_id
from COURSE as T
where unique (
select * from section as T
where semester='Spring' and year=2018 and S.course_id=T.course_id
)
Modification of the DB
- Deletion
delete from INSTRUCTOR where ...
- Insertion
insert into COURSE(attr1, attr2, ...) values(something, something, ...)
-- or just
insert into COURSE values(something, something, ...)
-- insert from select: subquery should not loop!
insert into INSTRUCTOR (select attr1, ... from STUDENT where _predicate)
- Update
update INSTRUCTOR set salary=salary*1.05 where salary<70000
-- conditional update
update INSTRUCTOR set salary = case
when salary<=100000 then salary*1.05
else 100
end
When performing multiple updates, the order matters!
Chapter 4: Intermediate SQL
Join Expression
- Natural join: matches tuples with same values for all common attributes
select name, course_id from
STUDENT, TAKES
where STUDENT.ID=TAKES.ID
-- equals
select name, course_id
from STUDENT
natural join TAKES
using
clause: CAUTION! use only if the column naes are the same
select l.title, r.name
from ALBUMS
inner join ARTISTS on
r.ArtistID = l.ArtistId;
-- equals
select title, name
from albums
inner join ARTISTS
using (ArtistId)
on
condition
Equi join: special type of inner join using equality operator
- Inner Join: Returns rows that exists in both tables
- Outer join
- left outer join: containing all rows of the left table
- right outer join: containing all rows of the right table
- full outer join: containin all rows of both tables
Views
virtual relation mechanism to hide certain data from the view of certain users
- Create a view
create view v as (select ...)
- materialized veiw: some database system physically stores the view
Transactions
- Sequence of query and/or update statements as a unit of work
- Commit: update the transactions performed
- Rollback: revert the transactions
Integrity Constraints
- Single relation constraits
not null
primary key
unique
check(PREDICATE)
create table SECTION (
course_id varchar(8) not null,
sec_id varchar(8) not null,
semester varchar(8) not null,
primary key (course_id, sec_id)
check (semester in ('Fall', 'Spring'))
)
- Referential Integrity
foreign key
- c.f. cascading actions in violating the integrity
foreign key (dept_name) references DEPARTMENT
on delete cascade
on update cascade
-- deleting dept_name from DEPARTMENT deletes all tuples referencing that value
SQL Data Types and Schemas
-
date
:'2005-07-27'
-
time
:'09:00:30'
-
timestamp
:'2005-07-27 09:00:30.75'
-
interval
: subtracting two time values -
Large-object types
- blob: binary large object
- clob: character large object
- query retuns a pointer rather than the object itself.
-
User defind types
create type Dollars as numeric(12, 2) final;
- Domains: not supported in sqlite
create domain person_name char(20) not null;
create domaindegree_level varchar(10) constraint degree_level_test
check (value in ('something', 'something2'))
Index definition
create table STUDENT (
ID varchar(5),
name varchar(20) not null,
dept_name varchar(20),
tot_cred numeric(3, 0) default 0,
primary key (ID)
)
create index studentID_index on STUDENT(id)
Authorization
- authoriing on
select, insert, update, delete, all privileges
- Grant
grant select on DEPARTMENT to Amit, Satoshi
- Revoke
revoke select on DEPARTMENT from Amit, Satoshi
- Roles
create role instructor
grant instructor to Amit
-- we can grant, revoke to roles, too!
- grant reference
grant reference (dept_name) on DEPARTMENT ot Amit
- Transferring privileges
grant select on DEPARTMNET to Amit with grant option;
revoke select on DEPARTMENT from Amit, Satoshi cascade;
revoke select on DEPARTMENT from Amit, Satoshi restrict;
Chapter 5: Advanced SQL
Accessing SQL from GP-PL
- Dynamic SQL: SQL statements constructed at runtime (SQL query running dynamically)
- Embedded SQL: SQL statements do not change at runtime (hard-coded)
Triggers
- Trigger Event:
insert, delete, update
create trigger setnull_trigger before update of TAKES
referencing old row as orow
referencing new row as nrow
for each row
when (nrow.grade='')
begin atomic
set nrow.grade=null;
end
Recursive Queries: with recursive
with recursive rec_prereq(course_id, prereq_id) as (
select course_id, prereq_id from PREREQ
union
select rec_prereq.course_id, PREREQ.prereq_id from rec_prereq, PREREQ
where rec_prereq.prereq_id=PREREQ.course_id
)
select course_id from rec_prereq;
Advanced Aggregate function
Rank
select ID, rank() over (order by GPA desc nulls last) as s_rank
from student_grades;
ntile(n)
: partition inton
buckets and return the index value
select ID, ntile(4) over (order by GPA desc) as quartile
from student_grades;
Windowing
select date, sum(value) over (order by date rows between 1 preceding and 1 following)
from sales;
- window specification
rows between unbounded preceding and current row
rows unbounded preceding
range between 10 preceding and current row
range interval 10 day preceding
- window specification within partitions
select account_number, date_time,
sum (value) over
(partition by account_number
order by date_time
rows unbounded preceding) as balance
from TRANSACTION
order by account_number, date_time;
Chapter 6: DB Design using E-R Model
NOTICE: using UML
- Reducing redundancy + prevent incompleteness