The questions below require you to write SQL code. The questions refer to the Books Database which is fully described fully on this page. However for your convenience, the info from that page was also copied below.
The rows of data shown for the various tables is the data as it appears in the datatabase today. However, with any database, you must consider that the data might change in the future. For example, new rows might be added, prices might change, data may be removed, etc.
The code in the answer for each question MUST WORK even if data were to be different than the current data in the database (i.e. do NOT write the answer such that it only works for the existing data - the code should work no matter what data is in the tables as long as the tables continue to have the same structure as they do now.
Click here to hide/show this section
40.1 The “Books Database”
The “books database” is a collection of several tables that contains data relating to books (i.e. “titles”), authors, publishers, etc.
You can download the data for the database in the following ways:
Click here to download all the data for the books datbase as a single “SQLite” database file.
To use this file you can do one of the following:
Install SQLite software on your computer. There are many such programs avaiable. I recommend “DB Viewer for SQLite” which is a free open source program.
This is a website that lets you upload a sqlite3 file and run SQL Select statements against the data. I created this website myself (with the help of some AI coding assistants).
PROs: you don’t need to install anything
CONs: currently only works with SQL SELECT statment but not any other types of sql statements.
# Read in the data for the books database - see the titles =read_csv("titles.csv", na="NULL", show_col_types=FALSE)authors =read_csv("authors.csv", na="NULL", show_col_types=FALSE)publishers =read_csv("publishers.csv", na="NULL", show_col_types=FALSE)title_authors =read_csv("title_authors.csv", na="NULL", show_col_types=FALSE)royalties =read_csv("royalties.csv", na="NULL", show_col_types=FALSE)
The following is an “Entity Relationship Diagram (ERD)” that describes the relationships between the tables. See the Word document linked above for more info:
title_id title_name type pub_id pages price sales pubdate
1 T01 1977! history P01 107 21.99 566 8/1/2000 0:00
2 T02 200 Years of German Humor history P03 14 19.95 9566 4/1/1998 0:00
3 T03 Ask Yor System Administrator computer P02 1226 39.95 25667 9/1/2000 0:00
4 T04 But I Did It Unconciously psychology P01 510 12.99 13001 5/31/1999 0:00
5 T05 Exchange of Platitudes psychology P01 201 6.95 201440 1/1/2001 0:00
6 T06 How About Never? biography P01 473 19.95 11320 7/31/2000 0:00
7 T07 I Blame My Mother biography P03 333 23.95 1500200 10/1/1999 0:00
8 T08 Just Wait Until After School children P01 86 10.00 4095 6/1/2001 0:00
9 T09 Kiss My Boo Boo children P01 22 13.95 5000 5/31/2002 0:00
10 T10 Not Without My Fabrerge Egg biography P05 NA NA NA <NA>
11 T11 Perhaps It's a Glandular Problem psychology P05 826 7.99 94123 11/30/2000 0:00
12 T12 Spontaneous, Not Annoying biography P05 507 12.99 100001 8/31/2000 0:00
13 T13 What Are The Civilian Applications? history P03 802 29.99 10467 5/31/1999 0:00
Click here - publishers table
sqldf("select * from publishers")
pub_id pub_name city state country
1 P01 Abatis Publishers New York NY USA
2 P02 Core Dump Books San Francisco CA USA
3 P03 Schandenfreude Press Hamburg <NA> Germany
4 P04 Tneterhooks Press Berkeley CA USA
5 P05 AAA Publishing Berkeley CA USA
au_id au_fname au_lname phone address city state zip
1 A01 Sarah Buchman 718-496-7223 75 West 205 St Bronx NY 10468
2 A02 Wendy Heydemark 303-986-7020 2922 Baseline Rd Boulder CO 80303
3 A03 Hallie Hull 415-549-4278 3800 Waldo Ave, #14F San Francisco CA 94123
4 A04 Klee Hull 415-549-4278 3800 Waldo Ave, #14F San Francisco CA 94123
5 A05 Christian Kells 212-771-4680 114 Horatio St New York NY 10014
6 A06 Harvey Kellsey 650-836-7128 390 Serra Mall Palo Alto CA 94305
7 A07 Paddy O'Furniture 941-925-0752 1442 Main St Sarasota FL 34236
List all books (title and number sold) that sold more than 100 copies. List the books with the most sales at the top.
Using the data shown above, the output should be:
title_name sales
1 I Blame My Mother 1500200
2 Exchange of Platitudes 201440
3 Spontaneous, Not Annoying 100001
4 Perhaps It's a Glandular Problem 94123
5 Ask Yor System Administrator 25667
6 But I Did It Unconciously 13001
7 How About Never? 11320
8 What Are The Civilian Applications? 10467
9 200 Years of German Humor 9566
10 Kiss My Boo Boo 5000
11 Just Wait Until After School 4095
Click for answer
sqldf(" select title_name, sales from titles where sales > 1000 order by sales desc;")
Question 2
List all authors who are live either in NY or CA and whose last name begins with a “K”.
Using the data shown above, the output should be:
au_fname au_lname
1 Christian Kells
2 Harvey Kellsey
Click for answer
sqldf(" -- NOTE: -- parentheses are required in the WHERE -- since AND is done before OR by default) select au_fname, au_lname from authors where (state = 'NY' or state='CA') and substr(au_lname,1,1) = 'K'; ")
Question 3
List the first and last names of all authors whose last name starts with a letter from A through J
(HINT: In the where clause, make sure that the first letter of the last name is both >=“A” and also <“K”) (HINT: another possible solution is to use the LIKE several times - once for A, once for B, once for C, etc up to J. Each LIKE should be separated from the others by OR’s)
sqldf(" select au_fname, au_lname from authors where substr(au_lname,1,1) >= 'A' and substr(au_lname,1,1) < 'K';")
ANOTHER ANSWER:
sqldf(" select au_fname, au_lname from authors where au_lname like 'A%' or au_lname like 'B%' or au_lname like 'C%' or au_lname like 'D%' or au_lname like 'E%' or au_lname like 'F%' or au_lname like 'G%' or au_lname like 'H%' or au_lname like 'I%' or au_lname like 'J%'")
Question 4
For each author, show their first initial, followed by a period and a space, followed by their last name. In the 2nd column show the author’s state. Show the column heading for the first column as ‘AuthorName’. Order the results in alphabetical order based on the full name of the person.
Using the data shown above, the output should be:
AuthorName state
1 S. Buchman NY
2 W. Heydemark CO
3 H. Hull CA
4 K. Hull CA
5 C. Kells NY
6 H. Kellsey CA
7 P. O'Furniture FL
Click for answer
sqldf("select substr(au_fname,1,1) || '. ' || au_lname as 'AuthorName', state from authors order by au_lname, au_fname;")
Question 5
Show the titles of all books. Also show the length of the title, i.e. how many characters, including spaces and punctuation. Display the 2nd column with the name ‘TitleLength’. Sort the output so that the shortest titles are listed first. If two titles are the same length then sort those titles alphabetically.
Using the data shown above, the output should be:
title_name TitleLength
1 1977! 5
2 Kiss My Boo Boo 15
3 How About Never? 16
4 I Blame My Mother 17
5 Exchange of Platitudes 22
6 200 Years of German Humor 25
7 But I Did It Unconciously 25
8 Spontaneous, Not Annoying 25
9 Not Without My Fabrerge Egg 27
10 Ask Yor System Administrator 28
11 Just Wait Until After School 28
12 Perhaps It's a Glandular Problem 32
13 What Are The Civilian Applications? 35
Click for answer
sqldf(" select title_name, length(title_name) as 'TitleLength' from titles order by length(title_name), title_name;")
40.1.6 Single table queries with aggregate functions but no “group by” (i.e. these will return EXACTLY one row for each query).
Question 6
List the average price of all books.
Using the data shown above, the output should be:
avg(price)
1 18.3875
Click for answer
sqldf(" select avg(price) from titles;")
Question 7
List the average price of history books.
Using the data shown above, the output should be:
avg(price)
1 23.97667
Click for answer
sqldf(" select avg(price) from titles where type = 'history';")
Question 8
List the number of pages in the longest and shortest books (don’t list the actual title of the book).
Using the data shown above, the output should be:
min(pages) max(pages)
1 14 1226
Click for answer
sqldf(" select min(pages) , max(pages) from titles")
40.1.7 Single table queries with calculated values
Question 9
List the title_name and total revenue for each book. (Revenue for a book is the number sold times the price of the book.)
Using the data shown above, the output should be:
title_name Total Revenue
1 1977! 12446.34
2 200 Years of German Humor 190841.70
3 Ask Yor System Administrator 1025396.65
4 But I Did It Unconciously 168882.99
5 Exchange of Platitudes 1400008.00
6 How About Never? 225834.00
7 I Blame My Mother 35929790.00
8 Just Wait Until After School 40950.00
9 Kiss My Boo Boo 69750.00
10 Not Without My Fabrerge Egg NA
11 Perhaps It's a Glandular Problem 752042.77
12 Spontaneous, Not Annoying 1299012.99
13 What Are The Civilian Applications? 313905.33
Click for answer
sqldf(" select title_name, sales * price as 'Total Revenue' from titles;")
Question 10
List title of each book and the sale price for the book where the sale price is 10% off of the original price.
Using the data shown above, the output should be:
title_name Sale Price
1 1977! 19.791
2 200 Years of German Humor 17.955
3 Ask Yor System Administrator 35.955
4 But I Did It Unconciously 11.691
5 Exchange of Platitudes 6.255
6 How About Never? 17.955
7 I Blame My Mother 21.555
8 Just Wait Until After School 9.000
9 Kiss My Boo Boo 12.555
10 Not Without My Fabrerge Egg NA
11 Perhaps It's a Glandular Problem 7.191
12 Spontaneous, Not Annoying 11.691
13 What Are The Civilian Applications? 26.991
Click for answer
sqldf(" select title_name, price * 0.90 as 'Sale Price' from titles;")
40.2 Single table queries with “group by”.
These can use aggregate functions but will return at most one row in the output for each “group” of rows as defined in the “group by”
Question 11
For each “type” of book (e.g. biography, children, etc) list the number of pages in the shortest book of that type and the number of pages in the longest book of that type. Sort the results alphabetically by the type of book.
Using the data shown above, the output should be:
type Length of Shortest Length of Longest
1 biography 333 507
2 children 22 86
3 computer 1226 1226
4 history 14 802
5 psychology 201 826
Click for answer
sqldf(" select type, min(pages) AS 'Length of Shortest' , max(pages) as 'Length of Longest' from titles group by type order by type;")
Question 12
List the types of books (e.g. history, biography, etc) and the average price of those books for which the average price of books in that category is at least $12.00.
Using the data shown above, the output should be:
type avg(price)
1 biography 18.96333
2 computer 39.95000
3 history 23.97667
Click for answer
sqldf(" select type, avg(price) from titles group by type having avg(price) >= 12;")
Question 13
For each type of book (e.g. biography, children, etc), show the number of those books that are 450 pages or longer. The column heading should be ‘# of long books’. If there are no titles of a particular type that are so long, then don’t show that type at all.
Using the data shown above, the output should be:
type # of long books
1 biography 2
2 computer 1
3 history 1
4 psychology 2
Click for answer
sqldf(" select type, count(*) as '# of long books' from titles where pages >= 450 group by type;")
Question 14
Modify the answer to the previous question so that only those types for which there are at least 2 long books are displayed.
Using the data shown above, the output should be:
type # of long books
1 biography 2
2 psychology 2
Click for answer
NOTE: the only change to the above answer is the addition of the HAVING clause at the end.
sqldf(" select type, count(*) as '# of long books' from titles where pages >= 450 group by type having count(*) >= 2 -- this is the only change.")
40.3 Multiple table queries with group by
Question 15
For each publisher, list the name of the publisher and total number of pages that the publisher has published in all their books (ie. add up all the pages in all books for each publisher). Show the publishers who have published the most pages at the top of the result list.
sqldf(" select pub_name, sum(pages) from publishers join titles on publishers.pub_id = titles.pub_id group by pub_name having sum(pages) >= 1250 -- THIS LINE IS THE ONLY CHANGE order by sum(pages) desc;")
Question 17
List each publisher’s name and the numbers of pages in their longest and shortest books. Sort the results by the publisher’s name.
sqldf(" select pub_name, min(pages) AS 'Length of Shortest' , max(pages) as 'Length of Longest' from titles join publishers on titles.pub_id = publishers.pub_id group by pub_name order by pub_name;")
40.4 Multiple table queries
Question 18
List the authors first and last names, the author’s state, the title name, the publisher name and the publishers state for all books.
Using the data shown above, the output should be:
au_fname au_lname state title_name pub_name state
1 Sarah Buchman NY 1977! Abatis Publishers NY
2 Sarah Buchman NY 200 Years of German Humor Schandenfreude Press <NA>
3 Sarah Buchman NY What Are The Civilian Applications? Schandenfreude Press <NA>
4 Wendy Heydemark CO How About Never? Abatis Publishers NY
5 Wendy Heydemark CO I Blame My Mother Schandenfreude Press <NA>
6 Wendy Heydemark CO Not Without My Fabrerge Egg AAA Publishing CA
7 Wendy Heydemark CO Spontaneous, Not Annoying AAA Publishing CA
8 Hallie Hull CA But I Did It Unconciously Abatis Publishers NY
9 Hallie Hull CA Perhaps It's a Glandular Problem AAA Publishing CA
10 Klee Hull CA But I Did It Unconciously Abatis Publishers NY
11 Klee Hull CA Exchange of Platitudes Abatis Publishers NY
12 Klee Hull CA I Blame My Mother Schandenfreude Press <NA>
13 Klee Hull CA Perhaps It's a Glandular Problem AAA Publishing CA
14 Christian Kells NY Ask Yor System Administrator Core Dump Books CA
15 Harvey Kellsey CA Just Wait Until After School Abatis Publishers NY
16 Harvey Kellsey CA Kiss My Boo Boo Abatis Publishers NY
17 Harvey Kellsey CA Perhaps It's a Glandular Problem AAA Publishing CA
Click for answer
sqldf(" select au_fname, au_lname, authors.state, title_name, pub_name, publishers.state from authors join title_authors on authors.au_id = title_authors.au_id join titles on title_authors.title_id = titles.title_id join publishers on publishers.pub_id = titles.pub_id;")
Question 19
List the title, number of pages, authors first and last names and author’s state for all books that are longer than 150 pages and whose author lives in NY or CA.
Using the data shown above, the output should be:
Warning: Ignoring remaining part of query: -- (parentheses ARE necessary since by default AND is done before OR)
title_name pages au_fname au_lname state
1 What Are The Civilian Applications? 802 Sarah Buchman NY
2 But I Did It Unconciously 510 Hallie Hull CA
3 Perhaps It's a Glandular Problem 826 Hallie Hull CA
4 But I Did It Unconciously 510 Klee Hull CA
5 Exchange of Platitudes 201 Klee Hull CA
6 I Blame My Mother 333 Klee Hull CA
7 Perhaps It's a Glandular Problem 826 Klee Hull CA
8 Ask Yor System Administrator 1226 Christian Kells NY
9 Perhaps It's a Glandular Problem 826 Harvey Kellsey CA
Click for answer
sqldf(" select title_name, pages, au_fname, au_lname, authors.state from authors join title_authors on authors.au_id = title_authors.au_id join titles on title_authors.title_id = titles.title_id where pages > 150 and (state = 'NY' or state = 'CA') ; -- (parentheses ARE necessary since by default AND is done before OR)")
Question 20
List the authors first and last names, the author’s state, the title name, the publisher name and the publishers state only for those books where the author and publisher are from the same state. (HINT: specify authors.state [note: there is a period between author and state] in the criteria for the publisher’s state).
Using the data shown above, the output should be:
au_fname au_lname state title_name pub_name state
1 Sarah Buchman NY 1977! Abatis Publishers NY
2 Hallie Hull CA Perhaps It's a Glandular Problem AAA Publishing CA
3 Klee Hull CA Perhaps It's a Glandular Problem AAA Publishing CA
4 Harvey Kellsey CA Perhaps It's a Glandular Problem AAA Publishing CA
Click for answer
sqldf(" select au_fname, au_lname, authors.state, title_name, pub_name, publishers.state from authors join title_authors on authors.au_id = title_authors.au_id join titles on title_authors.title_id = titles.title_id join publishers on publishers.pub_id = titles.pub_id where authors.state = publishers.state;")
Question 21
List the title and authors of all books published by “Abatis Publishers” sort the results in alphabetical order by title and then by the au_order column. Do NOT show that au_order column in the output.
Using the data shown above, the output should be:
au_fname au_lname title_name pub_name
1 Sarah Buchman 1977! Abatis Publishers
2 Hallie Hull But I Did It Unconciously Abatis Publishers
3 Klee Hull But I Did It Unconciously Abatis Publishers
4 Klee Hull Exchange of Platitudes Abatis Publishers
5 Wendy Heydemark How About Never? Abatis Publishers
6 Harvey Kellsey Just Wait Until After School Abatis Publishers
7 Harvey Kellsey Kiss My Boo Boo Abatis Publishers
Click for answer
sqldf(" select au_fname, au_lname, title_name, pub_name from authors join title_authors on authors.au_id = title_authors.au_id join titles on title_authors.title_id = titles.title_id join publishers on publishers.pub_id = titles.pub_id where pub_name = 'Abatis Publishers' order by title_name, au_order;")
Question 22
List the publisher name, author name, and title name for all books that have at least 100 pages and that sold at least 200 copies. Sort the results in alphabetical order first by publisher and then by author.
Using the data shown above, the output should be:
pub_name au_fname au_lname title_name
1 AAA Publishing Wendy Heydemark Spontaneous, Not Annoying
2 AAA Publishing Hallie Hull Perhaps It's a Glandular Problem
3 AAA Publishing Klee Hull Perhaps It's a Glandular Problem
4 AAA Publishing Harvey Kellsey Perhaps It's a Glandular Problem
5 Abatis Publishers Sarah Buchman 1977!
6 Abatis Publishers Wendy Heydemark How About Never?
7 Abatis Publishers Hallie Hull But I Did It Unconciously
8 Abatis Publishers Klee Hull But I Did It Unconciously
9 Abatis Publishers Klee Hull Exchange of Platitudes
10 Core Dump Books Christian Kells Ask Yor System Administrator
11 Schandenfreude Press Sarah Buchman What Are The Civilian Applications?
12 Schandenfreude Press Wendy Heydemark I Blame My Mother
13 Schandenfreude Press Klee Hull I Blame My Mother
Click for answer
sqldf(" select pub_name, au_fname, au_lname, title_name from authors join title_authors on authors.au_id = title_authors.au_id join titles on title_authors.title_id = titles.title_id join publishers on publishers.pub_id = titles.pub_id where pages >= 100 and sales >= 200 order by pub_name, au_lname, au_fname;")
Question 23
List authors first and last names and the titles they wrote for publishers whose name (ie. the publisher’s name) begins with “A”, “B” or “C”. Sort the rows in alphabetical order by the publisher. For rows for the same publisher, sort the rows alphabetically by the names of the authors.
Using the data shown above, the output should be:
au_fname au_lname title_name pub_name
1 Wendy Heydemark Not Without My Fabrerge Egg AAA Publishing
2 Wendy Heydemark Spontaneous, Not Annoying AAA Publishing
3 Hallie Hull Perhaps It's a Glandular Problem AAA Publishing
4 Klee Hull Perhaps It's a Glandular Problem AAA Publishing
5 Harvey Kellsey Perhaps It's a Glandular Problem AAA Publishing
6 Sarah Buchman 1977! Abatis Publishers
7 Wendy Heydemark How About Never? Abatis Publishers
8 Hallie Hull But I Did It Unconciously Abatis Publishers
9 Klee Hull But I Did It Unconciously Abatis Publishers
10 Klee Hull Exchange of Platitudes Abatis Publishers
11 Harvey Kellsey Just Wait Until After School Abatis Publishers
12 Harvey Kellsey Kiss My Boo Boo Abatis Publishers
13 Christian Kells Ask Yor System Administrator Core Dump Books
Click for answer
Three different answers.
The only differences between the answers are the WHERE clauses.
Answer 1
select au_fname, au_lname, title_name, pub_name
from authors join title_authors on authors.au_id = title_authors.au_id
join titles on title_authors.title_id = titles.title_id
join publishers on publishers.pub_id = titles.pub_id where substr(pub_name, 1,1) IN ('A','B','C') order by pub_name, au_lname, au_fname
au_fname au_lname title_name pub_name
1 Wendy Heydemark Not Without My Fabrerge Egg AAA Publishing
2 Wendy Heydemark Spontaneous, Not Annoying AAA Publishing
3 Hallie Hull Perhaps It's a Glandular Problem AAA Publishing
4 Klee Hull Perhaps It's a Glandular Problem AAA Publishing
5 Harvey Kellsey Perhaps It's a Glandular Problem AAA Publishing
6 Sarah Buchman 1977! Abatis Publishers
7 Wendy Heydemark How About Never? Abatis Publishers
8 Hallie Hull But I Did It Unconciously Abatis Publishers
9 Klee Hull But I Did It Unconciously Abatis Publishers
10 Klee Hull Exchange of Platitudes Abatis Publishers
11 Harvey Kellsey Just Wait Until After School Abatis Publishers
12 Harvey Kellsey Kiss My Boo Boo Abatis Publishers
13 Christian Kells Ask Yor System Administrator Core Dump Books
Answer 2
select au_fname, au_lname, title_name, pub_name
from authors join title_authors on authors.au_id = title_authors.au_id
join titles on title_authors.title_id = titles.title_id
join publishers on publishers.pub_id = titles.pub_id where substr(pub_name, 1,1) = 'A' or
substr(pub_name, 1,1) = 'B' or
substr(pub_name, 1,1) = 'C' order by pub_name, au_lname, au_fname
au_fname au_lname title_name pub_name
1 Wendy Heydemark Not Without My Fabrerge Egg AAA Publishing
2 Wendy Heydemark Spontaneous, Not Annoying AAA Publishing
3 Hallie Hull Perhaps It's a Glandular Problem AAA Publishing
4 Klee Hull Perhaps It's a Glandular Problem AAA Publishing
5 Harvey Kellsey Perhaps It's a Glandular Problem AAA Publishing
6 Sarah Buchman 1977! Abatis Publishers
7 Wendy Heydemark How About Never? Abatis Publishers
8 Hallie Hull But I Did It Unconciously Abatis Publishers
9 Klee Hull But I Did It Unconciously Abatis Publishers
10 Klee Hull Exchange of Platitudes Abatis Publishers
11 Harvey Kellsey Just Wait Until After School Abatis Publishers
12 Harvey Kellsey Kiss My Boo Boo Abatis Publishers
13 Christian Kells Ask Yor System Administrator Core Dump Books
Answer 3
select au_fname, au_lname, title_name, pub_name
from authors join title_authors on authors.au_id = title_authors.au_id
join titles on title_authors.title_id = titles.title_id
join publishers on publishers.pub_id = titles.pub_id where pub_name like 'A%' or pub_name like 'B%' or pub_name like 'C%' order by pub_name, au_lname, au_fname
au_fname au_lname title_name pub_name
1 Wendy Heydemark Not Without My Fabrerge Egg AAA Publishing
2 Wendy Heydemark Spontaneous, Not Annoying AAA Publishing
3 Hallie Hull Perhaps It's a Glandular Problem AAA Publishing
4 Klee Hull Perhaps It's a Glandular Problem AAA Publishing
5 Harvey Kellsey Perhaps It's a Glandular Problem AAA Publishing
6 Sarah Buchman 1977! Abatis Publishers
7 Wendy Heydemark How About Never? Abatis Publishers
8 Hallie Hull But I Did It Unconciously Abatis Publishers
9 Klee Hull But I Did It Unconciously Abatis Publishers
10 Klee Hull Exchange of Platitudes Abatis Publishers
11 Harvey Kellsey Just Wait Until After School Abatis Publishers
12 Harvey Kellsey Kiss My Boo Boo Abatis Publishers
13 Christian Kells Ask Yor System Administrator Core Dump Books
Question 24
List the publisher name, author’s name and phone number for all authors that have written books for the publisher’s whose name begins with a letter from A through J. (HINT: specify the publisher’s name twice but only show one of them. In the criteria for one of them list >=“A” and in the criteria for the other list <“K”).
sqldf(" select pub_name, au_fname, au_lname, phone from authors join title_authors on authors.au_id = title_authors.au_id join titles on title_authors.title_id = titles.title_id join publishers on publishers.pub_id = titles.pub_id where substr(pub_name,1,1) >= 'A' and substr(pub_name,1,1) < 'K';")
Question 25
List the titles of all books and the number of authors for the book (NOT the actual author names). Sort the results so that the books with the most authors appear on top. All books that have the same number of authors should be listed in alphabetical order.
Using the data shown above, the output should be:
title_name Number of Authors
1 Perhaps It's a Glandular Problem 3
2 But I Did It Unconciously 2
3 I Blame My Mother 2
4 1977! 1
5 200 Years of German Humor 1
6 Ask Yor System Administrator 1
7 Exchange of Platitudes 1
8 How About Never? 1
9 Just Wait Until After School 1
10 Kiss My Boo Boo 1
11 Not Without My Fabrerge Egg 1
12 Spontaneous, Not Annoying 1
13 What Are The Civilian Applications? 1
Click for answer
sqldf(" select title_name, count(*) as 'Number of Authors' from authors join title_authors on authors.au_id = title_authors.au_id join titles on title_authors.title_id = titles.title_id group by titles.title_id, title_name order by count(*) desc, title_name;")
Question 26
List the titles of all books that have 2 or more authors and that are longer than 50 pages.
Using the data shown above, the output should be:
title_name Number of Authors
1 Perhaps It's a Glandular Problem 3
2 But I Did It Unconciously 2
3 I Blame My Mother 2
Click for answer
sqldf(" select title_name, count(*) as 'Number of Authors' from authors join title_authors on authors.au_id = title_authors.au_id join titles on title_authors.title_id = titles.title_id where pages >= 50 group by titles.title_id, title_name having count(*) >= 2 order by count(*) desc, title_name;")
Question 27
List all publishers and the number of books that the publisher has published (not the actual titles, just the number of books). Sort the list so that the publisher with the most books appears on top.
sqldf(" select pub_name, count(*) as '# of Books' from titles join publishers on publishers.pub_id = titles.pub_id group by publishers.pub_id, pub_name order by count(*) desc;")
Question 28
List the publisher name and total revenue for each for each publisher. (Total revenue for a publisher is the total revenue for all books published by that publisher. Revenue for a book is the number sold times the price of the book.)
sqldf(" select pub_name, sum(sales * price) as 'Total Revenue' from titles join publishers on publishers.pub_id = titles.pub_id group by publishers.pub_id, pub_name ;")
Question 29
Modify the previous query to only list those publishers who have at least ten thousand dollars in total revenue.
Using the data shown above, the output should be:
Warning: Ignoring remaining part of query: having sum(sales * price) >= 10000;
sqldf(" select pub_name, sum(sales * price) as 'Total Revenue' from titles join publishers on publishers.pub_id = titles.pub_id group by publishers.pub_id, pub_name ; having sum(sales * price) >= 10000;")
Question 30
List the title and author of each book. For each author for a particular book, list the amount of the “advance” that (s)he got for that book.
Using the data shown above, the output should be:
title_name au_fname au_lname advance
1 1977! Sarah Buchman 10000
2 200 Years of German Humor Sarah Buchman 1000
3 What Are The Civilian Applications? Sarah Buchman 20000
4 How About Never? Wendy Heydemark 20000
5 I Blame My Mother Wendy Heydemark 1000000
6 Not Without My Fabrerge Egg Wendy Heydemark NA
7 Spontaneous, Not Annoying Wendy Heydemark 50000
8 But I Did It Unconciously Hallie Hull 20000
9 Perhaps It's a Glandular Problem Hallie Hull 100000
10 But I Did It Unconciously Klee Hull 20000
11 Exchange of Platitudes Klee Hull 100000
12 I Blame My Mother Klee Hull 1000000
13 Perhaps It's a Glandular Problem Klee Hull 100000
14 Ask Yor System Administrator Christian Kells 15000
15 Just Wait Until After School Harvey Kellsey 0
16 Kiss My Boo Boo Harvey Kellsey 0
17 Perhaps It's a Glandular Problem Harvey Kellsey 100000
Click for answer
sqldf(" select title_name, au_fname, au_lname, advance from authors join title_authors on authors.au_id = title_authors.au_id join titles on title_authors.title_id = titles.title_id join royalties on royalties.title_id = titles.title_id;")
Question 31
List the title and author of each book. For each author for a particular book, list the total amount in royalties that (s)he got for that book. Hint: the royalty dollars for a particular book is the “price” times the “sales” times the “royalty_rate” times the “royalty_rate”. Sort the results so that the authors who made the most royalty dollars are listed at the top. (NOTE: make sure to read the database description at the top of this document to understand the fields ROYALTIES.ROYALTY_RATE and the field TITLE_AUTHORS.royalty_rate).
Using the data shown above, the output should be:
title_name au_fname au_lname Total Royalties for Author
1 1977! Sarah Buchman 31.11585
2 200 Years of German Humor Sarah Buchman 687.03012
3 What Are The Civilian Applications? Sarah Buchman 1130.05919
4 How About Never? Wendy Heydemark 1445.33760
5 I Blame My Mother Wendy Heydemark 434750.45900
6 Not Without My Fabrerge Egg Wendy Heydemark NA
7 Spontaneous, Not Annoying Wendy Heydemark 10522.00522
8 But I Did It Unconciously Hallie Hull 1080.85114
9 Perhaps It's a Glandular Problem Hallie Hull 3685.00957
10 But I Did It Unconciously Klee Hull 1080.85114
11 Exchange of Platitudes Klee Hull 11340.06480
12 I Blame My Mother Klee Hull 434750.45900
13 Perhaps It's a Glandular Problem Klee Hull 3685.00957
14 Ask Yor System Administrator Christian Kells 5024.44359
15 Just Wait Until After School Harvey Kellsey 65.52000
16 Kiss My Boo Boo Harvey Kellsey 174.37500
17 Perhaps It's a Glandular Problem Harvey Kellsey 3685.00957
Click for answer
sqldf(" select title_name, au_fname, au_lname, royalty_rate * royalty_rate * sales * price as 'Total Royalties for Author' from authors join title_authors on authors.au_id = title_authors.au_id join titles on title_authors.title_id = titles.title_id join royalties on royalties.title_id = titles.title_id; ")
Question 32
List the name of each author and the total amount of royalty dollars that they received for all of their books (name this column “RoyaltyDollars”). Hint: the royalty dollars for a particular book is the “price” times the “sales” times the “royalty_rate” times the “royalty_rate”. Sort the results so that the authors who made the most royalty dollars are listed at the top.
Using the data shown above, the output should be:
au_fname au_lname Total Royalties for Author
1 Klee Hull 450856.385
2 Wendy Heydemark 446717.802
3 Christian Kells 5024.444
4 Hallie Hull 4765.861
5 Harvey Kellsey 3924.905
6 Sarah Buchman 1848.205
Click for answer
sqldf(" select au_fname, au_lname, sum(royalty_rate * royalty_rate * sales * price) as 'Total Royalties for Author' from authors join title_authors on authors.au_id = title_authors.au_id join titles on title_authors.title_id = titles.title_id join royalties on royalties.title_id = titles.title_id group by authors.au_id, au_fname, au_lname order by sum(royalty_rate * royalty_rate * sales * price) desc;")
Question 33
Modify the query from the previous question so that the list only includes authors who made more than $50,000.00 in royalties.
Using the data shown above, the output should be:
au_fname au_lname Total Royalties for Author
1 Klee Hull 450856.4
2 Wendy Heydemark 446717.8
Click for answer
sqldf(" select au_fname, au_lname, sum(royalty_rate * royalty_rate * sales * price) as 'Total Royalties for Author' from authors join title_authors on authors.au_id = title_authors.au_id join titles on title_authors.title_id = titles.title_id join royalties on royalties.title_id = titles.title_id group by authors.au_id, au_fname, au_lname having sum(royalty_rate * royalty_rate * sales * price) > 50000 order by sum(royalty_rate * royalty_rate * sales * price) desc;")
40.5 Subqueries, outer joins, unions and other stuff …
Question 34
Show the title names and number of pages for those books that are longer than (ie. more pages) the average length of all books.
Using the data shown above, the output should be:
title_name pages
1 Ask Yor System Administrator 1226
2 But I Did It Unconciously 510
3 How About Never? 473
4 Perhaps It's a Glandular Problem 826
5 Spontaneous, Not Annoying 507
6 What Are The Civilian Applications? 802
Click for answer
sqldf(" select title_name , pages from titles where pages >= (select avg(pages) from titles);")
Question 35
Show the publisher’s name and number of authors who have written books for that publisher:
Using the data shown above, the output should be:
Error: ambiguous column name: pub_id
Click for answer
sqldf(" select pub_name, count(DISTINCT authors.au_id) as '# of authors' from authors join title_authors on authors.au_id = title_authors.au_id join titles on title_authors.title_id = titles.title_id join publishers on publishers.pub_id = titles.pub_id group by pub_id, pub_name;")
Error: ambiguous column name: pub_id
Question 36
Show the pub_name and the number of titles published by that publisher. Only include publishers for whom at least 5 authors have worked.
Using the data shown above, the output should be:
pub_name Number of Titles
1 Abatis Publishers 6
Click for answer
sqldf(" select pub_name, count(*) as 'Number of Titles' from publishers as pub1 join titles on pub1.pub_id = titles.pub_id group by pub1.pub_id, pub_name having 5 <= ( select count(DISTINCT authors.au_id) from authors join title_authors on authors.au_id = title_authors.au_id join titles on title_authors.title_id = titles.title_id join publishers as pub2 on pub2.pub_id = titles.pub_id where pub2.pub_id = pub1.pub_id ) ;")
Question 37
Show the names of publishers who did not publish any books (I guess they are just getting started )
Using the data shown above, the output should be:
pub_name
1 Tneterhooks Press
Click for answer
ANSWER – with a subquery
sqldf(" select pub_name from publishers as pub1 where not exists ( select * from publishers as pub2 join titles on pub2.pub_id = titles.pub_id where pub2.pub_id = pub1.pub_id )")
pub_name
1 Tneterhooks Press
Click for answer
ANSWER – with a left join
sqldf(" select pub_name from publishers left join titles on publishers.pub_id = titles.pub_id where title_id is NULL")
pub_name
1 Tneterhooks Press
Question 38
List the name of each publisher and the total number of books that each publisher has published. Sort the results so that the publishers who published the most books appear at the top. If two or more publishers published the same number of books then they should be listed in alphabetical order.
The answer should only include publishers who have published some books.
sqldf(" select pub_name, count(*) from titles join publishers on publishers.pub_id = titles.pub_id group by pub_name order by count(*) desc, pub_name;")
Same as previous question but this time make sure to also include publishers who have published zero books
sqldf(" select pub_name, count(titles.title_id) from publishers left join titles on publishers.pub_id = titles.pub_id group by pub_name order by count(*) desc, pub_name;")
sqldf(" select pub_name , 0 as NumTitles from publishers left join titles on publishers.pub_id = titles.pub_id where title_id is NULL UNION select pub_name, count(*) as NumTitles from titles join publishers on publishers.pub_id = titles.pub_id group by pub_name;")
Show the names of the authors and the publishers who published their books. Only show those authors/publishers where the author lives in the same state as the publisher who published the book. Do not show any author/pbulisher names twice.
Using the data shown above, the output should be:
au_fname au_lname pub_name state state
1 Sarah Buchman Abatis Publishers NY NY
2 Hallie Hull AAA Publishing CA CA
3 Klee Hull AAA Publishing CA CA
4 Harvey Kellsey AAA Publishing CA CA
Click for answer
sqldf("select distinct au_fname, au_lname, pub_name, authors.state, publishers.statefrom authors join title_authors on authors.au_id = title_authors.au_id join titles on title_authors.title_id = titles.title_id join publishers on titles.pub_id = publishers.pub_idwhere authors.state = publishers.state ")
Question 40
The exchange rate for nis/usd is 3.3.
Write a query that shows the titles and prices for biographies that cost between 10 and 20 dollars. Show two columns for the prices. One column for USD and one column for NIS. Name the columns “price in USD” and “price in NIS”. Sort the results so that the most expensive books are displayed first.
Using the data shown above, the output should be:
title_name price in NIS price in USD type
1 How About Never? 65.835 19.95 biography
2 Spontaneous, Not Annoying 42.867 12.99 biography
Click for answer
sqldf("select title_name, price * 3.3 as 'price in NIS', price as 'price in USD', typefrom titleswhere type = 'biography' and price >= 10 and price <= 20order by price desc ")
Question 41
Show those titles for which the price of the book is greater than the average price of all books and the number of pages of the book is less than the average number of pages for all books.
Using the data shown above, the output should be:
title_name price pages
1 1977! 21.99 107
2 200 Years of German Humor 19.95 14
3 I Blame My Mother 23.95 333
Click for answer
sqldf("select title_name, price, pagesfrom titleswhere price > (select avg(price) from titles) and pages < (select avg(pages) from titles) ")
The answer above uses the following two “non-correlated” subqueries. It might help your understanding of the answer above to see the output of the subqueries when they are run by themselves.
# The average price of all bookssqldf("select avg(price) from titles")
avg(price)
1 18.3875
# The average number of pages of all bookssqldf("select avg(pages) from titles")
avg(pages)
1 425.5833
Question 42
Show the authors who have not written any books
Using the data shown above, the output should be:
au_fname au_lname
1 Paddy O'Furniture
Click for answer
sqldf("select au_fname, au_lnamefrom authors left join title_authors on authors.au_id = title_authors.au_idwhere title_authors.au_id is null")
Question 43
In each row show an author’s name and the number of books that author wrote.
Only show those authors who wrote at least two books.
Sort the results alphabetically (last name, first name).
sqldf("select au_fname, au_lname, count(*)from authors join title_authors on authors.au_id = title_authors.au_idgroup by au_fname, au_lnamehaving count(*) >= 2 ")
Question 44
Same as previous question. However, this time, only show those authors who wrote biographies, the count of the number of biographies they wrote and only show those authors who wrote at least 2 biographies.
Using the data shown above, the output should be:
au_fname au_lname count(*)
1 Wendy Heydemark 4
Click for answer
sqldf("select au_fname, au_lname, count(*)from authors join title_authors on authors.au_id = title_authors.au_id join titles on title_authors.title_id = titles.title_idwhere type = 'biography'group by au_fname, au_lnamehaving count(*) >= 2")
Question 45 - a
Write a query to show the books in the titles table for which both the author and the publisher have a higher than average price for books they have worked on. Sort the rows alphabetically by publisher name. Sort the rows that have the same publisher name by the author’s name.
Using the data shown above, the output should be:
pub_name au_fname au_lname title_name
1 Core Dump Books Christian Kells Ask Yor System Administrator
2 Schandenfreude Press Sarah Buchman 200 Years of German Humor
3 Schandenfreude Press Sarah Buchman What Are The Civilian Applications?
4 Schandenfreude Press Wendy Heydemark I Blame My Mother
Click for answer - using subqueries in WHERE
sqldf("select pub_name, au_fname, au_lname, title_namefrom authors join title_authors on authors.au_id = title_authors.au_id join titles on title_authors.title_id = titles.title_id join publishers on titles.pub_id = publishers.pub_idwhere authors.au_id in ( select authors.au_id from authors join title_authors on authors.au_id = title_authors.au_id join titles on title_authors.title_id = titles.title_id group by authors.au_id having avg(price) > (select avg(price) from titles) ) and publishers.pub_id in ( select publishers.pub_id from titles join publishers on titles.pub_id = publishers.pub_id group by publishers.pub_id having avg(price) > (select avg(price) from titles) )order by pub_name, au_lname, au_fname")
pub_name au_fname au_lname title_name
1 Core Dump Books Christian Kells Ask Yor System Administrator
2 Schandenfreude Press Sarah Buchman 200 Years of German Humor
3 Schandenfreude Press Sarah Buchman What Are The Civilian Applications?
4 Schandenfreude Press Wendy Heydemark I Blame My Mother
Click for answer - using subqueries in FROM
sqldf("select pub_name, au_fname, au_lname, title_namefrom ( select authors.* from authors join title_authors on authors.au_id = title_authors.au_id join titles on title_authors.title_id = titles.title_id group by authors.au_id having avg(price) > (select avg(price) from titles) ) as highEndAuthors join title_authors on highEndAuthors.au_id = title_authors.au_id join titles on title_authors.title_id = titles.title_id join ( select publishers.* from titles join publishers on titles.pub_id = publishers.pub_id group by publishers.pub_id having avg(price) > (select avg(price) from titles) ) as highEndPublishers on titles.pub_id = highEndPublishers.pub_idorder by pub_name, au_lname, au_fname")
pub_name au_fname au_lname title_name
1 Core Dump Books Christian Kells Ask Yor System Administrator
2 Schandenfreude Press Sarah Buchman 200 Years of German Humor
3 Schandenfreude Press Sarah Buchman What Are The Civilian Applications?
4 Schandenfreude Press Wendy Heydemark I Blame My Mother
Click for answer - using Common Table Expressions (CTE)
You can modify either of the previous queries to use CTE instead of subqueries. The following modifies the previous query.
sqldf("with highEndAuthors as ( select authors.* from authors join title_authors on authors.au_id = title_authors.au_id join titles on title_authors.title_id = titles.title_id group by authors.au_id having avg(price) > (select avg(price) from titles) ), highEndPublishers as ( select publishers.* from titles join publishers on titles.pub_id = publishers.pub_id group by publishers.pub_id having avg(price) > (select avg(price) from titles) )select pub_name, au_fname, au_lname, title_namefrom highEndAuthors join title_authors on highEndAuthors.au_id = title_authors.au_id join titles on title_authors.title_id = titles.title_id join highEndPublishers on titles.pub_id = highEndPublishers.pub_idorder by pub_name, au_lname, au_fname")
pub_name au_fname au_lname title_name
1 Core Dump Books Christian Kells Ask Yor System Administrator
2 Schandenfreude Press Sarah Buchman 200 Years of German Humor
3 Schandenfreude Press Sarah Buchman What Are The Civilian Applications?
4 Schandenfreude Press Wendy Heydemark I Blame My Mother
Question 45 - b
A literary agent is trying to connect authors with publishers who have not worked together before. Write a SQL query to help the agent identify these potential matches. Each row of your output should show the name of a publisher and the name of an author who have NOT worked together yet.
Sort the rows alphabetically by publisher name. For rows that have the same publisher name, sort them by the author’s name.
Click for answer - Method 1: Using EXCEPT
sqldf("-- All existing and non-existing publisher/author relationshipsSELECT publishers.pub_name, authors.au_fname, authors.au_lname FROM authors CROSS JOIN publishersEXCEPT-- Only the existing publisher/author relationshipsSELECT publishers.pub_name, authors.au_fname, authors.au_lname FROM authors JOIN title_authors on authors.au_id = title_authors.au_id JOIN titles on title_authors.title_id = titles.title_id JOIN publishers on titles.pub_id = publishers.pub_idORDER BY pub_name, au_lname, au_fname")
sqldf("SELECT publishers.pub_id, pub_name, authors.au_id, au_fname, au_lnameFROM authors CROSS JOIN publishers WHERE NOT EXISTS ( -- NOTE 1 - This is a correlated subquery. -- -- NOTE 2 - (unrelated to Note 1) With an EXISTS or NOT EXISTS subquery, -- it doesn't make a difference what data is returned by the subquery. -- As long as ANYTHING is returned the query is considered to 'EXIST' -- Threfore it's more efficient to return as little info as possilble. -- SELECT 1 just returns the number 1 for each row that matches as -- opposed to returning actual data from the tables. The query would -- work in the exact same way if we said SELECT * (or anything else) -- it's just for efficiency that we say SELECT 1. SELECT 1 FROM titles JOIN title_authors ON titles.title_id = title_authors.title_id WHERE title_authors.au_id = authors.au_id AND titles.pub_id = publishers.pub_id)ORDER BY pub_name, au_lname, au_fname;")
(continued from the previous question) The literary agent became more selective. He now wants to focus on introducing only those authors who command a higher than average price for their books with publishers who command a higher than average price for their books.
Write a SQL query to help the agent identify these potential matches. Each row of your output should show the name of a publisher and the name of an author who have NOT worked together yet. The publishers should be those whose average book price is above the average price of all books. Similarly, the authors should be those whose average book price is above the average price of all books.
Sort the rows alphabetically by publisher name. For rows that have the same publisher name, sort them by the author’s name.
Using the data shown above, the output should be:
pub_name au_fname au_lname
1 Core Dump Books Sarah Buchman
2 Core Dump Books Wendy Heydemark
3 Schandenfreude Press Christian Kells
This is a relatively complex query. It might help to build up your answer step by step.
Click here for some hints
In order to check your work, it might be helpful to know the following:
The average selling price of all books is:
sqldf("select avg(price) from titles")
avg(price)
1 18.3875
The average selling price of books by different publishers are: (The rows are sorted in decreasing order by average price.)
sqldf("select round(avg(price),2) as 'avg price of publishers books', pub_namefrom publishers join titles on publishers.pub_id = titles.pub_idgroup by publishers.pub_id, pub_nameorder by avg(price) desc, pub_name")
Therefore the authors who are potential introduction candidates are those whose avg book price is higher than average, ie.
sqldf("select au_fname, au_lnamefrom authors join title_authors on authors.au_id = title_authors. au_id join titles on title_authors.title_id = titles.title_idgroup by authors.au_id, au_fname, au_lnamehaving avg(price) > (select avg(price) from titles)order by avg(price) desc, au_lname, au_fname")
au_fname au_lname
1 Christian Kells
2 Sarah Buchman
3 Wendy Heydemark
Click for answer - using EXCEPT with subqueries
Answer using EXCEPT with subqueries
sqldf("-- These are the potential matchups for highEnd authors/publishersselect pub_name, au_fname, au_lnamefrom ( select authors.au_id, au_fname, au_lname from authors join title_authors on authors.au_id = title_authors.au_id join titles on title_authors.title_id = titles.title_id group by authors.au_id, au_fname, au_lname having avg(price) > (select avg(price) from titles) ) as highEndAuthors cross join ( select publishers.pub_id, pub_name from publishers join titles on publishers.pub_id = titles.pub_id group by publishers.pub_id, pub_name having avg(price) > (select avg(price) from titles) ) as highEndPublishersEXCEPT-- These are the existing publisher/author relationshipsselect pub_name , au_fname, au_lnamefrom authors join title_authors on authors.au_id = title_authors. au_id join titles on title_authors.title_id = titles.title_id join publishers on titles.pub_id = publishers.pub_idorder by pub_name, au_lname, au_fname")
pub_name au_fname au_lname
1 Core Dump Books Sarah Buchman
2 Core Dump Books Wendy Heydemark
3 Schandenfreude Press Christian Kells
Click for answer - using EXCEPT with Common Table Expressions (CTE)
pub_name au_fname au_lname
1 Core Dump Books Sarah Buchman
2 Core Dump Books Wendy Heydemark
3 Schandenfreude Press Christian Kells
Click for answer - using NOT EXISTS with subqueries
sqldf("select pub_name, au_fname, au_lnamefrom ( select authors.au_id, au_fname, au_lname from authors join title_authors on authors.au_id = title_authors.au_id join titles on title_authors.title_id = titles.title_id group by authors.au_id, au_fname, au_lname having avg(price) > (select avg(price) from titles) ) as highEndAuthors cross join ( select publishers.pub_id, pub_name from publishers join titles on publishers.pub_id = titles.pub_id group by publishers.pub_id, pub_name having avg(price) > (select avg(price) from titles) ) as highEndPublisherswhere not exists ( -- correlated subquery select 1 from authors join title_authors on authors.au_id = title_authors. au_id join titles on title_authors.title_id = titles.title_id join publishers on titles.pub_id = publishers.pub_id where publishers.pub_id = highEndPublishers.pub_id and authors.au_id = highEndAuthors.au_id )order by pub_name, au_lname, au_fname")
pub_name au_fname au_lname
1 Core Dump Books Sarah Buchman
2 Core Dump Books Wendy Heydemark
3 Schandenfreude Press Christian Kells
Click for answer - using NOT EXISTS with Common Table Expressions (CTE)**
sqldf("with highEndAuthors as ( select authors.au_id, au_fname, au_lname from authors join title_authors on authors.au_id = title_authors.au_id join titles on title_authors.title_id = titles.title_id group by authors.au_id, au_fname, au_lname having avg(price) > (select avg(price) from titles) ), highEndPublishers as ( select publishers.pub_id, pub_name from publishers join titles on publishers.pub_id = titles.pub_id group by publishers.pub_id, pub_name having avg(price) > (select avg(price) from titles) ), existingRelationships as ( -- correlated subquery select 1 from authors join title_authors on authors.au_id = title_authors. au_id join titles on title_authors.title_id = titles.title_id join publishers on titles.pub_id = publishers.pub_id where publishers.pub_id = highEndPublishers.pub_id and authors.au_id = highEndAuthors.au_id )select pub_name, au_fname, au_lnamefrom highEndAuthors cross join highEndPublisherswhere not exists ( select 1 from existingRelationships )order by pub_name, au_lname, au_fname")
pub_name au_fname au_lname
1 Core Dump Books Sarah Buchman
2 Core Dump Books Wendy Heydemark
3 Schandenfreude Press Christian Kells
Question 46
Show the names of the most prolific authors, i.e. the authors who have written the most books. Note that there could be several different authors who are “tied” for the most number of books.
Sort the names in alphabetical order (last name, first name) Also show the au_id and the number of titles that the author wrote.
This is a rather complex answer. This approach uses a subquery within a subquery. There may be other approaches too. There may be simpler answers too.
To explain we will show the subqueries and what they produce. Then we will show the full query that includes the subquery.
The following query is NOT the answer. It is simply a query that shows the number of books that were written by each author. This will be used as a subquery in the answer shown below.
sqldf(" select authors.au_id, count(*) as numTitles from authors join title_authors on authors.au_id = title_authors.au_id group by authors.au_id, au_fname, au_lname")
We’d like to get the max number of titles that an author wrote. This amounts to getting the highest number from the numTitles column in the previous query. We can do that by using the above query as a subquery in the from clause.
sqldf(" select max(numTitles2) from (select authors.au_id, count(*) as numTitles2 from authors join title_authors on authors.au_id = title_authors.au_id group by authors.au_id, au_fname, au_lname) ")
max(numTitles2)
1 4
Finally we can put this all together to answer our question. The following is the FULL ANSWER to the original question. It uses the above query as a subquery in the having clause (see the code below). Note that in the following code there is a subquery inside of a subquery.
FINAL ANSWER:
sqldf("select authors.au_id, au_fname, au_lname, count(*) as numTitlesfrom authors join title_authors on authors.au_id = title_authors.au_idgroup by authors.au_id, au_fname, au_lnamehaving numTitles = (select max(numTitles2) from (select authors.au_id, count(*) as numTitles2 from authors join title_authors on authors.au_id = title_authors.au_id group by authors.au_id, au_fname, au_lname))order by au_lname, au_fname")
Show all topics and the authors who wrote books on those topics. Sort the ouput in alphabetical order by topic and within each topic in alphabetical order by the author. The first name and last name of each author should be combined into a single column called ‘author’.
Using the data shown above, the output should be:
type author
1 biography Wendy Heydemark
2 biography Klee Hull
3 children Harvey Kellsey
4 computer Christian Kells
5 history Sarah Buchman
6 psychology Hallie Hull
7 psychology Klee Hull
8 psychology Harvey Kellsey
Click for answer
sqldf(" select type, au_fname || ' ' || au_lname as author from titles join title_authors on titles.title_id = title_authors.title_id join authors on title_authors.au_id = authors.au_id group by type, authors.au_id, au_fname, au_lname order by type, au_lname, au_fname;")
Question 48
Answer all parts of this question
Show all authors who wrote at least one biography or a history book for Abatis Publishers. Do NOT show an author’s name more than once even if they wrote several books.
Using the data shown above, the output should be:
au_fname au_lname
1 Sarah Buchman
2 Wendy Heydemark
Click for answer
sqldf(" select distinct au_fname, au_lname from authors join title_authors on authors.au_id = title_authors.au_id join titles on title_authors.title_id = titles.title_id join publishers on titles.pub_id = publishers.pub_id where publishers.pub_name = 'Abatis Publishers' and (titles.type = 'biography' or titles.type = 'history') order by au_lname, au_fname")
Show all authors who wrote at least one psychology book for Abatis Publishers. Do NOT show an author’s name more than once even if they wrote several books.
Using the data shown above, the output should be:
au_fname au_lname
1 Hallie Hull
2 Klee Hull
Click for answer
sqldf(" select distinct au_fname, au_lname from authors join title_authors on authors.au_id = title_authors.au_id join titles on title_authors.title_id = titles.title_id join publishers on titles.pub_id = publishers.pub_id where publishers.pub_name = 'Abatis Publishers' and titles.type = 'psychology' order by au_lname, au_fname")
Abatis Publishers believes publishing a book on the history of psychology during Sigmund Freud’s era would be a worthwhile endeavor. They think it would be best to get two authors to collaborate on the book - one author who has a psychology background and one author who has a history or biography background.
In order to brainstorm which authors might work together well, Abatis Publishers would like to see a list of each author who wrote a history or biography book with them matched up with each author who wrote a pschology book with them. The publisher will then look through the list and think about which combination of authors might make the most sense - then they’ll try to contact the chosen authors and persuade them to take on the project.
Write a query that returns two columns.
The first column should be named ‘history/biography author’ and contain names of authors who wrote a history or biography book with Abatis (combine the authors first and last names into a single column).
The second column should be named ‘psychology author’ and contain names of authors who wrote a psychology book with Abatis.
The output should show every combination of a history/biogrphy author who wrote for Abatis with a psychology author who wrote for Abatis.
sqldf(" select hist_bio_authors.au_fname || ' ' || hist_bio_authors.au_lname as 'history/biography author', psych_authors.au_fname || ' ' || psych_authors.au_lname as 'psychology author' from ( -- this is the query from part (a) select distinct au_fname, au_lname from authors join title_authors on authors.au_id = title_authors.au_id join titles on title_authors.title_id = titles.title_id join publishers on titles.pub_id = publishers.pub_id where publishers.pub_name = 'Abatis Publishers' and (titles.type = 'biography' or titles.type = 'history') order by au_lname, au_fname ) as hist_bio_authors cross join ( -- this is the query from part (b) select distinct au_fname, au_lname from authors join title_authors on authors.au_id = title_authors.au_id join titles on title_authors.title_id = titles.title_id join publishers on titles.pub_id = publishers.pub_id where publishers.pub_name = 'Abatis Publishers' and titles.type = 'psychology' order by au_lname, au_fname ) as psych_authors")