40  Database Questions (and Answers)

The questions below require you to write SQL code. The questions refer to the Books Database which is fully described fully on this page. However for your convenience, the info from that page was also copied below.

The rows of data shown for the various tables is the data as it appears in the datatabase today. However, with any database, you must consider that the data might change in the future. For example, new rows might be added, prices might change, data may be removed, etc.

The code in the answer for each question MUST WORK even if data were to be different than the current data in the database (i.e. do NOT write the answer such that it only works for the existing data - the code should work no matter what data is in the tables as long as the tables continue to have the same structure as they do now.

Click here to hide/show this section

40.1 The “Books Database”

The “books database” is a collection of several tables that contains data relating to books (i.e. “titles”), authors, publishers, etc.

You can download the data for the database in the following ways:

  • as a set of several CSV files or (see below)

  • as a single sqlite3 database file (see below)

See below for details.

40.1.1 SQLite file for “Books Database”

Click here to download all the data for the books datbase as a single “SQLite” database file.

To use this file you can do one of the following:

  1. Install SQLite software on your computer. There are many such programs avaiable. I recommend “DB Viewer for SQLite” which is a free open source program.

    Website (download from here): https://sqlitebrowser.org/

    Github repository: https://github.com/sqlitebrowser/sqlitebrowser

    You can search online for other similar programs for working with sqlite3 files.

  2. Free web based tool: https://y-rosenthal.github.io/yrSqliteViewer/yrSqliteViewer.html

    This is a website that lets you upload a sqlite3 file and run SQL Select statements against the data. I created this website myself (with the help of some AI coding assistants).

    PROs: you don’t need to install anything

    CONs: currently only works with SQL SELECT statment but not any other types of sql statements.

  3. See this chapter (in this book) https://y-rosenthal.github.io/DataManagementUsingR/sql0950-workingWithDatabasesInR-v010.html for instructions on how to access this database file directly via R commands.

40.1.2 CSV files for “Books Database”

Download the CSV files for the “books database”. These CSV files together comprise the data for the “books database”.

Download a tutorial on using Relational databases.

This word document contains:

  1. a complete description of the “books database”
  2. a tutorial on how “relational databases” are structured
  3. a tutorial on Entity Relationship Diagrams (ERDs)
  4. a set of questions and answers

Once you’ve downloaded the various CSV files you can run the following commands to import the data into R.

40.1.3 Import the data (books database)

if(!require(readr)){install.packages("readr");require(readr);}
Loading required package: readr
# Read in the data for the books database - see the 
titles = read_csv("titles.csv", na="NULL", show_col_types=FALSE)
authors = read_csv("authors.csv", na="NULL", show_col_types=FALSE)
publishers = read_csv("publishers.csv", na="NULL", show_col_types=FALSE)
title_authors = read_csv("title_authors.csv", na="NULL", show_col_types=FALSE)
royalties = read_csv("royalties.csv", na="NULL", show_col_types=FALSE)

40.1.4 Books Database - Entity Relationship Diagram (ERD)

The following is an “Entity Relationship Diagram (ERD)” that describes the relationships between the tables. See the Word document linked above for more info:

%%{init: {'theme': 'base', 'themeVariables': { 'primaryColor': '#ffffff', 'primaryTextColor': '#0000ff', 'primaryBorderColor': '#000000', 'lineColor': '#000000', 'secondaryColor': '#ffffff', 'tertiaryColor': '#ffffff', 'mainBkg': '#ffffff', 'background': '#ffffff', 'edgeLabelBackground': '#ffffff' }}}%%
erDiagram
    authors ||--o{ title_authors : ""
    title_authors }|--|| titles : ""
    titles ||--o| royalties : ""
    publishers |o--o{ titles : ""
    
    authors {
        string au_id PK
        string au_fname
        string au_lname
        string phone
        string address
        string city
        string state
        string zip
    }
    
    title_authors {
        string title_id FK1
        string au_id FK2
        int au_order
        float royalty_share
    }
    
    titles {
        string title_id PK
        string title_name
        string type
        string pub_id FK
        int pages
        float price
        int sales
        date pubdate
    }
    
    publishers {
        string pub_id PK
        string pub_name
        string city
        string state
        string country
    }
    
    royalties {
        string title_id PK
        float advance
        float royalty_rate
    }

Below are the contents of these tables:

sqldf("select * from titles")
   title_id                          title_name       type pub_id pages price   sales         pubdate
1       T01                               1977!    history    P01   107 21.99     566   8/1/2000 0:00
2       T02           200 Years of German Humor    history    P03    14 19.95    9566   4/1/1998 0:00
3       T03        Ask Yor System Administrator   computer    P02  1226 39.95   25667   9/1/2000 0:00
4       T04           But I Did It Unconciously psychology    P01   510 12.99   13001  5/31/1999 0:00
5       T05              Exchange of Platitudes psychology    P01   201  6.95  201440   1/1/2001 0:00
6       T06                    How About Never?  biography    P01   473 19.95   11320  7/31/2000 0:00
7       T07                   I Blame My Mother  biography    P03   333 23.95 1500200  10/1/1999 0:00
8       T08        Just Wait Until After School   children    P01    86 10.00    4095   6/1/2001 0:00
9       T09                     Kiss My Boo Boo   children    P01    22 13.95    5000  5/31/2002 0:00
10      T10         Not Without My Fabrerge Egg  biography    P05    NA    NA      NA            <NA>
11      T11    Perhaps It's a Glandular Problem psychology    P05   826  7.99   94123 11/30/2000 0:00
12      T12           Spontaneous, Not Annoying  biography    P05   507 12.99  100001  8/31/2000 0:00
13      T13 What Are The Civilian Applications?    history    P03   802 29.99   10467  5/31/1999 0:00
sqldf("select * from publishers")
  pub_id             pub_name          city state country
1    P01    Abatis Publishers      New York    NY     USA
2    P02      Core Dump Books San Francisco    CA     USA
3    P03 Schandenfreude Press       Hamburg  <NA> Germany
4    P04    Tneterhooks Press      Berkeley    CA     USA
5    P05       AAA Publishing      Berkeley    CA     USA
sqldf("select * from title_authors")
   title_id au_id au_order royalty_shares
1       T01   A01        1            1.0
2       T02   A01        1            1.0
3       T03   A05        1            1.0
4       T04   A03        1            0.6
5       T04   A04        2            0.4
6       T05   A04        1            1.0
7       T06   A02        1            1.0
8       T07   A02        1            0.5
9       T07   A04        2            0.5
10      T08   A06        1            1.0
11      T09   A06        1            1.0
12      T10   A02        1            1.0
13      T11   A03        2            0.3
14      T11   A04        3            0.3
15      T11   A06        1            0.4
16      T12   A02        1            1.0
17      T13   A01        1            1.0
sqldf("select * from authors")
  au_id  au_fname    au_lname        phone              address          city state   zip
1   A01     Sarah     Buchman 718-496-7223       75 West 205 St         Bronx    NY 10468
2   A02     Wendy   Heydemark 303-986-7020     2922 Baseline Rd       Boulder    CO 80303
3   A03    Hallie        Hull 415-549-4278 3800 Waldo Ave, #14F San Francisco    CA 94123
4   A04      Klee        Hull 415-549-4278 3800 Waldo Ave, #14F San Francisco    CA 94123
5   A05 Christian       Kells 212-771-4680       114 Horatio St      New York    NY 10014
6   A06    Harvey     Kellsey 650-836-7128       390 Serra Mall     Palo Alto    CA 94305
7   A07     Paddy O'Furniture 941-925-0752         1442 Main St      Sarasota    FL 34236
sqldf("select * from royalties")
   title_id advance royalty_rate
1       T01   10000         0.05
2       T02    1000         0.06
3       T03   15000         0.07
4       T04   20000         0.08
5       T05  100000         0.09
6       T06   20000         0.08
7       T07 1000000         0.11
8       T08       0         0.04
9       T09       0         0.05
10      T10      NA           NA
11      T11  100000         0.07
12      T12   50000         0.09
13      T13   20000         0.06

40.1.5 Single Table queries

Question 1

List all books (title and number sold) that sold more than 100 copies. List the books with the most sales at the top.

Using the data shown above, the output should be:

                            title_name   sales
1                    I Blame My Mother 1500200
2               Exchange of Platitudes  201440
3            Spontaneous, Not Annoying  100001
4     Perhaps It's a Glandular Problem   94123
5         Ask Yor System Administrator   25667
6            But I Did It Unconciously   13001
7                     How About Never?   11320
8  What Are The Civilian Applications?   10467
9            200 Years of German Humor    9566
10                     Kiss My Boo Boo    5000
11        Just Wait Until After School    4095
sqldf("
        select title_name, sales
        from titles
        where sales > 1000
        order by sales desc;
")

Question 2

List all authors who are live either in NY or CA and whose last name begins with a “K”.

Using the data shown above, the output should be:

   au_fname au_lname
1 Christian    Kells
2    Harvey  Kellsey
sqldf("
  -- NOTE: 
  --  parentheses are required in the WHERE 
  --  since AND is done before OR by default)
    select au_fname, au_lname
    from authors
    where (state = 'NY' or state='CA')  and  substr(au_lname,1,1) = 'K';   
")

Question 3

List the first and last names of all authors whose last name starts with a letter from A through J
(HINT: In the where clause, make sure that the first letter of the last name is both >=“A” and also <“K”) (HINT: another possible solution is to use the LIKE several times - once for A, once for B, once for C, etc up to J. Each LIKE should be separated from the others by OR’s)

Using the data shown above, the output should be:

  au_fname  au_lname
1    Sarah   Buchman
2    Wendy Heydemark
3   Hallie      Hull
4     Klee      Hull

ONE ANSWER:

sqldf("
        select au_fname, au_lname
        from authors
        where substr(au_lname,1,1) >= 'A' and substr(au_lname,1,1) < 'K';
")

ANOTHER ANSWER:

sqldf("
    select au_fname, au_lname
    from authors
    where   au_lname like 'A%' or au_lname like 'B%' or au_lname like 'C%' or 
                          au_lname like 'D%' or au_lname like 'E%' or au_lname like 'F%' or 
                          au_lname like 'G%' or au_lname like 'H%' or au_lname like 'I%' or 
                          au_lname like 'J%'
")

Question 4

For each author, show their first initial, followed by a period and a space, followed by their last name. In the 2nd column show the author’s state. Show the column heading for the first column as ‘AuthorName’. Order the results in alphabetical order based on the full name of the person.

Using the data shown above, the output should be:

      AuthorName state
1     S. Buchman    NY
2   W. Heydemark    CO
3        H. Hull    CA
4        K. Hull    CA
5       C. Kells    NY
6     H. Kellsey    CA
7 P. O'Furniture    FL
sqldf("select substr(au_fname,1,1) || '. ' || au_lname  as   'AuthorName',  state
         from authors
         order by au_lname, au_fname;
")

Question 5

Show the titles of all books. Also show the length of the title, i.e. how many characters, including spaces and punctuation. Display the 2nd column with the name ‘TitleLength’. Sort the output so that the shortest titles are listed first. If two titles are the same length then sort those titles alphabetically.

Using the data shown above, the output should be:
                            title_name TitleLength
1                                1977!           5
2                      Kiss My Boo Boo          15
3                     How About Never?          16
4                    I Blame My Mother          17
5               Exchange of Platitudes          22
6            200 Years of German Humor          25
7            But I Did It Unconciously          25
8            Spontaneous, Not Annoying          25
9          Not Without My Fabrerge Egg          27
10        Ask Yor System Administrator          28
11        Just Wait Until After School          28
12    Perhaps It's a Glandular Problem          32
13 What Are The Civilian Applications?          35
sqldf("
        select title_name, length(title_name) as 'TitleLength'
        from titles
        order by length(title_name), title_name;
")

40.1.6 Single table queries with aggregate functions but no “group by” (i.e. these will return EXACTLY one row for each query).

Question 6

List the average price of all books.

Using the data shown above, the output should be:

  avg(price)
1    18.3875
sqldf("
        select avg(price) 
        from titles;
")

Question 7

List the average price of history books.

Using the data shown above, the output should be:

  avg(price)
1   23.97667
sqldf("
        select avg(price) 
        from titles
        where type = 'history';
")

Question 8

List the number of pages in the longest and shortest books (don’t list the actual title of the book).

Using the data shown above, the output should be:

  min(pages) max(pages)
1         14       1226
sqldf("
        select min(pages) , max(pages) from titles
")

40.1.7 Single table queries with calculated values

Question 9

List the title_name and total revenue for each book. (Revenue for a book is the number sold times the price of the book.)

Using the data shown above, the output should be:

                            title_name Total  Revenue
1                                1977!       12446.34
2            200 Years of German Humor      190841.70
3         Ask Yor System Administrator     1025396.65
4            But I Did It Unconciously      168882.99
5               Exchange of Platitudes     1400008.00
6                     How About Never?      225834.00
7                    I Blame My Mother    35929790.00
8         Just Wait Until After School       40950.00
9                      Kiss My Boo Boo       69750.00
10         Not Without My Fabrerge Egg             NA
11    Perhaps It's a Glandular Problem      752042.77
12           Spontaneous, Not Annoying     1299012.99
13 What Are The Civilian Applications?      313905.33
sqldf("
        select title_name,   sales * price   as   'Total  Revenue'
        from titles;
")

Question 10

List title of each book and the sale price for the book where the sale price is 10% off of the original price.

Using the data shown above, the output should be:

                            title_name Sale Price
1                                1977!     19.791
2            200 Years of German Humor     17.955
3         Ask Yor System Administrator     35.955
4            But I Did It Unconciously     11.691
5               Exchange of Platitudes      6.255
6                     How About Never?     17.955
7                    I Blame My Mother     21.555
8         Just Wait Until After School      9.000
9                      Kiss My Boo Boo     12.555
10         Not Without My Fabrerge Egg         NA
11    Perhaps It's a Glandular Problem      7.191
12           Spontaneous, Not Annoying     11.691
13 What Are The Civilian Applications?     26.991
sqldf("
        select title_name, price * 0.90 as 'Sale Price'
        from titles;
")

40.2 Single table queries with “group by”.

These can use aggregate functions but will return at most one row in the output for each “group” of rows as defined in the “group by”

Question 11

For each “type” of book (e.g. biography, children, etc) list the number of pages in the shortest book of that type and the number of pages in the longest book of that type. Sort the results alphabetically by the type of book.

Using the data shown above, the output should be:

        type Length of Shortest Length of Longest
1  biography                333               507
2   children                 22                86
3   computer               1226              1226
4    history                 14               802
5 psychology                201               826
sqldf("
    select type, min(pages) AS 'Length of Shortest' ,  max(pages)  as 'Length of Longest'
        from titles
        group by type
        order by type;
")

Question 12

List the types of books (e.g. history, biography, etc) and the average price of those books for which the average price of books in that category is at least $12.00.

Using the data shown above, the output should be:

       type avg(price)
1 biography   18.96333
2  computer   39.95000
3   history   23.97667
sqldf("
        select type, avg(price)
        from titles
        group by type
        having avg(price) >= 12;
")

Question 13

For each type of book (e.g. biography, children, etc), show the number of those books that are 450 pages or longer. The column heading should be ‘# of long books’. If there are no titles of a particular type that are so long, then don’t show that type at all.

Using the data shown above, the output should be:

        type # of long books
1  biography               2
2   computer               1
3    history               1
4 psychology               2
sqldf("
        select type, count(*) as '# of long books'
        from titles
        where pages >= 450
        group by type;
")

Question 14

Modify the answer to the previous question so that only those types for which there are at least 2 long books are displayed.

Using the data shown above, the output should be:

        type # of long books
1  biography               2
2 psychology               2

NOTE: the only change to the above answer is the addition of the HAVING clause at the end.

sqldf("
        select type, count(*) as '# of long books'
        from titles
        where pages >= 450
        group by type
        having count(*) >= 2        -- this is the only change.
")

40.3 Multiple table queries with group by

Question 15

For each publisher, list the name of the publisher and total number of pages that the publisher has published in all their books (ie. add up all the pages in all books for each publisher). Show the publishers who have published the most pages at the top of the result list.

Using the data shown above, the output should be:

              pub_name sum(pages)
1    Abatis Publishers       1399
2       AAA Publishing       1333
3      Core Dump Books       1226
4 Schandenfreude Press       1149
sqldf("
        select pub_name, sum(pages)
        from publishers join titles on publishers.pub_id = titles.pub_id
        group by pub_name
        order by sum(pages) desc;
")

Question 16

Modify the previous query so that only publishers who have published at least 1250 pages will show up.

Using the data shown above, the output should be:

           pub_name sum(pages)
1 Abatis Publishers       1399
2    AAA Publishing       1333
sqldf("
        select pub_name, sum(pages)
        from publishers join titles on publishers.pub_id = titles.pub_id
        group by pub_name
        having  sum(pages)  >= 1250      -- THIS LINE IS THE ONLY CHANGE 
        order by sum(pages) desc;
")

Question 17

List each publisher’s name and the numbers of pages in their longest and shortest books. Sort the results by the publisher’s name.

Using the data shown above, the output should be:

              pub_name Length of Shortest Length of Longest
1       AAA Publishing                507               826
2    Abatis Publishers                 22               510
3      Core Dump Books               1226              1226
4 Schandenfreude Press                 14               802
sqldf("
        select pub_name, min(pages) AS 'Length of Shortest' ,  max(pages)  as 'Length of Longest'
        from titles join publishers on titles.pub_id = publishers.pub_id
        group by pub_name
        order by pub_name;
")

40.4 Multiple table queries

Question 18

List the authors first and last names, the author’s state, the title name, the publisher name and the publishers state for all books.

Using the data shown above, the output should be:

    au_fname  au_lname state                          title_name             pub_name state
1      Sarah   Buchman    NY                               1977!    Abatis Publishers    NY
2      Sarah   Buchman    NY           200 Years of German Humor Schandenfreude Press  <NA>
3      Sarah   Buchman    NY What Are The Civilian Applications? Schandenfreude Press  <NA>
4      Wendy Heydemark    CO                    How About Never?    Abatis Publishers    NY
5      Wendy Heydemark    CO                   I Blame My Mother Schandenfreude Press  <NA>
6      Wendy Heydemark    CO         Not Without My Fabrerge Egg       AAA Publishing    CA
7      Wendy Heydemark    CO           Spontaneous, Not Annoying       AAA Publishing    CA
8     Hallie      Hull    CA           But I Did It Unconciously    Abatis Publishers    NY
9     Hallie      Hull    CA    Perhaps It's a Glandular Problem       AAA Publishing    CA
10      Klee      Hull    CA           But I Did It Unconciously    Abatis Publishers    NY
11      Klee      Hull    CA              Exchange of Platitudes    Abatis Publishers    NY
12      Klee      Hull    CA                   I Blame My Mother Schandenfreude Press  <NA>
13      Klee      Hull    CA    Perhaps It's a Glandular Problem       AAA Publishing    CA
14 Christian     Kells    NY        Ask Yor System Administrator      Core Dump Books    CA
15    Harvey   Kellsey    CA        Just Wait Until After School    Abatis Publishers    NY
16    Harvey   Kellsey    CA                     Kiss My Boo Boo    Abatis Publishers    NY
17    Harvey   Kellsey    CA    Perhaps It's a Glandular Problem       AAA Publishing    CA
sqldf("
    select au_fname, au_lname, authors.state, title_name, pub_name, publishers.state
    from authors join title_authors on authors.au_id = title_authors.au_id 
                          join titles on title_authors.title_id = titles.title_id 
                          join publishers on publishers.pub_id = titles.pub_id;
")

Question 19

List the title, number of pages, authors first and last names and author’s state for all books that are longer than 150 pages and whose author lives in NY or CA.

Using the data shown above, the output should be:

Warning: Ignoring remaining part of query: -- (parentheses ARE necessary since by default AND is done before OR)
                           title_name pages  au_fname au_lname state
1 What Are The Civilian Applications?   802     Sarah  Buchman    NY
2           But I Did It Unconciously   510    Hallie     Hull    CA
3    Perhaps It's a Glandular Problem   826    Hallie     Hull    CA
4           But I Did It Unconciously   510      Klee     Hull    CA
5              Exchange of Platitudes   201      Klee     Hull    CA
6                   I Blame My Mother   333      Klee     Hull    CA
7    Perhaps It's a Glandular Problem   826      Klee     Hull    CA
8        Ask Yor System Administrator  1226 Christian    Kells    NY
9    Perhaps It's a Glandular Problem   826    Harvey  Kellsey    CA
sqldf("
    select title_name, pages, au_fname, au_lname, authors.state 
    from authors join title_authors on authors.au_id = title_authors.au_id 
                          join titles on title_authors.title_id = titles.title_id 
    where pages > 150 and   (state = 'NY'  or   state = 'CA') ;  -- (parentheses ARE necessary since by default AND is done before OR)
")

Question 20

List the authors first and last names, the author’s state, the title name, the publisher name and the publishers state only for those books where the author and publisher are from the same state. (HINT: specify authors.state [note: there is a period between author and state] in the criteria for the publisher’s state).

Using the data shown above, the output should be:

  au_fname au_lname state                       title_name          pub_name state
1    Sarah  Buchman    NY                            1977! Abatis Publishers    NY
2   Hallie     Hull    CA Perhaps It's a Glandular Problem    AAA Publishing    CA
3     Klee     Hull    CA Perhaps It's a Glandular Problem    AAA Publishing    CA
4   Harvey  Kellsey    CA Perhaps It's a Glandular Problem    AAA Publishing    CA
sqldf("
    select au_fname, au_lname, authors.state, title_name, pub_name, publishers.state
    from authors join title_authors on authors.au_id = title_authors.au_id 
                          join titles on title_authors.title_id = titles.title_id 
                          join publishers on publishers.pub_id = titles.pub_id
                where authors.state = publishers.state;
")

Question 21

List the title and authors of all books published by “Abatis Publishers” sort the results in alphabetical order by title and then by the au_order column. Do NOT show that au_order column in the output.

Using the data shown above, the output should be:

  au_fname  au_lname                   title_name          pub_name
1    Sarah   Buchman                        1977! Abatis Publishers
2   Hallie      Hull    But I Did It Unconciously Abatis Publishers
3     Klee      Hull    But I Did It Unconciously Abatis Publishers
4     Klee      Hull       Exchange of Platitudes Abatis Publishers
5    Wendy Heydemark             How About Never? Abatis Publishers
6   Harvey   Kellsey Just Wait Until After School Abatis Publishers
7   Harvey   Kellsey              Kiss My Boo Boo Abatis Publishers
sqldf("
    select au_fname, au_lname, title_name, pub_name
    from authors join title_authors on authors.au_id = title_authors.au_id 
                          join titles on title_authors.title_id = titles.title_id 
                          join publishers on publishers.pub_id = titles.pub_id
    where pub_name = 'Abatis Publishers'
    order by title_name, au_order;
")

Question 22

List the publisher name, author name, and title name for all books that have at least 100 pages and that sold at least 200 copies. Sort the results in alphabetical order first by publisher and then by author.

Using the data shown above, the output should be:

               pub_name  au_fname  au_lname                          title_name
1        AAA Publishing     Wendy Heydemark           Spontaneous, Not Annoying
2        AAA Publishing    Hallie      Hull    Perhaps It's a Glandular Problem
3        AAA Publishing      Klee      Hull    Perhaps It's a Glandular Problem
4        AAA Publishing    Harvey   Kellsey    Perhaps It's a Glandular Problem
5     Abatis Publishers     Sarah   Buchman                               1977!
6     Abatis Publishers     Wendy Heydemark                    How About Never?
7     Abatis Publishers    Hallie      Hull           But I Did It Unconciously
8     Abatis Publishers      Klee      Hull           But I Did It Unconciously
9     Abatis Publishers      Klee      Hull              Exchange of Platitudes
10      Core Dump Books Christian     Kells        Ask Yor System Administrator
11 Schandenfreude Press     Sarah   Buchman What Are The Civilian Applications?
12 Schandenfreude Press     Wendy Heydemark                   I Blame My Mother
13 Schandenfreude Press      Klee      Hull                   I Blame My Mother
sqldf("
  select pub_name, au_fname, au_lname, title_name
    from authors join title_authors on authors.au_id = title_authors.au_id 
                          join titles on title_authors.title_id = titles.title_id 
                          join publishers on publishers.pub_id = titles.pub_id
                where   pages >= 100 and sales >= 200
    order by  pub_name,   au_lname,  au_fname;
")

Question 23

List authors first and last names and the titles they wrote for publishers whose name (ie. the publisher’s name) begins with “A”, “B” or “C”. Sort the rows in alphabetical order by the publisher. For rows for the same publisher, sort the rows alphabetically by the names of the authors.

Using the data shown above, the output should be:

    au_fname  au_lname                       title_name          pub_name
1      Wendy Heydemark      Not Without My Fabrerge Egg    AAA Publishing
2      Wendy Heydemark        Spontaneous, Not Annoying    AAA Publishing
3     Hallie      Hull Perhaps It's a Glandular Problem    AAA Publishing
4       Klee      Hull Perhaps It's a Glandular Problem    AAA Publishing
5     Harvey   Kellsey Perhaps It's a Glandular Problem    AAA Publishing
6      Sarah   Buchman                            1977! Abatis Publishers
7      Wendy Heydemark                 How About Never? Abatis Publishers
8     Hallie      Hull        But I Did It Unconciously Abatis Publishers
9       Klee      Hull        But I Did It Unconciously Abatis Publishers
10      Klee      Hull           Exchange of Platitudes Abatis Publishers
11    Harvey   Kellsey     Just Wait Until After School Abatis Publishers
12    Harvey   Kellsey                  Kiss My Boo Boo Abatis Publishers
13 Christian     Kells     Ask Yor System Administrator   Core Dump Books

Three different answers.
The only differences between the answers are the WHERE clauses.

Answer 1
    select au_fname, au_lname, title_name, pub_name
    from authors join title_authors on authors.au_id = title_authors.au_id 
                          join titles on title_authors.title_id = titles.title_id 
                          join publishers on publishers.pub_id = titles.pub_id  where substr(pub_name, 1,1) IN ('A','B','C')  order by pub_name, au_lname, au_fname
    au_fname  au_lname                       title_name          pub_name
1      Wendy Heydemark      Not Without My Fabrerge Egg    AAA Publishing
2      Wendy Heydemark        Spontaneous, Not Annoying    AAA Publishing
3     Hallie      Hull Perhaps It's a Glandular Problem    AAA Publishing
4       Klee      Hull Perhaps It's a Glandular Problem    AAA Publishing
5     Harvey   Kellsey Perhaps It's a Glandular Problem    AAA Publishing
6      Sarah   Buchman                            1977! Abatis Publishers
7      Wendy Heydemark                 How About Never? Abatis Publishers
8     Hallie      Hull        But I Did It Unconciously Abatis Publishers
9       Klee      Hull        But I Did It Unconciously Abatis Publishers
10      Klee      Hull           Exchange of Platitudes Abatis Publishers
11    Harvey   Kellsey     Just Wait Until After School Abatis Publishers
12    Harvey   Kellsey                  Kiss My Boo Boo Abatis Publishers
13 Christian     Kells     Ask Yor System Administrator   Core Dump Books
Answer 2
    select au_fname, au_lname, title_name, pub_name
    from authors join title_authors on authors.au_id = title_authors.au_id 
                          join titles on title_authors.title_id = titles.title_id 
                          join publishers on publishers.pub_id = titles.pub_id  where substr(pub_name, 1,1) = 'A' or 
        substr(pub_name, 1,1) = 'B' or
               substr(pub_name, 1,1) = 'C'  order by pub_name, au_lname, au_fname
    au_fname  au_lname                       title_name          pub_name
1      Wendy Heydemark      Not Without My Fabrerge Egg    AAA Publishing
2      Wendy Heydemark        Spontaneous, Not Annoying    AAA Publishing
3     Hallie      Hull Perhaps It's a Glandular Problem    AAA Publishing
4       Klee      Hull Perhaps It's a Glandular Problem    AAA Publishing
5     Harvey   Kellsey Perhaps It's a Glandular Problem    AAA Publishing
6      Sarah   Buchman                            1977! Abatis Publishers
7      Wendy Heydemark                 How About Never? Abatis Publishers
8     Hallie      Hull        But I Did It Unconciously Abatis Publishers
9       Klee      Hull        But I Did It Unconciously Abatis Publishers
10      Klee      Hull           Exchange of Platitudes Abatis Publishers
11    Harvey   Kellsey     Just Wait Until After School Abatis Publishers
12    Harvey   Kellsey                  Kiss My Boo Boo Abatis Publishers
13 Christian     Kells     Ask Yor System Administrator   Core Dump Books
Answer 3
    select au_fname, au_lname, title_name, pub_name
    from authors join title_authors on authors.au_id = title_authors.au_id 
                          join titles on title_authors.title_id = titles.title_id 
                          join publishers on publishers.pub_id = titles.pub_id  where pub_name like 'A%'  or  pub_name like 'B%'  or pub_name like 'C%'  order by pub_name, au_lname, au_fname
    au_fname  au_lname                       title_name          pub_name
1      Wendy Heydemark      Not Without My Fabrerge Egg    AAA Publishing
2      Wendy Heydemark        Spontaneous, Not Annoying    AAA Publishing
3     Hallie      Hull Perhaps It's a Glandular Problem    AAA Publishing
4       Klee      Hull Perhaps It's a Glandular Problem    AAA Publishing
5     Harvey   Kellsey Perhaps It's a Glandular Problem    AAA Publishing
6      Sarah   Buchman                            1977! Abatis Publishers
7      Wendy Heydemark                 How About Never? Abatis Publishers
8     Hallie      Hull        But I Did It Unconciously Abatis Publishers
9       Klee      Hull        But I Did It Unconciously Abatis Publishers
10      Klee      Hull           Exchange of Platitudes Abatis Publishers
11    Harvey   Kellsey     Just Wait Until After School Abatis Publishers
12    Harvey   Kellsey                  Kiss My Boo Boo Abatis Publishers
13 Christian     Kells     Ask Yor System Administrator   Core Dump Books

Question 24

List the publisher name, author’s name and phone number for all authors that have written books for the publisher’s whose name begins with a letter from A through J. (HINT: specify the publisher’s name twice but only show one of them. In the criteria for one of them list >=“A” and in the criteria for the other list <“K”).

Using the data shown above, the output should be:

            pub_name  au_fname  au_lname        phone
1  Abatis Publishers     Sarah   Buchman 718-496-7223
2  Abatis Publishers    Hallie      Hull 415-549-4278
3  Abatis Publishers      Klee      Hull 415-549-4278
4  Abatis Publishers      Klee      Hull 415-549-4278
5  Abatis Publishers     Wendy Heydemark 303-986-7020
6  Abatis Publishers    Harvey   Kellsey 650-836-7128
7  Abatis Publishers    Harvey   Kellsey 650-836-7128
8    Core Dump Books Christian     Kells 212-771-4680
9     AAA Publishing     Wendy Heydemark 303-986-7020
10    AAA Publishing    Hallie      Hull 415-549-4278
11    AAA Publishing      Klee      Hull 415-549-4278
12    AAA Publishing    Harvey   Kellsey 650-836-7128
13    AAA Publishing     Wendy Heydemark 303-986-7020
sqldf("
        select pub_name, au_fname, au_lname, phone
        from authors join title_authors on authors.au_id = title_authors.au_id 
                              join titles on title_authors.title_id = titles.title_id 
                              join publishers on publishers.pub_id = titles.pub_id
        where substr(pub_name,1,1) >= 'A' and substr(pub_name,1,1) < 'K';
")

Question 25

List the titles of all books and the number of authors for the book (NOT the actual author names). Sort the results so that the books with the most authors appear on top. All books that have the same number of authors should be listed in alphabetical order.

Using the data shown above, the output should be:

                            title_name Number of Authors
1     Perhaps It's a Glandular Problem                 3
2            But I Did It Unconciously                 2
3                    I Blame My Mother                 2
4                                1977!                 1
5            200 Years of German Humor                 1
6         Ask Yor System Administrator                 1
7               Exchange of Platitudes                 1
8                     How About Never?                 1
9         Just Wait Until After School                 1
10                     Kiss My Boo Boo                 1
11         Not Without My Fabrerge Egg                 1
12           Spontaneous, Not Annoying                 1
13 What Are The Civilian Applications?                 1
sqldf("
        select title_name, count(*) as 'Number of Authors'
        from authors join title_authors on authors.au_id = title_authors.au_id 
                              join titles on title_authors.title_id = titles.title_id 
        group by titles.title_id, title_name
        order by count(*) desc, title_name;
")

Question 26

List the titles of all books that have 2 or more authors and that are longer than 50 pages.

Using the data shown above, the output should be:

                        title_name Number of Authors
1 Perhaps It's a Glandular Problem                 3
2        But I Did It Unconciously                 2
3                I Blame My Mother                 2
sqldf("
        select title_name, count(*) as 'Number of Authors'
        from authors join title_authors on authors.au_id = title_authors.au_id 
                              join titles on title_authors.title_id = titles.title_id 
        where pages >= 50
        group by titles.title_id, title_name
        having count(*) >= 2
        order by count(*) desc, title_name;
")

Question 27

List all publishers and the number of books that the publisher has published (not the actual titles, just the number of books). Sort the list so that the publisher with the most books appears on top.

Using the data shown above, the output should be:

              pub_name # of Books
1    Abatis Publishers          6
2 Schandenfreude Press          3
3       AAA Publishing          3
4      Core Dump Books          1
sqldf("
    select pub_name, count(*) as '# of Books'
        from titles join publishers on publishers.pub_id = titles.pub_id
        group by publishers.pub_id, pub_name
        order by count(*) desc;
")

Question 28

List the publisher name and total revenue for each for each publisher. (Total revenue for a publisher is the total revenue for all books published by that publisher. Revenue for a book is the number sold times the price of the book.)

Using the data shown above, the output should be:

              pub_name Total Revenue
1    Abatis Publishers       1917871
2      Core Dump Books       1025397
3 Schandenfreude Press      36434537
4       AAA Publishing       2051056
sqldf("
        select pub_name, sum(sales * price) as 'Total Revenue'
        from titles join publishers on publishers.pub_id = titles.pub_id
        group by publishers.pub_id, pub_name    ;
")

Question 29

Modify the previous query to only list those publishers who have at least ten thousand dollars in total revenue.

Using the data shown above, the output should be:

Warning: Ignoring remaining part of query: having sum(sales * price) >= 10000;
              pub_name Total Revenue
1    Abatis Publishers       1917871
2      Core Dump Books       1025397
3 Schandenfreude Press      36434537
4       AAA Publishing       2051056
sqldf("
        select pub_name, sum(sales * price) as 'Total Revenue'
        from titles join publishers on publishers.pub_id = titles.pub_id
        group by publishers.pub_id, pub_name    ;
        having sum(sales * price) >= 10000;
")

Question 30

List the title and author of each book. For each author for a particular book, list the amount of the “advance” that (s)he got for that book.

Using the data shown above, the output should be:

                            title_name  au_fname  au_lname advance
1                                1977!     Sarah   Buchman   10000
2            200 Years of German Humor     Sarah   Buchman    1000
3  What Are The Civilian Applications?     Sarah   Buchman   20000
4                     How About Never?     Wendy Heydemark   20000
5                    I Blame My Mother     Wendy Heydemark 1000000
6          Not Without My Fabrerge Egg     Wendy Heydemark      NA
7            Spontaneous, Not Annoying     Wendy Heydemark   50000
8            But I Did It Unconciously    Hallie      Hull   20000
9     Perhaps It's a Glandular Problem    Hallie      Hull  100000
10           But I Did It Unconciously      Klee      Hull   20000
11              Exchange of Platitudes      Klee      Hull  100000
12                   I Blame My Mother      Klee      Hull 1000000
13    Perhaps It's a Glandular Problem      Klee      Hull  100000
14        Ask Yor System Administrator Christian     Kells   15000
15        Just Wait Until After School    Harvey   Kellsey       0
16                     Kiss My Boo Boo    Harvey   Kellsey       0
17    Perhaps It's a Glandular Problem    Harvey   Kellsey  100000
sqldf("
        select title_name, au_fname, au_lname, advance
        from authors join title_authors on authors.au_id = title_authors.au_id 
                              join titles on title_authors.title_id = titles.title_id 
                              join royalties on royalties.title_id = titles.title_id;
")

Question 31

List the title and author of each book. For each author for a particular book, list the total amount in royalties that (s)he got for that book. Hint: the royalty dollars for a particular book is the “price” times the “sales” times the “royalty_rate” times the “royalty_rate”. Sort the results so that the authors who made the most royalty dollars are listed at the top. (NOTE: make sure to read the database description at the top of this document to understand the fields ROYALTIES.ROYALTY_RATE and the field TITLE_AUTHORS.royalty_rate).

Using the data shown above, the output should be:

                            title_name  au_fname  au_lname Total Royalties for Author
1                                1977!     Sarah   Buchman                   31.11585
2            200 Years of German Humor     Sarah   Buchman                  687.03012
3  What Are The Civilian Applications?     Sarah   Buchman                 1130.05919
4                     How About Never?     Wendy Heydemark                 1445.33760
5                    I Blame My Mother     Wendy Heydemark               434750.45900
6          Not Without My Fabrerge Egg     Wendy Heydemark                         NA
7            Spontaneous, Not Annoying     Wendy Heydemark                10522.00522
8            But I Did It Unconciously    Hallie      Hull                 1080.85114
9     Perhaps It's a Glandular Problem    Hallie      Hull                 3685.00957
10           But I Did It Unconciously      Klee      Hull                 1080.85114
11              Exchange of Platitudes      Klee      Hull                11340.06480
12                   I Blame My Mother      Klee      Hull               434750.45900
13    Perhaps It's a Glandular Problem      Klee      Hull                 3685.00957
14        Ask Yor System Administrator Christian     Kells                 5024.44359
15        Just Wait Until After School    Harvey   Kellsey                   65.52000
16                     Kiss My Boo Boo    Harvey   Kellsey                  174.37500
17    Perhaps It's a Glandular Problem    Harvey   Kellsey                 3685.00957
sqldf("
        select title_name, au_fname, au_lname, royalty_rate * royalty_rate * sales * price as  'Total Royalties for Author'
        from authors join title_authors on authors.au_id = title_authors.au_id 
                              join titles on title_authors.title_id = titles.title_id 
                              join royalties on royalties.title_id = titles.title_id;       ")

Question 32

List the name of each author and the total amount of royalty dollars that they received for all of their books (name this column “RoyaltyDollars”). Hint: the royalty dollars for a particular book is the “price” times the “sales” times the “royalty_rate” times the “royalty_rate”. Sort the results so that the authors who made the most royalty dollars are listed at the top.

Using the data shown above, the output should be:

   au_fname  au_lname Total Royalties for Author
1      Klee      Hull                 450856.385
2     Wendy Heydemark                 446717.802
3 Christian     Kells                   5024.444
4    Hallie      Hull                   4765.861
5    Harvey   Kellsey                   3924.905
6     Sarah   Buchman                   1848.205
sqldf("
        select au_fname, au_lname, sum(royalty_rate * royalty_rate * sales * price)  as  'Total Royalties for Author'
        from authors join title_authors on authors.au_id = title_authors.au_id 
                              join titles on title_authors.title_id = titles.title_id 
                              join royalties on royalties.title_id = titles.title_id
        group by authors.au_id, au_fname, au_lname
        order by sum(royalty_rate * royalty_rate * sales * price)  desc;
")

Question 33

Modify the query from the previous question so that the list only includes authors who made more than $50,000.00 in royalties.

Using the data shown above, the output should be:

  au_fname  au_lname Total Royalties for Author
1     Klee      Hull                   450856.4
2    Wendy Heydemark                   446717.8
sqldf("
        select au_fname, au_lname, sum(royalty_rate * royalty_rate * sales * price)  as  'Total Royalties for Author'
        from authors join title_authors on authors.au_id = title_authors.au_id 
                              join titles on title_authors.title_id = titles.title_id 
                              join royalties on royalties.title_id = titles.title_id
        group by authors.au_id, au_fname, au_lname
        having sum(royalty_rate * royalty_rate * sales * price) > 50000
        order by sum(royalty_rate * royalty_rate * sales * price)  desc;
")

40.5 Subqueries, outer joins, unions and other stuff …

Question 34

Show the title names and number of pages for those books that are longer than (ie. more pages) the average length of all books.

Using the data shown above, the output should be:

                           title_name pages
1        Ask Yor System Administrator  1226
2           But I Did It Unconciously   510
3                    How About Never?   473
4    Perhaps It's a Glandular Problem   826
5           Spontaneous, Not Annoying   507
6 What Are The Civilian Applications?   802
sqldf("
    select title_name , pages
    from titles
    where pages >= (select avg(pages) from titles);
")

Question 35

Show the publisher’s name and number of authors who have written books for that publisher:

Using the data shown above, the output should be:

Error: ambiguous column name: pub_id
sqldf("
        select pub_name, count(DISTINCT authors.au_id) as '# of authors'
        from authors join title_authors on authors.au_id = title_authors.au_id 
                              join titles on title_authors.title_id = titles.title_id 
                              join publishers on publishers.pub_id = titles.pub_id
        group by pub_id, pub_name;
")
Error: ambiguous column name: pub_id

Question 36

Show the pub_name and the number of titles published by that publisher. Only include publishers for whom at least 5 authors have worked.

Using the data shown above, the output should be:

           pub_name Number of Titles
1 Abatis Publishers                6
sqldf("
    select pub_name, count(*) as 'Number of Titles'
    from publishers as pub1 join titles on pub1.pub_id = titles.pub_id
    group by pub1.pub_id, pub_name
    having   5 <= ( select count(DISTINCT authors.au_id)
        from authors join title_authors on authors.au_id = title_authors.au_id 
                              join titles on title_authors.title_id = titles.title_id 
                              join publishers as pub2 on pub2.pub_id = titles.pub_id
        where pub2.pub_id = pub1.pub_id ) ;
")

Question 37

Show the names of publishers who did not publish any books (I guess they are just getting started  )

Using the data shown above, the output should be:

           pub_name
1 Tneterhooks Press

ANSWER – with a subquery

sqldf("
    select pub_name 
    from publishers  as   pub1
    where not exists
        ( select * 
        from publishers  as pub2  join titles on pub2.pub_id = titles.pub_id
        where pub2.pub_id = pub1.pub_id )
")
           pub_name
1 Tneterhooks Press
Click for answer

ANSWER – with a left join

sqldf("
    select pub_name from publishers left join titles on publishers.pub_id = titles.pub_id
    where title_id is NULL
")
           pub_name
1 Tneterhooks Press

Question 38

List the name of each publisher and the total number of books that each publisher has published. Sort the results so that the publishers who published the most books appear at the top. If two or more publishers published the same number of books then they should be listed in alphabetical order.

  1. The answer should only include publishers who have published some books.

    Using the data shown above, the output should be:

              pub_name count(*)
1    Abatis Publishers        6
2       AAA Publishing        3
3 Schandenfreude Press        3
4      Core Dump Books        1
sqldf("
        select pub_name, count(*)
        from titles join publishers on publishers.pub_id = titles.pub_id
        group by pub_name
        order by count(*) desc, pub_name;
")
  1. Same as previous question but this time make sure to also include publishers who have published zero books

    Using the data shown above, the output should be:

              pub_name count(titles.title_id)
1    Abatis Publishers                      6
2       AAA Publishing                      3
3 Schandenfreude Press                      3
4      Core Dump Books                      1
5    Tneterhooks Press                      0

ANSWER - with a left join

sqldf("
        select pub_name, count(titles.title_id)
        from publishers left join titles on publishers.pub_id = titles.pub_id
        group by pub_name
        order by count(*) desc, pub_name;
")
              pub_name count(titles.title_id)
1    Abatis Publishers                      6
2       AAA Publishing                      3
3 Schandenfreude Press                      3
4      Core Dump Books                      1
5    Tneterhooks Press                      0

ANSWER – with a UNION

sqldf("
    select pub_name , 0  as  NumTitles
    from publishers left join titles on publishers.pub_id = titles.pub_id
    where title_id is NULL 

UNION

    select pub_name, count(*) as NumTitles
    from titles join publishers on publishers.pub_id = titles.pub_id
    group by pub_name;
")
              pub_name NumTitles
1       AAA Publishing         3
2    Abatis Publishers         6
3      Core Dump Books         1
4 Schandenfreude Press         3
5    Tneterhooks Press         0

40.6 Additional questions

Question 39

Show the names of the authors and the publishers who published their books. Only show those authors/publishers where the author lives in the same state as the publisher who published the book. Do not show any author/pbulisher names twice.

Using the data shown above, the output should be:

  au_fname au_lname          pub_name state state
1    Sarah  Buchman Abatis Publishers    NY    NY
2   Hallie     Hull    AAA Publishing    CA    CA
3     Klee     Hull    AAA Publishing    CA    CA
4   Harvey  Kellsey    AAA Publishing    CA    CA
sqldf("
select distinct au_fname, au_lname, pub_name, authors.state, publishers.state
from authors join title_authors on authors.au_id = title_authors.au_id
             join titles on title_authors.title_id = titles.title_id
             join publishers on titles.pub_id = publishers.pub_id
where authors.state = publishers.state
      ")

Question 40

The exchange rate for nis/usd is 3.3.
Write a query that shows the titles and prices for biographies that cost between 10 and 20 dollars. Show two columns for the prices. One column for USD and one column for NIS. Name the columns “price in USD” and “price in NIS”. Sort the results so that the most expensive books are displayed first.

Using the data shown above, the output should be:

                 title_name price in NIS price in USD      type
1          How About Never?       65.835        19.95 biography
2 Spontaneous, Not Annoying       42.867        12.99 biography
sqldf("
select title_name, price * 3.3 as 'price in NIS', price as 'price in USD', type
from titles
where type = 'biography' and
      price >= 10 and price <= 20
order by price desc
      ")

Question 41

Show those titles for which the price of the book is greater than the average price of all books and the number of pages of the book is less than the average number of pages for all books.

Using the data shown above, the output should be:

                 title_name price pages
1                     1977! 21.99   107
2 200 Years of German Humor 19.95    14
3         I Blame My Mother 23.95   333
sqldf("
select title_name, price, pages
from titles
where price > (select avg(price) from titles) and
      pages < (select avg(pages) from titles)
      ")

The answer above uses the following two “non-correlated” subqueries. It might help your understanding of the answer above to see the output of the subqueries when they are run by themselves.

# The average price of all books
sqldf("select avg(price) from titles")
  avg(price)
1    18.3875
# The average number of pages of all books
sqldf("select avg(pages) from titles")
  avg(pages)
1   425.5833

Question 42

Show the authors who have not written any books

Using the data shown above, the output should be:

  au_fname    au_lname
1    Paddy O'Furniture
sqldf("
select au_fname, au_lname
from authors left join title_authors on authors.au_id = title_authors.au_id
where title_authors.au_id is null
")

Question 43

In each row show an author’s name and the number of books that author wrote.
Only show those authors who wrote at least two books.
Sort the results alphabetically (last name, first name).

Using the data shown above, the output should be:

  au_fname  au_lname count(*)
1   Hallie      Hull        2
2   Harvey   Kellsey        3
3     Klee      Hull        4
4    Sarah   Buchman        3
5    Wendy Heydemark        4
sqldf("
select au_fname, au_lname, count(*)
from authors join title_authors on authors.au_id = title_authors.au_id
group by au_fname, au_lname
having count(*) >= 2
      ")

Question 44

Same as previous question. However, this time, only show those authors who wrote biographies, the count of the number of biographies they wrote and only show those authors who wrote at least 2 biographies.

Using the data shown above, the output should be:

  au_fname  au_lname count(*)
1    Wendy Heydemark        4
sqldf("
select au_fname, au_lname, count(*)
from authors join title_authors on authors.au_id = title_authors.au_id
             join titles on title_authors.title_id = titles.title_id
where type = 'biography'
group by au_fname, au_lname
having count(*) >= 2
")

Question 45 - a

Write a query to show the books in the titles table for which both the author and the publisher have a higher than average price for books they have worked on. Sort the rows alphabetically by publisher name. Sort the rows that have the same publisher name by the author’s name.

Using the data shown above, the output should be:

              pub_name  au_fname  au_lname                          title_name
1      Core Dump Books Christian     Kells        Ask Yor System Administrator
2 Schandenfreude Press     Sarah   Buchman           200 Years of German Humor
3 Schandenfreude Press     Sarah   Buchman What Are The Civilian Applications?
4 Schandenfreude Press     Wendy Heydemark                   I Blame My Mother
sqldf("
select pub_name, au_fname, au_lname, title_name
from authors join title_authors on authors.au_id = title_authors.au_id
             join titles on title_authors.title_id = titles.title_id
             join publishers on titles.pub_id = publishers.pub_id
where authors.au_id in 
         (
           select authors.au_id 
           from authors join title_authors on authors.au_id = title_authors.au_id
                        join titles on title_authors.title_id = titles.title_id
           group by authors.au_id 
           having avg(price) > (select avg(price) from titles)
         )
      and publishers.pub_id in
         (
           select publishers.pub_id 
           from titles join publishers on titles.pub_id = publishers.pub_id
           group by publishers.pub_id
           having avg(price) > (select avg(price) from titles)
         )
order by pub_name, au_lname, au_fname
")
              pub_name  au_fname  au_lname                          title_name
1      Core Dump Books Christian     Kells        Ask Yor System Administrator
2 Schandenfreude Press     Sarah   Buchman           200 Years of German Humor
3 Schandenfreude Press     Sarah   Buchman What Are The Civilian Applications?
4 Schandenfreude Press     Wendy Heydemark                   I Blame My Mother
sqldf("
select pub_name, au_fname, au_lname, title_name
from    (
          select authors.*
          from authors join title_authors on authors.au_id = title_authors.au_id
                       join titles on title_authors.title_id = titles.title_id
          group by authors.au_id 
          having avg(price) > (select avg(price) from titles)
        ) as highEndAuthors 

      join title_authors on highEndAuthors.au_id = title_authors.au_id

      join titles on title_authors.title_id = titles.title_id

      join 
        (
          select publishers.*
          from titles join publishers on titles.pub_id = publishers.pub_id
          group by publishers.pub_id
          having avg(price) > (select avg(price) from titles)
        )
      as highEndPublishers on titles.pub_id = highEndPublishers.pub_id

order by pub_name, au_lname, au_fname
")
              pub_name  au_fname  au_lname                          title_name
1      Core Dump Books Christian     Kells        Ask Yor System Administrator
2 Schandenfreude Press     Sarah   Buchman           200 Years of German Humor
3 Schandenfreude Press     Sarah   Buchman What Are The Civilian Applications?
4 Schandenfreude Press     Wendy Heydemark                   I Blame My Mother

You can modify either of the previous queries to use CTE instead of subqueries. The following modifies the previous query.

sqldf("
with 
  highEndAuthors as 
  (
    select authors.*
    from authors join title_authors on authors.au_id = title_authors.au_id
                 join titles on title_authors.title_id = titles.title_id
    group by authors.au_id 
    having avg(price) > (select avg(price) from titles)
  ),

  highEndPublishers as 
  (
    select publishers.*
    from titles join publishers on titles.pub_id = publishers.pub_id
    group by publishers.pub_id
    having avg(price) > (select avg(price) from titles)
  )

select pub_name, au_fname, au_lname, title_name
from highEndAuthors join title_authors on highEndAuthors.au_id = title_authors.au_id
             join titles on title_authors.title_id = titles.title_id
             join highEndPublishers on titles.pub_id = highEndPublishers.pub_id
order by pub_name, au_lname, au_fname
")
              pub_name  au_fname  au_lname                          title_name
1      Core Dump Books Christian     Kells        Ask Yor System Administrator
2 Schandenfreude Press     Sarah   Buchman           200 Years of German Humor
3 Schandenfreude Press     Sarah   Buchman What Are The Civilian Applications?
4 Schandenfreude Press     Wendy Heydemark                   I Blame My Mother

Question 45 - b

A literary agent is trying to connect authors with publishers who have not worked together before. Write a SQL query to help the agent identify these potential matches. Each row of your output should show the name of a publisher and the name of an author who have NOT worked together yet.

Sort the rows alphabetically by publisher name. For rows that have the same publisher name, sort them by the author’s name.

sqldf("
-- All existing and non-existing publisher/author relationships
SELECT publishers.pub_name, authors.au_fname, authors.au_lname 
FROM authors CROSS JOIN publishers

EXCEPT

-- Only the existing publisher/author relationships
SELECT publishers.pub_name, authors.au_fname, authors.au_lname 
FROM authors JOIN title_authors on authors.au_id = title_authors.au_id
             JOIN titles on title_authors.title_id = titles.title_id
             JOIN publishers on titles.pub_id = publishers.pub_id
 
ORDER BY pub_name, au_lname, au_fname
")
               pub_name  au_fname    au_lname
1        AAA Publishing     Sarah     Buchman
2        AAA Publishing Christian       Kells
3        AAA Publishing     Paddy O'Furniture
4     Abatis Publishers Christian       Kells
5     Abatis Publishers     Paddy O'Furniture
6       Core Dump Books     Sarah     Buchman
7       Core Dump Books     Wendy   Heydemark
8       Core Dump Books    Hallie        Hull
9       Core Dump Books      Klee        Hull
10      Core Dump Books    Harvey     Kellsey
11      Core Dump Books     Paddy O'Furniture
12 Schandenfreude Press    Hallie        Hull
13 Schandenfreude Press Christian       Kells
14 Schandenfreude Press    Harvey     Kellsey
15 Schandenfreude Press     Paddy O'Furniture
16    Tneterhooks Press     Sarah     Buchman
17    Tneterhooks Press     Wendy   Heydemark
18    Tneterhooks Press    Hallie        Hull
19    Tneterhooks Press      Klee        Hull
20    Tneterhooks Press Christian       Kells
21    Tneterhooks Press    Harvey     Kellsey
22    Tneterhooks Press     Paddy O'Furniture
sqldf("
SELECT publishers.pub_id, pub_name, authors.au_id, au_fname, au_lname
FROM authors CROSS JOIN publishers
             LEFT JOIN (
                SELECT DISTINCT title_authors.au_id, titles.pub_id
                FROM title_authors JOIN titles 
                                   ON title_authors.title_id = titles.title_id
             ) AS author_publisher 
               ON authors.au_id = author_publisher.au_id AND 
                  publishers.pub_id = author_publisher.pub_id
WHERE author_publisher.au_id IS NULL
ORDER BY pub_name, au_lname, au_fname;
")
   pub_id             pub_name au_id  au_fname    au_lname
1     P05       AAA Publishing   A01     Sarah     Buchman
2     P05       AAA Publishing   A05 Christian       Kells
3     P05       AAA Publishing   A07     Paddy O'Furniture
4     P01    Abatis Publishers   A05 Christian       Kells
5     P01    Abatis Publishers   A07     Paddy O'Furniture
6     P02      Core Dump Books   A01     Sarah     Buchman
7     P02      Core Dump Books   A02     Wendy   Heydemark
8     P02      Core Dump Books   A03    Hallie        Hull
9     P02      Core Dump Books   A04      Klee        Hull
10    P02      Core Dump Books   A06    Harvey     Kellsey
11    P02      Core Dump Books   A07     Paddy O'Furniture
12    P03 Schandenfreude Press   A03    Hallie        Hull
13    P03 Schandenfreude Press   A05 Christian       Kells
14    P03 Schandenfreude Press   A06    Harvey     Kellsey
15    P03 Schandenfreude Press   A07     Paddy O'Furniture
16    P04    Tneterhooks Press   A01     Sarah     Buchman
17    P04    Tneterhooks Press   A02     Wendy   Heydemark
18    P04    Tneterhooks Press   A03    Hallie        Hull
19    P04    Tneterhooks Press   A04      Klee        Hull
20    P04    Tneterhooks Press   A05 Christian       Kells
21    P04    Tneterhooks Press   A06    Harvey     Kellsey
22    P04    Tneterhooks Press   A07     Paddy O'Furniture
sqldf("
SELECT publishers.pub_id, pub_name, authors.au_id, au_fname, au_lname
FROM authors CROSS JOIN publishers 
WHERE NOT EXISTS (
    -- NOTE 1 - This is a correlated subquery.
    --
    -- NOTE 2 - (unrelated to Note 1) With an EXISTS or NOT EXISTS subquery,
    -- it doesn't make a difference what data is returned by the subquery. 
    -- As long as ANYTHING is returned the query is considered to 'EXIST'
    -- Threfore it's more efficient to return as little info as possilble.
    -- SELECT 1 just returns the number 1 for each row that matches as 
    -- opposed to returning actual data from the tables. The query would
    -- work in the exact same way if we said SELECT * (or anything else) 
    -- it's just for efficiency that we say SELECT 1.
    SELECT 1 
    FROM titles JOIN title_authors ON titles.title_id = title_authors.title_id
    WHERE title_authors.au_id = authors.au_id
    AND titles.pub_id = publishers.pub_id
)
ORDER BY pub_name, au_lname, au_fname;
")
   pub_id             pub_name au_id  au_fname    au_lname
1     P05       AAA Publishing   A01     Sarah     Buchman
2     P05       AAA Publishing   A05 Christian       Kells
3     P05       AAA Publishing   A07     Paddy O'Furniture
4     P01    Abatis Publishers   A05 Christian       Kells
5     P01    Abatis Publishers   A07     Paddy O'Furniture
6     P02      Core Dump Books   A01     Sarah     Buchman
7     P02      Core Dump Books   A02     Wendy   Heydemark
8     P02      Core Dump Books   A03    Hallie        Hull
9     P02      Core Dump Books   A04      Klee        Hull
10    P02      Core Dump Books   A06    Harvey     Kellsey
11    P02      Core Dump Books   A07     Paddy O'Furniture
12    P03 Schandenfreude Press   A03    Hallie        Hull
13    P03 Schandenfreude Press   A05 Christian       Kells
14    P03 Schandenfreude Press   A06    Harvey     Kellsey
15    P03 Schandenfreude Press   A07     Paddy O'Furniture
16    P04    Tneterhooks Press   A01     Sarah     Buchman
17    P04    Tneterhooks Press   A02     Wendy   Heydemark
18    P04    Tneterhooks Press   A03    Hallie        Hull
19    P04    Tneterhooks Press   A04      Klee        Hull
20    P04    Tneterhooks Press   A05 Christian       Kells
21    P04    Tneterhooks Press   A06    Harvey     Kellsey
22    P04    Tneterhooks Press   A07     Paddy O'Furniture
sqldf("
SELECT publishers.pub_id, pub_name, authors.au_id, au_fname, au_lname
FROM authors CROSS JOIN publishers
WHERE (authors.au_id, publishers.pub_id) NOT IN (
    SELECT title_authors.au_id, titles.pub_id
    FROM title_authors
    JOIN titles ON title_authors.title_id = titles.title_id
)
ORDER BY pub_name, au_lname, au_fname;
")
   pub_id             pub_name au_id  au_fname    au_lname
1     P05       AAA Publishing   A01     Sarah     Buchman
2     P05       AAA Publishing   A05 Christian       Kells
3     P05       AAA Publishing   A07     Paddy O'Furniture
4     P01    Abatis Publishers   A05 Christian       Kells
5     P01    Abatis Publishers   A07     Paddy O'Furniture
6     P02      Core Dump Books   A01     Sarah     Buchman
7     P02      Core Dump Books   A02     Wendy   Heydemark
8     P02      Core Dump Books   A03    Hallie        Hull
9     P02      Core Dump Books   A04      Klee        Hull
10    P02      Core Dump Books   A06    Harvey     Kellsey
11    P02      Core Dump Books   A07     Paddy O'Furniture
12    P03 Schandenfreude Press   A03    Hallie        Hull
13    P03 Schandenfreude Press   A05 Christian       Kells
14    P03 Schandenfreude Press   A06    Harvey     Kellsey
15    P03 Schandenfreude Press   A07     Paddy O'Furniture
16    P04    Tneterhooks Press   A01     Sarah     Buchman
17    P04    Tneterhooks Press   A02     Wendy   Heydemark
18    P04    Tneterhooks Press   A03    Hallie        Hull
19    P04    Tneterhooks Press   A04      Klee        Hull
20    P04    Tneterhooks Press   A05 Christian       Kells
21    P04    Tneterhooks Press   A06    Harvey     Kellsey
22    P04    Tneterhooks Press   A07     Paddy O'Furniture

Any of the previous queries could be reorganized by using Common Table Expressions (CTE). For example:

sqldf("
WITH 

all_pairs AS (
    SELECT authors.au_id, au_fname, au_lname, publishers.pub_id, pub_name
    FROM authors CROSS JOIN publishers
),

existing_pairs AS (
    SELECT DISTINCT title_authors.au_id, titles.pub_id
    FROM title_authors JOIN titles ON title_authors.title_id = titles.title_id
)

SELECT all_pairs.pub_id,   all_pairs.pub_name, all_pairs.au_id, 
       all_pairs.au_fname, all_pairs.au_lname
FROM all_pairs
WHERE NOT EXISTS (
    SELECT 1 FROM existing_pairs
    WHERE existing_pairs.au_id = all_pairs.au_id 
    AND existing_pairs.pub_id = all_pairs.pub_id
)
ORDER BY pub_name, au_lname, au_fname;
")
   pub_id             pub_name au_id  au_fname    au_lname
1     P05       AAA Publishing   A01     Sarah     Buchman
2     P05       AAA Publishing   A05 Christian       Kells
3     P05       AAA Publishing   A07     Paddy O'Furniture
4     P01    Abatis Publishers   A05 Christian       Kells
5     P01    Abatis Publishers   A07     Paddy O'Furniture
6     P02      Core Dump Books   A01     Sarah     Buchman
7     P02      Core Dump Books   A02     Wendy   Heydemark
8     P02      Core Dump Books   A03    Hallie        Hull
9     P02      Core Dump Books   A04      Klee        Hull
10    P02      Core Dump Books   A06    Harvey     Kellsey
11    P02      Core Dump Books   A07     Paddy O'Furniture
12    P03 Schandenfreude Press   A03    Hallie        Hull
13    P03 Schandenfreude Press   A05 Christian       Kells
14    P03 Schandenfreude Press   A06    Harvey     Kellsey
15    P03 Schandenfreude Press   A07     Paddy O'Furniture
16    P04    Tneterhooks Press   A01     Sarah     Buchman
17    P04    Tneterhooks Press   A02     Wendy   Heydemark
18    P04    Tneterhooks Press   A03    Hallie        Hull
19    P04    Tneterhooks Press   A04      Klee        Hull
20    P04    Tneterhooks Press   A05 Christian       Kells
21    P04    Tneterhooks Press   A06    Harvey     Kellsey
22    P04    Tneterhooks Press   A07     Paddy O'Furniture

Question 45 - c

(continued from the previous question) The literary agent became more selective. He now wants to focus on introducing only those authors who command a higher than average price for their books with publishers who command a higher than average price for their books.

Write a SQL query to help the agent identify these potential matches. Each row of your output should show the name of a publisher and the name of an author who have NOT worked together yet. The publishers should be those whose average book price is above the average price of all books. Similarly, the authors should be those whose average book price is above the average price of all books.

Sort the rows alphabetically by publisher name. For rows that have the same publisher name, sort them by the author’s name.

Using the data shown above, the output should be:

              pub_name  au_fname  au_lname
1      Core Dump Books     Sarah   Buchman
2      Core Dump Books     Wendy Heydemark
3 Schandenfreude Press Christian     Kells

This is a relatively complex query. It might help to build up your answer step by step.

In order to check your work, it might be helpful to know the following:

  • The average selling price of all books is:

    sqldf("select avg(price) from titles")
      avg(price)
    1    18.3875
  • The average selling price of books by different publishers are: (The rows are sorted in decreasing order by average price.)

    sqldf("
    select round(avg(price),2) as 'avg price of publishers books', pub_name
    from publishers join titles on publishers.pub_id = titles.pub_id
    group by publishers.pub_id, pub_name
    order by avg(price) desc, pub_name
    ")
      avg price of publishers books             pub_name
    1                         39.95      Core Dump Books
    2                         24.63 Schandenfreude Press
    3                         14.30    Abatis Publishers
    4                         10.49       AAA Publishing
  • Therefore the publishers who are potential introduction candidates are those whose avg book price is higher than average, ie.

    sqldf("
    select pub_name
    from publishers join titles on publishers.pub_id = titles.pub_id
    group by publishers.pub_id, pub_name
    having avg(price) > (select avg(price) from titles)
    order by avg(price) desc, pub_name
    ")
                  pub_name
    1      Core Dump Books
    2 Schandenfreude Press
  • These are the average selling prices of the books for the various authors (The rows are sorted in decreasing order by average price.)

    sqldf("
    select round(avg(price),2) as 'avg price of authors books', au_fname,   au_lname
    from authors join title_authors on authors.au_id = title_authors. au_id
                 join titles on title_authors.title_id = titles.title_id
    group by authors.au_id, au_fname, au_lname
    order by avg(price) desc, au_lname, au_fname
    ")
      avg price of authors books  au_fname  au_lname
    1                      39.95 Christian     Kells
    2                      23.98     Sarah   Buchman
    3                      18.96     Wendy Heydemark
    4                      12.97      Klee      Hull
    5                      10.65    Harvey   Kellsey
    6                      10.49    Hallie      Hull
  • Therefore the authors who are potential introduction candidates are those whose avg book price is higher than average, ie.

    sqldf("
    select au_fname,   au_lname
    from authors join title_authors on authors.au_id = title_authors. au_id
                 join titles on title_authors.title_id = titles.title_id
    group by authors.au_id, au_fname, au_lname
    having avg(price) > (select avg(price) from titles)
    order by avg(price) desc, au_lname, au_fname
    ")
       au_fname  au_lname
    1 Christian     Kells
    2     Sarah   Buchman
    3     Wendy Heydemark

Answer using EXCEPT with subqueries

sqldf("
-- These are the potential matchups for highEnd authors/publishers
select pub_name, au_fname, au_lname
from (
        select authors.au_id, au_fname, au_lname
        from authors join title_authors on authors.au_id = title_authors.au_id
                     join titles on title_authors.title_id = titles.title_id
        group by authors.au_id, au_fname, au_lname
        having avg(price) > 
                 (select avg(price) from titles)
     ) as highEndAuthors
     
     cross join

     (
        select publishers.pub_id, pub_name
        from publishers join titles on publishers.pub_id = titles.pub_id
        group by publishers.pub_id, pub_name
        having avg(price) > 
                  (select avg(price) from titles)
     ) as highEndPublishers

EXCEPT

-- These are the existing publisher/author relationships
select pub_name , au_fname, au_lname
from authors join title_authors on authors.au_id = title_authors. au_id
             join titles on title_authors.title_id = titles.title_id
             join publishers on titles.pub_id = publishers.pub_id

order by pub_name, au_lname, au_fname
")
              pub_name  au_fname  au_lname
1      Core Dump Books     Sarah   Buchman
2      Core Dump Books     Wendy Heydemark
3 Schandenfreude Press Christian     Kells
              pub_name  au_fname  au_lname
1      Core Dump Books     Sarah   Buchman
2      Core Dump Books     Wendy Heydemark
3 Schandenfreude Press Christian     Kells
sqldf("
select pub_name, au_fname, au_lname
from (
        select authors.au_id, au_fname, au_lname
        from authors join title_authors on authors.au_id = title_authors.au_id
                     join titles on title_authors.title_id = titles.title_id
        group by authors.au_id, au_fname, au_lname
        having avg(price) > 
                 (select avg(price) from titles)
     ) as highEndAuthors
     
     cross join

     (
        select publishers.pub_id, pub_name
        from publishers join titles on publishers.pub_id = titles.pub_id
        group by publishers.pub_id, pub_name
        having avg(price) > 
                  (select avg(price) from titles)
     ) as highEndPublishers

where not exists
     (  -- correlated subquery
        select 1
        from authors join title_authors on authors.au_id = title_authors. au_id
                     join titles on title_authors.title_id = titles.title_id
                     join publishers on titles.pub_id = publishers.pub_id
        where publishers.pub_id = highEndPublishers.pub_id and
              authors.au_id = highEndAuthors.au_id
     )

order by pub_name, au_lname, au_fname
")
              pub_name  au_fname  au_lname
1      Core Dump Books     Sarah   Buchman
2      Core Dump Books     Wendy Heydemark
3 Schandenfreude Press Christian     Kells
sqldf("
with
  highEndAuthors as 
  (
        select authors.au_id, au_fname, au_lname
        from authors join title_authors on authors.au_id = title_authors.au_id
                     join titles on title_authors.title_id = titles.title_id
        group by authors.au_id, au_fname, au_lname
        having avg(price) > 
                 (select avg(price) from titles)
  ),
  
  highEndPublishers as
  (
        select publishers.pub_id, pub_name
        from publishers join titles on publishers.pub_id = titles.pub_id
        group by publishers.pub_id, pub_name
        having avg(price) > 
                  (select avg(price) from titles)
  ),

  existingRelationships as 
     (  -- correlated subquery
        select 1
        from authors join title_authors on authors.au_id = title_authors. au_id
                     join titles on title_authors.title_id = titles.title_id
                     join publishers on titles.pub_id = publishers.pub_id
        where publishers.pub_id = highEndPublishers.pub_id and
              authors.au_id = highEndAuthors.au_id
     )
  
select pub_name, au_fname, au_lname
from highEndAuthors cross join highEndPublishers
where not exists
     ( select 1 from existingRelationships )

order by pub_name, au_lname, au_fname
")
              pub_name  au_fname  au_lname
1      Core Dump Books     Sarah   Buchman
2      Core Dump Books     Wendy Heydemark
3 Schandenfreude Press Christian     Kells

Question 46

Show the names of the most prolific authors, i.e. the authors who have written the most books. Note that there could be several different authors who are “tied” for the most number of books.
Sort the names in alphabetical order (last name, first name) Also show the au_id and the number of titles that the author wrote.

Using the data shown above, the output should be:

  au_id au_fname  au_lname numTitles
1   A02    Wendy Heydemark         4
2   A04     Klee      Hull         4

This is a rather complex answer. This approach uses a subquery within a subquery. There may be other approaches too. There may be simpler answers too.

To explain we will show the subqueries and what they produce. Then we will show the full query that includes the subquery.

The following query is NOT the answer. It is simply a query that shows the number of books that were written by each author. This will be used as a subquery in the answer shown below.

sqldf("
          select authors.au_id, count(*) as numTitles
          from authors join title_authors on authors.au_id = title_authors.au_id
          group by authors.au_id, au_fname, au_lname
")
  au_id numTitles
1   A01         3
2   A02         4
3   A03         2
4   A04         4
5   A05         1
6   A06         3

We’d like to get the max number of titles that an author wrote. This amounts to getting the highest number from the numTitles column in the previous query. We can do that by using the above query as a subquery in the from clause.

sqldf("
          select max(numTitles2) 
          from 
               (select authors.au_id, count(*) as numTitles2
                from authors join title_authors on authors.au_id = title_authors.au_id
                group by authors.au_id, au_fname, au_lname)
      ")
  max(numTitles2)
1               4

Finally we can put this all together to answer our question. The following is the FULL ANSWER to the original question. It uses the above query as a subquery in the having clause (see the code below). Note that in the following code there is a subquery inside of a subquery.

FINAL ANSWER:

sqldf("
select authors.au_id, au_fname, au_lname, count(*) as numTitles
from authors join title_authors on authors.au_id = title_authors.au_id
group by authors.au_id, au_fname, au_lname
having numTitles = 
          (select max(numTitles2) 
           from 
               (select authors.au_id, count(*) as numTitles2
                from authors join title_authors on authors.au_id = title_authors.au_id
                group by authors.au_id, au_fname, au_lname))
order by au_lname, au_fname
")
  au_id au_fname  au_lname numTitles
1   A02    Wendy Heydemark         4
2   A04     Klee      Hull         4

Question 47

Show all topics and the authors who wrote books on those topics. Sort the ouput in alphabetical order by topic and within each topic in alphabetical order by the author. The first name and last name of each author should be combined into a single column called ‘author’.

Using the data shown above, the output should be:

        type          author
1  biography Wendy Heydemark
2  biography       Klee Hull
3   children  Harvey Kellsey
4   computer Christian Kells
5    history   Sarah Buchman
6 psychology     Hallie Hull
7 psychology       Klee Hull
8 psychology  Harvey Kellsey
sqldf("
    select type, au_fname || ' ' || au_lname as author
        from titles join title_authors on titles.title_id = title_authors.title_id
                join authors on title_authors.au_id = authors.au_id
        group by type, authors.au_id, au_fname, au_lname
        order by type, au_lname, au_fname;
")

Question 48

Answer all parts of this question

  1. Show all authors who wrote at least one biography or a history book for Abatis Publishers. Do NOT show an author’s name more than once even if they wrote several books.

    Using the data shown above, the output should be:

      au_fname  au_lname
    1    Sarah   Buchman
    2    Wendy Heydemark
    sqldf("
      select distinct au_fname, au_lname
      from authors join title_authors on authors.au_id = title_authors.au_id
                   join titles on title_authors.title_id = titles.title_id
                   join publishers on titles.pub_id = publishers.pub_id
      where publishers.pub_name = 'Abatis Publishers' and
            (titles.type = 'biography' or titles.type = 'history')
      order by au_lname, au_fname
    ")
  2. Show all authors who wrote at least one psychology book for Abatis Publishers. Do NOT show an author’s name more than once even if they wrote several books.

    Using the data shown above, the output should be:

      au_fname au_lname
    1   Hallie     Hull
    2     Klee     Hull
    sqldf("
      select distinct au_fname, au_lname
      from authors join title_authors on authors.au_id = title_authors.au_id
                   join titles on title_authors.title_id = titles.title_id
                   join publishers on titles.pub_id = publishers.pub_id
      where publishers.pub_name = 'Abatis Publishers' and
            titles.type = 'psychology'
      order by au_lname, au_fname
    ")
  3. Abatis Publishers believes publishing a book on the history of psychology during Sigmund Freud’s era would be a worthwhile endeavor. They think it would be best to get two authors to collaborate on the book - one author who has a psychology background and one author who has a history or biography background.

    In order to brainstorm which authors might work together well, Abatis Publishers would like to see a list of each author who wrote a history or biography book with them matched up with each author who wrote a pschology book with them. The publisher will then look through the list and think about which combination of authors might make the most sense - then they’ll try to contact the chosen authors and persuade them to take on the project.

    Write a query that returns two columns.

    The first column should be named ‘history/biography author’ and contain names of authors who wrote a history or biography book with Abatis (combine the authors first and last names into a single column).

    The second column should be named ‘psychology author’ and contain names of authors who wrote a psychology book with Abatis.

    The output should show every combination of a history/biogrphy author who wrote for Abatis with a psychology author who wrote for Abatis.

    Using the data shown above, the output should be:

      history/biography author psychology author
    1            Sarah Buchman       Hallie Hull
    2            Sarah Buchman         Klee Hull
    3          Wendy Heydemark       Hallie Hull
    4          Wendy Heydemark         Klee Hull
    sqldf("
      select hist_bio_authors.au_fname || ' ' || hist_bio_authors.au_lname 
                 as 'history/biography author', 
             psych_authors.au_fname || ' ' || psych_authors.au_lname
                 as 'psychology author' 
      from 
          ( -- this is the query from part (a)
            select distinct au_fname, au_lname
              from authors join title_authors on authors.au_id = title_authors.au_id
                           join titles on title_authors.title_id = titles.title_id
                           join publishers on titles.pub_id = publishers.pub_id
              where publishers.pub_name = 'Abatis Publishers' and
                    (titles.type = 'biography' or titles.type = 'history')
              order by au_lname, au_fname
            ) as hist_bio_authors
    
          cross join
    
          ( -- this is the query from part (b)
              select distinct au_fname, au_lname
              from authors join title_authors on authors.au_id = title_authors.au_id
                           join titles on title_authors.title_id = titles.title_id
                         join publishers on titles.pub_id = publishers.pub_id
              where publishers.pub_name = 'Abatis Publishers' and
                  titles.type = 'psychology'
            order by au_lname, au_fname
          ) as psych_authors
    ")