20 Common & Difficult SQL Interview Questions for Programmers 2019Last Updated: January 1, 2019
Preparation for a database programming job interview must involve concentrated focus on fundamental programming skills, as it’s likely that you will be given a number of problems to solve using SQL. These 20 of the most common and difficult SQL interview questions will give you an idea of what to expect on interview day, with answers that you can cross-check against your own solutions.
Proficiency in data and database programme languages is probably the number one skillset in programming now and for the foreseeable future. Most experts agree that recruiters in database programming are looking for a killer blend of regular interview skills, as well as the technical knowhow in programming skills such as SQL, which means you should research and anticipate the SQL interview questions that are typically asked of database programming candidates.
So, firstly ensure you are neatly turned out on the day, and behave professionally in every respect, including the manner in which you carry tech devices and any documentation you need to bring. Look for a decent briefcase, a cool, retro-chic messenger bag / satchel or check out what’s available in sleek, tidy, professional backpacks. Please do not undo all the good work you did in securing an interview by carrying your items in a plastic carrier bag; it looks terrible.
Don’t forget all your standard interview preparation techniques. This is your dream ticket: a chance to pitch your skills and personality to the requirements of the organisation that’s looking to hire a new team member.
But the skills are a hugely important metric in deciding whether you will be hired. Fortunately help is at hand. Online, there is a veritable fount of information on the common SQL interview questions you can anticipate in addition to some of the most difficult SQL interview questions that have been asked of candidates. There are also a number of great guides to SQL interview questions that will be of invaluable assistance, including SQL Server Interview Questions and Answers by Pinal Dave and Vinod Kumar; William Barden’s MS SQL, T-SQL, and SQL Server Interview Questions: Hundreds of Questions, Responses, and Explanations; and the various volumes of SQL Server Interview Questions You’ll Most Likely Be Asked compiled by Vibrant Publishers.
20 Common & Difficult SQL Interview Questions
We checked out a range of SQL interview questions posted online by SQL experts, and by those who have sat on both sides of the interview table. We’ve narrowed down these typical and most tricky SQL interview questions to a list of 20, with answers that will help you prepare for your interview.
It‘s clear to see that there is no way on earth that you will be able to ‘wing’ your responses to these SQL interview questions. But if you devote sufficient time to researching them, and also devise your own responses to them, you will be well prepared for any eventuality on the day of your interview. Good luck, SQL AGENTs!
1. What is cardinality?
“Thinking mathematically, it is the number of elements in a set,” writes IT pro John Zukowski, one of the most reliable sources on SQL interview questions. “Thinking in the database world, cardinality has to do with the counts in a relationship, one-to-one, one-to-many, or many-to-many.”
2. What does UNION do? What is the difference between UNION and UNION ALL?
According to US freelance software engineer agency, Toptal, UNION merges the contents of two structurally-compatible tables into a single combined table. The difference is that UNION omits duplicate records, but UNION ALL includes them.
The site notes that UNION ALL has superior performance, as it does not need to undertake the additional work of removing duplicates. For that reason, UNION ALL is recommended in situations were there are no duplicates, or if duplicates are not an issue.
3. When might someone denormalize their data?
Another of the typical, yet quite tricky, SQL interview questions, John Zukowski writes that this is “typically done for performance reasons, to reduce the number of table joins”. He adds that is not a good idea in a transactional environment, because of “inherent data integrity risks or performance risks due to excessive locking”.
Zukowski notes also that questions related to the Unified Modeling Language (UML) or Entity-Relationship Diagrams (ERDs) may also arise out of a question about denormalizing data.
4. Queries & Results: 1
Among the numerous SLQ interview questions explored on the Toptal site, this is a typical poser, with full answer below. Firstly, look at the following query.
SELECT count (*) AS total FROM orders; +-------+ + total + +-------+ + 100 + +-------+ SELECT COUNT(*) AS cust_123_total FROM orders WHERE customer_id= '123'; +----------------+ | cust_123_total | +----------------+ | 15 | +----------------+
Given these query results, what will be the result of the query below?
SELECT count(*) AS cust_not_123_total FROM orders WHERE customer_id <> '123'
According to Toptal the answer is 85 (i.e, 100 – 15). However, this may not be entirely correct, as any records with a customer_id of NULL will not be included in either count (i.e., they won’t be included in cust_123_total, nor will they be included in cust_not_123_total).
For example, if exactly one of the 100 customers has a NULL customer_id, the result of the last query will be:
+--------- ----------+ | cust_not_123_total | +--------------------+ | 84 | +--------------------+
5. In which files does SQL Server actually store data?
The answer to this and similar SQL interview questions, according to DotNetSpider.com, is that SQL server has 2 associated data files, as follows:
- MDF (Master Database File): actual data file storage, and
- LDF (Log Data File): which stores the transaction log
6. Describe the differences in the first through fifth normalization forms.
John Zukowski contends that database candidates should be familiar with most if not all of these. He shared a list of definitions for each from Wikipedia, and urged candidates to familiarise themselves with the distinctions.
- First: The domain of each attribute contains only atomic values, and the value of each attribute contains only a single value from that domain.
- Second: No non-prime attribute in the table is functionally dependent on a proper subset of any candidate key.
- Third: Every non-prime attribute is non-transitively dependent on every candidate key in the table. The attributes that do not contribute to the description of the primary key are removed from the table. In other words, no transitive dependency is allowed.
- Fourth: Every non-trivial multivalued dependency in the table is a dependency on a superkey.
- Fifth: Every non-trivial join dependency in the table is implied by the superkeys of the table.
7. What is the sixth normal form in SQL Server?
This is something of a trick question, according to DotNetSpider.com, who caution: “Friends, do not try to impress the interviewer: just skip this question.” The sixth normal form does exist, but
Sixth normal form does exist. it is primarily used in some data warehouses where benefits exceed its drawbacks, but currently, SQL does not directly support ‘6NF’.
8. Queries & Results: 2
More brainteasing SQL interview questions from Toptal, in the form of yet another Query & Results problem. Given the following tables:
sql> SELECT * FROM runners; +----+--------------+ | id | name | +----+--------------+ | 1 | John Doe | | 2 | Jane Doe | | 3 | Alice Jones | | 4 | Bobby Louis | | 5 | Lisa Romero | +----+--------------+ sql> SELECT * FROM races; +----+----------------+-----------+ | id | event | winner_id | +----+----------------+-----------+ | 1 | 100 meter dash | 2 | | 2 | 500 meter dash | 3 | | 3 | cross-country | 2 | | 4 | triathalon | NULL | +----+----------------+-----------+
What will be the result of the query below?
SELECT * FROM runners WHERE id NOT IN (SELECT winner_id FROM races)
Explain your answer, and provide an alternative version Query version that avoids the issue it exposes.
Answer: The result is an empty set. This is because, if the set being evaluated by the SQL NOT IN condition contains any values that are null, then the outer query here will return an empty set, even if there are many runner ids that match winner_ids in the races table.
Here is an alternative version of the query that avoids the issue.
SELECT * FROM runners WHERE id NOT IN (SELECT winner_id FROM races WHERE winner_id IS NOT null)
9. Write an SQL Query to find the second highest salary of a group of employees
Of the numerous to find second highest salary of Employee in SQL, Java67.com provides a simple solution. It is possible to use either SQL Join or Subquery to solve the problem. Java67’s example uses Subquery, as follows:
select MAX(Salary) from Employee WHERE Salary NOT IN (select MAX(Salary) from Employee );
10. How many locks exist in the SQL Server ?
Here is a response to this question from DotNetSpider.com. There are 6 locks, as follows:
- Bulk Update
11. Task: How to increase the speed of a stored procedure
A question from David Lozinksi: “You have been tasked with increasing the speed of a stored procedure that runs once a month, deleting approximately 25 million records of stale data from a table called “StaleWorkOrders”. Your sole job is to increase the speed at which it runs: you don’t care about any sort of logging and there’s zero transaction blocks that need to be rolled back.
You’ve made an important change. One of the SQL statements below was the original code; the other is your new code:
a) Which SQL statement was originally there? And which one did you change it to?
b) Why did you make the change?
a) DELETE FROM was the original statement which you replaced with the TRUNCATE statement.
b) TRUNCATE TABLE quickly deletes all records in a table by deallocating the data pages used by the table. It completely bypasses the transaction log. This reduces the resource overhead of logging the deletions, as well as the number of locks acquired, thus increasing performance. In addition, the TRUNCATE statement also resets any Identity Seeds that may be in the table.
12.What is SQL Profiler?
This is a tool that allows administrators to monitor different events and transaction of the SQL server. For instance, you can capture and save data about each event to a file or SQL Server table, for analysis later. This is not available with the EXPRESS edition
13. Write an SQL Query to find maximum salary in each department of an organisation.
According to Java67, it’s possible to do this by grouping all records by DeptId and then using the MAX() function to calculate maximum salary in each group or each department, as follows.
SELECT DeptID, MAX(Salary) FROM Employee GROUP BY DeptID.
Java67 advises anticipating an interviewer making the question more interesting by asking you to print the department name instead of department id. In that case, you must join Employee table with Department using foreign key DeptID. Use LEFT or RIGHT OUTER JOIN to include departments that have no employees. Here is the query
SELECT DeptName, MAX(Salary) FROM Employee e RIGHT JOIN Department d ON e.DeptId = d.DeptID GROUP BY DeptName;
In this query, RIGHT OUTER JOIN is used to retrieve the department name from the table that‘s on the right side of the JOIN clause, even if there’s no reference of dept_id on the Employee table.
14. What is the difference between an inner and outer join?
An inner join involves joining two tables where a common id/key exists in both. An outer join is the joining of two tables, but where there is no match in the second (or first).
15. What is wrong with this SQL query? Correct it so it executes properly.
SELECT Id, YEAR(BillingDate) AS BillingYear FROM Invoices WHERE BillingYear >= 2010;
The expression BillingYear in the WHERE clause is invalid. Even though it is defined as an alias in the SELECT phrase, which appears before the WHERE phrase, the logical processing order of the phrases of the statement is different from the written order. Most programmers are accustomed to code statements being processed generally top-to-bottom or left-to-right, but T-SQL processes phrases in a different order.
The correct query should be:
SELECT Id, YEAR(BillingDate) AS BillingYear FROM Invoices WHERE YEAR(BillingDate) >= 2010;
16. Write an SQL Query to check whether date passed to Query is in the specified format.
Java67 writes that the ISDATE() function checks whether the passed value is in the specified format, and will return 1(true) or 0(false). However, ISDATE() is an MSSQL function that may not work on Oracle, MySQL or any other database.
SELECT ISDATE('1/08/13') AS "MM/DD/YY";
This will return 0, because the passed date is not in correct format.
17. How do you maintain database integrity where deletions from one table will automatically cause deletions in another table?
You create a trigger that will automatically delete elements in the second table when elements from the first table are removed.
18. Data Schema questions
Here’s Toptal with another of the typically tricky SQL interview questions: Assume a schema of Emp ( Id, Name, DeptId ) , Dept ( Id, Name). If there are 10 records in the Emp table and 5 records in the Dept table, how many rows will be displayed in the result of the following SQL query, and please explain your answer:
Select * From Emp, Dept
Answer: The query will result in 50 rows as a “cartesian product” or “cross join”, which is the default whenever the ‘where’ clause is omitted.
19. Write an SQL Query to find number of employees, according to gender, with a date of birth between 01/01/1960 and 31/12/1975.
According to Java67, the answer to this question is:
SELECT COUNT(*), sex from Employees WHERE DOB BETWEEN '01/01/1960' AND '31/12/1975' GROUP BY sex;
20. List and explain the different types of JOIN clauses supported in ANSI-standard SQL.
According to Toptal, there are five JOIN clauses supported in ANSI-standard SQL. Here is the list of five clauses, with explanations of each:
- INNER JOIN (a.k.a. “simple join”): Returns all rows for which there is at least one match in BOTH tables. This is the default type of join if no specific JOIN type is specified.
- LEFT JOIN (or LEFT OUTER JOIN): Returns all rows from the left table, and the matched rows from the right table; i.e., the results will contain all records from the left table, even if the JOIN condition doesn’t find any matching records in the right table. This means that if the ON clause doesn’t match any records in the right table, the JOIN will still return a row in the result for that record in the left table, but with NULL in each column from the right table.
- RIGHT JOIN (or RIGHT OUTER JOIN): Returns all rows from the right table, and the matched rows from the left table. This is the exact opposite of a LEFT JOIN; i.e., the results will contain all records from the right table, even if the JOIN condition doesn’t find any matching records in the left table. This means that if the ON clause doesn’t match any records in the left table, the JOIN will still return a row in the result for that record in the right table, but with NULL in each column from the left table.
- FULL JOIN (or FULL OUTER JOIN): Returns all rows for which there is a match in EITHER of the tables. Conceptually, a FULL JOIN combines the effect of applying both a LEFT JOIN and a RIGHT JOIN; i.e., its result set is equivalent to performing a UNION of the results of left and right outer queries.
- CROSS JOIN: Returns all records where each row from the first table is combined with each row from the second table (i.e., returns the Cartesian product of the sets of rows from the joined tables). Note that a CROSS JOIN can either be specified using the CROSS JOIN syntax (“explicit join notation”) or (b) listing the tables in the FROM clause separated by commas without using a WHERE clause to supply join criteria (“implicit join notation”).