Hive SQL (and T-SQL): How To Write A Left Anti Join and WhysteemCreated with Sketch.

in #sql4 years ago

Previously, we've look at the join functionality of LEFT JOIN. If we recall from that lesson, we learned that a LEFT JOIN will join records that match and don't match on the condition specified. Since we're using the same example over and over again for easily remembering the result, we'll remember that two of our records in Table1 matched with Table2, but two results didn't match, so we saw nulls as the result. When there isn't a match (because the records don't exist), we get a null as the return. What if we only want to get the records in Table1 that don't match in Table2? In the video, SQL Basics: How To Use A LEFT ANTI JOIN and Why, we look at accomplishing this challenge using the LEFT ANTI JOIN. Since not all SQL languages support LEFT ANTI JOIN, I show this syntax two different ways, one of which is generally supported.

Some questions that are answered in the video:

  • Note the tables that we're using and what values are identical based on the column names and what values differ based on the column names. We'll be using these tables throughout these videos. As a note on aliasing tables - the first table always gets a "t1" while the second table always gets a "t2" and so on and so forth (the 5th table would get a "t5"). Finally, consider that some SQL languages may not support the JOINs that we discuss.
  • As a quick reminder, not all SQL languages may support the exact syntax shown.
  • In first our example, we use a LEFT ANTI JOIN without directly calling it (like we could do in some SQL languages), but by using a null filter on the LEFT JOIN condition. What is the purpose of specifying nulls in this example?
  • To accomplish the same results in the first example, how can we LEFT ANTI JOIN Table1 with Table2? How does this save us time in development?
  • What would happen if we used a LEFT ANTI JOIN from Table1 on Table2 for the column Letter? What would our results be? Why?
As we see, a big part of what we're doing here is filtering out records when we have two data sets with similarities and differences. If you've ever completed an assignment that involved "compare" and "contrast" then you've done something similar in functionality as you had to identify the similarities and the differences, which is what we're doing here with data.

Automating ETL
For mastering data transformation from one form to another form, check out the highest-rated Automating ETL course on Udemy. For a coupon to the course, check out the trailer video on the channel SQL In Six Minutes.

As for a real world example with LEFT ANTI JOINs, consider a health example where we have populations with different traits, but all that have the same disease. We could use a LEFT ANTI JOIN to filter out these differences. This would possibly allow us to further our analysis (may or may not be useful). The key with any data set where we use a LEFT ANTI JOIN is that must have columns that will be used in our join condition where we do the filtering. In other words, we would filter on trait or multiple traits depending on how we broke down our data in the example. The same applies if we were to use LEFT ANTI JOINs with sales data, housing data, etc. We must have columns where we'll apply our filtering condition.

For more SQL lessons, you can continue to watch the SQL Basics series.

Are you looking for tech consultants that can assist with design and development? From building custom applications to working with existing infrastructure that's causing you trouble, we can get you connected to consultants who can assist. You can contact for assistance.

SQL In Six Minutes | Automating ETL | T-SQL In 2 Hours | Consumer Guide To Digital Security