Custom Search
My Zimbio

Tuesday, April 7, 2009

JOINS
Everyone writes articles on the definitions of joins and examples of each.

Here are mine.

Join (simple join)
A simple join is when data in table1 and data in table2 match on the column being joined to.
Syntax Example:
Select
*
from
Employee a
Join
Employee_Salary b
on
a.EmployeeId = b.EmployeeId

Left Join
Get all of the data in table1 that has a like value in table2.
Select
*
from
Employee a
Left Join
Employee_Salary b
on
a.EmployeeId = b.EmployeeId


Left Outer Join
Get all of the data in table1 regardless if it has a match in table2.
Select
*
from
Employee a
Left outer Join
Employee_Salary b
on
a.EmployeeId = b.EmployeeId

Right Join
Get all of the data in table2 that has a like value in table1
Select
*
from
Employee a
Right Join
Employee_Salary b
on
a.EmployeeId = b.EmployeeId

Right Outer Join
Get all of the data in table2 regardless if it has a match in table1.
Select
*
from
Employee a
Right Outer Join
Employee_Salary b
on
a.EmployeeId = b.EmployeeId

Full outer join
Get all of the data in table1 and table2 regardless of matching data.
Select
*
from
Employee a
Full Outer Join
Employee_Salary b
on
a.EmployeeId = b.EmployeeId

Joins can make or break your query, one wrong join and you could end up with invalid data.

Monday, April 6, 2009

SQL DATES

Dates to me are one of the most annoying items within SQL. Getting dates, performing date validation checks, return sets based on dates, converting dates, transforming dates. I find this can be a real pain at times.

Every SQL dba knows the
Select getdate() - will return the current system date.

Select getdate()-1 will return the current date - 1 day

Select getdate()+1 will return the current date + 1 day
these are pretty simple.

for the most part dates are simple to work with, but they are just annoying when you need to convert them. My biggest peeve is working in a database that has columns that are datetime columns and other columns that are a different datatype but are date columns like Int for example. Or someone has created a date column as a char(8) or something of that nature. I find it frustrating when the date columns take on multiple data types. Even in circumstances where they could of been consistent.

select getdate() AS CurrentDate,
'Q'+DATENAME(quarter, convert(char, getdate())) As Quarter,
DATENAME(year, convert(char, getdate())) As Year,
DATENAME(month, convert(char, getdate())) as Month,
DATENAME(Dayofyear, convert(char, getdate())) as DayofYear,
DATENAME(day, convert(char, getdate())) as Day,
DATENAME(week, convert(char, getdate())) as Week,
DATENAME(WEEK, GETDATE()) AS CurrentWeek,
DATENAME(MONTH, GETDATE()) AS CurrentMonth

SELECT convert(varchar,DATEADD(mm, DATEDIFF(mm,0,getdate())-25, 0),112) as [First Day of Previous Month 24 months ago]

more to come....

queries for reporting

When I create sql based reports, I usually start with a mock report and along with the end-users I review the criteria, the output and the sorting options. It is important to understand in the very beginning how much data will be generated and the frequency of the report. It is also good to determine the amount of storage may be required for the report generation and archiving. I then will set to work on designing the initial query. Usually I will create an all encompassing query. I try to think futuristically, what will they possibly want to see a few weeks, months, years from now. If it is not too much on query performance I will try to add these extra fields in my initial code and return set. Then I will start to create the report based on the criteria, parameters and business rules that were documented. Once I finish the initial query and have a sample return set, I will sit down with the end-users or business analysts again to determine the delivery mechnanism.

query optimization

I truly believe that no query should be longer than 32 seconds to return data. If it takes longer it should be checked for performance optimization. Of course there are some exceptions to this but for the most part, most people do not like to wait longer than 30 seconds to retrieve data. They will move on to another site or try to get their information a different way if they are forced to wait extended period of time for data elements to return.

I first try to limit my query to the simplest return set if possible. I try not to do too many sorts, calculations, aggregations, etc in my initial return set. I try to stay away from too many where clauses. For example if I am trying to retrieve data within a date range and I want to do more with the data after I get the initial subset, I will try to break it into two queries. I also try to break my procedures down whenever possible so that they can be ran from start and stop points. What I mean here is that if I am doing a lot of data manipulation for a report. I will first grab a subset of data, then I will do another subset, then another, until i end up with a final output set of data. This way if there is a problem, I don't necessarily have to run everything from the beginning in order to try and diagnose an issue. I try to split the data in like items. Like select information for a particular department within a date range. (I put this return set into a temp table) then from there, I may create another subset that joins to another large table but only has to search through the smaller subset to get the data back. Then I will put the calculations and transforming into another temp table and then I will finish all the prettiness into a final subset of data. I find this works for me and it is really fast if you need to find data issues quickly. You don't have the headache of reading through excessive amounts of code to find a small tiny data issue, or code issue. I sometimes will right the big procedure as one huge select then I will actually go back and break it up especially if I think it will increase query and return set peformance. I usually always leave the end user parameters for sorting to the final return set. I am not sure if all end users really know what they are looking for.

Sunday, April 5, 2009

Database Basics

Simple every day select statements

Select count(*) from table
Select getdate()
Select "titlevalue"

Select * from table
Select top 1 * from table
Select max(column) from table
Select min(column) from table