General discussion of SQL Topics; aimed for the novice-intermediate level Microsoft SQL Server User. Currently focuses on using SQL Server 2005.

Tuesday, March 24, 2009

What are Tables in SQL Server?

Overview

The following is a general overview of tables within a SQL Server database; it is by no means a thorough discussion of this subject. This is intended to be a primer for future discussions and covers only the basics, in no particular order.

Table is a database object containing a column definition (or a collection of) in which data is stored row-by-row. You can think of this very similar to a spreadsheet; with a little extra power.

A table is generally designed to handle most relational data in an organized manner; however, SQL Server has evolved to allow handling of data that is not organized.

A table must include the following (at a minimum):

  • A name for the table
  • A column to store the data

A table may include many other attributes/properties, such as:

  • A description of the table
  • A schema the table belongs to
  • An identity column
  • A filegroup the table belongs to
  • Check Constraints, to ensure data integrity
  • Multiple columns
  • Relationships to other tables and/or objects within the database
  • Indexes

CREATE TABLE syntax

The following is the bare minimum T-SQL syntax for creating a table:

CREATE TABLE 
   
[ database_name . [ schema_name ] . | schema_name . ] table_name 
       
( { <column_definition> | <computed_column_definition> 
               
| <column_set_definition> }
       
[ <table_constraint> ] [ ,...n ] ) 
   
[ ON { partition_scheme_name ( partition_column_name ) | filegroup 
       
| "default" } ] 
   
[ { TEXTIMAGE_ON { filegroup | "default" } ] 
   
[ FILESTREAM_ON { partition_scheme_name | filegroup 
       
| "default" } ]
   
[ WITH ( <table_option> [ ,...n ] ) ]
[ ; ]

See the "References" section at the end of this blog for the complete syntax. If you don't fully understand the text/symbols used in this syntax please reference my February 2008 posting Understanding MSDN Syntax Conventions.

Here is a CREATE TABLE command, at it's simplest, issued in T-SQL:

CREATE TABLE myTable
   
(myColumn TINYINT);

This is a very simple T-SQL command that will create a table called "myTable". The table will contain 1 column called "myColumn", this column's data type is "INT" (which is short form for Integer); thus only allowing whole numbers to be entered. The numbers are limited to a range of numbers between 0 and 255.

Columns

A column definition will include the following (at a minimum):

  • The name of the column
  • A data type, and any attributes required to define the data type
  • If it allows NULL values
  • If it is Identity Specific

A column definition may include other attributes/properties, such as:

  • A default value (or binding)
  • A description
  • A computed column specification

Example

As mentioned earlier a table will include, at minimum, a column definition. In most cases you will find multiple column definitions within a table. An example customer table for a small business might include a column indicating the CustomerID, CustomerName, BusinessName, Address, PhoneNumber. The table might be called "Customers". To create this example table the T-SQL command might look like this:

CREATE TABLE Customers
(
   
CustomerID INT PRIMARY KEY NOT NULL,
   
CustomerName VARCHAR(100) NOT NULL,
   
BusinessName VARCHAR(100) NULL,
   
[Address] VARCHAR(500) NULL,
   
PhoneNumber VARCHAR(10) NULL
)

Notice that the column named "Address" is enclosed in square brackets "[]", this is because "Address" is a reserved keyword in Transact-SQL and can only be used as an identifier object if it is called using one of the "Delimited Identifiers".  There are other acceptable delimited identifiers that may be used, you may have to set the SQL Server configuration for quoted_identifiers, or follow other methods to use these. Refer to Books Online (BOL) Delimited Identifiers (see resources at end of posting) for further information.

Also, you may have noticed that this table construction isn't the best there is. Logically the "Address" column currently would contain a complete address in the field (such as: 123 Anywhere St, Anyplace, State, 12345). If we wanted to search this field for all customers that are in the state of California or Hawaii then we would have to search using a LIKE operator (this will be discussed further later); this can take quite some time in SQL depending on how large the table is. Also, if we want to further list our customers by city alphabetically and group them by state then we introduce a complex T-SQL query to obtain this seemingly simple information.

To make this better we would perform what is known as "Normalization" on our tables. Normalization will be discussed in further detail in a future blog. Right now, the grasp of what a table is, what is does, and how the data is stored is the important information to know.

Conclusion

As you can see a table is very much a corner stone to any database, without a table there is no where to store your data in your database. A table can be very simple to setup; while also can be very complex to maintain a great amount of integrity and Business Intelligence rules. This is just a primer discussion, if you want to delve deeper into tables you will want to start your exploration with BOL's Understanding Tables section and follow the appropriate links to further your knowledge. You will find that tables can be very powerful if you setup the structure properly and utilize the appropriate techniques to ensure data integrity. I've already discussed some attributes of tables in earlier postings, such as my postings about Indexes. There is a wealth of knowledge out there on this subject and may seem like you already know most of it, but you will find that there are many aspects of tables that you may not have known about. I encourage you to take a few moments and review the BOL sections mentioned here, at a minimum. You may also want to look into Temporary Tables and for you SQL 2005+ users look into the new data type Table.

References

CREATE TABLE (Transact-SQL): http://technet.microsoft.com/en-us/library/ms174979.aspx
Understanding MSDN Syntax Conventions: http://sqln.blogspot.com/2008/02/understanding-msdn-syntax-conventions.html
TechNet - Tables: http://technet.microsoft.com/en-us/library/ms189104.aspx
Understanding Tables: http://technet.microsoft.com/en-us/library/ms189905.aspx
Creating and Modifying Table Basics: http://msdn.microsoft.com/en-us/library/ms177399.aspx
table (Transact-SQL): http://msdn.microsoft.com/en-us/library/ms175010.aspx

Tuesday, January 6, 2009

New Article: Transaction Log Growth, do you need it?

Yesterday, SQL Server Central had published another article of mine: "Transaction Log Growth, do you need it?". It got some good reviews, and just over 8,000 views before 9 AM; I've been told anything over 4,000 views (total) is pretty good.

Be sure to look it over when you have a chance, it may prove helpful in the future or just might provide you a little more insight about Transaction Logs and how they work with SQL Server and your databases.

Until next time, Happy Coding!!!

 

References:

Article: http://www.sqlservercentral.com/articles/Transaction+Log/65255/
Discussion: http://www.sqlservercentral.com/Forums/Topic629390-1157-1.aspx

Wednesday, December 31, 2008

Updated: Search Stored Procedures for Dependencies / References

In a discussion on the MSDN Forums I had posted my original code to search a Stored Procedure (search any object within SQL really) for specific text (http://social.msdn.microsoft.com/Forums/en-US/transactsql/thread/847b421d-332d-40f8-9649-b26fc2306920). You can view my original blog post describing this code block at: http://sqln.blogspot.com/2008/10/searching-text-of-programmable-objects.html.

Adam Haines had posted a wonderful enhancement to my T-SQL code. He had returned the matching text result as an XML instead of a text, this allows the user to click on the returned text and have the resulting text appear in a new window.

The new code, with the XML returned results, is:

DECLARE @StringToSearch NVARCHAR(MAX)
SET @StringToSearch = 'ENTER_SEARCH_TEXT_HERE'

SELECT    
   
[name],   
   
(   
       
SELECT    
           
OBJECT_DEFINITION(obj2.object_id) AS [text()]   
       
from sys.all_objects obj2   
       
where obj2.object_id = obj1.object_id   
       
FOR XML PATH(''), TYPE   
   
) AS Obj_text,   
   
[type] as ObjType    
FROM sys.all_objects obj1  
WHERE OBJECT_DEFINITION(object_id([name])) LIKE '%' + @StringToSearch + '%'
--Optional to limit search results....    
--AND [is_ms_shipped] = 0 --Only search user & post SQL release objects
--Change above to 1 if want to search only Microsoft provided objects released with SQL Server
--AND [type] = 'P' --Search only Stored Procedures
--AND [type] <> 'V' --Exclude Views
--See referenced article for listing of additional object types
--http://www.sqlservercentral.com/articles/T-SQL/63471/

Thanks again for this enhancement Adam!

Until next time, Happy Coding!!!

Friday, December 12, 2008

Books Online could be better...and we, the users, already can make it better!

Yesterday, Kalen Delaney wrote a small article discussing if MS should or shouldn't build in more warning messages or taking better actions in warning users about potential problems from performing tasks/commands within SQL Server. You can read her article at: http://www.sqlmag.com/Articles/ArticleID/101033/101033.html

I agree that to a certain extent the DBA is responsible for the database and should know how to use the software thoroughly...so ideally the database would never be at an unknown risk of data loss. Unfortunately, SQL Server has jumped from the realm of an application to the realm of a data platform! This changes things quite a bit; there is no realistic expectation that a single DBA (or even a small team of DBAs) can possibly know the database software and all of its commands and internal workings 100%. So there is some portion of relief that Microsoft must provide...rather it is a method of warning the DBA from messages, errors, or warnings I don't know or speculate a foreseeable method. I do agree that Books Online (BOL) is considered the SQL Server documentation and it is Microsoft's responsibility to ensure that the documentation provides all of the information (including inner workings where necessary) of commands and features that have potential unsafe effects.

The question is where does this responsibility end? Also, where does it fall onto the DBA (or end-user) shoulders to piece together all of this information and understand how they are utilizing the data platform and the data loss potential is created by this usage?

The first thing that comes to my mind is that Microsoft has taken a step in adding a way for the user's of BOL to add in their content that can point this type of stuff out. Why is it that we DBAs aren't doing this? I hope after reading this maybe our community will take a moment or two to place a useful comment (especially warning of potential data loss areas) in BOL for the other less knowledgeable DBAs.

Until next time, Happy Coding!

Thursday, October 23, 2008

TSQL Code: Compare Tables of Databases on Separate Servers

NOTE: This script was originally developed to compare table row counts, but with the below mentioned modifications this could be helpful if you have a database that is shipped to another server and often times needs to be ETL into the other database and you are worried there may be records that aren't getting transferred properly (or at all).

The other day I came across a question on how to compare the tables in one database to the tables in another database, this was in concerns to a migration project. This person had a database that existed on an older server running SQL 2000 and they chose to migrate their database to a new server running SQL Server 2005. They had wanted to be able to somehow be able to visually display a comparison of the two databases to prove the conversion was successful, and also to display the comparisons of the tables. Results meeting above requirements are shown in Figure 1.

AW_Compare_All 
Figure 1 (results of running sp_CompareTables_All)

In response to this I had started to develop a TSQL script that created a stored procedure that would allow you to run it from the current database and specify the location of the original database. The script would then collect the tables and count the records within each table for both databases. It will display the table names on the outside (left and right) of the results, and the record counts will be next to each other on the inside (next to their table names); which will allow for a very simple and pleasing visual comparison. I wanted, and did, avoid using cursors or other techniques that will potentially bog down the system resources (such as the stored proc 'sp_MSForEachDB' and 'sp_MSForEachTable').

Now, this is a rough draft that I had thrown together and tested over a lunch period; so, there are some issues that can still be cleaned up on the script and it lacks the ability to detect if a table exists on one database, but not the other.

At the end of the script I'll also provide a quick comment that will allow you to change the results from displaying all tables and their row counts to display only the tables with mismatched row counts (which may be useful if you want to use this script as a method to troubleshoot databases that tend to not migrate all records).

The first problem faced is how to best access the server with the original database; while there are many options I chose to use the built-in stored procedure "sp_addlinkedserver". This procedure seems to be simple to implement and allows for seamless integration into TSQL code. Testing for this script was performed on the AdventureWorks database (developed for SQL 2005) on both test systems. The 'originating' database was on a SQL 2005 instance (although testing was briefly performed on SQL 2000 and SQL 2008 to validate compatibility). The 'originating' server is called TestServer35, the database is on an instance called Dev05; the database for both instances is called AdventureWorks2005. This information becomes important when using the sp_addlinkedserver command. I used the following TSQL code:

EXEC sp_addlinkedserver   
   
@server='TestServer35-5', 
   
@srvproduct='',
   
@catalog='AdventureWorks2005',
   
@provider='SQLNCLI', 
   
@datasrc='Server35\Dev05'
 

As you can see, the linked server is referenced as TestServer35-5. We will use this reference, in a four-part identifier (Server.Catalog.Schema.Table). The next obstacle is to obtain a listing of tables and their row counts. I used a script I had modified last year to perform this since this script will run both on SQL 2005 and SQL 2000 (you can view my script on SQLServerCentral.com's Script section at: http://www.sqlservercentral.com/scripts/Administration/61766/). I then take the results of this and store them into a temporary table; I also do this for the new database (on local server where this stored proc is running at).

Then comes the simple part of joining the two temp tables into a final temp table. I chose this route because I wanted to have the two database in separate temp tables in the event I want to work with that data, which I will be working with the data in my update to determine if a table is missing from one of the databases.

Here is the TSQL code I used (remember if you want to use this you will need to change the linked server information to the correct information; as well as to create this stored proc in the appropriate database):

--Change the database name to the appropriate databse
USE [AdventureWorks2005];
GO

CREATE PROCEDURE sp_CompareTables_all
 
AS
 
 
CREATE TABLE #tblNew
( tblName varchar(50), CountRows int )

INSERT INTO #tblNew
( tblName, CountRows)
SELECT o.name AS "Table Name", i.rowcnt AS "Row Count"
FROM sysobjects o, sysindexes i
WHERE i.id = o.id
AND indid IN(0,1)
AND xtype = 'u'
AND o.name <> 'sysdiagrams'
ORDER BY o.name

CREATE TABLE #tblOLD
( tblName varchar(50), CountRows int )

INSERT INTO #tblOLD
( tblName, CountRows)
SELECT lo.name AS "Table Name", li.rowcnt AS "Row Count"
--********
--Replace TestServer35-5 and AdventureWorks2005 below with your appropriate values
--********
FROM [TestServer35-5].[AdventureWorks2005].[dbo].[sysobjects] lo, 
   
[TestServer35-5].[AdventureWorks2005].[dbo].[sysindexes] li
WHERE li.id = lo.id
AND indid IN(0,1)
AND xtype = 'u'
AND lo.name <> 'sysdiagrams'
ORDER BY lo.name

CREATE TABLE #tblDiff
( OldTable varchar(50), OldRowCount int, NewRowCount int, NewTableName varchar(50))

INSERT INTO #tblDiff
( OldTable, OldRowCount, NewRowCount, NewTableName )
SELECT ol.tblName, ol.CountRows, nw.CountRows, nw.tblName
From    #tblNew nw
JOIN #tblOLD ol
ON (ol.tblName = nw.tblName AND ol.CountRows = nw.CountRows)
        
SELECT * FROM #tblDiff
        
DROP TABLE #tblNEW
DROP TABLE #tblOLD
DROP TABLE #tblDiff

 

You simply execute the code with the following TSQL:

EXECUTE sp_CompareTables_All
 

The results of this script are shown in Figure 1 (above).

Now, this is great if you want to have a list that you can go through yourself to verify each table matches in row counts. But, what if that database has 1000 or more tables? What if you are just, simply put, lazy? Why not utilize SQL Server to process this information for you?

Well, I sure enough did just that. With a very small modification to this script you can easily have it only display any tables that don't match up in record counts.

All you have to do is change the INSERT INTO #tblDiff block's "ON" statement to join if the CountRows are NOT equal. The following is the modified block of code; the remaining stored procedure remains the same:

ON (ol.tblName = nw.tblName AND ol.CountRows <> nw.CountRows)
 

I did also rename the stored procedure from "sp_CompareTables_All" to "sp_CompareTables_Diff", but this is optional for your own ability to clarify which stored proc is being used.

To get some results I had made a few modifications to the AdventureWorks2005 database. I had added a couple of rows to a table, and removed some rows from two tables. The results of the stored proc showing only the different tables are shown in Figure 2.

AW_Compare_Diff
Figure 2 (results of running sp_CompareTables_Diff)

As you can see the ability to change this script to show all tables or only different tables is very simple. Even setting up this script is simple, where the hardest part of the whole thing is adding a linked server (which is fairly simple also).

In a future post I'll revisit this script and include the ability to display tables that exist on one database, but not in the other. Be sure to check back for this update.

Until next time...Happy Coding!!

Friday, October 17, 2008

Searching text of programmable objects...

This is a follow-up posting to my article "Finding and Listing ALL Procedures, Functions, and Views in SQL Server 2005" I wrote for SQLServerCentral.com (http://www.sqlservercentral.com/articles/T-SQL/63471/). This will work with both SQL Server 2005 and SQL Server 2008.

I've found a few instances where I would've liked to be able to search through a Stored Procedure or a View for specific text in the definition; specifically to check to see if there was a reference to another Stored Procedure or View. Originally to solve this problem I had just queried the syscomments legacy view for the search string joining with sysobjects, while this was effective it has its limitations...such as this only searches objects created post-release of SQL Server. So, what happens if you want to search for a specific string in ALL of the programmable objects? Well, then we fall back to our trusty view of sys.all_objects! We can join it with sys.all_sql_modules and get almost all of what we need...almost.

Now, we can do a clever query that will join sys.all_objects and sys.all_sql_modules. Something like:

DECLARE @SearchString NVARCHAR(MAX)
SET @SearchString = 'ENTER_SEARCH_TEXT_HERE'
SELECT [ao].[name], [asm].[definition]
FROM sys.all_objects ao
JOIN sys.all_sql_modules asm
ON [ao].[object_id] = [asm].[object_id]
WHERE [asm].[definition] LIKE '%' + @SearchString + '%'
--AND [ao].[is_ms_shipped] = 0 --Only search user & post SQL release objects, or
--Change the above value to 1 to only include objects provided by Microsoft with the release of SQL Server
--Optional  to limit search results....
--AND [ao].[type] = 'P' --Search only Stored Procedures
--AND [ao].[type] <> 'V' --Exclude Views
--See referenced article for listing of additional object types

This would do the job for the most part, the only missing items would be constraints and rules for the most part; which how many times do you really need to search a constraint for a specific string of text?

Now, the execution plan shows that this can be a little bit of a resource monger. Here's the estimated execution plan I get for this:

Plan_For_Defintion_Search_by_Joins

This Execution Plan is OK; there's a lot of Clustered Index Scans and Compute Scalars. You might be able to improve upon this with some ingenious Indexing, hints and such...but, again...why bother? You'll have to upkeep it and monitor yet another query plan. So, I'm not too keen on this plan myself and I don't like that this query still doesn't do everything...I want everything or nothing (start my pouting and tantrum).

As I said, I want it all. So the problem for me here is that there is that 1 in 1,000,000 chance I'll need to search a constraint and then that means I'll have to either modify my code or discover a new method...why bother doing it again later if we can just make it do what we want now. Plus, and this may be more important for myself; I want something that is easier to read and understand at a later time and uses less resources. Time and resources is the name of the game we play as DBAs!

How do we solve this? Can it be done? It sure can! Enter the sacred "OBJECT_DEFINITION" function! This little function helps us out a lot by eliminating the need to manually join tables and the return results is the exact column (definition column) that we are wanting also! Let's take a look at the same search function using this syntax instead of joins:

DECLARE @SearchString NVARCHAR(MAX)
SET @SearchString = 'ENTER_SEARCH_TEXT_HERE'
SELECT [name], OBJECT_DEFINITION(OBJECT_ID([name]))
FROM sys.all_objects
WHERE OBJECT_DEFINITION(object_id([name])) LIKE '%' + @SearchString + '%'
--Optional  to limit search results....
--AND [is_ms_shipped] = 0 --Only search user & post SQL release objects
--Change the above value to 1 to only include objects provided by Microsoft with the release of SQL Server
--AND [type] = 'P' --Search only Stored Procedures
--AND [type] <> 'V' --Exclude Views
--See referenced article for listing of additional object types

As you can see, this is a lot cleaner to read. We can easily tell what is being queried and how the query results should be returned to us. Not much to look at here; we query the sys.all_objects view to find the proper object and then tell SQL that we want to view the object's definition...simple enough.

Now, let's look at the execution plan for this query. We know the query is straight forward, I'm guessing the execution plan will be also! Let's see:

Plan_For_Defintion_Search_by_Object_Definition

Well, it definitely is a smaller execution plan. Almost 1/2 the plan of the original query! Not bad at all. I'd venture a guess that this will be much less resource intense on our machine, and that means less time! I like less time!

Now, we can see that using SQL Server's "Object_Definition" function improves the ability to search the definitions. I'm sure you can see with the above that when possible use a SQL built-in function, these are here to help us. As they say...'why reinvent the wheel?'; I'd say for sure that this object_definition function is already optimized to run in SQL and that it's a fairly safe bet that we DBAs wouldn't be able to optimize our JOIN statement to this level...and if you happen to be able to even come close to the same optimization, I'd be willing to wager that you spent more time than it was worth to get to that point. With that said, it brings me back to my original statement...why bother?

Keep looking for those built-in functions, you'll find there are many of them inside SQL Server that will not only make your DBA life easier and quicker, but also more fun!

Until next time, Happy Coding!!!

Friday, August 22, 2008

VS/SQL 2008 Developers need more space!

I’m not just talking about needing a larger office (although, that couldn’t hurt). I’m talking about hard drive space!

With the release of SQL Server 2008; I’ve officially moved over my trusty development machine to the latest and greatest MS technology, Visual Studio 2008 Professional Edition and SQL Server 2008 Developer Edition.

Before you do the same, make sure you have plenty of Hard Drive space[1]. If you already have Visual Studio 2008 installed and then you try to install SQL 2008 you’ll get a friendly little message that states you MUST have VS 2008 Service Pack 1 installed. I’m not quite sure why you MUST have VS 2008 SP 1 installed to run SQL Server 2008 and VS 2008 on the same machine; but, if I do find a reason (other than the guesses of to enable some sort of feature or the ability for the two applications to see each other) I’ll post an update. So, now you know the approximate hard drive space it will take to install these applications.

Here is a simple break down of the hard drive space requirement for each product I installed:

Product Name Hard Drive Space Required URL with Hard Drive Requirement Specifications
Visual Studio 2008 (Professional Edition) 2.2 GB http://msdn.microsoft.com/en-us/vs2008/products/bb894726.aspx
Visual Studio 2008 Service Pack 1 3.8 GB (plus 600 MB on installed Hard Drive for VS 2008) http://download.microsoft.com/download/A/2/8/A2807F78-C861-4B66-9B31-9205C3F22252/VS2008SP1Readme.htm
SQL Server 2008 (Developer Edition) 2.0 GB (required even if installing components to other hard drives) http://msdn.microsoft.com/en-us/library/ms143506.aspx

As you can see, VS 2008 SP 1 is no small update. Also note that these hard drive space requirements are based on installing all features/options of each application and the final installation space used will typically be smaller than the requirements specifications. Requirements typically include space required for temporary files during installation of the product.

This comes to a total of around 8 GB of space during installation, and around 6.5 GB upon completion of installation. This, of course, does not include any project files, database files or add-ons.

What if hard drive space is a premium and you just want to test out these new applications? How do you use less space? The simplest way is to use the Express editions of each product and only install the languages/features you absolutely need or want to use/learn. A language such as C++ being installed for a developer who only knows and uses VB would seem pointless, unless that developer is intending to learn and develop with C++.

Is there a way to bypass the requirement to install VS 2008 SP 1? Well, I haven’t tried it myself; but, conceptually, if you install SQL 2008 first, then install VS 2008 you won’t get any messages or requirements during either product installation to have SP 1 installed. Now, that doesn’t mean that VS and SQL will play nice with each other. It also doesn’t mean that you won’t get some sort of message or error at a later time stating that you need to install SP 1. As mentioned earlier, I don’t know why SQL 2008 installation requires VS 2008 SP 1 to be installed with a machine that has VS 2008; I’d imagine there is a good reason though. Try at your own risk.

Until next time, Happy Coding!


[1] The installation hard drive size requirement is based on my personal experience and may vary per installation.