General discussion of SQL Topics; aimed for the novice-intermediate level Microsoft SQL Server User. Currently focuses on using SQL Server 2005.

Showing posts with label SSMSE. Show all posts
Showing posts with label SSMSE. Show all posts

Tuesday, May 6, 2008

CREATE INDEX (Transact-SQL)

Summary:

This covers the syntax and some examples on how to create an Index for a table. I’ll finish this blog entry with an alternative method for creating an index using SQL Server Management Studio (SSMS).

Syntax:

The following is the main syntax for CREATE INDEX from Books Online (BOL). You can view the entire syntax by visiting the referenced link.
Reference: http://msdn.microsoft.com/en-us/library/ms188783.aspx
NOTE: If you do not understand how to read this syntax please review my blog entry “Understanding MSDN Syntax Conventions

CREATE [ UNIQUE ] [ CLUSTERED | NONCLUSTERED ] INDEX index_name

    ON <object> ( column [ ASC | DESC ] [ ,...n ] )

    [ INCLUDE ( column_name [ ,...n ] ) ]

    [ WITH ( <relational_index_option> [ ,...n ] ) ]

    [ ON { partition_scheme_name ( column_name )

         | filegroup_name

         | default

         }

    ]

[ ; ]

UNIQUE – The index key used may not be duplicated in another row.

CLUSTERED | NONCLUSTERED – When using CLUSTERED, the logical order of the key values determines the physical order of the rows in the table. You may only have a single clustered index per table. CLUSTERED indexes should be created before any NONCLUSTERED indexes. In cases where the CLUSTERED index is created after the NONCLUSTERED index, the NONCLUSTERED indexes will be rebuilt.
NOTE: CLUSTERED is the default, if you OMIT the [CLUSTERED | NONCLUSTERED] argument then SQL will first look for a CLUSTERED index, if not found it will create the index as CLUSTERED; if it CLUSTERED already exists then it will create the index as NONCLUSTERED.

index_name – Gives your index a name. A common practice is to prefix the name with “idx_” or “ix”. An example of an index for Last Names of a customer table might be called “idx_Customers_Last_Names” or “ix_Customers_Last_Names”.

<object> - Name of the table the index is being created for. This can be up to a four part name, such as Servername.DatabaseName.Schema.TableName, as with other commands you do not always have to fully qualify (type all four parts) the <object>. You only need to be able to determine the table, without causing confusion. Example if you have two tables called Customers, then you’d at minimum need to qualify the table using the SCHEMA_NAME; such as Colorado.Cusotmers and California.Customers.

column_name [ASC | DESC] – Specify the column, or columns, to make the index for. There is no minimum or limit to the number of columns you can specify. Typically it is suggested, for CLUSTERED indexes, to use only the columns that can logically be organized; such as the First_Name and Last_Name columns. In NONCLUSTERED indexes you usually want to add ‘helper’ columns; such as the Street_Address and Phone_Number. See my blog entry “Introduction to Indexes” for additional details in choosing the appropriate columns for indexing.

<relational_index_option> - You can specify additional options with this argument, this goes beyond the scope of this blog entry. I might cover this in a later entry. For now, if you want specific details as to what options you can use with this argument and how to use the argument then review the BOL syntax command at the referenced location above.

partition_scheme_name (column_name) | filegroup_name | default - You can specify the partition scheme and columns to include in your index. This goes beyond the scope of this blog entry. I might cover this in a later entry. For now, if you want specific details as to what options you can use with this argument and how to use the argument then review the BOL syntax command at the referenced location above.

Simple Terminology:

As you can see with just a little bit of the Syntax I’ve posted, this can be quite a complicated T-SQL command. Yet, it will be among the most commonly used throughout the creation and life-span of your tables. You’ll constantly find yourself tweaking your indexes as your needs change and the data defined within the database changes. You can think of indexes as a child to your table, as with all children they will grow in complication and evolve as their experiences grow. Indexes among a table can have the same phases of ‘life’ occur also.

As with the Customer’s table example you might originally only be storing the customer’s first name, last name, street address, city, state, zip. So you may have a CLUSTERED index on the last name, then first name columns. Maybe even create a NONCLUSTERED index for the street address.

Now, let’s say a couple of years later you find that you now want to store the customer’s phone number, fax number, maybe mailing lists columns with an opt-in designator for your mailing lists. Then you decide it would be nice to just look up customer’s by their phone numbers, or find the customer’s who’ve ‘opted-in’ to certain mailing lists. You might then create additional NONCLUSTERED indexes to make these searches more efficient. Especially the Mailing Lists opt-in columns (assuming you have 100s of mailing lists…following Normalization rules would mean this should be in a separate table; but for this example it is in the Customer’s table).

Now, let’s say a year later you decide to Normalize your Customer’s table and separate the Mailing Lists columns into a “Mailing_Lists” table. Obviously the indexes for the Mailing Lists won’t be needed in the Customer’s Table, thus you’d drop those indexes; and most likely you would’ve created the appropriate indexes in the “Mailing_Lists” table when you created the table.

As you can see, the indexes can have different reasons to be tweaked. I find most commonly I will look into tweaking the indexes when I have large queries running that are taking up resources. I can usually find an index that could be added or modified that can help to improve the efficiency of the search results being returned. There are many different methods and ways to determine when to use an index and how to optimize your indexes; I’d suggest trial and error (with test systems only) as a first option. I’d also suggest reading up on optimizing queries and/or SQL Server performance (in that order). Queries are what drive your data, what gives you your results.

It’s usually a good idea to be in the habit of obtaining performance information, especially in large databases and periodically review the usage of your indexes and adjust them as appropriate. There is no perfect formula, but there are many good methods and discussions on how to achieve the best performance. Always be willing to read and try to understand your options; and when possible spend time testing to see how things are affected by your changes. What might look good today, could end up causing problems you won’t see until a few days have passed…this is why I must stress…TEST, TEST, TEST!

Example Syntax:

The following will create a UNIQUE CLUSTERED index on the Customer’s table using the Last Name and First Name columns (notice the order of names is Last then First because of logical searches will typically be performed on the last name, and then the results would be sorted by the first name):

USE myDemoTable;

GO

CREATE UNIQUE CLUSTERED INDEX idx_Customers_Names

ON Customers (Last_Name, First_Name);

GO

The following example will create a NONCLUSTERED index on the Customers table using the Street Address column:

USE myDemoTable;

GO

CREATE NONCLUSTERED INDEX ix_Customers_Addresses

ON Customers (Street_Address);

GO

The following example creates a UNIQUE NONCLUSTERED index on the Customers table using the customer’s phone number column. This will ensure that none of our customers have a duplicate phone number as an existing customer already has:

USE myDemoTable;

GO

CREATE UNIQUE NONCLUSTERED INDEX idx_Customers_Phone_Numbers

ON Customers (Phone_Number);

GO

If you were to attempt to enter a new customer and use a phone number that already exists with another customer you will get a “Msg 2601, Level 14” error code that states you cannot insert a duplicate key.

Also, note that in the second example I used the prefix “ix_” and the other examples I used the prefix “idx_”. First, I used the “idx_” prefix because for my personal uses this means it is a UNIQUE index; thus anytime I see “idx_SOMETHING” I know it is a UNIQUE index and will not allow multiple keys. I use “ix_” to mean that it is NOT unique and is NONCLUSTERED.

Remember that CLUSTERED index is the default index type; however, I strongly recommend stating the type of index in every syntax command for two reasons. First, distinction can be easily made when reviewing the syntax at a later time. Second, just because CLUSTERED is the default right now does not guarantee it will be in future SQL Server releases. The less you leave to be interpreted the more compatible you can make your code for future releases (and for backwards compatibility in many cases).

Using SSMS to create your indexes:

You can create indexes within SSMS in several places. The more common areas to create indexes are: Database Engine Tuning Advisor, the Table Designer, and in Database Diagrams, as well as in Object Explorer.

The easiest method, in my opinion, is to create a new index using Object Explorer. In object explorer you will want to navigate to the table you want to create your index on. Expand the table by clicking on the plus sign to the immediate left of the table icon to show the folders containing the objects for that table (Columns, Keys, Constraints, etc). You will right-click on the folder labeled “indexes” and select “New Index…”, this will bring up a new window called “New Index”. Here you can name your index, add the columns for the index and choose many options to go with the creation of your index.

If the “New Index…” is grayed out when you right-click the “indexes” folder then this means you have the table opened in “Design” mode. If this is the case, you can either close the “Table Designer” window and then access the “New Index…” or you can right-click anywhere in the “Table Designer” and select “Indexes/Keys…”. This will bring up a slightly different window, but just as easy to follow to create your indexes.

To modify or delete an index you have two simple methods. If in Object Explorer you can select the Index name under the table’s “Index” folder and right-click the index name and select “Delete”. Otherwise, when in “Table Designer” you can right-click anywhere and select “Indexes/Keys…”, in the resulting window you will highlight the index name you wish to delete. You will then click on “Delete”. CAUTION: There is NO confirmation or UNDO in this window; once that button has been clicked you have removed your index. You cannot CANCEL out of the window either. So make sure this is what you want to do before clicking that button!

Conclusion:

Indexes are helpful, simple to create and very powerful in making your queries and database operate at a very efficient level. Anyone can quickly learn to create indexes, modify indexes, and to drop (delete) indexes. Most people will spend a fair amount of time reading about indexes when first learning about them, this is because they are so versatile in usage and can provide such a powerful result when leveraged correctly.

I suggest at minimum to try to understand how indexes are determined and how to optimize them. These are key aspects to indexes that can make the most difference. I also suggest for you to schedule in a regular review period for indexes on your most heavily accessed tables and queries. This doesn’t need to be daily, weekly, or even monthly; but, it should be done over some periodic time because your data needs and accessing will evolve as your database evolves.

Until next time, happy coding!

Thursday, February 21, 2008

Getting Started – SQL Server Overview

Overview:

Now that we have covered the basics of T-SQL and how to get the information you will need to make your queries, I'm going to focus on getting familiar with SQL. This blog will briefly cover an overview of SQL Server 2005 and some of the various tools/utilities you can use with it.

SQL Server 2005 – The Platform:

Believe it or not, SQL Server 2005 is NOT an application (like Visual Studio, Microsoft Office, etc); it's an actual platform. What would make this a platform vs. being an application? This is because SQL Server 2005 is a layer (a data layer in particular) within the Microsoft Operating System. When we are done with this blog you'll understand this concept much better; for now, you may want to look at the MSDN website: SQL Server Overview. This will show you that SQL Server offers different components that make up the entire SQL Server Platform; those components are: Integration Services, Database Engine, Reporting Services, Analysis Services, Service Broker, Full-Text Search, Replication, and Notification Services.

SQL Server Components:

Each of these components have different functions it is intended to perform; the basic idea behind having these services is that you can leverage data in almost any fashion you see best fit. Examples of leveraging could be:

  • To import/export data to/from other databases (including, Access, Oracle, MySQL, etc)
  • To generate live-time reports
  • To achieve live-time notifications of events or data milestones
  • To hold data that is considered unorganized and get useful information from within this data

This is only some of the uses you can make of data and SQL Server. You'll find that nearly any way of storing, retrieving and/or manipulating the data may be performed. It all boils down the knowing what component you can get the most leverage using and how to present that data to the user and/or to the other components in a meaningful way.

How data is gathered, retrieved, and manipulated:

This is only the tip of the iceberg. With the advances made today you'll find that data is an evolving volatile market that is important to almost everyone; if you think about it data runs most things. Your bank account uses data to know how much money you have, you've used, and where you got it from and spent it at. When you shop the grocery stores keep data to know what is being sold, sitting on the shelves, and they will use this data to know what items to keep in stock that target your local area. Ever notice that some items you find in the store near your home aren't carried in the same store that's located a couple of states away? This is usually because they have collected data that says your area likes that item and the other area may not have as high of sales for it.

How data is accessed and stored:

So, now that you have a general understanding that SQL Server is way more than just an application and you can do immense of data storage and extract almost any information at any given time; the question then comes to how can you access the data and/or get more leverage.

This answer can be quite complex; it generally comes down to what you want to do with the data and how you want to do it. The common methods are to use an application to access the methods, or to create an application that can present the data, and capture the data, in a unique method suited best for whatever audience is targeted.

A common used application to create databases, store data, view data, generate reports, and/or manipulate data is by using built-in tools designed by Microsoft. The most common tool used is SQL Server Management Studio (SSMS). This tool provides a seamlessly integrated solution to access any databases stored within SQL Server, and to create any database to be stored in SQL Server. You can also write direct T-SQL statements, stored procedures, and perform a large number of tasks (such as backing up the database, restoring the database, monitoring the activity on an attached database).

Another commonly used application is Visual Studio (VS). This is a development software application used to develop custom software solutions that can interact with the SQL Server or any attached database within the SQL Server platform. This can be done with only using the wizards and controls within the VS environment to create a simple access solution to the data; however, you'll find that ONLY using the built-in wizards and controls will lack many features you would want (such as the ability to search all the records for specific words, the ability to mass delete records, backup the database, etc). If you wanted to have these features, then you would use the VS environment with custom coding to access the data and to present the data to the end-user. There are many books dedicated to doing these types of things, and go beyond the basis of this blog.

How SQL Server is configured and/or monitored:

SQL Server comes with utilities and tools that can help to configure it and to monitor it. A common tool used for configuring SQL Server is the SQL Server Surface Area Configuration tool, sometimes in conjunction with the SSMS (see above) tool; but, it doesn't stop there. Another useful tool included with SQL Server is the SQL Server Configuration Manager. These types of tools allow you to tell SQL Server where it is accessed from, how the database is stored, and many other options.

When it comes to monitoring the database platform you will find that there are many tools/utilities available for this too. SQL Server comes pre-bundled with: SQL Server Profiler and Database Engine Tuning Advisor. Also if you go to MSDN's SQL Server 2005 Tools and Utilities website you will find other tools/utilities such as: RML Utilities and SQL Server 2005 Best Practices Analyzer.

Conclusion:

As you probably know by now, SQL Server is far beyond any application and includes so many components, tools, and utilities that you could spend years just learning how to use them and to capture the information that you want. You probably have even started to realize that SQL Server has tools for its components!

It's well worth to mention that not all tools/utilities will work on all editions of SQL Server; however, Microsoft has gone through great lengths to ensure that SQL Server can be used by just about anyone with a data need, from the single user to the multi-world enterprises. Microsoft has even gone to the extent of making it's commonly used tools available for use with its "free" (at time of this blog) express edition. Sometimes the tools may have slightly modified names (such as, SQL Server Management Studio Express) and there are some options that aren't available for the lower end SQL Server Platforms (such as 64-bit support, and remote login <internet> access), but in most cases you will find that if you are not concerned about those areas or any other affected components (such as reporting services, analysis services) you can get by with using the Express edition.

Even if the express edition won't work for your needs, it's usually worth the time to download it and test out some basic uses to see if its bigger brothers can address those needs. Microsoft can even offer evaluation and development editions that will provide great means of determine the usefulness of the product.

Most of all, don't forget, there are tons and tons of information available for free to help you along the way; when those aren't quite cutting it for you any longer or you don't have access to the free content information you will find tons and tons of books that cover just about all topics you can think of and more!

Additional Resources:

SQL Server 2005 Tools and Utilities on MSDN: http://msdn2.microsoft.com/en-us/sqlserver/bb671105.aspx
SQL Server Overview: http://msdn2.microsoft.com/en-us/library/ms166352.aspx