General discussion of SQL Topics; aimed for the novice-intermediate level Microsoft SQL Server User. Currently focuses on using SQL Server 2005.

Showing posts with label Recovery. Show all posts
Showing posts with label Recovery. Show all posts

Tuesday, July 1, 2008

Concept: Finding the Right Order...

The other day I was helping out a friend of mine with a space issue on their server, and one of the topics I discussed was the importance of find the best order to perform the operations in to utilize the least amount of extra hard drive space. While discussing this it came to mind that often times while plugging away with SQL we can get so intent on getting the job done (often times in the first way that works, while this is not necessarily the best way), that we might take for granted all of that hard drive space and server power available to us!

With just a little bit of forethought, planning, and understanding what we are trying to accomplish, we can actually use less disk space...and we could also potentially improve the performance we get out of SQL Server! I don't know about you, but getting better performance always ranks high on my list of desired accomplishments. I think this can be best demonstrated with an example....

For simplicity reasons I will use a database with a single table that each record being inserted will equal 1 MB of hard drive space. Thus, 1,000,000 records in our "simple database" will result in a database size of 1,000,000 MB. Now, let's say that on average this database grows by about 100,000 records per year. This would mean that with 1,000,000 records this database has been used for about 10 years. Now, assume there is a decent DR in place and the transaction log is kept at a fairly decent size of maybe 1,000 MB to accommodate for a months worth of data before the FULL backup operation. Simple enough, right?

So, 10 years has passed and the boss walks in and says they just merged with another company that has an identical database setup and stats with only different data and the two databases will be merged (since this is hypothetical, we will pretend that a merge of the data can be done without any issues and it would simply double our database size). Also, the boss says that they are archiving 5 years worth of data from both databases to match the business rules for hard copy storage on site and archiving. And lastly, the boss says "...because you don't look busy..." that they only want to have enough space in the database to allow for 2 years of data to be added and that they need the extra hard drive space available for other needs.

From a first glance you might say not a problem. We'll just merge the two databases, take out 5 years worth of data, perform a FULL backup, shrink the transaction log (leaving enough free space for a month worth of data), and then shrink the database down (leaving enough room for 2 years worth of data to be added later). I'm going to ignore the transaction log size in this analysis because there will typically be one out come of the size regardless of the steps taken; and this will differ based on the database recovery mode and method taken to perform these steps; and to analyze each possibility is beyond the scope and purpose of this blog. Now, let's look at how this might look on our system after each process...

First step in the above scenario would be to merge the two databases (remember that since the databases are originally from different companies, they are physically stored on separate hard drives).

HD Space Used: 2,000,000 MB
SQL Record Count: 2,000,000 records
SQL Space Used: 2,000,000 MB

Second step would be to take out 5 years worth of data.

HD Space Used: 2,000,000 MB
SQL Record Count: 1,000,000 records
SQL Space Used: 1,000,000 MB

Third step would be to perform a FULL backup (this is based on a FULL recovery model).

HD Space Used: 2,000,000 MB
SQL Record Count: 1,000,000 records
SQL Space Used: 1,000,000 MB

Fourth and Fifth steps to shrink the log and database.

HD Space Used: 1,200,000 MB
SQL Record Count: 1,000,000 records
SQL Space Used: 1,200,000 MB

 

From the above we can see that the maximum HD space being used would be 2,000,000 MB. Which depending on our available HD space this wouldn't be too much of a problem. The real issue would be the amount of resources needed; since this is hypothetical we can only imagine from a step-by-step that the resources being used would be fairly moderate. There are too many variables to be accurate when it comes to actual performance measuring, so I won't even attempt to measure it.

Now, what if we put a little thought towards these requirements from the "boss", and especially what if the HD space available to use was 1,800,000 MB. Now, we have a problem that can easily be resolved with just finding the right order. In this case, just making a simple change of placing Step # 1 at the very end and performing an extra (final) FULL backup and a (final) transaction log shrink will allow us to remain under 1,800,000 MB and also can save our server a lot of processing time, because it will have to process less records. We can also utilize the secondary server (the server holding the data from the company being merged with) processing power to cut total time being required to archive and thus decreasing the time required to merge because of the lower number of records being merged. Here is how the steps would look....

Step # 1: Remove data from each database at it's original location

HD Space Used (per Server): 1,000,000 MB
SQL Record Count (per Server): 500,000 records
SQL Space Used (per Server): 1,000,000 MB

Step # 2: Perform FULL backups for each database

HD Space Used (per Server): 1,000,000 MB
SQL Record Count (per Server): 500,000 records
SQL Space Used (per Server): 1,000,000 MB

Step # 3: Shrink the transaction log and databases

HD Space Used (per Server): 500,000 MB
SQL Record Count (per Server): 500,000 records
SQL Space Used (per Server): 500,000 MB

Step # 4: Merge the two databases

HD Space Used: 1,000,000 MB
SQL Record Count: 1,000,000 records
SQL Space Used: 1,000,000 MB

Step # 5: Perform FULL backups for database and shrink transaction log

HD Space Used: 1,200,000 MB
SQL Record Count: 1,000,000 records
SQL Space Used: 1,200,000 MB

 

Conclusion:

Now, you can easily see in the above scenario that we only use a maximum of 1,200,000 MB of space and that is because we are reserving that free space. Logically, by performing the data archiving before merging the two databases we save our server processing power because it will process ONLY that of which will remain in the database. Sometimes the order of operations will already be pre-determined from outside factors, sometimes it may be required to perform certain tasks in a particular sequence.

I hope this illustrates that it's not always good enough to just find a way to make things work; and that with little planning in the beginning you can save yourself time, lower the requirements to perform the tasks at hand, and also improve the performance of your SQL Server!

Monday, April 21, 2008

A brief introduction to Transaction Logs

Note: This entire article has been published by SQLServerCentral.com. Please find the link to the article at the end of this blog in the "Update" section.

Summary:

This is a very brief introduction to transaction logs. I am currently working on an article (titled "Introduction to the Transaction Log") that will go into a lot of detail, show the ‘ins’ and ‘outs’, and also discuss some ‘good practices’ (and how truncating works). This blog will just give you a basic introduction that discusses what a transaction log is, how it basically works, and what the primary use of the transaction log is (these can be more than just used for backup/restore procedures; that will be discussed in my upcoming article).

What a transaction log is:

Every single database MUST have at least one transaction log. A database can have more than one transaction log. A transaction log can be used over multiple physical files.

A transaction log is a log file (physical and virtual) that records each and every transaction that occurs. This includes data modifications, database modifications, and rollback (restoring) modifications. This file records each transaction in sequence using a Log Sequence Number (LSN). Each transaction is appended to the end of the physical log file and always uses a LSN that is higher in value than the last written LSN. These are best thought of as a journal entry for every action that occurs within your database.

Some of the operations that are recorded within the transaction log are data modifications, database modifications, rollback modifications, the start/end of a transaction, and even creating/dropping of tables and indexes.

Logical Architecture (concept):

Transaction logs operate as if they are being stored with a string of log records. Each record is identified by an LSN. The steps used to recover data can differentiate on how the data is logged within the transaction log (this is covered further in my article). Each transaction reserves enough space to support a successful rollback, either by an explicit action requesting the rollback or from an error occurring within the database. This amount of space can vary; however, it typically mirrors the amount of space that is used to store the logged operation. Transaction logs are loaded into a virtual log; this cannot be controlled or set by the administrator (my article will include a ‘good practice’ to help prevent system performance decreases from Virtual Logs).

Physical Architecture (concept):

Transaction logs can span multiple physical files. You can truncate the transaction log to free internal log space (this will not shrink the log file; I’ll cover shrinking the log file in my article). The basic concept of the physical file is it will append to the end of the log file, once the physical end of the log is reached the transactions will wrap around to the beginning of the log file (assuming there is free space).

If the log file does not contain any free space (internally); there are two possible outcomes:

1) If FILEGROWTH is enabled and there is free disk space on the hard drive, the file will automatically increase in accordance to the settings and then will re-organize the transaction log and append the new record to the end of the log, OR

2) If FILEGROWTH is NOT enabled, or there is NOT enough free disk space on the hard drive, then SQL Server will return a 9002 error code.

Primary Use of a Transaction Log:

A transaction log is typically used to allow recovery of a database to a certain point in time; or to allow the recovery of a database to the last successful transaction committed prior to the database failure. The usage of the transaction log will greatly depend on how the database recovery model is setup and the backup plan that is used.

Additional Notes:

Transaction logs should be periodically backed up. The size of the log is typically recommended to be about 1 ½ the size of your database. Example: If your database is 10 MB in size, then the transaction log would be about 15MB in size. This should be treated as a very general recommendation; there are many instances where this formula is not appropriate and should be modified to the appropriate amount based on your particular situation.

You want to ensure the transaction log is the proper size (the expected ending size) and has a relatively large growth increment value. This will provide you with the best performance and can ensure the proper use of the transaction log (much less likely to have the log expand without your knowledge of the event occurring).

Conclusion:

Transaction logs are a very important aspect of maintaining and recovering your database. If set properly, and maintained properly, they can give you the additional backup support you need without impacting system performance. They can help to give you peace of mind during daily operations. They can be an integral file to recovering a database to a point in time.

Additional Resources:

MSDN: Introduction to Transaction Logs (http://msdn2.microsoft.com/en-us/library/ms190925.aspx)
MSDN: Transaction Log Logical Architecture (http://msdn2.microsoft.com/en-us/library/ms180892.aspx)
MSDN: Transaction Log Physical Architecture (http://msdn2.microsoft.com/en-us/library/ms179355.aspx)

Update:

Posted on 7/2/08 by James:

SQLServerCentral.com has published the article in full. You can read the article at: http://www.sqlservercentral.com/articles/Design+and+Theory/63350/. You can join the discussion of this article at: http://www.sqlservercentral.com/Forums/FindPost527448.aspx. I hope you enjoy the article and find it helpful!

Wednesday, March 26, 2008

Types of backups

Overview:

There are different objects you can backup using SQL Server; in most cases when DBAs refer to backups they are referring to backing up databases. You can also backup the physical files and folders, but that will go beyond the scope of this blog posting. First I'll cover the three "Recovery Models" that you can choose for you database (Full, Simple, and Bulk-logged, respectively), then I'll cover the three different types of database backups: Full, Differential, and Transactions. Then I'll conclude with a quick overview of how to create backups for your database.

The most important thing to know is that you should always have a current backup in place that matches the needs of your company, you should have set procedures that detail how to backup your database and how to restore your database, and most importantly you should practice the database restoring plan you use (so you can verify it actually works, the data being backed up is actually being backed up, and that when the time comes for that once-in-a-lifetime emergency restore you will remain calm and collected so you can ensure the data is restored in a very prompt manner).

What is a backup? When do I make them?

Backups are copies of your database that is stored on your hard drive, on a server, on a removable media source, or any other type of location (physical and/or virtual). The basic idea behind a backup is to have a copy (or copies) of your data in a secure and easily obtainable location, and for the business critical information to have another copy stored somewhere offsite that is secure and obtained within a reasonable timeframe. Backups can contain all the information that is stored within your database, these are called "Full" backups. Most common backups contain entire copies of the database; there are also times when you will only retain copies that contain only the data that has been added since your last complete database backup. These are called "differential" and "transaction log" backups. Both accomplish the same thing in different ways and have different positives and negatives to them. I'll cover those later in this posting.

What you are aiming to accomplish is the ability to quickly recover from an incident that causes your current database to become corrupt or inoperable. Examples would be: Your database was hacked into the previous night, the database has somehow become corrupt and cannot be started, the data was accidentally deleted from a careless system maintenance procedure, or worse yet the building has been burned down or otherwise destroyed with all the computers and backups in it!

In most cases, you will access a backup copy that is stored in a secure location within your business location for the unforeseen events that can occur at any moment. In the extreme cases you will obtain your backup from your offsite location for the catastrophic events that require the business to be temporarily or permanently relocated without any prior notices.

So, now that we know what the backup is and where it is stored we now need to know when and how do we make them. The great thing about SQL, regardless of the version you get, you can make a backup at anytime. SQL uses online backup technology, so this means that your users and database can be accessing and modifying your database at anytime, even during your backup cycle! Do be aware that to create a backup takes up some resources, such as CPU, Memory, and I/O. It is most common to make backups during the business' "down cycle"; usually in the middle of the night for traditional 9-5 companies...and the 24x7 companies that are always accessing data will require a little more planning by determining the data usage trends and adjusting the backup procedures to occur during the lowest peaks or generalized time frame that matches the expected lower usage peaks.

Recovery Models:

There are three recovery models that can be chosen at the time you create your database: Full, Simple, and Bulk-logged. Each database can have it's own recovery model; and this recovery model may be changed from within the database in a future time.

Full: This model allows you to recover your database to a specific point in time, or to the point of the failure. This model will record every transaction that occurs within your database, as well as the stored data, structure, and every other object; this includes the bulk operations and bulk loading of data. For this model you will typically want to use Full and Differential backup types at a minimum; to maintain a more complete backup solution you would also use the Transaction Log backup type.

Simple: This model will allow you to recover to the point of your last backup creation; it's very important to understand that this model does NOT allow the use of Transaction Log backups. This model uses the 'TRUNCATE LOG ON CHECKPOINT' option; which in effect deletes all old transactions when the database gets to a checkpoint. This is more ideal for system databases because the transaction log is deleted at each checkpoint.

Bulk-logged: Bulk operations and bulk loads are logged at the most minimal levels. This means that during a restore you will need to repeat the bulk operations and bulk loading of data, should the database fail prior to a Full backup or a Differential backup. The ideal strategy will implement the Full and Differential backup types at a minimum, and include Transaction Log backups on a regular re-occurrence. 

Types of Backups:

I'll clarify now that I am discussing using the SQL Server Management Studio (SSMS) backup options; and that there are many different methods to create backups, but this discussion is limited to performing file backups using the BACKUP syntax & wizards. Some other options that are commonly used, but are beyond the scope of this posting, are: Log Shipping, High Availability (i.e. Redundancy hard drive configurations, standby servers, etc), and many others using a combination of hardware/software solutions.

The three types of backups that can be accessed from within SSMS are: Full, Differential, and Transaction Log. Here is a brief breakdown of each type:

Full
Creates a backup that contains all objects, system tables and data from the database, it also includes portions of the transaction log that required because of database being used at the time the backup was created. A Full backup is intended to bring you right back to complete working order up to the point in time the backup was created.

Differential
This type of backup is designed to create a backup of only the objects, tables and data (including transaction logs) that have been changed or not recorded on the last Full backup. It is very important to understand that you can NOT create a differential backup until after you have created at least 1 full backup.

On a personal observation note: you'll notice that a Differential is quite a bit faster than the Full backup because it requires less information to be created and is recording much less information, however, as with everything else there is a trade off. If you choose to create only 1 Full backup and then from there on use Differential backups to save time and resources, you will have to use all of these Differential files to restore to the last point of your Differential backup. What I'm saying here is that assume you make a daily Differential every single day for the next 2 years; on the first day of the 3rd year you have a database failure and must restore the database. You will have to first restore the Full backup file you originally made and then you will have restore all 730 Differential files.

You can choose to create a Full backup at anytime and it will automatically replace your original Full backup and ALL differential backups made between the original Full backup and the current Full backup you are making.

Transaction Log
Transaction Logs are a special type that compliment the Full and Differential backup types. These are serialized records of all database modifications made since the last transaction log backup. They are used during the recovery process to roll-back or commit transactions. Unlike the Full and Differential backups, Transaction Logs backups are recording the state of the transaction log at the START of the backup operation. Transaction Logs are best to think of a way you can have backed up checkpoints between the time you have made your last Full or Differential backup. A most common method of use is in OLTP databases that have daily Differential backups and weekly Full backups; you would use the Transaction Log backup on an hourly basis to help minimize the lost of transactions in case of a failure within the middle of the day. It is very common to be used with live websites that have many transactions occur per hour; instead of losing a full days worth of transactions you can minimize it down to an hour, or in highly volatile situations could even get down to the seconds (however, it is very uncommon to have a transaction log backup on a per second basis).

How do I make a backup?

In SSMS you can easily create a backup of your database by right-clicking on your database and selecting "Tasks" then "Back Up...". This will open the "Back Up Database" dialog box that allows you to easily fill in the appropriate information and then either have SSMS perform the backup, or create a script using the scripting wizard from within the "Back Up Database" dialog box.

From the "Back Up Database" dialog box you can choose the backup type, the name of the backup file, the location to place the file (including using devices that contain the file location and other information). By click on "Options" in the left pane, you will be brought to another screen that will allow you to give even more specific options for your backups. Some options include creating a new set of backups, appending to an existing set, verifying the backup reliability, and for transaction logs you can specify how to handle the log after being backed up.

Conclusion:

As you can tell, the backup options and wizard are each powerful in their own rights. I can't stress enough the importance of having a backup plan and a recovery plan in place now, it can save you time and also provide you with the confidence that you know when a disaster occurs you are ready and have reliable backups that can be restored in an instance.

You have now seen that backups can cover most any situation that you can come across, and depending on the need can back up a database down to the second in the highest volatile uses.

This is just the tipping of the iceberg, there are many vendors out there that specialize in create backup/recovery solutions that are based on software, hardware, and hardware/software combinations.

With some careful planning, execution and understanding you can handle any situation that will cross your path and know that in the worse of conditions you are prepared to get your business back up and running faster than they can get a grasp on what just happened to their business!

Until next time....Happy Coding!