By Susan Sales Harkins
Data integrity and performance are the driving force behind almost every decision you make during the design and development process. Defining appropriate data types is one of the easiest ways to let SQL Server help you help yourself.
Always use the smallest data size that will accommodate the largest possible value. If a column is going to store values between 1 and 5, use tinyint instead of int. This rule also applies to character columns. The smaller the data size, the less there is to read, so performance, over all, benefits. In addition, smaller size reduces network traffic. With newer technology, this tip seems less relevant, but don’t dismiss it out of hand. You’ll won’t regret being efficient from the get-go.
Bad primary keys
Don’t use float, real, or datetime for primary keys. They add overhead that you just don’t need, and given the nature of primary keys, you will probably feel the pinch.
Usurp SQL Server assumptions
When converting a value to a variable length data type using varchar, always specify the length. Otherwise, SQL Server assumes a default size of 30. Specify the smallest size possible (see #1).
To speed up frequent sorts, use an int (or an integer-based) data type if possible. SQL Server sorts integer data faster than character data.
The text data type accommodates a lot of data but at a cost. Unfortunately, I have seen developers use it by default. For those large columns, use varchar instead; it accommodates up to 8,000 characters and requires less overhead. Consequently, varchar performs better.
The varchar instead of char trade off
It’s best to limit a text column, but knowing just how much can be difficult. If the data varies in length, it can be more efficient to use varchar than char. A fixed-length data type will waste space on smaller entries. In addition, sorts against a varchar column are usually faster. That’s because SQL Server sorts the entire width of a char column.
Don’t store NULL in fixed-length columns
Try not to allow NULL values in a fixed-length column. NULL consumes the same space an input value would. Those NULL values will add up quickly in a large column. If you must accommodate NULL values, use a variable-length column. They use less space for NULL.
SQL Server’s bigint uses 8 bytes of memory. In comparison, int uses just 4. Don’t use bigint unless the data forces you to.
Avoid using SQL Server’s sql_variant data type. It’s a memory hog and comes with limits that make it difficult to work with:
• Variants can’t be part of a primary or foreign key.
• Variants can’t be part of a computed column.
• Variants don’t work with LIKE in a WHERE clause.
• OLE DB and ODBC providers automatically convert variants to nvarchar(4000) a huge waste almost 100% of the time!
When numbers are really text
It’s common to store numeric values as text. For instance, you won’t (mathematically) evaluate a ZIP Code or a phone number, so you might store them as text. However, numeric data types generally consume less overhead to store the same value as a character data type. You’ll probably notice a difference between the two data types in a WHERE clause, a sort, or a join.
Susan Sales Harkins is an independent consultant and the author of several articles and books on database technologies. Her most recent book is Mastering Microsoft SQL Server 2005 Express, with Mike Gunderloy, published by Sybex. Other collaborations with Gunderloy are Automating Microsoft Access 2003 with VBA, Upgrader’s Guide to Microsoft Office System 2003, ICDL Exam Cram 2, and Absolute Beginner’s Guide to Microsoft Access 2003, all published by Que. Currently, Susan volunteers as the Publications Director for Database Advisors. You can reach her at firstname.lastname@example.org.
Filed under: Computer / IT