Friday, March 19, 2010

TEXT to VARCAHR(MAX)

                                    TEXT to VARCAHR(MAX)



You may have migrated from SQL Server 2000 to SQL Server 2005/2008. There are several tables in your database having text data type columns. Text data type and image columns are deprecated, and replaced by varchar (max) and varbinary(max) . Apart from this, if the text data is stored outside the row, when this column is referenced in select, insert, update statements, it will consume extra reads slowing down query performance

You can check, whether text data is stored inside the row or on a separate data page by using the following query

select objectproperty(object_id('tablename'),'TableTextInRowLimit')

If the output is 0 then text in row limit is not set. i.e. The text column(s) are stored out side the data page and pointers in the data row, references the text data stored in text pages.

In SQL 2005 and above, by default lob data stored inside the data row. If we want to store lob data outside the row

EXEC sp_tableoption 'tablename', 'large value types out of row', 1;

Step by Step process to convert text to varchar(max)

We will take an example of an employee table and go through the step by step process to convert text data types to varchar (max) .

Script 1 :
use tempdb
go
if exists (select 1 from sys.tables where name = 'employee')
begin
drop table dbo.employee
end

create table dbo.employee
(
employee_id int identity(1,1) primary key ,
employee_name varchar(25) not null,
salary numeric(10,2) not null,
employee_info text
)
go

insert into employee ( employee_name,salary,employee_info)
values ('Rama',10000,'Good Character' )
insert into employee ( employee_name,salary,employee_info)
values ('Mohammad',40000,'Nice' )
insert into employee ( employee_name,salary,employee_info)
values ('James',50000,'Compassion' )
insert into employee ( employee_name,salary,employee_info)
values ('Saraswathi',50000,'Has high Knowledge' )
insert into employee ( employee_name,salary,employee_info)
values ('Krishna',20000,'Very Good' )
insert into employee ( employee_name,salary,employee_info)
values ('Fathima',32000.40,'Caring' )
insert into employee ( employee_name,salary,employee_info)
values ('Teresa',44000.50,'Professional' )
insert into employee ( employee_name,salary,employee_info)
values ('Buddha',90000.00,'Calm' )
go

Script 2 :

select objectproperty(object_id('employee'),'TableTextInRowLimit')
–- Returns 0 , which means text column rows are stored outside the data page
go

Now, let us create a function which will return page number in readable format
Script 3 :
create function dbo.ufn_convertpage_nums(@page_num binary (6))
returns varchar(20)
as
begin
return
( convert(varchar(2),(convert(int,substring(@page_num,6,1)) * power(2,8)) +
convert(int,substring(@page_num,5,1)))) + ':' +
convert( varchar(11),
( convert(int, substring(@page_num,4,1 )) * power(2,24)) +
( convert(int, substring(@page_num,3,1 )) * power(2,16)) +
( convert(int, substring(@page_num,2,1 )) * power(2,8)) +
( convert(int, substring(@page_num,1,1 )))

)
end

go



Script 4 : CAUTION: Don’t run this on production server. Run only on test server

use tempdb
go

DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS
DBCC FREEPROCCACHE WITH NO_INFOMSGS
DBCC FREESYSTEMCACHE ('All')

set statistics io on
go

set nocount on
select employee_id,employee_name, employee_info from dbo.employee
go

set statistics io off
go

Output is

-- Table 'employee'. Scan count 1, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 16, lob physical reads 0, lob read-ahead reads 0.
There are 16 lobs (large object) logical reads. Since there are 8 rows in the table
Script 5 :
select convert(varchar(25),object_name(p.object_id)) as tablename,
a.type_desc as page_type_desc,
total_pages as totpages,
used_pages,
data_pages ,
partition_number as pnum,rows ,
dbo.ufn_convertpage_nums(first_page) as first_page,
dbo.ufn_convertpage_nums(root_page) as root_page,
dbo.ufn_convertpage_nums(first_iam_page) as first_iam_page,
partition_id,
allocation_unit_id
from sys.partitions p
inner join sys.system_internals_allocation_units a
on p.partition_id = a.container_id
where p.object_id = object_id('dbo.employee')
go



tablename page_type_desc totpages used_pages data_pages pnum rows first_page root_page first_iam_page partition_id allocation_unit_id

employee IN_ROW_DATA 2 2 1 1 8 1:155 1:155 1:158 72057594038648800 72057594043236300

employee LOB_DATA 2 2 0 1 8 1:153 1:153 1:154 72057594038648800 72057594043301800



There are 2 LOB_DATA pages



After migration for previous versions of SQL server to SQL Server 2005/2008, text data type should be altered to varchar (max)

Script 6 :
alter table employee
alter column employee_info varchar(max)
go



Again, Run Script 4

Table 'employee'. Scan count 1, logical reads 3, physical reads 1, read-ahead reads 0, lob logical reads 8, lob physical reads 1, lob read-ahead reads 0.
Lob reads reduced from 16 to 8


Running Script 5
tablename page_type_desc totpages used_pages data_pages pnum rows first_page root_page first_iam_page partition_id allocation_unit_id

employee IN_ROW_DATA 2 2 1 1 8 1:155 1:155 1:158 72057594039238600 72057594044809200

employee LOB_DATA 2 2 0 1 8 1:153 1:153 1:154 72057594039238600 72057594044874700

employee ROW_OVERFLOW_DATA 0 0 0 1 8 0:0 0:0 0:0 72057594039238600 72057594044940200

You can see that a new page type description ROW_OVERFLOW_DATA is created. This page type is used when a row,does not have enough space to hold its maximum capacity of 8060 bytes

To make it more efficient, and reduce lob reads we need to update the table. We are updating the same column to push out of row data to inside the data row.

Script 7 :

update employee set
employee_info = employee_info
go



Note: If there are large number of rows , update in batches to reduce logging and performance improvement.

Again, Run Script 4

Table 'employee'. Scan count 1, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0,

lob physical reads 0, lob read-ahead reads 0.

Number of Lob reads are reduced to 0. This means data is moved from outside text pages to inside the data row

As an alterantive we can create the table with varchar(max) instead of text data type

Script 8 :

Note: In SQL Server 2005 and later, by default varchar(max) data is stored in the data row,
if the length of row does not exceed 8060. If length of row reaches this threshold,
data is pushed out to large object page (text page)

if exists (select 1 from sys.tables where name = 'employee')
begin
drop table dbo.employee
end

create table dbo.employee
(
employee_id int identity(1,1) primary key ,
employee_name varchar(25) not null,
salary numeric(10,2) not null,
employee_info varchar(max)
)
go



Run the insert script (Script 1 )

Run Script 4

Table 'employee'. Scan count 1, logical reads 3, physical reads 0, read-ahead reads 0,
lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Run Script 5

tablename page_type_desc totpages used_pages data_pages pnum rows first_page root_page first_iam_page partition_id allocation_unit_id

employee IN_ROW_DATA 2 2 1 1 8 1:153 1:153 1:154 72057594039304100 72057594045005800

employee LOB_DATA 0 0 0 1 8 0:0 0:0 0:0 72057594039304100 72057594045071300

After migration or if there are text data type columns , it is better to create a new table replacing text data types with varchar(max) data type and populating data from the old table

Conclusion

Recommended Method After Migration:


1. Create a new table (employee_new) replacing text data type with varchar(max)
2. Populate from old table (employee) to employee_new .
Note: If number of rows are very high , insert in batches to avoid bloating of log file and to improve performance
3. Drop old table (employee) . You may be required to drop constraints if any before dropping the table
4. Rename new table (employee_new) to old table (employee) . Put back the dropped constraints if any

Alternative Method

1. Alter table data type from text to varchar(max)
2. Update the same column
Note: If number of rows are very high ,update in batches to avoid bloating of log file and to improve performance

1 comment:

Anonymous said...

You are not right. Let's discuss it. Write to me in PM.