Tuesday, February 14, 2012

Calculating Row Size

A couple of questions regading table size:
1. Why is the size of a table reported by Enterprise Manager in Table Info
so different than the size that is returned when using sp_spaceused?
2. I've found the calculations but was wondering if there exists a system
stored procedure that would return the size of a row in a table? For example
,
I'm looking to capture the max, min, and avg size of a row in a particular
table.
Any input is greatly appreciated!
Thanks!1. I don't know. Sorry!
2. No, but you could do this to get the information you want
CREATE TABLE ColumnSizes (table_name varchar(50), column_name varchar(50),
min_size int, max_size int)
GO
DECLARE @.table_name varchar(100)
DECLARE @.column_name varchar(100)
DECLARE @.querystring nvarchar(1000)
SET @.querystring = ''
DECLARE columns_cursor CURSOR STATIC FORWARD_ONLY
FOR
SELECT t1.table_name, t2.column_name
FROM INFORMATION_SCHEMA.TABLES t1
INNER JOIN INFORMATION_SCHEMA.COLUMNS t2
ON t1.table_name = t2.table_name
WHERE t1.table_type = 'BASE TABLE' AND t1.table_name <> 'columnsizes'
OPEN columns_cursor
FETCH NEXT FROM columns_cursor INTO @.table_name, @.column_name
WHILE (@.@.FETCH_STATUS = 0)
BEGIN
SET @.querystring = 'INSERT INTO ColumnSizes SELECT ''' + @.table_name +
''', ''' +
@.column_name + ''', MIN(DATALENGTH([' + @.column_name + '])),
MAX(DATALENGTH([' +
@.column_name + '])) FROM [' + @.table_name + ']'
--PRINT @.querystring
EXEC sp_executesql @.querystring
FETCH NEXT FROM columns_cursor INTO @.table_name, @.column_name
END
CLOSE columns_cursor
DEALLOCATE columns_cursor
select * from columnsizes
"A. Robinson" wrote:

> A couple of questions regading table size:
> 1. Why is the size of a table reported by Enterprise Manager in Table Info
> so different than the size that is returned when using sp_spaceused?
> 2. I've found the calculations but was wondering if there exists a system
> stored procedure that would return the size of a row in a table? For examp
le,
> I'm looking to capture the max, min, and avg size of a row in a particular
> table.
> Any input is greatly appreciated!
> Thanks!|||Thanks!!
One quick question:
The database I'm running this against has 4500+ attributes/columns. I
started running the procedure last night and left for the evening. When I go
t
back in, it was still running - 18+ hours...
Is that normal?!?
Thanks!!
"Mark Williams" wrote:
> 1. I don't know. Sorry!
> 2. No, but you could do this to get the information you want
> CREATE TABLE ColumnSizes (table_name varchar(50), column_name varchar(50),
> min_size int, max_size int)
> GO
> DECLARE @.table_name varchar(100)
> DECLARE @.column_name varchar(100)
> DECLARE @.querystring nvarchar(1000)
> SET @.querystring = ''
> DECLARE columns_cursor CURSOR STATIC FORWARD_ONLY
> FOR
> SELECT t1.table_name, t2.column_name
> FROM INFORMATION_SCHEMA.TABLES t1
> INNER JOIN INFORMATION_SCHEMA.COLUMNS t2
> ON t1.table_name = t2.table_name
> WHERE t1.table_type = 'BASE TABLE' AND t1.table_name <> 'columnsizes'
> OPEN columns_cursor
> FETCH NEXT FROM columns_cursor INTO @.table_name, @.column_name
> WHILE (@.@.FETCH_STATUS = 0)
> BEGIN
> SET @.querystring = 'INSERT INTO ColumnSizes SELECT ''' + @.table_name +
> ''', ''' +
> @.column_name + ''', MIN(DATALENGTH([' + @.column_name + '])),
> MAX(DATALENGTH([' +
> @.column_name + '])) FROM [' + @.table_name + ']'
> --PRINT @.querystring
> EXEC sp_executesql @.querystring
> FETCH NEXT FROM columns_cursor INTO @.table_name, @.column_name
> END
> CLOSE columns_cursor
> DEALLOCATE columns_cursor
> select * from columnsizes
>
> --
>
> "A. Robinson" wrote:
>|||On Fri, 27 Jan 2006 09:00:03 -0800, A. Robinson wrote:

>Thanks!!
>One quick question:
>The database I'm running this against has 4500+ attributes/columns. I
>started running the procedure last night and left for the evening. When I g
ot
>back in, it was still running - 18+ hours...
>Is that normal?!?
Hi A.,
I just checked the script Mark posted, and I'm not surprised.
For each column, a query is executed to find MIN(DATALENGTH(column)) and
MAX(DATALENGTH(column)). That requires a full table scan for each
column. So if you have a 100 million row table with 40 columns, the 100
million rows will be scanned 40 times.
Hugo Kornelis, SQL Server MVP|||I didn't consider a database that large. (I live in my own little world with
little databases).
Other than adding WITH (NOLOCK) to the select statement, I can't think of
much else to modify the script as is. In reality, my script really didn't ge
t
what you wanted, which was the minimum and maximum size of a row in a table.
I gave you the minimum and maximum sizes of columns in the tables. Since
physical storage sizes for most of the datatypes are fixed (ie int, datetime
,
smalldatetime, ..) it would probably be better to only run the SELECT
MAX(DATELENGTH()), MIN(DATALENGTH()) statement on columns that had variable
physical size data capabilities (ie varchar,nvarchar,text,ntext, and so on).
Unfortunately, I can't whip that one up right now, but I will work on it and
post it here.|||an obvious optimization would be to issue one select per table, not one
per column. naturally that would require more complex programming.
makes sense?|||an obvious optimization would be to issue one select per table, not one
per column. naturally that would require more complex programming.
makes sense?|||Give this a try. The whole thing takes about 3 seconds for Northwind
It runs through a set of nested cursors for only the variable length
datatypes, then joins the output table with information_schema.columns to ge
t
the totals with the fixed length datatypes.
drop table variablerowsizes
go
CREATE TABLE VariableRowSizes (table_name varchar(50), min_size int,
max_size int, avgsize int)
GO
DECLARE @.table_name nvarchar(100)
DECLARE @.column_name nvarchar(100)
DECLARE @.column_list nvarchar(4000)
DECLARE @.querystring nvarchar(4000)
SET @.querystring = ''
set @.column_list = ''
DECLARE tables_cursor CURSOR STATIC FORWARD_ONLY
FOR
select distinct t1.table_name
from information_schema.tables t1
inner join information_schema.columns t2
on t1.table_name = t2.table_name
where t1.table_type = 'BASE TABLE'
and t2.data_type in
('varchar','nvarchar','text','ntext','va
rbinary','image','sql_variant')
order by t1.table_name
OPEN tables_cursor
FETCH NEXT FROM tables_cursor INTO @.table_name
WHILE (@.@.FETCH_STATUS = 0)
BEGIN
set @.querystring = ''
set @.column_list = ''
declare columns_cursor cursor static forward_only
for
select distinct column_name
from information_schema.columns
where table_name = @.table_name
and data_type in
('varchar','nvarchar','text','ntext','va
rbinary','image','sql_variant')
open columns_cursor
fetch next from columns_cursor into @.column_name
WHILE (@.@.FETCH_STATUS = 0)
begin
set @.column_list = @.column_list + N'DATALENGTH([' + @.column_name + N'])
+ '
fetch next from columns_cursor into @.column_name
end
close columns_cursor
deallocate columns_cursor
set @.column_list = SUBSTRING(@.column_list,1,LEN(@.column_lis
t) - 2)
set @.querystring = N'insert into variablerowsizes select '''
+ @.table_name + N''', min(' + @.column_list + N'), max(' + @.column_list
+ N'), avg(' + @.column_list
+ N') from ' + @.table_name
--print @.querystring
EXEC sp_executesql @.querystring
FETCH NEXT FROM tables_cursor INTO @.table_name
END
CLOSE tables_cursor
DEALLOCATE tables_cursor
select t3.table_name, t3.[fixed column sizes] + isnull(t4.[bit sizes],0) +
isnull(t5.min_size,0) AS "minimum row size",
t3.[fixed column sizes] + isnull(t4.[bit sizes],0) + isnull(t5.max_size,0)
AS "maximum row size",
t3.[fixed column sizes] + isnull(t4.[bit sizes],0) + isnull(t5.avgsize,0) AS
"average row size"
from (
select t1.table_name,
sum(
case
when t2.data_type = 'uniqueidentifier' then 16
when t2.data_type in ('char','nchar') then t2.character_octet_length
when t2.data_type in ('bigint','money','datetime','timestamp'
) then 8
when t2.data_type in ('int','smallmoney','smalldatetime','rea
l') then 4
when t2.data_type = 'smallint' then 2
when t2.data_type = 'tinyint' then 1
when t2.data_type in ('decimal','numeric') then
case
when t2.numeric_precision between 1 and 9 then 5
when t2.numeric_precision between 10 and 19 then 9
when t2.numeric_precision between 20 and 28 then 13
else 17
end
when t2.data_type = 'float' then
case
when t2.numeric_precision between 1 and 24 then 4
else 8
end
when t2.data_type = 'binary' then t2.character_octet_length + 4
else 0
end) as "fixed column sizes"
from information_schema.tables t1 left join information_schema.columns t2
on t1.table_name = t2.table_name
where t1.table_type = 'BASE TABLE'
group by t1.table_name
) t3 left join
(
select t1.table_name, ((count(t2.data_type) - 1) / 8) + 1 as "bit sizes"
from information_schema.tables t1
left join information_schema.columns t2
on t1.table_name = t2.table_name
where t1.table_type = 'BASE TABLE'
and t2.data_type = 'bit'
group by t1.table_name
) t4 on t3.table_name = t4.table_name
left join variablerowsizes t5
on t3.table_name = t5.table_name
"A. Robinson" wrote:

> A couple of questions regading table size:
> 1. Why is the size of a table reported by Enterprise Manager in Table Info
> so different than the size that is returned when using sp_spaceused?
> 2. I've found the calculations but was wondering if there exists a system
> stored procedure that would return the size of a row in a table? For examp
le,
> I'm looking to capture the max, min, and avg size of a row in a particular
> table.
> Any input is greatly appreciated!
> Thanks!

No comments:

Post a Comment