千万数据的连续ID表，快速读取其中指定的某1000条数据？

作者: 邀月来源: 博客园发布时间: 2010-08-29 19:56 阅读: 3067 次推荐: 0 原文链接 [收藏]

摘要：一张上千万数据的表，结构很简单：ID是自增的，你怎么快速读取其中指定的某1000条数据，比如100万到100万零1000？怎么做呢？看下文。

[1] 千万数据的连续ID表，快速读取其中指定的某1000条数据？
[2] 千万数据的连续ID表，快速读取其中指定的某1000条数据？

　　有这样一个需求：一张上千万数据的表，结构很简单：ID是自增的，你怎么快速读取其中指定的某1000条数据，比如100万到100万零1000？这个需求其实很简单，因为是自增型ID，可能分两种状况：有聚集索引或Heap，如果是后者，我想用ID和新增时间组建非聚集索引。效果应该相差不大。于是动手，过程如下：

　　一、准备测试数据

　　基本测试环境：
邀月工作室

　　插入1000万测试数据：


/***************创建千万级测试数据库***********
****************downmoon 3w@live.cn ***************/

Create database HugeData_10Millons
go
use HugeData_10Millons
go

/***************创建测试表*********************
****************downmoo  3w@live.cn ***************/

IF NOT OBJECT_ID('[bigTable]') IS NULL
    DROP TABLE [bigTable]
GO
Create table bigTable
(PID int identity(1,1) primary key not null
,PName nvarchar(100) null
,AddTime dateTime null
,PGuid Nvarchar(40)
)
go

truncate table [bigTable]

/***************创建第一个25万测试数据*********************
****************downmoo  3w@live.cn ***************/

declare @d datetime 
set @d=getdate() 

declare @i int
set @i=1
while @i<=250000
begin
    insert into [bigTable]
    select cast(datepart(ms,getdate()) as nvarchar(3))+Replicate('A',datepart(ss,getdate()))
    ,getdate()
    ,NewID()
    set @i=@i+1
end

select [语句执行花费时间(毫秒)]=datediff(ms,@d,getdate()) 

/*
语句执行花费时间(毫秒)
94750
*/

/***************创建第二个25万测试数据*********************
****************downmoo  3w@live.cn ***************/

declare @d datetime 
set @d=getdate() 

declare @i int
set @i=1
while @i<=250000
begin
    insert into [bigTable]
    select cast(datepart(ms,getdate()) as nvarchar(3))+Replicate(Substring(cast(NEWID() as nvarchar(40)),1,6),3)
    ,getdate()
    ,NewID()
    set @i=@i+1
end

select [语句执行花费时间(毫秒)]=datediff(ms,@d,getdate()) 

/*
语句执行花费时间(毫秒)
115640
*/

/***************创建900万测试数据*********************
****************downmoo  3w@live.cn ***************/

declare @d datetime 
set @d=getdate() 

declare @i int
set @i=1
while @i<=9000000
begin
    insert into [bigTable]
    select replicate('X',ROUND((RAND()* 60),0) )+cast(datepart(ms,getdate()) as nvarchar(3))
    ,getdate()
    ,NewID()
    set @i=@i+1
end

select [语句执行花费时间(毫秒)]=datediff(ms,@d,getdate()) 
/*
语句执行花费时间(毫秒)
3813686
*/

/***************创建最后50万测试数据*********************
****************downmoo  3w@live.cn ***************/

declare @d datetime 
set @d=getdate() 

declare @i int
set @i=1
while @i<=500000
begin
    insert into [bigTable]
    select replicate('X',ROUND((RAND()* 60),0) )+cast(NewID() as nvarchar(40))
    ,getdate()
    ,NewID()
    set @i=@i+1
end

select [语句执行花费时间(毫秒)]=datediff(ms,@d,getdate()) 
/*
语句执行花费时间(毫秒)
207436
*/

/*
检查数量
select count(1) from dbo.bigTable
----------10000000
清除日志
DUMP TRANSACTION HugeData_10Millons WITH NO_LOG
BACKUP LOG HugeData_10Millons WITH NO_LOG
DBCC SHRINKDATABASE(HugeData_10Millons)

*/

　　完成后，数据文件大小如下：

邀月工作室

　　二、创建一个存储过程用于测试


/***************查中间某段1000条顺序数据*********************
****************downmoo  3w@live.cn ***************/
Create procedure GetTop1000RecordsByRange
(@begin int
,@end int
)
as 
select top 1000 * from [bigTable]
where PID between @begin and @end
go

　　邀月说明：其实，加不加top对查询并没有影响。后面的测试证实了这一点。因为将top　1000 去掉后，清除过程计划缓存，仍然得出相同的计划结果。

　　测试语句：

declare @d datetime 
set @d=getdate() 

exec GetTop100RecordsByRange 1000000,10001000

select [语句执行花费时间(毫秒)]=datediff(ms,@d,getdate())

　　此时，由于SQL Server默认为主键PID创建了聚集索引，查询速度比较理想，平均为0-16毫秒之间，更接近于0

　　查询计划也如我所料：
邀月工作室　　而如果以Pguid作为聚集索引键，查询计划如下：

　　如果以AddTime作为聚集索引键，查询计划：
邀月工作室

继续>>下一页

[第1页][第2页]

标签：SQL Server

千万数据的连续ID表，快速读取其中指定的某1000条数据？

推荐链接

数据库热门文章

数据库最新文章

最新新闻

热门新闻