Home > DeveloperSection > Forums > Using TSQL to select a percentage of the total sum of records at random
Mark Devid
Mark Devid

Total Post:102

Points:714
Posted on    May-06-2013 2:05 AM

 MSSQL Server MSSQL Server 
Ratings:


 1 Reply(s)
 732  View(s)
Rate this:
Hi!

I have a table that contains Road reference numbers and road length, with columns RoadID (int) and RoadLength (int).

There are around 3000 rows. Using T-SQL I need to extract a random selection of road references and their length where the sum of the length adds up to 5% of the

total length of all the roads in the table. This is for an annual road survey where roads are selected at random.

I'm using T-SQL against a SQL Server 2008 database. Tried a few variations on triangular queries from this article http://www.sqlservercentral.com/Forums/Topic793008

-149-1.aspx but struggling with selecting random rows. I tried using order by newID() but my results don't look correct.

Any help with the most efficient way to do this would be appreciated. Thanks

Thanks in advance!


AVADHESH PATEL

Total Post:604

Points:4228
Posted on    May-06-2013 9:03 AM

Hi Mark!

Messy, but it seems to work

--Create a temp table and add a random number column
CREATE TABLE #Roads(ROW_NUM int, RoadID int, RoadLength int)

--Populate from zt_Roads table and add a random number field
INSERT #Roads (ROW_NUM , RoadID , RoadLength )
                    (SELECT ROW_NUMBER() OVER (ORDER BY NEWID()),
                        RoadID,
                        RoadLength
                         from zt_Roads)
go

--Calcualte 5% of the TOTAL length of ALL roads
declare @FivePercent int
SELECT  @FivePercent =  ROUND(Sum(IsNULL((RoadLength ),0))*.01,0) from zt_Roads
print 'One Percent of total length = ' 
Print @FivePercent

--Select a random sample from temp table so that the total sample length 
--is no more than 5% of all roads in table
; with RandomSample as 
(SELECT top 100 percent 
    ROW_NUM, 
    RoadID, 
    RoadLength, 
    RoadLength+
        COALESCE((Select Sum(RoadLength) from #Roads b 
        WHERE b.ROW_NUM < a.ROW_NUM),0) as RunningTotal

        From #Roads  a
        ORDER BY ROW_NUM)


Select * from RandomSample WHERE RunningTotal <@FivePercent 
Drop table #Roads

Don't want to miss updates? Please click the below button!

Follow MindStick