Using TSQL to select a percentage of the total sum of records at random

Total Post:102

Points:714
 1133  View(s)
Ratings:
Rate this:
Hi!

I have a table that contains Road reference numbers and road length, with columns RoadID (int) and RoadLength (int).

There are around 3000 rows. Using T-SQL I need to extract a random selection of road references and their length where the sum of the length adds up to 5% of the

total length of all the roads in the table. This is for an annual road survey where roads are selected at random.

I'm using T-SQL against a SQL Server 2008 database. Tried a few variations on triangular queries from this article http://www.sqlservercentral.com/Forums/Topic793008

-149-1.aspx but struggling with selecting random rows. I tried using order by newID() but my results don't look correct.

Any help with the most efficient way to do this would be appreciated. Thanks

Thanks in advance!
  1. Post:604

    Points:4228
    Re: Using TSQL to select a percentage of the total sum of records at random

    Hi Mark!


    Messy, but it seems to work

    --Create a temp table and add a random number column
    CREATE TABLE #Roads(ROW_NUM int, RoadID int, RoadLength int)

    --Populate from zt_Roads table and add a random number field
    INSERT #Roads (ROW_NUM , RoadID , RoadLength )
                        (SELECT ROW_NUMBER() OVER (ORDER BY NEWID()),
                            RoadID,
                            RoadLength
                             from zt_Roads)
    go

    --Calcualte 5% of the TOTAL length of ALL roads
    declare @FivePercent int
    SELECT  @FivePercent =  ROUND(Sum(IsNULL((RoadLength ),0))*.01,0) from zt_Roads
    print 'One Percent of total length = ' 
    Print @FivePercent

    --Select a random sample from temp table so that the total sample length 
    --is no more than 5% of all roads in table
    ; with RandomSample as 
    (SELECT top 100 percent 
        ROW_NUM, 
        RoadID, 
        RoadLength, 
        RoadLength+
            COALESCE((Select Sum(RoadLength) from #Roads b 
            WHERE b.ROW_NUM < a.ROW_NUM),0) as RunningTotal

            From #Roads  a
            ORDER BY ROW_NUM)


    Select * from RandomSample WHERE RunningTotal <@FivePercent 
    Drop table #Roads

Answer

NEWSLETTER

Enter your email address here always to be updated. We promise not to spam!