Advertise with us

Using TSQL to select a percentage of the total sum of records at random

Posted by  Mark Devid
 1441  View(s)
Rate this:

I have a table that contains Road reference numbers and road length, with columns RoadID (int) and RoadLength (int).

There are around 3000 rows. Using T-SQL I need to extract a random selection of road references and their length where the sum of the length adds up to 5% of the

total length of all the roads in the table. This is for an annual road survey where roads are selected at random.

I'm using T-SQL against a SQL Server 2008 database. Tried a few variations on triangular queries from this article

-149-1.aspx but struggling with selecting random rows. I tried using order by newID() but my results don't look correct.

Any help with the most efficient way to do this would be appreciated. Thanks

Thanks in advance!
  1. Re: Using TSQL to select a percentage of the total sum of records at random

    Hi Mark!

    Messy, but it seems to work

    --Create a temp table and add a random number column
    CREATE TABLE #Roads(ROW_NUM int, RoadID int, RoadLength int)

    --Populate from zt_Roads table and add a random number field
    INSERT #Roads (ROW_NUM , RoadID , RoadLength )
                        (SELECT ROW_NUMBER() OVER (ORDER BY NEWID()),
                             from zt_Roads)

    --Calcualte 5% of the TOTAL length of ALL roads
    declare @FivePercent int
    SELECT  @FivePercent =  ROUND(Sum(IsNULL((RoadLength ),0))*.01,0) from zt_Roads
    print 'One Percent of total length = ' 
    Print @FivePercent

    --Select a random sample from temp table so that the total sample length 
    --is no more than 5% of all roads in table
    ; with RandomSample as 
    (SELECT top 100 percent 
            COALESCE((Select Sum(RoadLength) from #Roads b 
            WHERE b.ROW_NUM < a.ROW_NUM),0) as RunningTotal

            From #Roads  a
            ORDER BY ROW_NUM)

    Select * from RandomSample WHERE RunningTotal <@FivePercent 
    Drop table #Roads