The standard PL/SQL package dbms_random only allows for drawing uniformly distributed (all numbers equally likely) random numbers. As far as I know, other discrete probability distributions have to be simulated. This is easily accomplished using a little elementary probability theory.

The following code allows for drawing a sample of a size to be specified at runtime from a discrete probability distribution on an arbitrary (though unfortunately necessarily finite) set of positive integers. It is just meant for illustrating a standard math trick and is neither robust nor optimized (nor wellformed :-( . It’s just a quick hack. Perhaps I will look into this.....

The sample space used and its discrete probability distribution are presented as the table prob_tab (num number, weight number), where the first column represents the set numbers to be drawn from and the second the probability of each element presented as a weighing factor. Hence the law is also relatively arbitrary. The sample drawn is inserted into a table sample_tab (num number,freq number), where freq counts the occurences of num. I assume both tables to be present at runtime.

CREATE OR REPLACEPROCEDURE draw_sample (p_size IN NUMBER) AS
    -- define datatypes to store the probability distribution  type  prob_rec is record (weight number, distrib number);  type  prob_tab is table of prob_rec index by pls_integer;    prec        prob_rec;  prob        prob_tab;
    total_weight  number;
  -- set max.precision for the uniform distribution on 0<=x<=1  -- dbms_random.random produces -2^31 <= integer <= 2^31  uni_prec      constant number := power(2,31);  uni           number;    -- load probability data   cursor c_prob is     select * from prob_dist order by num asc;  cursor c_sample (p_num number) is     select 'x' from sample_tab where num=p_num;        i             number;  cumul_weight  number;  BEGIN  delete sample_tab;  select sum(weight) into total_weight from prob_dist;
  for r_prob in c_prob loop    -- initialize sample_tab table     insert into sample_tab values (r_prob.num,0);      -- we need the prob.distribution function    select sum(weight) into cumul_weight      from prob_dist        where num <= r_prob.num;        prec.weight:=r_prob.weight;    prec.distrib:=cumul_weight/total_weight;      prob(r_prob.num) := prec;
    end loop;
    -- draw the sample  for t in 1..p_size loop    -- draw a unif.distributed random number    uni := mod(abs(dbms_random.random),uni_prec)/uni_prec;        -- find the corresp.subinterval of [0,1]    i := prob.first;     loop      prec:=prob(i);       exit when uni <= prec.distrib;      i := prob.next(i);      exit when (i is null);    end loop;        -- store the sample    if (i is not null) then      update sample_tab set freq = freq + 1 where num = i;    end if;  end loop;  commit;END;

The trick consists of partitioning the unit interval into subintervals of length corresponding to the weight of the numbers in the sample space and determining in which subinterval a dbms_random gererated uniformly distributed number falls. The index of the subinterval yields the number drawn.

Put a little more mathematically, what is essentially does is apply the generalized or Penrose inverse of the desired probabilty distribution function to a (continuous) uniformly distributed (on 0<=x<=1) random variable. This yields a random variable with the correct distribution.

The code can of course be generalized eg.to allow for sampling from a set of strings but my point should be clear.