Virtual Private Database... 548e3b6c322e606acca4c29c49b32bd9

Virtual Private Database…

Some time ago I was asked to assist in fixing or at least finding the cause of a performance problem. The application ran fine until the Virtual Private Database (VPD) policy was applied. Oracle claims there should be near zero impact on your application when you implement VPD, then how is this possible?

First of all, the policy applied was a rather complex one. A complex query should be executed to determine if the current user has access to the record. Let’s say this query takes up a second, then I would expect my query to run about a second slower, maybe two. But the query took several minutes to complete when the VPD policy was applied. This didn’t make sense to me, so I decided to find out what was really happening.
To do this, I opened up my sandbox database to try and recreate this situation.
First I need to create two new users

create user vpd1 identified by vpd1
/
grant connect, resource to vpd1
/
create user vpd2 identified by vpd2
/
grant connect, resource to vpd2
/

Then I created a simple table to hold the data that should be protected by the VPD policy:

drop table emp purge
/
create table emp
       (empno number(4) not null, 
        ename varchar2(10), 
        job varchar2(9), 
        mgr number(4), 
        hiredate date, 
        sal number(7, 2), 
        comm number(7, 2), 
        deptno number(2))
;
insert into emp values (7369, 'SMITH',  'CLERK',     7902, to_date('17-12-1980', 'DD-MM-YYYY'),  800, null, 20);
insert into emp values (7499, 'ALLEN',  'SALESMAN',  7698, to_date('20-02-1981', 'DD-MM-YYYY'), 1600,  300, 30);
insert into emp values (7521, 'WARD',   'SALESMAN',  7698, to_date('22-02-1981', 'DD-MM-YYYY'), 1250,  500, 30);
insert into emp values (7566, 'JONES',  'MANAGER',   7839, to_date('02-04-1981', 'DD-MM-YYYY'),  2975, null, 20);
insert into emp values (7654, 'MARTIN', 'SALESMAN',  7698, to_date('28-09-1981', 'DD-MM-YYYY'), 1250, 1400, 30);
insert into emp values (7698, 'BLAKE',  'MANAGER',   7839, to_date('01-05-1981', 'DD-MM-YYYY'),  2850, null, 30);
insert into emp values (7782, 'CLARK',  'MANAGER',   7839, to_date('09-06-1981', 'DD-MM-YYYY'),  2450, null, 10);
insert into emp values (7788, 'SCOTT',  'ANALYST',   7566, to_date('09-12-1982', 'DD-MM-YYYY'), 3000, null, 20);
insert into emp values (7839, 'KING',   'PRESIDENT', null, to_date('17-11-1981', 'DD-MM-YYYY'), 5000, null, 10);
insert into emp values (7844, 'TURNER', 'SALESMAN',  7698, to_date('08-09-1981', 'DD-MM-YYYY'),  1500,    0, 30);
insert into emp values (7876, 'ADAMS',  'CLERK',     7788, to_date('12-01-1983', 'DD-MM-YYYY'), 1100, null, 20);
insert into emp values (7900, 'JAMES',  'CLERK',     7698, to_date('03-12-1981', 'DD-MM-YYYY'),   950, null, 30);
insert into emp values (7902, 'FORD',   'ANALYST',   7566, to_date('03-12-1981', 'DD-MM-YYYY'),  3000, null, 20);
insert into emp values (7934, 'MILLER', 'CLERK',     7782, to_date('23-01-1982', 'DD-MM-YYYY'), 1300, null, 10);
commit
/
drop table emp_vpd purge
/
create table emp_vpd as select * from emp
/
commit
/

And of course I need to grant access to this table to the newly created users:

grant all on emp_vpd to vpd1
/
grant all on emp_vpd to vpd2
/

On the table I need to create a policy function so I create a package (which mimics the customers package, just simpler) to do this:

create or replace package emp_vpd_policy as
  function first_policy(owner_in   in varchar2
                       ,objname_in in varchar2) return varchar2;
  function allowed(empno_in  in number
                  ,deptno_in in number) return number;
end emp_vpd_policy;
/
sho err
create or replace package body emp_vpd_policy as
  function first_policy(owner_in   in varchar2
                       ,objname_in in varchar2) return varchar2 is
  begin
    dbms_output.put_line('first policy');
    if (user = 'VPD1') then
      return 'emp_vpd_policy.allowed(emp_vpd.empno, emp_vpd.deptno)=10';
    elsif user = 'VPD2' then
      return 'emp_vpd_policy.allowed(emp_vpd.empno, emp_vpd.deptno)=20';
    else
      return '1=1';
    end if;
  end first_policy;
  function allowed(empno_in  in number
                  ,deptno_in in number) return number is
  begin
    dbms_output.put_line('emp_vpd_policy.allowed(' || empno_in || ',' || deptno_in || ')');
    return deptno_in;
  end allowed;
end emp_vpd_policy;
/
sho err

and then protect the EMP_VPD table using a policy:

begin
  sys.dbms_rls.add_policy(object_schema => 'DEMO'
                         ,object_name => 'EMP_VPD'
                         ,policy_name => 'EMP_VPD_SEL'
                         ,function_schema => '&myuser'
                         ,policy_function => 'EMP_VPD_POLICY.FIRST_POLICY'
                         ,statement_types => 'SELECT');
end;
/

The package will show what will happen when I perform a select on the table:

conn vpd1/vpd1
set serveroutput on size unlimited
select * from demo.emp_vpd
/
EMPNO ENAME      JOB         MGR HIREDATE          SAL      COMM DEPTNO
----- ---------- --------- ----- ----------- --------- --------- ------
 7782 CLARK      MANAGER    7839 6/9/1981      2450.00               10
 7839 KING       PRESIDENT       11/17/1981    5000.00               10
 7934 MILLER     CLERK      7782 1/23/1982     1300.00               10
first policy
first policy
emp_vpd_policy.allowed(7369,20)
emp_vpd_policy.allowed(7499,30)
emp_vpd_policy.allowed(7521,30)
emp_vpd_policy.allowed(7566,20)
emp_vpd_policy.allowed(7654,30)
emp_vpd_policy.allowed(7698,30)
emp_vpd_policy.allowed(7782,10)
emp_vpd_policy.allowed(7788,20)
emp_vpd_policy.allowed(7839,10)
emp_vpd_policy.allowed(7844,30)
emp_vpd_policy.allowed(7876,20)
emp_vpd_policy.allowed(7900,30)
emp_vpd_policy.allowed(7902,20)
emp_vpd_policy.allowed(7934,10)

In my case this is done rather quickly, there’s almost no difference in timing for the query with or without the policy applied. But as you can see, the policy is executed for each and every record that is being checked. Well, not really the policy itself, but the function that is defined in the policy. So if this function takes a lot of time and your table has a lot of records then the query will run for a very long time. There has got to be a better way to do this.
Let’s analyze what happens, the actual policy is executed twice. What if we use this architecture to our benefit. In the first pass we can setup some in memory data structure to hold whatever we need, this might take some time and then in the second pass we can use this data to be used in the actual check.
First we drop the policy so we can create a new one:

begin
  sys.dbms_rls.drop_policy(object_schema => '&myuser'
                          ,object_name => 'EMP_VPD'
                          ,policy_name => 'EMP_VPD_SEL');
end;
/

For our implementation we need a Nested Table type to be created in the database:

create or replace type empnos_tt is table of number(4) 
/

Then we create a new package to hold the policy function.

create or replace package emp_vpd_pp as
  function sel( owner_in   in varchar2
              , objname_in in varchar2
              ) return varchar2;
  function read_g_empnos return empnos_tt;
end emp_vpd_pp;
/ 
sho err

The function SEL will be used in the policy. The function READ_G_EMPNOS is needed to retrieve the data in the package variable. Then the actual implementation of the package:

create or replace package body emp_vpd_pp as
  g_empnos empnos_tt;
  beenhere boolean := false;
  function sel( owner_in   in varchar2
              , objname_in in varchar2
              ) return varchar2 is
  begin
    if not(beenhere) then
      if user = 'VPD1' then
        begin
          select emp.empno
            bulk collect into g_empnos
            from emp
           where emp.deptno = 10;
         exception
           when others then
           dbms_output.put_line(sqlerrm);
         end;
      elsif user = 'VPD2' then
        begin
          select emp.empno
            bulk collect into g_empnos
            from emp
           where emp.deptno = 20;
         exception
           when others then
           dbms_output.put_line(sqlerrm);
         end;
      end if;
    end if;
    beenhere := not(beenhere);
    if ((user = 'VPD1') or (user = 'VPD2')) then
      return 'emp_vpd.empno in (select column_value
                                  from table(emp_vpd_pp.read_g_empnos))';
    else
      return '1=1';
    end if;
  end sel;
  function read_g_empnos return empnos_tt
    is
    begin
      return (g_empnos);
    end;
begin
  beenhere := false; 
end emp_vpd_pp;
/
sho err

In the initialization section of the package we initialize the Boolean variable. Then, when the policy function is executed for the first time (per query) we select the column values we need and save that into the package variable. The second time we execute the policy function we use the values saved in the predicate that is being added.

begin
  sys.dbms_rls.add_policy(object_schema => 'DEMO'
                         ,object_name => 'EMP_VPD'
                         ,policy_name => 'EMP_VPD_SEL'
                         ,function_schema => 'DEMO'
                         ,policy_function => 'EMP_VPD_PP.SEL'
                         ,statement_types => 'SELECT');
end;
/

Notice the predicate with the use of the Nested Table is executed always, but the Nested Table is only filled up in the first execution of the policy function. Using this technique the database only has to execute the expensive query once and its result can be used multiple times at almost no cost.
Using this policy function has exactly the same result, but the execution improved dramatically. Using this technique the database only has to execute the expensive query once per query instead of for every row.

One Response

  1. Marcus August 2, 2016