My First Steps with RAC One Node 11gR2 (11.2.0.2)

6

I’ve been using RAC (10gR2) for years now and I was wondering how RAC One Node differs from RAC. That’s why I used a few ‘old’ RAC servers that we had ‘laying’ around to install a RAC One Node Cluster. I also used that cluster to get some hands on experience with ASM but that is not the topic for this post, so I will not go into that further.

I assume that installing the software is a breeze for everyone who has some experience with RAC and who has read the manual so I will not go into that as well. Once you have fulfilled the prerequisites it’s a matter of next, next, finish.

The first thing I noticed is that the instance name used is the database name with the use of an _1 or _2 (as I have two nodes) instead of just appending the instance number to the database name.

So let’s see what happens when I pull the plug of server1 (aka give it a reboot) on which orcl11g_1 is running?

That’s unexpected: orcl11g_1 is being started on server2!

I expected that orcl11g_2 was going to be started on server2.

Starting an Online Database Relocation is starting orcl11g_2 on server1 which was up and running again before I started the relocation. That is expected as the relocation can only be “Online” if there are two instances alive for at least a short time.

Can we start an instance on an other node?

srvctl start instance -d orcl11g -n server1 -i orcl11g_1
PRKO-2136 : 'srvctl start/stop/enable/disable/modify/status/setenv/getenv/unsetenv instance' commands are not supported with RAC One Node databases

To get orcl11g_1 on server1, or orcl11g_2 on server2 I needed to do a reboot of the server running the active instance. So if you run scripts on your servers you need to be aware that the instance running on that server can have a different name then you expect.

So let’s see how “Online” the relocation is out of the box…
I have instance_2 running on server2 and start an Online Database Relocation from DBConsole while I have an open sqlplus session to a service that is currently running on orcl11g_2.

The output of the Online Database Relocation job in DBConsole is:

RAC One node: Online Database Relocation to other node.
Configuration updated to two instances
Instance orcl11g_1 started
Services relocated
Waiting for 30 minutes for instance orcl11g_2 to stop.....
Instance orcl11g_2 stopped
Configuration updated to one instance

Trying:

select instance_name, host_name from v$instance;
select instance_name, host_name from v$instance
*
ERROR at line 1:
ORA-03113: end-of-file on communication channel

Hummm that’s not supposed to happen!

What is the current definition of the service?

select name, failover_method, failover_type, failover_retries, failover_delay, goal,enabled, aq_ha_notifications, clb_goal
from dba_services where name='service_one';
NAME FAILOVER_METHOD FAILOVER_TYPE FAILOVER_RETRIES FAILOVER_DELAY GOAL ENA AQ_ CLB_G
------------- --------------- ------------- ---------------- -------------- ------------ --- --- -----
service_one NONE NONE 0 0 NONE NO NO LONG

So there is no TAF configured for this service and I’m not sure I didn’t configure that or if I wasn’t asked by the installer.

In 10G I could configure TAF like this:

execute dbms_service.modify_service ( service_name => 'service_one', aq_ha_notifications => TRUE , failover_method => dbms_service.failover_method_basic, failover_type => dbms_service.failover_type_select, failover_retries => 5, failover_delay => 5, clb_goal => dbms_service.clb_goal_long);

It seems to have worked:

SQL> select name, failover_method,failover_type,failover_retries,failover_delay,goal,enabled,aq_ha_notifications,clb_goal from dba_services
2 where name='service_one';

NAME FAILOVER_METHOD FAILOVER_TYPE FAILOVER_RETRIES FAILOVER_DELAY GOAL ENA AQ_ CLB_G
------------- --------------- ------------- ---------------- -------------- ------------ --- --- -----
service_one BASIC SELECT 5 5 NONE NO YES LONG

Let’s do an Online Database Relocation and query in a sqlplus session.

SQL> select instance_name, host_name from v$instance;
select instance_name, host_name from v$instance
*
ERROR at line 1:
ORA-03113: end-of-file on communication channel

Log on again, check the service settings:

SQL> select name, failover_method,failover_type,failover_retries,failover_delay,goal,enabled,aq_ha_notifications,clb_goal from dba_services
2 where name='service_one';

NAME FAILOVER_METHOD FAILOVER_TYPE FAILOVER_RETRIES FAILOVER_DELAY GOAL ENA AQ_ CLB_G
------------- --------------- ------------- ---------------- -------------- ------------ --- --- -----
service_one NONE NONE 0 0 NONE NO NO LONG

It looks like the TAF settings of the service are lost…

Let’s set them again on this instance and see if it helps.

SQL> select name, failover_method,failover_type,failover_retries,failover_delay,goal,enabled,aq_ha_notifications,clb_goal from dba_services
2 where name='service_one';

NAME FAILOVER_METHOD FAILOVER_TYPE FAILOVER_RETRIES FAILOVER_DELAY GOAL ENA AQ_ CLB_G
------------- --------------- ------------- ---------------- -------------- ------------ --- --- -----
service_one BASIC SELECT 5 5 NONE NO YES LONG

Do an other Online Database Relocation and query again in the sqlplus session.

SQL> select instance_name, host_name from v$instance;
select instance_name, host_name from v$instance
*
ERROR at line 1:
ORA-03113: end-of-file on communication channel

No it didn’t work. I had a look at the manual and it seems that in 11gR2 you need to set TAF on a service via srvctl.

Now what are the current settings?

srvctl config service -d orcl11g -s service_one -v

Service name: service_one
Service is enabled
Server pool: orcl11g
Cardinality: 1
Disconnect: false
Service role: PRIMARY
Management policy: AUTOMATIC
DTP transaction: false
AQ HA notifications: false
Failover type: NONE
Failover method: NONE
TAF failover retries: 0
TAF failover delay: 0
Connection Load Balancing Goal: LONG
Runtime Load Balancing Goal: NONE
TAF policy specification: BASIC
Edition:
Preferred instances: orcl11g_1
Available instances:

Set TAF via srvctl:

srvctl modify service -d orcl11g -s service_one -q TRUE -P BASIC -j LONG -e SELECT -m BASIC -z 5 -w 5

srvctl config service -d orcl11g -s service_one -v

Service name: service_one
Service is enabled
Server pool: orcl11g
Cardinality: 1
Disconnect: false
Service role: PRIMARY
Management policy: AUTOMATIC
DTP transaction: false
AQ HA notifications: true
Failover type: SELECT
Failover method: BASIC
TAF failover retries: 5
TAF failover delay: 5
Connection Load Balancing Goal: LONG
Runtime Load Balancing Goal: NONE
TAF policy specification: BASIC
Edition:
Preferred instances: orcl11g_1
Available instances:

Now an Online Database Relocation works without an ORA-03113 in the client session:

SQL> select instance_name, host_name from v$instance;

INSTANCE_NAME HOST_NAME
---------------- --------------------------------------
orcl11g_1 server1

Pulling the plug on that server results in a True Application Failover as well:

SQL> select instance_name, host_name from v$instance;

INSTANCE_NAME HOST_NAME
---------------- --------------------------------------
orcl11g_1 server2

And that is very nice indeed :-)

You might want to get a feeling for how long it will take for your servers to start an instance and set the values for TAF failover retries and TAF failover delay accordingly.
Or you will still get an: ORA-03113: end-of-file on communication channel

If I use the values of 5 retries and a 5 second delay and start to query when I rebooted a server I got the ORA-03113. But with a TAF failover delay of 60 seconds the query returned with results after about 122 seconds.

One more thing I noticed: The OEM agent cannot be started via “emctl start agent” anymore. But it will be started with: “emctl start dbconsole”
When the DBConsole is already running on the other server this command will only start the agent on this server.

Share.

About Author

6 Comments

  1. @mr_love
    Question nr1: You have a failover therefor you see the other servername. See my article above
    Question nr2: The delay depends on how fast your machine can start the instance on the other node. With RAC One node only one node is active at a time.

  2. sorry. I read your post is usefull. Thanks u so muck.
    but i have 2 questions, pls ansewr :
    Question1 :
    – i have 2 node. and 1 client to connect.
    when 2 node running. i check in client : select from v$instance . i allway see : only node1 or node 2 running.
    i think: after some minutes,  the host_name must change from node1 to node 2 ( or node 2 to node 1)

    1
    SQL> select instance_name, host_name from v$instance;

    2

    3
    INSTANCE_NAME    HOST_NAME

    4
    ---------------- --------------------------------------

    5
    orcl11g_1        server1

    1
    SQL> select instance_name, host_name from v$instance;

    2

    3
    INSTANCE_NAME    HOST_NAME

    4
    ---------------- --------------------------------------

    5
    orcl11g_1        server2

    Question 2 : 2 node is running , client is running node 2. when i shutdown node 2, after about 1 minute, the client is running node 1. why the delay is too slow. i declare delay 5 s in client.
    my tnsname in client:
    ABC =
    (DESCRIPTION =
    (ENABLE=BROKEN)
    (LOAD_BALANCE=ON)
    (ADDRESS=(PROTOCOL=TCP)(HOST=1xxx.117)(PORT=1521))
    (ADDRESS=(PROTOCOL=TCP)(HOST=1xxx.118)(PORT=1521))
    (CONNECT_DATA=
    (SERVICE_NAME=ABC)
    (FAILOVER_MODE =
    (TYPE=session)
    (METHOD=basic)
    (RETRIES=10)
    (DELAY=5)
    )
    )
    )
     
     
    thanks u.

  3. Hi
    This was my exam question last week and still not sure about the answer.
    Any help is much appreciated.

    Which three statements are true about services and transparent application failover (TAF)
    A-)TAF has been configured for a service, sessions using that service fail over to a surviving instance when an outage occurs
    B-)The TAF setting on a service can be none, basic, preconnect or postconnect and overrides and TAF setting in the client connection definition.
    C-)TAF can restart a query after failover has completed but for other statements such as insert,update or delete, the application must resubmit the transaction
    D-)The Taf setting for a client connection overrides any TAF setting in the service definition
    E-)Services simplify the deployment of TAF because definining a TAF policy for a service, all connections using this service will automatically have TAF enabled

  4. Nice post.  I am looking into RAC One as well.  Do you think it is a good replacement for Veritas VCS solution?