viernes, 17 de junio de 2016

Example Redefinition Desde Enterprise Manager

--   Base de datos destino: USCS90PR
--   Script generado en: 17-JUN-2016   11:51
CREATE TABLE "SYSADM"."LC_MNT_CLASS_CA$REORG" ( "INSTITUTION" VARCHAR2(5) NOT NULL ,  "CAMPUS" VARCHAR2(5) NOT NULL ,  "ACAD_CAREER" VARCHAR2(4) NOT NULL ,  "STRM" VARCHAR2(4) NOT NULL ,  "CRSE_ID" VARCHAR2(6) NOT NULL ,  "CRSE_OFFER_NBR" NUMBER NOT NULL ,  "CLASS_NBR" NUMBER NOT NULL ,  "SESSION_CODE" VARCHAR2(3) NOT NULL ,  "CLASS_SECTION" VARCHAR2(4) NOT NULL ,  "LC_EST_MIGRACION" VARCHAR2(1) NOT NULL ,  "DESCRLONG" CLOB) TABLESPACE "PSDEFAULT" PCTFREE 10 INITRANS 1 MAXTRANS 255 STORAGE  ( INITIAL 64K BUFFER_POOL DEFAULT)  LOGGING  NOCOMPRESS  LOB ("DESCRLONG") STORE AS  BASICFILE ( TABLESPACE "PSDEFAULT" CHUNK 8192  STORAGE ( INITIAL 64K BUFFER_POOL DEFAULT)  RETENTION  NOCACHE  LOGGING )
COMMIT
BEGIN DBMS_REDEFINITION.START_REDEF_TABLE('"SYSADM"', '"LC_MNT_CLASS_CA"', '"LC_MNT_CLASS_CA$REORG"', NULL, DBMS_REDEFINITION.CONS_USE_ROWID); END;
{SYSADM} GRANT SELECT ON "SYSADM"."LC_MNT_CLASS_CA$REORG" TO "CES"
{SYSADM} GRANT SELECT ON "SYSADM"."LC_MNT_CLASS_CA$REORG" TO "CES_ITIS"
{SYSADM} GRANT SELECT ON "SYSADM"."LC_MNT_CLASS_CA$REORG" TO "CES_USBBO"
{SYSADM} GRANT SELECT ON "SYSADM"."LC_MNT_CLASS_CA$REORG" TO "CES_USBCA"
{SYSADM} GRANT SELECT ON "SYSADM"."LC_MNT_CLASS_CA$REORG" TO "CES_USBCT"
{SYSADM} GRANT SELECT ON "SYSADM"."LC_MNT_CLASS_CA$REORG" TO "CES_USBME"
BEGIN DBMS_STATS.GATHER_TABLE_STATS('"SYSADM"', '"LC_MNT_CLASS_CA$REORG"', estimate_percent=>NULL, cascade=>TRUE); END;
BEGIN DBMS_REDEFINITION.FINISH_REDEF_TABLE('"SYSADM"', '"LC_MNT_CLASS_CA"', '"LC_MNT_CLASS_CA$REORG"'); END;
DROP TABLE "SYSADM"."LC_MNT_CLASS_CA$REORG" CASCADE CONSTRAINTS PURGE

jueves, 9 de junio de 2016

Step by Step Guide on Creating Physical Standby Using RMAN DUPLICATE...FROM ACTIVE DATABASE (Doc ID 1075908.1)

Applies to:

Oracle Database - Enterprise Edition - Version 9.2.0.8 and later
Information in this document applies to any platform.
Checked for relevance on 21-AUG-2015


Goal


Step by step guide on how to create a physical standby database using RMAN DUPLICATE FROM ACTIVE DATABASE command without shutting down the primary and using primary active database files (No need to take a backup)
This is feature available in Oracle11g onwards.
In case you may want or need more about your current topic - please also access the Backup & Recover Community of Customers and Oracle Specialists directly via:
https://community.oracle.com/community/support/oracle_database/database_backup_and_recovery

Database Name :- chicago
Primary db_unique_name :- chicago
standby db_unique_name :- Boston

For duplicating a NON-Standby database, see :
   Note 452868.1 RMAN 'Duplicate From Active Database' Feature in 11G

Solution

1. Make the necessary changes to the primary database.
    a. Enable force logging.
    b. Creating the password file if one does not exist.
    c. Create standby redologs.
    d. Modify the parameter file suitable for Dataguard.
2. Ensure that the sql*net connectivity is working fine.
3. Create the standby database over the network using the active(primary) database files.
    a. Create the password file
    b. Create the initialization parameter file for the standby database (auxiliary database)
    c. Create the necessary mount points or the folders for the database files
    d. Run the standby creation ON STANDBY by connecting to primary as target database.

DUPLICATE TARGET DATABASE 
FOR STANDBY
FROM ACTIVE DATABASE
SPFILE
   PARAMETER_VALUE_CONVERT '', ''
   SET DB_FILE_NAME_CONVERT '', ''
   SET LOG_FILE_NAME_CONVERT '', ''
   SET SGA_MAX_SIZE 200M
   SET SGA_TARGET 125M;
4. Check the log shipping and apply.

PROCEDURE:

While creating the standby database we use the active database files i.e., this command will be useful in creating the physical standby database using active database files over the network.

1. Prepare the production database to be the primary database

a. Ensure that the database is in archivelog mode .
SQL> select log_mode from v$database;

LOG_MODE
------------
ARCHIVELOG

b. Enable force logging
SQL> ALTER DATABASE FORCE LOGGING;

c. Create standby redologs
SQL> alter database add standby logfile '' size ;

d. Modify the primary initialization parameter for dataguard on primary,
SQL> alter system set LOG_ARCHIVE_CONFIG='DG_CONFIG=(chicago,boston)';
System altered.

SQL> alter system set LOG_ARCHIVE_DEST_1='LOCATION=/u01/app/oracle/databases/chicago/redo/ VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=chicago';
System altered.

SQL> alter system set LOG_ARCHIVE_DEST_2='SERVICE=boston LGWR ASYNC VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=boston';
System altered.

SQL> alter system set LOG_ARCHIVE_DEST_STATE_1=ENABLE;
System altered.

SQL> alter system set FAL_SERVER=boston;
System altered.

SQL> alter system set FAL_CLIENT=chicago;
System altered.

SQL> alter system set DB_FILE_NAME_CONVERT='/u01/app/oracle/databases/boston/data/','/u01/app/oracle/databases/chicago/data' scope=spfile;
System altered.

SQL> alter system set LOG_FILE_NAME_CONVERT='/u01/app/oracle/databases/boston/redo/','/u01/app/oracle/databases/chicago/redo' scope=spfile;
System altered.

2. Ensure that the sql*net connectivity is working fine.

Insert a static entry for Boston in the listener.ora file of the standby system.

SID_LIST_LISTENER =
  (SID_LIST =
    (SID_DESC =
     (GLOBAL_DBNAME = boston.us.oracle.com)
     (ORACLE_HOME = /u01/app/oracle/product/OraHome111)
     (SID_NAME = boston)
    )
   )

LISTENER =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = )(PORT = 1521))
  )

TNSNAMES.ORA for the Primary and Standby should have BOTH entries

CHICAGO =
  (DESCRIPTION =
    (ADDRESS_LIST =
      (ADDRESS = (PROTOCOL = TCP)(HOST = )(PORT = 1521))
    )
    (CONNECT_DATA = (SERVICE_NAME = chicago.us.oracle.com))
  )

BOSTON =
  (DESCRIPTION =
    (ADDRESS_LIST =
      (ADDRESS = (PROTOCOL = TCP)(HOST = )(PORT = 1521))
    )
    (CONNECT_DATA = (SERVICE_NAME = boston.us.oracle.com))
  )
Check with the SQL*Net configuration using the following commands on the Primary AND Standby
% tnsping chicago
% tnsping boston

3. Create the standby database

a. Copy the password file from the primary $ORACLE_HOME/dbs and rename it to the standby database name.
The username is required to be SYS and the password needs to be the same on the Primary and Standby.
The best practice for this is to copy the passwordfile as suggested.
The password file name must match the instance name/SID used at the standby site, not the DB_NAME.
b. Create a initialization parameter with only one parameter DB_NAME.
DB_NAME=chicago
DB_UNIQUE_NAME=boston
DB_BLOCK_SIZE=
c. Create the necessary directories in the standby location to place database files and trace files ($ADR_HOME).

d. Set the environment variable ORACLE_SID to the standby service and start the standby-instance.
% export ORACLE_SID=boston
% sqlplus "/ as sysdba"
SQL> startup nomount pfile=$ORACLE_HOME/dbs/initcore1.ora
NOTE : Use either PFILE or SPFILE

# Addtl. comment
# If DUPLICATE without TARGET connection is used you cannot use SPFILE
# else getting

RMAN-05537: DUPLICATE without TARGET connection when auxiliary instance is started with spfile cannot use SPFILE clause

e. Verify if the connection 'AS SYSDBA' is working
% sqlplus /nolog
SQL> connect sys/SQL> connect sys/@chicago AS SYSDBA

f. On the primary system invoke the RMAN executable and connect to the primary and the auxiliary database ( i.e., the standby)
$ rman target sys/sys@chicago auxiliary sys/sys@boston

connected to target database: CHICAGO (DBID=761464750)
connected to auxiliary database: CHICAGO (not mounted)

RMAN> run {
allocate channel prmy1 type disk;
allocate channel prmy2 type disk;
allocate channel prmy3 type disk;
allocate channel prmy4 type disk;
allocate auxiliary channel stby type disk;

duplicate target database for standby from active database
spfile
  parameter_value_convert 'chicago','boston'
  set db_unique_name='boston'
  set db_file_name_convert='/chicago/','/boston/'
  set log_file_name_convert='/chicago/','/boston/'
  set control_files='/u01/app/oracle/oradata/control01.ctl'
  set log_archive_max_processes='5'
  set fal_client='boston'
  set fal_server='chicago'
  set standby_file_management='AUTO'
  set log_archive_config='dg_config=(chicago,boston)'
  set log_archive_dest_2='service=chicago ASYNC valid_for=(ONLINE_LOGFILE,PRIMARY_ROLE) db_unique_name=chicago'
;
}

using target database control file instead of recovery catalog
allocated channel: prmy1
channel prmy1: SID=147 device type=DISK

allocated channel: prmy2
channel prmy2: SID=130 device type=DISK

allocated channel: prmy3
channel prmy3: SID=137 device type=DISK

allocated channel: prmy4
channel prmy4: SID=170 device type=DISK

allocated channel: stby
channel stby: SID=98 device type=DISK

Starting Duplicate Db at 19-MAY-08

contents of Memory Script:
{
backup as copy reuse
file '/u02/app/oracle/product/11.1.0/db_1/dbs/orapwcore' auxiliary format'/u02/app/oracle/product/11.1.0/db_1/dbs/orapwcore1'
file'/u02/app/oracle/product/11.1.0/db_1/dbs/spfilecore.ora' auxiliary format'/u02/app/oracle/product/11.1.0/db_1/dbs/spfilecore1.ora' ;
sql clone "alter system set spfile= ''/u02/app/oracle/product/11.1.0/db_1/dbs/spfilecore1.ora''";
}
executing Memory Script

Starting backup at 19-MAY-08
Finished backup at 19-MAY-08

sql statement: alter system set spfile= ''/u02/app/oracle/product/11.1.0/db_1/dbs/spfilecore1.ora''

contents of Memory Script:
{
sql clone "alter system set audit_file_dest =''/u02/app/oracle/admin/boston/adump'' comment='''' scope=spfile";
sql clone "alter system set dispatchers =''(PROTOCOL=TCP) (SERVICE=core1XDB)'' comment='''' scope=spfile";
sql clone "alter system set log_archive_dest_2 =''service=core11 arch async VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) db_unique_name=boston'' comment='''' scope=spfile";
sql clone "alter system set db_unique_name =''boston'' comment='''' scope=spfile";
sql clone "alter system set db_file_name_convert =''/chicago/'', ''/boston/'' comment='''' scope=spfile";
sql clone "alter system set log_file_name_convert =''/chicago/'', ''/boston/'' comment='''' scope=spfile";
sql clone "alter system set control_files =''/u01/app/oracle/oradata/control01.ctl'' comment='''' scope=spfile";
sql clone "alter system set log_archive_max_processes =5 comment='''' scope=spfile";
sql clone "alter system set fal_client =''boston'' comment='''' scope=spfile";
sql clone "alter system set fal_server =''chicago'' comment='''' scope=spfile";
sql clone "alter system set standby_file_management =''AUTO'' comment='''' scope=spfile";
sql clone "alter system set log_archive_config =''dg_config=(chicago,boston)'' comment='''' scope=spfile";
sql clone "alter system set log_archive_dest_2 =''service=chicago ASYNC valid_for=(ONLINE_LOGFILE,PRIMARY_ROLE) db_unique_name=chicago'' comment='''' scope=spfile";
shutdown clone immediate;
startup clone nomount ;
}
executing Memory Script

sql statement: alter system set audit_file_dest = ''/u02/app/oracle/admin/boston/adump'' comment= '''' scope=spfile
sql statement: alter system set dispatchers = ''(PROTOCOL=TCP) (SERVICE=core1XDB)'' comment= '''' scope=spfile
sql statement: alter system set log_archive_dest_2 = ''service=core11 arch async VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) db_unique_name=boston'' comment= '''' scope=spfile
sql statement: alter system set db_unique_name = ''boston'' comment= '''' scope=spfile
sql statement: alter system set db_file_name_convert = ''/chicago/'', ''/boston/'' comment= '''' scope=spfile
sql statement: alter system set log_file_name_convert = ''/chicago/'', ''/boston/'' comment= '''' scope=spfile
sql statement: alter system set control_files = ''/u01/app/oracle/oradata/control01.ctl'' comment= '''' scope=spfile
sql statement: alter system set log_archive_max_processes = 5 comment= '''' scope=spfile
sql statement: alter system set fal_client = ''boston'' comment= '''' scope=spfile
sql statement: alter system set fal_server = ''chicago'' comment= '''' scope=spfile
sql statement: alter system set standby_file_management = ''AUTO'' comment= '''' scope=spfile
sql statement: alter system set log_archive_config = ''dg_config=(chicago,boston)'' comment= '''' scope=spfile
sql statement: alter system set log_archive_dest_2 = ''service=chicago ASYNC valid_for=(ONLINE_LOGFILE,PRIMARY_ROLE) db_unique_name=chicago'' comment= '''' scope=spfile

Oracle instance shut down

connected to auxiliary database (not started)
Oracle instance started

Total System Global Area 845348864 bytes

Fixed Size 1303188 bytes
Variable Size 482348396 bytes
Database Buffers 356515840 bytes
Redo Buffers 5181440 bytes

contents of Memory Script:
{
backup as copy current controlfile for standby auxiliary format '/u01/app/oracle/oradata/control01.ctl';
sql clone 'alter database mount standby database';
}
executing Memory Script

Starting backup at 19-MAY-08
channel prmy1: starting datafile copy
copying standby control file
output file name=/u02/app/oracle/product/11.1.0/db_1/dbs/snapcf_chicago.f tag=TAG20080519T173406 RECID=2 STAMP=655148053
channel prmy1: datafile copy complete, elapsed time: 00:00:03
Finished backup at 19-MAY-08

sql statement: alter database mount standby database

contents of Memory Script:
{
set newname for tempfile 1 to"/u02/app/oracle/oradata/boston/temp01.dbf";
switch clone tempfile all;
set newname for datafile 1 to "/u02/app/oracle/oradata/boston/system01.dbf";
set newname for datafile 2 to "/u02/app/oracle/oradata/boston/sysaux01.dbf";
set newname for datafile 3 to "/u02/app/oracle/oradata/boston/undotbs01.dbf";
set newname for datafile 4 to "/u02/app/oracle/oradata/boston/users01.dbf";
backup as copy reuse
datafile 1 auxiliary format "/u02/app/oracle/oradata/boston/system01.dbf"
datafile 2 auxiliary format "/u02/app/oracle/oradata/boston/sysaux01.dbf"
datafile 3 auxiliary format "/u02/app/oracle/oradata/boston/undotbs01.dbf"
datafile 4 auxiliary format "/u02/app/oracle/oradata/boston/users01.dbf" ;
sql 'alter system archive log current';
}
executing Memory Script

executing command: SET NEWNAME

renamed tempfile 1 to /u02/app/oracle/oradata/boston/temp01.dbf in control file

executing command: SET NEWNAME
executing command: SET NEWNAME
executing command: SET NEWNAME
executing command: SET NEWNAME

Starting backup at 19-MAY-08
channel prmy1: starting datafile copy
input datafile file number=00001 name=/u02/app/oracle/oradata/chicago/system01.dbf
channel prmy2: starting datafile copy
input datafile file number=00002 name=/u02/app/oracle/oradata/chicago/sysaux01.dbf
channel prmy3: starting datafile copy
input datafile file number=00003 name=/u02/app/oracle/oradata/chicago/undotbs01.dbf
channel prmy4: starting datafile copy
input datafile file number=00004 name=/u02/app/oracle/oradata/chicago/users01.dbf
output file name=/u02/app/oracle/oradata/boston/undotbs01.dbf tag=TAG20080519T173421 RECID=0 STAMP=0
channel prmy3: datafile copy complete, elapsed time: 00:00:24
output file name=/u02/app/oracle/oradata/boston/users01.dbf tag=TAG20080519T173421 RECID=0 STAMP=0
channel prmy4: datafile copy complete, elapsed time: 00:00:16
output file name=/u02/app/oracle/oradata/boston/system01.dbf tag=TAG20080519T173421 RECID=0 STAMP=0
channel prmy1: datafile copy complete, elapsed time: 00:02:32
output file name=/u02/app/oracle/oradata/boston/sysaux01.dbf tag=TAG20080519T173421 RECID=0 STAMP=0
channel prmy2: datafile copy complete, elapsed time: 00:02:32
Finished backup at 19-MAY-08

sql statement: alter system archive log current

contents of Memory Script:
{
switch clone datafile all;
}
executing Memory Script

datafile 1 switched to datafile copy
input datafile copy RECID=2 STAMP=655148231 file name=/u02/app/oracle/oradata/boston/system01.dbf
datafile 2 switched to datafile copy
input datafile copy RECID=3 STAMP=655148231 file name=/u02/app/oracle/oradata/boston/sysaux01.dbf
datafile 3 switched to datafile copy
input datafile copy RECID=4 STAMP=655148231 file name=/u02/app/oracle/oradata/boston/undotbs01.dbf
datafile 4 switched to datafile copy
input datafile copy RECID=5 STAMP=655148231 file name=/u02/app/oracle/oradata/boston/users01.dbf
Finished Duplicate Db at 19-MAY-08
released channel: prmy1
released channel: prmy2
released channel: prmy3
released channel: prmy4

4. Start managed recovery

Connect to standby using SQL*Plus and start the MRP (Managed Recovery Process). Compare the primary last sequence and MRP (Managed Recovery Process) applying sequence.
Example :
SQL> alter database recover managed standby database disconnect from session;

5. Open standby database in Read Only (active dataguard)

If you are licensed to use Active Dataguard (ADG) than open the Standby Database in READ ONLY and start the recovery.
SQL> alter database recover managed standby database cancel;
SQL> alter database open;
SQL> alter database recover managed standby database disconnect;

Creating a physical standby from ASM primary (Doc ID 787793.1)

To BottomTo Bottom

In this Document
Goal
Solution
References

Applies to:

Oracle Database - Enterprise Edition - Version 10.2.0.1 to 12.1.0.2 [Release 10.2 to 12.1]
Information in this document applies to any platform.
** checked for relevance '7-Jul-2015' **


Goal

** checked for relevance '7-Jul-2015' **
This document offers step by step procedure to create physical standby from ASM primary.
Database Name :- chicago
Primary db_unique_name :- chicago
standby db_unique_name :- Boston

Primary Hostname :- raca.idc.oracle.com
standby Hostname :- core1.idc.oracle.com

Solution

1. Enable Force Logging,

SQL> ALTER DATABASE FORCE LOGGING;

Database altered. 


2. On the primary node, create a staging directory. For example:
[oracle@raca chicago]$ mkdir –p /home/oracle/stage

3. Create the same exact path on the standby host:
[oracle@core1 ~]$ mkdir -p /home/oracle/stage


4. On the primary node, connect to the primary database and create a PFILE from the SPFILE in the staging directory. For example:
create pfile='/home/oracle/stage/boston.ora' from spfile;


5. On the primary node, perform an RMAN backup of the primary database that places the backup pieces into the staging directory. For example:
RMAN> run{
2> backup device type disk format '/home/oracle/stage/%U' database;
3> backup device type disk format '/home/oracle/stage/%U' current controlfile for standby;
4> }

Starting backup at 24-FEB-09
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=30 devtype=DISK
channel ORA_DISK_1: starting full datafile backupset
channel ORA_DISK_1: specifying datafile(s) in backupset
input datafile fno=00001 name=+DATA/chicago/datafile/system.271.679675991
input datafile fno=00003 name=+DATA/chicago/datafile/sysaux.273.679676023
input datafile fno=00002 name=+DATA/chicago/datafile/undo1.272.679676015
channel ORA_DISK_1: starting piece 1 at 24-FEB-09
channel ORA_DISK_1: finished piece 1 at 24-FEB-09
piece handle=/home/oracle/stage/01k8650r_1_1 tag=TAG20090224T154323 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:35
channel ORA_DISK_1: starting full datafile backupset
channel ORA_DISK_1: specifying datafile(s) in backupset
including current control file in backupset
including current SPFILE in backupset
channel ORA_DISK_1: starting piece 1 at 24-FEB-09
channel ORA_DISK_1: finished piece 1 at 24-FEB-09
piece handle=/home/oracle/stage/02k8651v_1_1 tag=TAG20090224T154323 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:03
Finished backup at 24-FEB-09

Starting backup at 24-FEB-09
using channel ORA_DISK_1
channel ORA_DISK_1: starting full datafile backupset
channel ORA_DISK_1: specifying datafile(s) in backupset
including standby control file in backupset
channel ORA_DISK_1: starting piece 1 at 24-FEB-09
channel ORA_DISK_1: finished piece 1 at 24-FEB-09
piece handle=/home/oracle/stage/03k86525_1_1 tag=TAG20090224T154405 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:03
Finished backup at 24-FEB-09

RMAN>

6. Copy the contents of the staging directory from primary to standby's staging location. For example
[oracle@raca stage]$ scp /home/oracle/stage/* oracle@core1:/home/oracle/stage/
oracle@core1's password:
01k8650r_1_1 100% 236MB 10.7MB/s 00:22
02k8651v_1_1 100% 6016KB 5.9MB/s 00:00
03k86525_1_1 100% 5984KB 5.8MB/s 00:01
boston.ora 100% 458 0.5KB/s 00:00
[oracle@raca stage]$


7. Prepare the initialization file for standby instance
[oracle@core1 ~]$ mkdir -p /u01/app/oracle/product/10.2/oradata
[oracle@core1 ~]$ cd /u01/app/oracle/product/10.2/oradata
[oracle@core1 oradata]$ mkdir redo data ctrl bdump udump srl arc1
[oracle@core1 oradata]$
Set the below parameters in init.ora of standby,
*.db_name='chicago'
*.db_create_file_dest='/u01/app/oracle/product/10.2/oradata/boston/data/'
*.db_create_online_log_dest_1='/u01/app/oracle/product/10.2/oradata/boston/redo/'
*.db_file_name_convert='+DATA/chicago/datafile/','/u01/app/oracle/product/10.2/oradata/boston/data/','+DATA/chicago/tempfile/','/u01/app/oracle/product/10.2/oradata/boston/data/'
*.log_file_name_convert='+DATA/chicago/onlinelog/','/u01/app/oracle/product/10.2/oradata/boston/redo/'
*.db_unique_name=boston

NOTE: For RAC primary to non RAC standby , include below line in the standby init.ora,
cluster_database=false

8. Create password file for standby database
[oracle@core1 boston]$ export ORACLE_SID=boston
[oracle@core1 dbs]$orapwd file=orapwboston password=oracle
[oracle@core1 dbs]$ ls -ltr *boston*
-rw-r----- 1 oracle oinstall 1536 Mar 3 15:55 orapwboston

9. Compose a tnsnames or connect string at standby server

chicago =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST =raca.idc.oracle.com )(PORT = 1521))
)
(CONNECT_DATA =
(SERVICE_NAME = chicago)
)
)

boston =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST =core1.idc.oracle.com )(PORT = 1521))
)
(CONNECT_DATA =
(SERVICE_NAME = boston)
)
)
Note:-

chicago will connect to primary while boston connects to standby itself
10. Create standby database
[oracle@core1 boston]$ sqlplus / as sysdba

SQL*Plus: Release 10.2.0.1.0 - Production on Tue Mar 3 16:55:19 2009

Copyright (c) 1982, 2005, Oracle. All rights reserved.

Connected to an idle instance.

SQL> create spfile from pfile='/u01/app/oracle/product/10.2/oradata/boston/initboston.ora';

File created.

SQL> startup nomount
ORACLE instance started.

Total System Global Area 327155712 bytes
Fixed Size 1218844 bytes
Variable Size 150996708 bytes
Database Buffers 104857600 bytes
Redo Buffers 70082560 bytes
SQL> exit
Disconnected from Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, OLAP and Data Mining options

[oracle@core1 boston]$ /u01/app/oracle/product/10.2/bin/rman target sys/oracle@chicago auxiliary /

Recovery Manager: Release 10.2.0.1.0 - Production on Tue Mar 3 16:56:09 2009

Copyright (c) 1982, 2005, Oracle. All rights reserved.

connected to target database: CHICAGO (DBID=1289394690)
connected to auxiliary database: CHICAGO (not mounted)

RMAN> duplicate target database for standby;

Starting Duplicate Db at 03-MAR-09
using target database control file instead of recovery catalog
allocated channel: ORA_AUX_DISK_1
channel ORA_AUX_DISK_1: sid=36 devtype=DISK

contents of Memory Script:
{
restore clone standby controlfile;
sql clone 'alter database mount standby database';
}
executing Memory Script

Starting restore at 03-MAR-09
using channel ORA_AUX_DISK_1

channel ORA_AUX_DISK_1: starting datafile backupset restore
channel ORA_AUX_DISK_1: restoring control file
channel ORA_AUX_DISK_1: reading from backup piece /home/oracle/stage/03k86525_1_1
channel ORA_AUX_DISK_1: restored backup piece 1
piece handle=/home/oracle/stage/03k86525_1_1 tag=TAG20090224T154405
channel ORA_AUX_DISK_1: restore complete, elapsed time: 00:00:01
output filename=/u01/app/oracle/product/10.2/oradata/boston/ctrl/ctrl01.ctl
Finished restore at 03-MAR-09

sql statement: alter database mount standby database
released channel: ORA_AUX_DISK_1

contents of Memory Script:
{
set newname for tempfile 1 to
"/u01/app/oracle/product/10.2/oradata/boston/data/tmp1.274.679676043";
switch clone tempfile all;
set newname for datafile 1 to
"/u01/app/oracle/product/10.2/oradata/boston/data/system.271.679675991";
set newname for datafile 2 to
"/u01/app/oracle/product/10.2/oradata/boston/data/undo1.272.679676015";
set newname for datafile 3 to
"/u01/app/oracle/product/10.2/oradata/boston/data/sysaux.273.679676023";
restore
check readonly
clone database
;
}
executing Memory Script

executing command: SET NEWNAME

renamed temporary file 1 to /u01/app/oracle/product/10.2/oradata/boston/data/tmp1.274.679676043 in control file

executing command: SET NEWNAME

executing command: SET NEWNAME

executing command: SET NEWNAME

Starting restore at 03-MAR-09
allocated channel: ORA_AUX_DISK_1
channel ORA_AUX_DISK_1: sid=36 devtype=DISK

channel ORA_AUX_DISK_1: starting datafile backupset restore
channel ORA_AUX_DISK_1: specifying datafile(s) to restore from backup set
restoring datafile 00001 to /u01/app/oracle/product/10.2/oradata/boston/data/system.271.679675991
restoring datafile 00002 to /u01/app/oracle/product/10.2/oradata/boston/data/undo1.272.679676015
restoring datafile 00003 to /u01/app/oracle/product/10.2/oradata/boston/data/sysaux.273.679676023
channel ORA_AUX_DISK_1: reading from backup piece /home/oracle/stage/01k8650r_1_1
channel ORA_AUX_DISK_1: restored backup piece 1
piece handle=/home/oracle/stage/01k8650r_1_1 tag=TAG20090224T154323
channel ORA_AUX_DISK_1: restore complete, elapsed time: 00:00:25
Finished restore at 03-MAR-09

contents of Memory Script:
{
switch clone datafile all;
}
executing Memory Script

datafile 1 switched to datafile copy
input datafile copy recid=4 stamp=680547428 filename=/u01/app/oracle/product/10.2/oradata/boston/data/system.271.679675991
datafile 2 switched to datafile copy
input datafile copy recid=5 stamp=680547428 filename=/u01/app/oracle/product/10.2/oradata/boston/data/undo1.272.679676015
datafile 3 switched to datafile copy
input datafile copy recid=6 stamp=680547428 filename=/u01/app/oracle/product/10.2/oradata/boston/data/sysaux.273.679676023
Finished Duplicate Db at 03-MAR-09

RMAN>
11 Add standby redo logs to standby database
SQL> alter database add standby logfile group 3 '/u01/app/oracle/product/10.2/oradata/boston/srl/srl3a.log' size 150m;

Database altered.

SQL> alter database add standby logfile group 4 '/u01/app/oracle/product/10.2/oradata/boston/srl/srl4a.log' size 150m;

Database altered.

SQL> alter database add standby logfile group 5 '/u01/app/oracle/product/10.2/oradata/boston/srl/srl5a.log' size 150m;

Database altered.

SQL> alter database recover managed standby database using current logfile disconnect;

Database altered.

SQL>
12. Establish the communication to primary from standby
SQL> select name,database_role from v$database;

NAME DATABASE_ROLE
--------- ----------------
CHICAGO PHYSICAL STANDBY

SQL> alter system set standby_archive_dest='/u01/app/oracle/product/10.2/oradata/boston/arc1/';

System altered.

SQL> alter system set log_archive_dest_1='location=/u01/app/oracle/product/10.2/oradata/boston/arc1/ valid_for=(all_logfiles,all_roles) db_unique_name=boston';


SQL> alter system set log_archive_config='dg_config=(chicago,boston)';

System altered.

SQL> alter system set log_archive_dest_2='service=chicago lgwr async noaffirm valid_for=(online_logfiles,primary_role) db_unique_name=chicago';

System altered.

SQL> alter system set fal_client=boston;

System altered.

SQL> alter system set fal_server=chicago;

System altered.

SQL> alter system set standby_file_management=auto;

System altered.

13. Compose tnsnames or connect strings at Primary server

chicago =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST =raca.idc.oracle.com )(PORT = 1521))
)
(CONNECT_DATA =
(SERVICE_NAME = chicago)
)
)

boston =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST =core1.idc.oracle.com )(PORT = 1521))
)
(CONNECT_DATA =
(SERVICE_NAME = boston)
)
)
Note:-

Boston will connect to standby while chicago connects to primary itself

From 11g we can use RMAN>DUPLICATE FROM ..ACTIVE DATABASE,
DUPLICATE TARGET DATABASE 
FOR STANDBY
FROM ACTIVE DATABASE
SPFILE
   PARAMETER_VALUE_CONVERT '', ''
   SET DB_FILE_NAME_CONVERT '', ''
   SET LOG_FILE_NAME_CONVERT '', ''
   SET SGA_MAX_SIZE 200M
   SET SGA_TARGET 125M;
For STEP 10 detailed Step on DUPLICATE from ACTIVE DATABSE refer,
Note 1075908.1Step by Step Guide on Creating Physical Standby Using RMAN DUPLICATE...FROM ACTIVE DATABASE

14. Establish the communication to standby from primary
SQL> alter system set log_archive_config='dg_config=(chicago,boston)';

System altered.

SQL> alter system set log_archive_dest_2='service=boston lgwr async noaffirm valid_for=(online_logfiles,primary_role) db_unique_name=boston';

System altered.

SQL> alter system set fal_client=chicago;

System altered.

SQL> alter system set fal_server=boston;

System altered.

SQL> select name,database_role from v$database;

NAME DATABASE_ROLE
--------- ----------------
CHICAGO PRIMARY

SQL>

15. Set role transition specific parameters for current primary
SQL> alter system set db_file_name_convert='/u01/app/oracle/product/10.2/oradata/boston/data/','+DATA/chicago/datafile/','/u01/app/oracle/product/10.2/oradata/boston/data/','+DATA/chicago/tempfile/' scope=spfile;

System altered.

SQL> alter system set log_file_name_convert='/u01/app/oracle/product/10.2/oradata/boston/redo/','+DATA/chicago/onlinelog/' scope=spfile;

System altered.

SQL> alter system set standby_file_management=auto;

System altered.
16. Configure a Standby Redo Log,
SQL> alter database add standby logfile group 3 size 150m;

Database altered.

SQL>alter database add standby logfile group 4 size 150m

Database altered.

SQL>alter database add standby logfile group 5 size 150m

Database altered.

Creating a Physical Standby Database File System My Oracle Support

To BottomTo Bottom

To BottomTo Bottom

In this Document
Purpose
Scope
Details
 Prerequisites and Assumptions to get started
 Environment Information
 Preparing the Environment and Standby Instance
 Create the Physical Standby Database
 Creating a Standby Database via User Managed Backups
 Creating a Standby Database using RMAN (Backup based)
 Creating a Standby Database using RMAN without Backup (from active Database)
 Post-Creation Steps to make the Data Guard Environment operational
 Optional additional Steps
References

Applies to:

Oracle Database - Enterprise Edition - Version 10.1.0.2 to 12.1.0.2 [Release 10.1 to 12.1]
Information in this document applies to any platform.

Purpose

** checked for relevance '7-Jul-2015' **
This Document shows the various Possibilities to create a Physical Standby Database on Oracle Database 11.2.0.x.
It should be seen as an Addition and Clarification to the Documentation. You may also review
Oracle® Data Guard, Concepts and Administration, 11g Release 2 (11.2), E25608-0x
Chapter 3: Creating a Physical Standby Database
Appendix E: Creating a Standby Database with Recovery Manager
which is the Base for this Document. We take an Example Database and show the various Methods to create the Standby Database. Note that only the basic Concepts are shown. Further more complex Scenarios are possible as well, of course. Take a Look into the corresponding Documentation to get the deeper Insight.
 
Would you like to explore this Topic further with other Oracle Customers, Oracle Employees and Industry Experts ??
You can discuss this Note, show your Experiences or ask Questions about it directly right at the Bottom of this Note in the Discussion Thread about this Document.
If you want to discover Discussions about other Articles und Subjects or even post new Discussions you can access the My Oracle Support Community Page for High Availability Data Guard

Scope

This Document is intended for System Adminstrators planning to setup a Standby Database. Depending on the Needs and Prerequisites on the existing Environment you can choose any of the mentioned Ways to create the Standby Database and Data Guard Environment.
Note that this Note does not cover Setting up Log Transport and Log Apply Services in Detail - we take the basic default Setup here. The same is true for the RMAN Environment in Case where different Backup Types and Scenarios are possible. We just try to show the basic Concept here.

Details


Prerequisites and Assumptions to get started


-          There already exists a Primary Database you want to create your Standby Database for
-          Primary Database is in ARCHIVELOG-Mode and local Archiving is using Fast Recovery Area
-          FORCE LOGGING is enabled on the Primary Database
-          The Primary Database is using a SPFILE
-          On the Standby Site ORACLE_HOME is installed using the same Oracle Release and Patchlevel
-          The Platform of the Standby Database is the same as the Primary Site or a supported Combination as per Note 413484.1
-          There is Network Connectivity between the Primary and Standby System
-          Listener is configured and running on the Primary and Standby Site
-          We are creating a Standby Database using default Log Transport Services in Maximum Performance Mode


Environment Information


For all mentioned Possibilities we are using this Environment

OS:                                                                       Linux x64-64
Hostname (Primary):                                    TSTPRIM
Hostname (Standby):                                   TSTSTBY
Database Version:                                         11.2.0.x
DB_NAME (Primary and Standby):         prim_db
SID/DB_UNIQUE_NAME (Primary):       prim_db
SID/DB_UNIQUE_NAME (Standby):      stby_db
Listener Port (Primary and Standby):     1521
Primary Database Files Location:             /oracle/oradata/prim_db
Standby Database Files Location:            /oracle/oradata/stby_db


Preparing the Environment and Standby Instance


-          Create TNS-Alias to resolve the Standby Database later. The TNS-Alias to resolve the Primary and Standby Database must be present in the Primary and Standby Database TNSNAMES.ORA. Here it looks like that:
 
PRIM_DB =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = TSTPRIM)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = prim_db.world)
    )
  )

STBY_DB =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = TSTSTBY)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = stby.world)
    )
  )
 
-          Create Folders in ADMIN- and ORADATA-Folders to host the Database Files and Dumpfiles (can use $ORACLE_BASE/admin of the Primary as a Reference which Folders to create – typically those are the ‘adump’- and ‘dpdump’-Folders)

$ cd $ORACLE_BASE/admin
$ mkdir stby_db
$ cd stby_db
$ mkdir adump
$ mkdir dpdump

-          Create a PFILE from the Primary PFILE in the Format ‘init.ora’

SQL> create pfile=’/tmp/initstby_db.ora’ from spfile;

 -          Set the following Initialization Parameters on the Primary Database:
  • log_archive_dest_1 = ‘location=USE_DB_RECOVERY_FILE_DEST’
  • log_archive_dest_2 = ’service=stby_db async valid_for=(ONLINE_LOGFILES,PRIMARY_ROLE) db_unique_name=stby_db’
  • log_archive_dest_state_2 = ’defer’
  • log_archive_config= ’dg_config=(prim_db,stby_db)’
  • log_archive_max_processes = 8

-> Those Parameters prepare the Primary Database to support a Standby Database ’stby_db’. log_archive_dest_2 is responsible for transferring Redo to the Standby Site later – currently it is deferred, we will enable later once the Standby Database exists. log_archive_config records all db_unique_name’s participating in this Configuration.

-          Modify the following Initialization Parameters in the PFILE (initstby_db.ora) for the Standby Database we created before:
  • log_archive_dest_1 = ‘location=USE_DB_RECOVERY_FILE_DEST’
  • log_archive_config= ’dg_config=(prim_db,stby_db)’
  • log_archive_max_processes = 8
  • fal_server = ‘prim_db’
  • log_file_name_convert = ‘’,’
  • db_file_name_convert = ‘’,’
  • db_unique_name = ‘stby_db’

-> Those Parameters are required to be adjusted for the Standby Database. log_archive_dest_1 here defines the local Archive Destination where we will put ArchiveLogs arriving from the Primary Database later. log_archive_config and db_unique_name are required for the Data Guard Configuration (like on the Primary) and fal_server is required for Gap Resolution to be able to automatically fetch Archive Logs from the Primary again in Case there is a Gap. To automate the Substitution of Database File Locations we set db_file_name_convert and log_file_name_convert. Those Pairs of Values are a simple String Substitution. See Note 1367014.1 for further Details.
-> You may also adjust audit_file_dest and control_files to match with your Environment and Locations.

-          Copy prepared ‘initstby_db.ora’ together with the Passwordfile of the Primary Database (orapwprim_db in $ORACLE_HOME/dbs) to the Standby Site and place both into ‘$ORACLE_HOME/dbs’-Folder. Then you have to rename the Passwordfile to match with the Standby SID:

$ mv orapwprim_db orapwstby_db
 

Create the Physical Standby Database


In the previous Section we prepared the Environments and the Primary Database to support a Standby Database. Now we can proceed and create the Physical Standby Database itself. There are 3 common Ways to perform this which will be shown in the next Section. You can choose any Way most suitable for you:
  1. Creating manually via User-Managed Backups
  2. Using RMAN Backup-based Duplication
  3. Creating a Standby Database from the active Primary Database without a Backup using RMAN Duplicate

Creating a Standby Database via User Managed Backups

We can use a Hot or Cold Database Backup to create the Standby Database. It is only important to create the Standby Controlfile from the Primary after the Backup has completed.
For a Cold Backup we simply perform a clean shutdown of the Primary Database and copy all Datafiles to a temporary Location, then startup the Primary Database again:

SQL> shutdown immediate;
$ cp /oracle/oradata/prim_db/*.dbf /backup/
SQL> startup
 
-          If you want to use a Host Backup, we have to put the Primary Database into Backup Mode and copy the Datafiles:

SQL> alter database begin backup;
$ cp /oracle/oradata/prim_db/*.dbf /backup/
SQL> alter database end backup;
 
-          We can create the Standby Controlfile since the Backup is complete (either Hot or Cold Backup)

SQL> alter database create standby controlfile as ‘/backup/control01.ctl’;
 
-          Now the required Files for the Standby Database are complete and we can transfer those to the Standby Database and place them into the Standby Database File-Folder (in our Example to ‘/oracle/oradata/stby_db/’)
-          Once the Datafiles are there we can set the Environment and first mount the Standby Database

$ export ORACLE_SID = stby_db
SQL> connect / as sysdba
SQL> startup mount
 
-          When the Database is mounted proceed with the Post-Creation Steps below


Creating a Standby Database using RMAN (Backup based)

Instead of creating a User-Managed Backup we can also use a RMAN Backup to create the Standby Database.
-          In this Case we have to create a RMAN Backup of the Primary Database first:

RMAN> connect target /
RMAN> backup database plus archivelog format ‘/backup/STBY_%U’;

-          Since the Backup already includes the Controlfile, there is no Need to backup or create a Standby Controlfile separately
-          Transfer all Backuppieces created by the RMAN Backup to the exact same Folder (in our Case ‘/backup/’) on the Standby Database
-          Startup nomount the Standby Database

$ export ORACLE_SID = stby_db
SQL> startup nomount

 -          At this Point we can now start the RMAN Duplicate to create the Standby Database

$ export ORACLE_SID = stby_db
RMAN> connect target sys/@prim_db
RMAN> connect auxiliary /
RMAN> duplicate target database for standby nofilenamecheck;
 
-> The RMAN Duplicate now extracts the Controlfile and Datafiles from the Backup to build the Standby Database. Once done it mounts the Standby Database.  So we can now proceed with the Post-Creation Steps below.

Creating a Standby Database using RMAN without Backup (from active Database)

It is now possible to create a Physical Standby Database from the active Primary Database, ie. It is not necessary to create a Backup first. The Blocks are transferred to the Standby Database via the Network during the RMAN Duplicate. Before you decide to use this Option to create your Standby Database you should ensure you sufficient Bandwith available to transfer all Database Blocks of the Primary to the Standby. Depending on the Size of your Primary Database and the Bandwith available it might take long Time to complete this Task. If you have a large Database or a slow Network Connection you may consider to use another Option to create your Standby Database.
The Steps to create the Standby Database that Way are similar to a Backup-based RMAN Duplicate, but we don’t have to take the Backup, we can directly start the RMAN Duplicate from the Standby Site:

$ export ORACLE_SID = stby_db
SQL> startup nomount
RMAN> connect target sys/@prim_db
RMAN> connect auxiliary /
RMAN> duplicate target database for standby from active database nofilenamecheck;

-> The RMAN Duplicate first copies the Controlfile from the Primary Database as a Standby Controlfile, then mounts the Standby Database with this Conrolfile and creates the Datafiles/copies the Database Blocks from the Primary. Once done the Duplicate finishes and leaves the Database in mount-Status. So we can now proceed with the Post-Creation Steps below.
NOTE : If primary and standby sharing same storage do not use NOFILENAMECHECK clause unless you are sure about the directories are different.
ref,
<NOTE 789370.1> -  Creating Physical Standby using RMAN Duplicate Without Shutting down The Primary

Post-Creation Steps to make the Data Guard Environment operational


In the previous Chapters we prepared and created the Physical Standby Database. Now we can start the Log Transport and Log Apply Services to have the Data Guard Environment completely operational.
-          First of all we should now add Standby RedoLogs to the new created Standby Database to collect the current Redo arriving from the Primary Database. We can add those using

SQL> alter database add standby logfile (‘’) size ;

You should add at least one more Standby RedoLogs Group than corresponding Online RedoLog Group. See Note 219344.1 for further Details and Examples.

-          Next we can enable Log Transport Services on the Primary Database which have been prepared initially – performing a Logfile Switch afterwards will finally enable it (all changes to log_archive_dest_n and log_archive_dest_state_n become active after the next Log Switch once set).

SQL> alter system set log_archive_dest_state_2 = ‘enable’ scope=both;
SQL> alter system switch logfile;
 
To verify if Standby RedoLogs get allocated on the Standby Database and so Log Transport Services are running query v$standby_log on the Standby Database:

SQL> select * from v$standby_log;
  -          Assuming that Log Transport Services are running now, we can finally start Log Apply Services (Managed Recovery):
 SQL> alter database recover managed standby database using current logfile disconnect;

Query v$managed_standby to monitor the Progress of the Managed Recovery:

SQL> select * from v$managed_standby where process = ‘MRP0’;

Most likely there will now be a FAL Gap Request first to ship and apply all ArchiveLogs created since we took the Backup. Depending on the Time and Amount of Redo generated on the Primary meanwhile it may take a while.
Note that starting Managed Recovery will also attempt to clear/create the Online RedoLogs on the Standby Database proactive to reduce Switchover/Failover Time.

Optional additional Steps


Basically we have now an active Data Guard Configuration. Depending on your Requirements you can proceed performing those Steps:
-          Create an SPFILE from the current PFILE on the Standby Database
-          Enable Flashback to be able to Flashback this Database (eg. instead of the Primary Database to get historical Data or avoid having to recreate the Standby Data in Case of a RESETLOGS-Operation on the Primary Database)
-          Setup and enable a Data Guard Broker Configuration – the Data Guard Broker is an Offset to Data Guard in order to make monitoring and administering of a Data Guard Configuration much easier. Note that using a Data Guard Configuration with Cloud Control (formerly Grid Control) requires the Data Guard Broker
-          Raise Protection Mode to Maximum Availability or Maximum Protection
-          Proactive setup Log Transport Services from Standby Database to the Primary and create Standby RedoLogs on the Primary for Switchover Purposes is used
-          Open the Standby Database READ ONLY and restart Managed Recovery (Active Data Guard). Note that this requires Licensing of the ‘Active Data Guard’-Option. However you can open the Standby Database open READ ONLY without Apply Services; this does not require this Option to be licensed
-          Setup Fast-Start Failover (requires the Data Guard Broker, too)
-          Add additional Standby Database(s)
-          Turn the just created Physical Standby Database into a Logical Standby Database

miércoles, 1 de junio de 2016

Disponibilidad

En busca de los cinco 9s: Calculando la disponibilidad de sistemas complejos

Este artículo es la traducción del artículo In search of five 9s: Calculating Availability of Complex Systems, publicado por un tal Bill el 29 de Octubre de 2007. Dado que el artículo original tiene demasiado texto, me he visto obligado a modificar la maquetación y añadir títulos.
Puede parecer un poco largo, pero es realmente interesante. Aunque tiene muchas fórmulas es sencillo de entender, gracias a los ejemplos.
Y he elegido este artículo porque con el cloud que está tan de moda resulta sencillo y relativamente barato redundar máquinas.
Además, éste es el tipo de artículos que realmente me hacen sentir como Ingeniero.

Disponibilidad y SLA

He pasado los últimos días tratando de desarrollar un modelo matemático simple para predecir la disponibilidad esperada en sistemas complejos. En IT se nos suele pedir que desarrollemos y cumplamos acuerdos de nivel de servicio (SLAs). Si no se analizan los puntos de fallo de un sistema, y después se calcula la disponibilidad del sistema, el SLA es defectuoso desde el principio. Para complicar aún más las cosas, diferentes personas tienen diferentes definiciones de disponibilidad. Por ejemplo: ¿cuenta un downtime planificado de mantenimiento en sus cálculos de disponibilidad del sistema?
Definiciones de Disponibilidad Comunes:
  1. Disponibilidad = MTBF/(MTTR + MTBF). Ésta es una definición clásica de disponibilidad y es frecuentemente usada por fabricantes de hardware cuando publican una métrica de disponibilidad para un determinado servidor.
  2. Disponibilidad = (Uptime + Scheduled Maintenance)/(Unscheduled Downtime + Uptime + Scheduled Maintenance). Consiste en una métrica centrada en IT donde el negocio puede permitirse downtimes durante horas. Este modelo funciona para determinados sistemas, como un servidor de ficheros que no se necesita durante la noche, pero no funciona bien para servicios web, a pesar de que muchas compañías aún lo usan para sus SLAs.
  3. Disponibilidad = Uptime/(Uptime + Downtime). Es la métrica que mejor aplica para servicios que se necesitan 24x7, como sitios de comercio electrónico.
Normalmente se expresa la disponibilidad como un porcentaje. En ocasiones, la gente hace referencia a los "cuatro nueves" (99.99%) o los "cinco nueves" (99.999%). Para simplificar las cosas, en la siguiente tabla se muestran los minutos de downtime al año permitidos para un nivel determinado:
DisponiblidadMin Downtime/AñoHorasDowntime/Año
95.000%26,298438
98.000%10,519175
98.500%7,889131
99.000%5,26088
99.500%2,63044
99.900%5268.8
99.990%52.6.88
99.999%5.26.088
Basándose en la tabla anterior, se puede comprobar que hay una gran diferencia entre un SLA que especifique un 99% de disponibilidad (88 horas de downtime al año) y 99.9% de disponibilidad (8.8 horas de downtime al año). Pero ¿cómo se puede estar seguro de cuál es el downtime esperado de un sistema? En su form más simplista, la disponibilidad esperada de un sistema es la disponibilidad esperada de cada una de sus partes multiplicadas entre sí. De esta manera, si el sistema tiene dos servidores, y cada servidor tiene una disponibilidad esperada del 99%, entonces la disponibilidad del sistema sería 99%99%=98.01% . Nota: He usado la expresión "disponibilidad esperada". Estamos calculando el futuro esperado de un sistema sobre un periodo de tiempo, no la disponibilidad histórica. Para el resto de este artículo, eliminaré el término "esperada" por brevedad, pero está siempre implícito.

Sistemas compuestos

El modelo simplista de más arriba es útil para mostrar que el downtime es acumulativo. En otras palabras, si se espera que un componente esté fuera de línea 88 horas/año, y un fallo de cualquier componente es un fallo del sistema, entonces el sistema tiene un downtime esperado de 174 horas. ¿Por qué no 176 horas? Bueno, ocasionalmente ambos componentes estarán caídos al mismo tiempo.
Los sistemas del mundo real nunca son tan simples. Típicamente el sistema estará compuesto de múltiples componentes, con redundancia, y cada uno con distinto nivel de disponibilidad por componente. Modelar esto requiere fórmulas algo más complicadas, pero en cuanto se tengan claros los conceptos, los nuevos cálculos pueden hacerse rápidamente en una hoja Excel. Antes de continuar necesitamos algo de notación básica para simplificar nuestras fórmulas:
  • Disponibilidad del componente 1: Ac1
  • Disponibilidad del componente 2: Ac2
  • Disponibilidad del componente 3: Ac3
  • Disponibilidad del componente N: AcN
  • Sistema de disponibilidad: As
Y dicho esto, estamos listos para nuestra primera fórmula. Cuando un sistema está constituído por N componentes que son puntos únicos de fallo, la disponibilidad del sistema se puede calcular como:
Ecuación#1:As=Ac1Ac2Ac3...Acn

Ejemplo

Consideremos un sitio de comercio electrónico 24x7 con muchos puntos únicos de fallo. Se podría modelar el sistema con los 8 componentes siguientes:
ComponentAvailability
Web85%
Application90%
Database99.9%
DNS98%
Firewall85%
Switch99%
Data Center99.99%
ISP95%
Si cualquiera de estos componentes falla, el sistema caerá. La disponibilidad esperada del sistema sería 85%90%99.9%98%85%99%99.99%95%=59.87% . Nótese que se modela cada componente como un todo, en lugar de mirar cada una de sus partes. Se podría dividir el servicio web en el software (Apache), el código (nuestra web) y el hardware (placa base, discos duros, etc). Para nuestros propósitos, la complejidad no mejora necesariamente el modelo, así que trabajaremos con el servicio como un todo. Además, para esta discusión se utilizará la tercera definición de disponibilidad de arriba. Para nuestros usuarios, no importa si el sistema está caído por mantenimiento o por un disco duro roto.

Mejorar la disponibilidad

Si queremos conservar nuestros trabajos, necesitamos encontrar una manera de mejorar esta disponibilidad. Los objetivos obvios para mejorar la estabilidad del sitio son la web y el firewall. La pregunta es qué efecto tendría en la disponibilidad de la web añadir otro servidor. Con esto llegamos a la segunda ecuación. Cuando un sistema está compuesto por dos componentes redundantes, la disponibilidad del sistema se puede calcular como:
Ecuación#2:As=Ac1+((1Ac1)Ac2)

Ejemplo

Usando nuestro ejemplo del servidor web con disponibilidad de 85%, añadir un segundo servidor incrementaría la disponibilidad a 85%+(185%)85%=97.75% . La lógica detrás de esto es que mientras el primer servidor está caído (15% del tiempo) el segundo servidor estará arriba el 85% del tiempo. Esto podría traducirse o no a la disponibilidad en el mundo real. Por ejemplo, si el servicio web está caído tan a menudo porque necesitamos desplegar nuestro código constantemente, añadir un segundo servidor debería traducirse en un incremento de la disponibilidad, ya que se puede desplegar código en el servidor fuera de línea mientras el otro servidor continúa funcionando. En este caso, el incremento de nuestra disponibilidad real debería ser mayor del 12.75%. En caso contrario, si el servicio se cae por errores de código, añadir un segundo servidor podría empeorar la disponibilidad debido al desesperante error.
La idea es que, en general, si has calculado rigurosamente la disponibilidad del componente, la ecuación funcionará. Nótese que la ecuación funcionará igualmente si los componentes tienen distinta estimación de disponibilidad. Supongamos que el servidor web tiene un problema de disponibilidad porque el hardware está subdimensionado. Ahora supongamos que el segundo servidor que hemos comprado tiene el doble de capacidad y determinamos que por sí mismo daría una disponibilidad del 90%, por lo que la ecuación cambia a 85%+(185%)90%=98.5% .
Volvamos al cálculo del sistema anterior y añadamos esto. Asumiendo que se ha añadido un segundo servidor web y un segundo firewall, incrementando la disponibilidad de estos componentes del sistema a 97.75%. Ahora la disponibilidad de nuestro nuevo sistema sería 97.75%90%99.9%98%97.75%99%99.99%95%=79.10% . Mejor, pero aún no es buena. Es difícil conseguir cualquier nivel de alta disponibilidad cuando se tienen puntos únicos de fallo. Así que asumamos que añadimos componentes redundantes en todos nuestros servidores y equipos de red. Asumamos un segundo ISP para diversificar el transporte, pero aún estamos en un único CPD. Nuestra ecuación sería ahora: 97.75%99%99.9999%99.96%97.75%99.99%99.99%99.75%=94.3% . Mejorando. Eliminando los puntos únicos de fallo se mejoró la disponibilidad del sistema del 60% (3506 horas de downtime/año) a 94.3% (500 horas de downtime/año).

Generalizando la fórmula

La ecuación #2 de más arriba modelaba cómo añadir un único componente. En algunos casos es necesario añadir más de un componente redundante. Por ejemplo, se podrían tener más de dos servidores web. En este caso es necesario iterar la ecuación #2 varias veces para agrupar el efecto de los componentes adicionales, lo que implica una tercera ecuación. Cuando se intenta calcular la disponibilidad de un servicio con n componentes redundantes, se calcula:
Ecuación#3:As=Acn1+((1Acn1)Acn)

Ejemplo

En el caso de nuestro servidor web, añadir un tercer servidor cambia la disponibilidad a: 97.75%+(197.75%)85%=99.6625% . Añadiendo un cuarto servidor incrementaría la disponibilidad a: 99.6625%+(199.6625%)85%=99.949% . Nótese que hay una disminución de ganancia. Añadiendo el segundo servidor incrementó la disponibilidad en un 12.75%. Añadir el tercero sólo hizo ganar 1.9125%. El cuarto servidor nos da un insignificante .2865%. E incluso con 3 servidores más para distribuir nuestra carga, aún no se han conseguido los elusivos cuatro nueves de disponibilidad. Diseñar un sistema altamente disponible requiere que cada componente individual sea altamente disponible Y añadir redundancia en los componentes. Si el servidor web individual de nuestro sistema tuviera una disponibilidad del 90% en lugar del 85%, entonces la disponibilidad de los dos servidores sería del 99% y los tres servidores tendrían un 99.99%.
Las ecuaciones #2 y #3 tienen un defecto, ya que asumen que cada componente independiente puede manejar la carga, y que la carga es constante. ¿Qué ocurre si bajo operativa normal un servidor web puede manejar la carga, pero en pico se necesitan tres servidores? En ese caso la disponibilidad de los tres servidores en carga normal sería de 99.775%, mientras que en pico bajaría hasta el 85%. En un pico de carga, la caída de uno de los servidores podría significar una pérdida de servicio, por lo que se transforma a la disponibilidad de una única caja. ¿Y si el pico de carga requiere 2 servidores? En este caso la disponibilidad en pico sería del 97.75%. Si el pico require dos servidores y tenemos tres, la disponibilidad sería equivalente a tener dos servidores. El concepto importante aquí es que hay una relación inversa entre la carga y la disponibilidad.

Redundando el CPD

Debería ser obvio a estas alturas unos verdaderos altos niveles de disponibilidad (99.9% - 99.999%) es muy difícil y muy caro. Uno de los puntos únicos de fallo más caros a eliminar es el propio CPD. En la mayoría de los casos, doblará el coste de la infraestructura, e incluso podría ser más del doble, ya que será necesario invertir en tecnologías para sincronizar ambos CPDs.
De todas formas consideremos añadir un nuevo CPD. En el ejemplo de arriba, la disponibilidad de nuestro sistema con servidores e ISPs redundantes era del 94.3%. Añadir un segundo CPD con la tecnología necesaria para permitir que ambos trabajen de forma activo-activo (ambos centros reciben tráfico al mismo tiempo) incrementaría nuestra disponibilidad a 94.3%+(194.3%)94.3%=99.675% . ¡Añadir un nuevo CPD puede ahorrar hasta 471 horas de downtime al año!
En este ejemplo se asume que cada CPD es un sistema independiente, por lo que el fallo de un sistema en un CPD sería un fallo de todo el sistema en ese centro. No siempre es así. Por ejemplo, si se diseña correctamente, un servidor web podría conectarse a la base de datos del otro CPD. En este caso la disponibilidad sería mayor del 99.675%. Si se pudiera diseñar el site de manera que cada sistema operara de forma independiente al resto de servicios, la disponibilidad en el ejemplo se incrementaría del 99.675% al 99.888% (cada servicio tendría 3 componentes redundantes, excepto el CPD que sólo tendría uno).

Excel

Estas fórmulas son mucho más sencillas de mostrar en Excel. Pegue la siguiente tabla en una hoja de cálculo, comenzando en la casilla A1:
Avail %1 Component2 Components3 Components4 Components
Web85%=B2+((1-B2)*$B2)=C2+((1-C2)*$B2)=D2+((1-D2)*$B2)
Application90%=B3+((1-B3)*$B3)=C3+((1-C3)*$B3)=D3+((1-D3)*$B3)
Database99.9%=B4+((1-B4)*$B4)=C4+((1-C4)*$B4)=D4+((1-D4)*$B4)
DNS98%=B5+((1-B5)*$B5)=C5+((1-C5)*$B5)=D5+((1-D5)*$B5)
Firewall85%=B6+((1-B6)*$B6)=C6+((1-C6)*$B6)=D6+((1-D6)*$B6)
Switch99%=B7+((1-B7)*$B7)=C7+((1-C7)*$B7)=D7+((1-D7)*$B7)
Data Center99.99%?=B8+((1-B8)*$B8)?
ISP95%=B9+((1-B9)*$B9)=C9+((1-C9)*$B9)=D9+((1-D9)*$B9)
System Avail %=b2*b3*b4*b5*b6*b7*b8*b9=c2*c3*c4*c5*c6*c7*b8*c9=d2*d3*d4*d5*d6*d7*d8*d9=e2*e3*e4*e5*e6*e7*d8*e9
Ahora que tiene los conceptos claros y el principio de una hoja de cálculo para calcular los cambios de premisas, se puede centrar en aplicar estas teorías a su situación particular. Comenzando por descomponer su sistema, sea una web simple, un sistema de contabilidad o un sistema de ficheros, en servicios de componentes independientes. Para cada servicio, determine el número mínimo de unidades necesarias para trabajar, y la disponibilidad necesaria de cada unidad.
Estimar la disponibilidad puede ser un reto. Una manera es mirar datos históricos. Si no se tiene acceso a buenos datos, se puede formar una estimación basada en los parámetros de su operativa estándar. Por ejemplo, si se despliega código nuevo del servicio web dos veces al mes y cada una causa un downtime de 2 horas, se traduce en 48 horas de parada al año. Si se espera realizar labores de mantenimiento del sistema operativo una vez por trimestre, con un downtime estimado de 2 horas por trimestre, serían otras 8 horas más al año. Si además se anticipa a un fallo hardware al año, con garantía de entrega el siguiente día hábil, se traduciría en 41 horas de downtime al año (fallos en viernes se repararían el lunes, los sábados y domingos en martes). Sumando estos números obtenemos 48+8+41=98 horas de downtime al año, o una disponibilidad estimada del 98.882%.
Con un poco de esfuerzo, se puede estimar un nivel de disponibilidad realista para su sistema. Éste es el pilar para crear SLA realistas y cumplibles. Estas fórmulas pueden ayudar a IT a negociar los SLAs con negocio, y puede ayudar a comparar el RSI de diferentes soluciones. Por ejemplo, digamos que se está intentando elegir una solución para el servidor web, y se tienen dos elecciones:
  • La opción 1 consiste en 4 servidores usando hardware barato sin redundancia interna. Cada servidor cuesta 3.000 €. Estimamos la disponibilidad en un 75%.
  • La opción 2 consiste en 2 servidores usando hardware caro con discos duros y fuentes de alimentación redundantes. Cada servidor cuesta 20.000 €. Estimas la disponibilidad de cada servidor en un 99%.
Se estima que el coste por downtime es de 500€/hora, y se espera que estos servidores aguanten la carga del sistema los próximos 3 años, tras los cuales se reemplazarán. Usando los números de arriba, la solución #1 tiene una disponibilidad esperada del 99.6% a un coste de 12.000€. La solución #2 tiene una disponibilidad esperada del 99.99%, a un coste de $40.000€. La solución #1 experimentaría 34 horas/año, o 102 horas en los tres años, de downtime más que la solución #2. En tres años, este downtime extra costaría $51.000€. Por lo que el gasto de 28.000€ previo proporcionará un RSI en 3 años del 182%. Nótese que el modelo es tan bueno como lo sean las estimaciones. Si los servidores de la solución #2 sólo tuvieran el 95% de disponibilidad, la disponibilidad combinada sería del 99.75%, lo que sólo proporciona 13 horas menos de downtime anual. En este caso sólo se ahorrarían 20,000€ en los 3 años por la inversión de 28.000€, por lo que podría ser mejor la opción #1.

Conclusión

Diseñar y operar sistemas de alta disponibilidad es un trabajo complicado, pero con unas pocas fórmulas sencillas, es posible entender y predecir su comportamiento general. Esto permitirá realizar mejores decisiones al elegir entre múltiples opciones, y hacer predicciones más realistas a la hora de negociar los SLAs.