World of Oracle DBA

Monday, September 21, 2015

Exadata Part 1

Introduction to Exadata

Let's begin with a whirlwind tour of the Oracle Exadata Database Machine. It comes in a rack with the components that make up a database infrastructure: disks, servers, networking gear, and so on. Three configuration types are available: full rack (see below), half rack, or quarter rack. The architecture is identical across all three types but the number of components differs.

Now let's dive into each of these components and the role they play. The following list applies to a full rack; you can also view them contextually via a really neat 3D demo.

· Database Nodes – The Exadata Database Machine runs Oracle Database 11g Real Application Cluster. The cluster and the database run on the servers known as database nodes or compute nodes (or simply “nodes”). A full rack has 8 nodes running Oracle Linux or Oracle Solaris.

· Storage cells - The disks are not attached to the database compute nodes, as is normally the case with the direct attached storage, but rather to a different server known as the storage cell (or just “cell”; there are 14 of them in a full rack). The Oracle Exadata Server Software runs in these cells on top of the OS.

· Disks – each cell has 12 disks. Depending on the configuration, these disks are either 600GB high performance or 2TB high capacity (GB here means 1 billion bytes, not 1024MB). You have a choice in the disk type while making the purchase.

· Flash disks – each cell also has 384GB of flash disks. These disks can be presented to the compute nodes as storage (to be used by the database) or used a secondary cache for the database cluster (called smart cache).

· Infiniband circuitry – the cells and nodes are connected through infiniband for speed and low latency. There are 3 infiniband switches for redundancy and throughput. Note: there are no fiber switches since there is no fiber component.

· Ethernet switch – the outside world can communicate via infiniband, or by Ethernet. There is a set of Ethernet switches with ports open to the outside. The clients may connect to the nodes using Ethernet. DMAs and others connect to the nodes and cells using Ethernet as well. Backups are preferably via infiniband but they can be done through network as well.

· KVM switch – there is a keyboard, video, and mouse switch to get direct access to the nodes and cells physically. This is used initially while setting up and when the network to the system is not available. In a normal environment you will not need to go near the rack and access this KVM, not even for powering on and off the cells and nodes. Why not? You’ll learn why in the next installment. (Not all models have a KVM switch.)

The nodes run the Oracle Clusterware, the ASM instances, and the database instances. You may decide to create just one cluster or multiple ones. Similarly you may decide to create a single database on the cluster or multiple ones. If you were to create three databases – dev, int and QA - you would have two choices:

· One cluster – create one cluster and create the three databases

· Three clusters – create three different clusters and one database in each of them

The first option allows you to add and remove instances of a database easily. For instance, with 8 nodes in a full rack, you may assign 2 nodes to dev, 2 to int, and 4 to QA. Suppose a full-fledged production stress test is planned and that temporarily needs all 8 nodes in QA to match 8 nodes in production. In this configuration, all you have to do is shut down the dev and int instances and start the other four instances of QA on those nodes. Once the stress test is complete, you can shut down those 4 QA instances and restart the dev and int instances on them.

If you run multiple production databases on a single rack of Exadata, you can still take advantage of this technique. If a specific database needs additional computing power temporarily to ride out a seasonal high demand, just shut down one instance of a different database and restart the instance of the more demanding one in that node. After the demand has waned, you can reverse the situation. You can also run two instances in the same node but they will compete for the resources – something you may not want. At the I/O level, you can control the resource usage by the instances using the IO Resource Manager (IORM).

On the other hand, with this option, you are still on just one cluster. When you upgrade the cluster, all the databases will need to be upgraded. The second option obviates that; there are individual clusters for each database – a complete separation. You can upgrade them or manipulate them any way you want without affecting the others. However, when you need additional computational power for other nodes, you can’t just start up an instance. You need to remove a node from that cluster and add the node to the other cluster where it is needed – an activity more complex compared to the simple shutdown and startup of instances.

Since the cells have the disks, how do the database compute nodes access them - or more specifically, how do the ASM instances running on the compute nodes access the disks? Well, the disks are presented to cells only, not to the compute nodes. The compute nodes see the disks through the cells. For the lack of a better analogy, this is akin to network-attached storage. (Please note, the cell disks are not presented as NAS; this is just an analogy.)

The flash disks are presented to the cell as storage devices as well, just like the normal disks. As a result they can be added to the pool of ASM disks to be used in the database for ultra fast access, or they can be used to create the smart flash cache layer, which is a secondary cache between database buffer cache and the storage. This layer caches the most used objects but does not follow the same algorithm as the database buffer cache, where everything is cached first before sending to the end user. Smart flash cache caches only those data items which are accessed frequently – hence the term “smart” in the name. The request for data not found in the smart flash cache is routed to disks automatically.

The Secret Sauce: Exadata Storage Server

So, you may be wondering, what’s the “secret sauce” for the Exadata Database Machine’s amazing performance? A suite of software known as Exadata Storage Server, which runs on the storage cells, is the primary reason behind that performance. In this section we will go over the components of the storage server very briefly (not a substitute for documentation!).

Cell Offloading

The storage in the Exadata Database Machine is not just dumb storage. The storage cells are intelligent enough to process some workload inside them, saving the database nodes from that work. This process is referred to as cell offloading. The exact nature of the offloaded activity is discussed in the following section.

Smart Scan

In a traditional Oracle database, when a user selects a row or even a single column in a row, the entire block containing that row is fetched from the disk to the buffer cache, and the selected row (or column, as the case may be) is then extracted from the block and presented to the user’s session. In the Exadata Database Machine, this process holds true for most types of access, except a very important few. Direct path accesses – for instance, full table scans and full index scans – are done differently. The Exadata Database Machine can pull the specific rows (or columns) from the disks directly and send them to the database nodes. This functionality is known as Smart Scan. It results in huge savings in I/O.

For instance your query might satisfy only 1,000 rows out of 1 billion but a full table scans in a traditional database retrieves all the blocks and filters the rows from them. Smart Scan, on the other hand, will extract only those 1,000 rows (or even specific columns from those rows, if those are requested) – potentially cutting I/O by 10 million times! The cell offloading enables the cells to accomplish this.

Not all the queries can take advantage of Smart Scan. Direct buffer reads can. An example of such queries is a full table scan. An index scan will look into index blocks first and then the table blocks – so, Smart Scan is not used.

iDB

How can storage cells know what columns and rows to filter from the data? This is done by another component inherently built into the storage software. The communication between nodes and cells employ a specially developed protocol called iDB (short for Intelligent Database). This protocol not only request the blocks (as it happens in an I/O call in a traditional database) but can optionally send other relevant information. In those cases where Smart Scan is possible, iDB sends the names the table, columns, predicates and other relevant information on the query. This information allows the cell to learn a lot more about the query instead of just the address of the blocks to retrieve. Similarly, the cells can send the row and column data instead of the traditional Oracle blocks using iDB.

Storage Indexes

How does Smart Scan achieve sending only those relevant rows and columns instead of blocks? A special data structure built on the pattern of the data within the storage cells enables this. For a specific segment, it stores the minimum, maximum, and whether nulls are present for all the columns of that segment in a specified region of the disk, usually 1MB in size. This data structure is called a storage index. When a cell gets a Smart Scan-enabled query from the database node via iDB, it checks which regions of the storage will not contain the data. For instance if the query predicate states where rating = 3, a region on the disk where the minimum and maximum values of the column RATING are 4 and 10 respectively will definitely not have any row that will match the predicate. Therefore the cell skips reading that portion of the disk. Checking the storage index, the cell excludes a lot of regions that will not contain that value and therefore saves a lot of I/O.

Although it has the word “index” in its name, a storage index is nothing like a normal index. Normal indexes are used to zero in on the locations where the rows are most likely to be found; storage indexes are used just for the opposite reason – where the rows are most likely not to be found. Also, unlike other segments, these are not stored on the disks; they reside in memory.

Smart Cache

Database buffer cache is where the data blocks come in before being shipped to the end user. If the data is found there, a trip to the storage is saved. However, if it not found, which might be the case in case of large databases, the I/O will inevitably come in. In Exadata Database Machine, a secondary cache can come in between the database buffer cache and the storage, called Smart Cache. The smart cache holds frequently accessed data and may satisfy the request from the database node from this cache instead of going to the disks – improving performance.

Infiniband Network

This is the network inside the Exadata Database Machine – the nervous system of the machine through which the different components such as database nodes and storage cells. Infiniband is a hardware media running a protocol called RDP (Reliable Datagram Protocol), which has high bandwidth and low latency – making the transfer of data extremely fast.

Disk Layout

The disk layout needs some additional explanation because that’s where most of the activities occur. As I mentioned previously, the disks are attached to the storage cells and presented as logical units (LUNs), on which physical volumes are built.

Each cell has 12 physical disks. In a high capacity configuration they are about 2TB and in a high performance configuration, they are about 600GB each. The disks are used for the database storage. Two of the 12 disks are also used for the home directory and other Linux operating system files. These two disks are divided into different partitions

The physical disks are divided into multiple partitions. Each partition is then presented as a LUN to the cell. Some LUNs are used to create a filesystem for the OS. The others are presented as storage to the cell. These are called cell disks. The cell disks are further divided as grid disks, ostensibly referencing the grid infrastructure the disks are used inside. These grid disks are used to build ASM Diskgroups, so they are used as ASM disks. An ASM diskgroup is made up of several ASM disks from multiple storage cells. If the diskgroup is built with normal or high redundancy (which is the usual case), the failure groups are placed in different cells. As a result, if one cell fails, the data is still available on other cells. Finally the database is built on these diskgroups.

These diskgroups are created with the following attributes by default:

Parameter	Description	Value
_._DIRVERSION	The minimum allowed version for directories	11.2.0.2.0
COMPATIBLE.ASM	The maximum ASM version whose features can use this diskgroup. For instance ASM Volume Management is available in 11.2 only. If this parameter is set to 11.1, then this diskgroup can’t be used for an ASM volume.	11.2.0.2.0
IDP.TYPE	Intelligent Data Placement, a feature of ASM that allows placing data in such a way that more frequently accessed data is located close to the periphery of the disk where the access is faster.	dynamic
CELL.SMART_SCAN_CAPABLE	Can this diskgroup be enabled for Exadata Storage Server’s Smart Scan Capability?	TRUE
COMPATIBLE	The minimum version of the database that can be created on this diskgroup. The far back you go back in version number, the more the message passing between RExadata Database MachineS and ASM instances causing performance issue. So, unless you plan to create a pre-11.2 database here (which you most likely do not plan on), leave it as it is.	11.2.0.2
AU Size	The size of Allocation Unit on this disk. The AU is the least addressable unit on the diskgroup.

On two of the 12 disks, the operating system, Oracle Exadata Storage Server software, and other OS related filesystems such as /home are located. They occupy about 29GB on a disk. For protection, this area is mirrored as RAID1 with on another disk. The filesystems are mounted on that RAID1 volume.

However, this leaves two cell disks with less data than the other ten. If we create an ASM diskgroup on these 12 disks, it will have an imbalance on those two disks. Therefore, you (or whoever is doing the installation) should create another diskgroup with 29TB from the other 10 cell disks. This will create same sized ASM disks for other diskgroups. This “compensatory” diskgroup is usually named DBFS_DG. Since this diskgroup is built on the inner tracks of the disk, the performance is low compared to the outer tracks. Therefore instead of creating a database file here, you may want to use it for some other purpose such as ETL files. ETL files need a filesystem. You can create a database filesystem on this diskgroup – hence the name DBFS_DG. Of course, you can use it for anything you want, even for database files as well, especially for less accessed objects.

Now that you know the components, look at the next section to get a detailed description of these components.

Detailed Specifications

As of this writing, the current (third) generation of Exadata Database Machine comes in two models (X2-2 and X2-8); various sizes (full rack, half rack, and quarter rack); and three classes of storage (high performance, high capacity SAS, and high capacity SATA). For detailed specifications, please see the configuration specs on the Oracle website: X2-2, X2-8, X2-2 Storage Server.

Tuesday, June 11, 2013

Flash Back Concepts

Files stored in Flash Recovery Area(FRA)

Redo log file and control file are permant file resides in FRA remainig file are deleted by FRA
become obsolete or already backed up to media

(1)Flashback logs:
if Flashback enabled option enabled in the database. every changes in blocks belongs to the datafile
will be recorded as a image copy in flash recovery area named flashback log.

(2)Control file:
The multiplexed copy of current control file will e stored in FRA

(3)Control file of Autobackups:
In RMAN settin "CONFIGURE CONTROLFILE AUTOBACKUP ON" then it copies
(RMAN automatically backs up the control file and the current server parameter file)
(a) when a successful backup must be recorded in the RMAN repository
(b) when a structural change to the database affects the contents of the control file
(c) Whenever you (add /drop) ,changing the status in(online/offline/read only), (rename/resize,/increase /decrease) the size of datafileor tablespace
(d) When adding a new online redo log or adding a new redo thread.

(4)Online Redo log files:
multiplexed copy of redo log file are allow to store in FRA

(5)Archived Redo log files
FRA allow store your archived redo log file but any space issue in FRA Rman automatically will delete the files

Flash Recovey feature of 11g

SQL> desc v$flashback_database_log;
Name                                      Null?    Type
----------------------------------------- -------- --------
OLDEST_FLASHBACK_SCN          NUMBER - (minimum SCN to which you can flash back the database)
OLDEST_FLASHBACK_TIME                  DATE   - (earliest time to which you can flash back the database)
RETENTION_TARGET                        NUMBER - (how long flashback logs are retained, in minutes)
FLASHBACK_SIZE                              NUMBER - (size of flashback logs as of now)
ESTIMATED_FLASHBACK_SIZE                  NUMBER - ( The estimated size of the total flashback     logs retained to satisfy the retention target at the end of this time interval, shown in the column END_TIME)

***To calculate (using logs available in flash recovery area) how far to rewind the database to the previous states.

SQL> select * from v$flashback_database_log;

OLDEST_FLASHBACK_SCN OLDEST_FL RETENTION_TARGET FLASHBACK_SIZE
-------------------- --------- ---------------- --------------
ESTIMATED_FLASHBACK_SIZE
------------------------
             1040830 21-SEP-12             1440       15941632
                       0
The value of the column OLDEST_FLASHBACK_SCN is 1040830, which indicates you can
flash back to the SCN up to that number only, not before that.

***(OR)

SQL> select to_char(oldest_flashback_time,'mm/dd/yy hh24:mi:ss') from v$flashback_database_log;

TO_CHAR(OLDEST_FL
-----------------
09/21/12 16:57:23

***To Calculate the total Amount of Logs Generated in Flashback Area at several timestamps
SQL> select * from v$flashback_database_stat order by begin_time;

BEGIN_TIM END_TIME FLASHBACK_DATA    DB_DATA REDO_DATA ESTIMATED_FLASHBACK_SIZE
--------- --------- -------------- ---------- ---------- ------------------------
21-SEP-12 21-SEP-12        2998272    2785280    1826816                        0

***(OR)-- To get the exact BEGIN TIME AND END TIME of the log generation

SQL> alter session set nls_date_format = 'mm/dd/yy hh24:mi:ss';
Session altered.

SQL> select * from v$flashback_database_stat order by begin_time;

BEGIN_TIME        END_TIME          FLASHBACK_DATA    DB_DATA REDO_DATA ESTIMATED_FLASHBACK_SIZE
----------------- ----------------- -------------- ---------- ---------- ------------------------
09/21/12 16:57:23 09/21/12 17:20:49        2998272    2785280    1876992                        0

***To get the individual Size of Logs Generated in Flashback Area at every period

SQL> select end_time, estimated_flashback_size from v$flashback_database_stat order by 1;

END_TIME          ESTIMATED_FLASHBACK_SIZE
----------------- -----------------------0
09/21/12 17:22:04                154224128
09/25/12 20:37:57                137567258
10/25/12 22:28:20                194224128


***To Calculate the total Space used by Logs in the Flash Recovery Area

SQL> select * from v$flashback_database_log;

OLDEST_FLASHBACK_SCN OLDEST_FLASHBACK_ RETENTION_TARGET FLASHBACK_SIZE ESTIMATED_FLASHBACK_SIZE
-------------------- ----------------- ---------------- -------------- ------------------------
             1040830 09/21/12 16:57:23             1440       15941632                169648128

***Flashing Back a Database from "RMAN Prompt" (or) "SQL prompt"

***Presteps for FRA using "RMAN Prompt" (or) "SQL prompt"
1.To calculate (using logs available in flash recovery area) how far to rewind the database to the previous states.
SQL> select * from v$flashback_database_log;
2. rman target=/
3. RMAN> shutdown immediate (or) SQL> shutdown immediate
4. RMAN> startup mount      (or) SQL> startup mount

***Poststeps of FRA using "RMAN Prompt" (or) "SQL prompt"
1. RMAN> alter database open read only (or) SQL> alter database open read only
NOTE: to verify desired information available or againg follow the presteps of FRA THEN FOLLOW STEP 2
2.RMAN> alter database open resetlogs; (or) SQL> alter database open resetlogs;
Note: always use open the database with resetlogs to create new incarnation of scn number

***Flashing Back a Database using specific point-in-time, specified by date and time
1.presteps of FRA
2.RMAN (or) SQL> flashback database to timestamp to_date('5/18/2007 00:00:00','mm/dd/yyyy hh24:mi:ss');
(OR)
RMAN> flashback database to time 'sysdate-2/60/24';
3.poststep of FRA

***Flashing Back a Database using specific SCN number
SQL> select current_scn from v$database;

1.prestep of FRA
2.RMAN (or) SQL> flashback database to scn 4587962;
3.poststep of FRA

***Flashing Back a Database using last Resetlogs operation

SQL> select resetlogs_change# from v$database;
RESETLOGS_CHANGE#
-----------------
1548632

1.prestep of FRA
2.RMAN> flashback database to before resetlogs;
3.poststep of FRA

***Flashing Back a Database using named Restore-Point
1.prestep of FRA
RMAN (or) SQL> flashback database to restore point restore_point1;
3.poststep of FRA

Note: restore point can be (create, create with guarantee, list, drop)
SQL> create restore point rp1;
SQL> create restore point rp2 guarantee flashback database;-- create guarantee RS-point
SQL> select * from v$restore_point order by 2,1; --- list the RS-point
SQL> drop restore point rp2;

Recovering a object from Recycle bin
SQL> show recyclebin
SQL> select * from user_recyclebin;
SQL> alter session set recyclebin = on; ---The recycle bin is enabled by default.
SQL> alter session set recyclebin = off;
SQL> alter system set recyclebin = off; ---To disable for the entire database (not recommended):
SQL> alter system set recyclebin = on;

SQL> purge recyclebin;-- remove all dropped objects from the recyclebin (system wide):

***Recovering table related objects( constraint, index, trigger, )
***Display the Constraints of the table:

SQL> select constraint_type, constraint_name from user_constraints where table_name = 'SAMPLE';
C CONSTRAINT_NAME
- -----------------------------
P BIN$sdlkolirkfjHDFsdfskjks==$0 ---- P(PRIMARY)
C BIN$584ASRDJGThTINjOISVM3Q==$0 ---- C(CHECK )

SQL> alter table accounts rename constraint "BIN$sdlkolirkfjHDFsdfskjks==$0" to pk_SAMPLE;
Table altered.

SQL> alter table accounts rename constraint "BIN$584ASRDJGThTINjOISVM3Q" to ck_SAMPLE_01;
Table altered.

***Display the Index of the table:

SQL> select index_name from user_indexes where table_name = 'SAMPLE';
INDEX_NAME
-----------------------------
BIN$sdlkolirkfjHDFsdfskjks==$0
BIN$584ASRDJGThTINjOISVM3Q==$0

SQL> alter index "BIN$sdlkolirkfjHDFsdfskjks==$0" rename to IN_SAMPLE_01;
Index altered.

SQL> alter index "BIN$584ASRDJGThTINjOISVM3Q==$0" rename to IN_SAMPLE_02;
Index altered.

***To check the index restored properly

Finally, make sure the indexes are in place and have correct names:
SQL> select index_name from user_indexes where table_name = 'SAMPLE';

INDEX_NAME
-----------------------------
IN_SAMPLE_01
IN_SAMPLE_02

***Display the trigger of the table:

SQL> select trigger_name from user_triggers;
TRIGGER_NAME
-----------------------------
"BIN$sdlkolirkfjHDFsdfskjks==$0"

***Rename triggers to their original names:

SQL> alter trigger "BIN$sdlkolirkfjHDFsdfskjks==$0" rename to tr_sample_01;
Trigger altered.

*Check the triggers now to make sure they are named as they were originally:
SQL> select trigger_name from user_triggers;
TRIGGER_NAME
-----------------------------
TR_SAMPLE_01

Identify and fix table fragmentation in Oracle 10g - how?

Table fragmentation – when?

If a table is only subject to inserts, there will not be any fragmentation.
Fragmentation comes with when we update/delete data in table.
The space which gets freed up during non-insert DML operations is not immediately re-used (or sometimes, may not get reuse ever at all). This leaves behind holes in table which results in table fragmentation.

To understand it more clearly, we need to be clear on how oracle manages space for tables.

“High water mark” of table actually defines the border line between (ever) used and unused (never) space. While performing full table scan, Oracle will always read the data up to HWM. And if there is lot of free space with-in HWM, that is read too, and hence degrading the performance of FTS.

Now lets see how to identify HWM, unused (never used) space and free space (used but deleted/updated) and then take a call whether the concerned table is candidate for a reorganization or not.

SQL> create table test as select * from dba_tables; -- Create a table

Table created.

SQL> analyze table test compute statistics; -- Analyze it

Table analyzed.

SQL> select blocks "Ever Used", empty_blocks "Never Used", num_rows "Total rows"
2 from user_tables where table_name='TEST'; -- The number of blocks used/free

Ever Used Never Used Total rows
---------- ---------- ----------
49 6 1680

SQL> delete from test where owner='SYS'; --- Im deleting almost half the number of rows.

764 rows deleted.

SQL> commit;

Commit complete.

SQL> analyze table test compute statistics; -- Analyze it again

Table analyzed.

SQL> select blocks "Ever Used", empty_blocks "Never Used", num_rows "Total rows"

2 from user_tables where table_name='TEST'; -- No difference in blocks usage

Ever Used Never Used Total rows
---------- ---------- ----------
49 6 916

PL/SQL procedure successfully completed.

Even though you deleted almost half the rows, the above shows that table HWM is up to 49 blocks, and to perform any FTS, Oracle will go up to 49 blocks to search the data. If your application is so-written that there are many FTS on this table, you may consider, reorganizing this table.

Reasons to reorganization

a) Slower response time (from that table)
b) High number of chained (actually migrated) rows.
c) Table has grown many folds and the old space is not getting reused.

Note: Index based queries may not get that much benefited by reorg as compared to queries which does Full table scan.

How to reorganize?

Before Oracle 10g, there were mainly 2 ways to do the reorganization of the table
a) Export, drop, and import.
b) Alter table move (to another tablespace, or same tablespace).

Oracle 10g provides us a new way of reorganizing the data.

Shrink command: This command is only applicable for tables which are tablespace with auto segment space management.

Before using this command, you should have row movement enabled.

SQL> alter table test enable row movement;
Table altered.

There are 2 ways of using this command.

1. Break in two parts: In first part rearrange rows and in second part reset the HWM.

Part 1: Rearrange (All DML's can happen during this time)

SQL> alter table test shrink space compact;
Table altered.

Part 2: Reset HWM (No DML can happen. but this is fairly quick, infact goes unnoticed.)

SQL> alter table sa shrink space;
Table altered.

2. Do it in one go:

SQL> alter table sa shrink space; (Both rearrange and restting HWM happens in one statement)
Table altered.

Few advantages over the conventional methods
1. Unlike "alter table move ..", indexes are not in UNUSABLE state. After shrink command, indexes are updated also.

2. Its an online operation, So you dont need downtime to do this reorg.

3. It doesnot require any extra space for the process to complete.

Conclusion
Its a new 10g feature to shrink (reorg) the tables (almost) online which can be used with automatic segment space management.

Tuesday, November 27, 2012

Comparison of Golden gate Vs Streams

The purpose of this note is to present a high level comparison of the operation of Oracle Streams version 11.2.0.3 and Oracle GoldenGate commencing version 11.1.1.

The note is only applicable to operation against an Oracle database. No new Streams features will be added after 11.2.0.3 . This is outlined in Oracle's statement of direction for GoldenGate - which can be found here. Oracle Streams will continue to be supported and maintained in future releases. The strategic direction for logical replication within Oracle is now Oracle GoldenGate. The intention is to integrate the 'best' features of Streams into Oracle GoldenGate.

Streams will continue to support exactly what it supports today in 11.2 in future releases. New replication features in relation to the capture of additional data types, etc. will not be supported by Streams as Streams will not be enhanced. Customers who wish to take advantage of new replication related features will need to license GoldenGate.

Oracle GoldenGate is a separate product from Oracle Streams. Oracle GoldenGate Extract runs externally to the Oracle Database and is licensed seprately from the Oracle Database. Oracle Streams Capture runs as a process within the Oracle instance. Both products effectively offer the same level of functionality by different means. Hereafter, Streams will refer to Oracle Streams and GoldenGate will refer to Oracle GoldenGate (OGG).

Each product deals with activity which appears in the Oracle redo logs. There are certain objects which neither product are able to Capture. Large and long running transactions (LRTs) need special consideration in both products. DDL operations will also introduce new tables, change existing tables and may result in others being dropped - this also presents a challenge to determine exactly what the details in the Oracle redo logs relate to.

This is not an exhaustive reference on the subject. Always consult the Oracle product documentation for more details to any questions which may arise. Most of this information has been in the public domain for a long time. Considering the details, both products are working at achieving the same goal albeit in slightly different ways.

Staging

Staging is the idea of temporarily storing only the changes a process is interested in so they can be processed by the next activity. Where relevant, staging can be employed at both the Source and the Remote databases. Streams and GoldenGate do things differently in this area as a result of different design considerations. In this document, the focus is only on the Capture / Extract process staging changes from the Oracle redo logs.

A significant difference between the two products is that Streams stages all committed and uncommitted transaction changes. GoldenGate only stages committed changes. This has a significant implication in the way the products work.

Whilst it is not possible to view the content of an Oracle buffered queue, as it is an in memory structure, changes which GoldenGate Extract has staged can be viewed with the Logdump utility.
Streams: Streams: Streams may store or stage changes in a database buffered queue. The benefit is that changes are stored in memory which has a fast access time. Whilst a queue is created, it is not managed by Streams AQ. In 11.2 with Streams operating in CCA mode (the default mode, where possible), Capture interacts with Apply. As such, there no real staging involved.

GoldenGate: When processing a one-time activity : an initial load or a batch run, GoldenGate stages extracted changes in an extract file. During normal operation , GoldenGate will stage changes in an extract trail. The difference is that no checkpoint information is maintained for an extract file whereas it is for a trail. The OGG trail is comprised of multiple, unstructured files characterized by filenames with a common 2-characater prefix and an ascending six digit suffix. Thus, we refer to trail 'ab' which is composed of trail files 'ab000000', 'ab000001', 'ab000002', etc.

How Changes are Staged

Streams: Streams creates a queue but now does not use Streams AQ. The queue only serves as a means to record subscriber information - who has an interest in changes. Enqueue relates to storing a change in memory whilst dequeue is the corresponding action of referencing that change - AQ is not used. In CCA Mode, if propagation is configured, the Capture process communicates directly with a propagation receiver on the remote database. If CCA Mode is not used, a separate propagation activity dequeues the message and sends it to the destination.
The propagation receiver enqueues the message onto the destination queue. An Apply process reader will dequeue messages connected with committed transactions from this queue.

CCA Mode means that the Capture process fulfills both activities : capturing changes and , where relevant, propagating them . There is no separate background propagation activity involved as propagation is carried out by the Capture process itself. The CCA optimisation is not available for all Streams configurations but it can be seen that this simplifies some of the complexity at the local side.

CCA mode also means that users should monitor gv$propagation_sender and gv$propagation_receiver views to understand what the Capture process is doing and not refer to the dba_queue_schedules view . There is no queue schedule propagation job - it is being done by Capture itself.

CCA (default operation) is more efficient than non-CCA operation . It removes the possibility for queue spill which was possible in versions below 11g. In view of this , this document does not expand upon queue spill as it is anticipated that users adopt relevant conditions to ensure CCA operation.

GoldenGate: GoldenGate stages changes outside the database in its own cache within process memory. Once transactions are committed, they are written to the trail file from the cache.

Memory Allocation

Both Steams and GoldenGate work more effectively if transactions are small and short lived.
Streams: Streams requires memory to be allocated on the source and destination databases to cache changes . Staging these changes involves caching transaction information in the Streams pool which is located within the SGA. Users need to be aware therefore of the Streams pool related parameter : STREAMS_POOL_SIZE. As a starting measure , it is useful to preallocate at least 200MB. This is recommended for the Capture and Apply side databases. The Streams pool , as well as holding staged changes in the related buffered queue, also accommodates Logminer's work area for a Capture process. If required, changes are only written to a database table at the Apply side .
GoldenGate: Extract caches transaction information in OS memory . For Oracle database, caching is associated with the parameter CACHEMGR parameter which is relevant to the source Extract process.

Transaction Capture / Extract

Streams: Streams Capture stages all changes connected with an object whether committed or uncommitted . It is the responsibility of the Apply side to manage what to do with these. Once the Apply reader sees a commit , it can pass the transaction to the Apply coordinator process which can then assign an Apply server to do the work or processing the transaction. If a rollback is observed, the transaction is discarded by the Apply reader. Large or long running transactions which go uncommitted may need to be written to a database table.
GoldenGate: GoldenGate only stages committed transactions. Whilst the Extract process is reading redo information , it caches change data in process memory. Once a commit is observed , the transaction can be written to the Extract trail or file. If a rollback is observed, the transaction can be discarded by Extract. Very large transactions which may have an impact on memory usage may need to be flushed out to disk to make space for other transaction details; long running transactions can also be written out to disk. GoldenGate can also use Bounded Recovery for large and long running transactions to influence Extract restart activity.

Large Transactions

Streams: Large transactions are also referenced under 'Transaction Spill' below. These are managed at the Apply side and written to a partitioned table. Large transactions are written out to the same Apply spill table after the LCR count reaches the Apply parameter : TXN_LCR_SPILL_THRESHOLD. Streams Capture also reports the transaction id of a large transaction as it is being mined in the alert log file.

GoldenGate: Large transactions may be flushed from the Extract cache to disk to avoid affecting smaller transactions in the cache . The size of the cache is controlled by the CACHESIZE parameter . Unlike Streams there is no transaction size (number of rows affected) at which Goldengate writes large transactions out to disk.

Transaction Spill

This is also referred to as Apply spill in a Streams environment . Apply spill, where possible, is written out to a partitioned table.
Streams: It is the responsibility of the Apply side to handle the spill of transactions from memory to a partitioned spill table under the following conditions :

- transactions are large ; or
- transactions are long running ;

Additionally:
- error transactions may also written out to the Apply spill table if they had already been spilled prior to a later error taking place.

An affect of Apply spill is to allow checkpoint activity to advance past the point these transactions have currently reached. In short, it means that Apply spilled long running transaction data will not have an affect on where Capture restarts from.
GoldenGate: GoldenGate only stages committed transactions. Whilst the Extract process is reading redo information , it caches uncommitted information. In view of the fact that only committed transactions are staged , the Extract process also needs to take into consideration large and long running activity. Again, large transactions may affect the cache of transactions with the same consequences that Streams would experience and therefore Extract may flush out large transactions to disk under dirtmp. Long running transaction can be envisaged as causing a problem on restart and GoldenGate addresses this by performing bounded recovery. This is discussed further under the section 'Bounded Recovery'.

Long Running transactions (LRTs)

Streams: Streams captures committed and uncommitted changes it is interested in. All txn changes - committed or uncommitted - are sent to the Apply process. The Apply process will then write to the Apply spill table a transaction which has been open for a duration which exceeds the Apply parameter : TXN_AGE_SPILL_THRESHOLD seconds. Streams Capture also reports the transaction id of a long running transaction as it is being mined in the alert log file.
GoldenGate: Transaction activity is cached. Until Extract processes a commit or rollback, the transaction is considered open and its information continues to be collected in memory. The management of these open long running transaction is handled within the bounded recovery mechanism where , as with Streams , long running transactions will be written to disk. GoldenGate writes transaction to the trail in transaction commit order. This means that transactions may appear 'out of order' when viewed from the perspective of when they were started.

GoldenGate Extract will also warn users that long transactions are taking place if there is an attempt to shutdown Extract ; Streams doesn't do this. If there are long running transactions, this will also provide checkpoint information which will show how far back the Extract process will go in the Oracle redo if it is forced to shutdown at this point - as a bounded recovery checkpoint may not have taken place recently. The user is therefore able to make an informed choice as to whether it is reasonable to perform a force shutdown of the Extract process. With Bounded Recovery taking place, even with long running transactions taking place , Extract should not need to go too far back in the redo logs on restart. Refer to 'Bounded Recovery' below.

Log Mining

Logminer might be used in a replication environment to verify the original transaction changes in the Oracle redo logs in the event of any uncertainty. In this situation, relevant redo logs containing the transaction would be added to an Ad-Hoc logminer session and v$logmnr_contents inspected to determine the details.

Streams: Streams uses logminer to retrieve changes from Oracle redo logs. The Streams capture process is a client of logminer.

GoldenGate: GoldenGate needs to interpret the content of the redo log files directly. With GoldenGate version 11.2.1 however, there is also be the option to use integrated Capture which will use logminer . Integrated Capture (as opposed to classic Capture - refer to the Oracle Installation and Setup Guide for 11.2.1) supports more data types and storage types as well as making it easier to work with RAC and ASM ; other benefits are also provided with integrated Capture.

Supplemental Logging

Both Streams and GoldenGate require Supplemental changes to be logged in Oracle's redo Stream . Refer to the product documentation for more details. This is an important consideration as it can affect the way changes are applied and therefore the integrity of the data.

The purpose of Supplemental logging relates to the following :

- detail needs to be written to the redo stream for a key column (primary key, unique key or substitute key) which will be passed across to the remote database to allow the equivalent row on the destination table to be accessed by index access and not Full table Scan (FTS);

- detail also needs to be logged for those columns which will be used in conflict resolution; these may be columns which were not specified in the operation.

The Supplemental logging requirements are different between Streams and Goldengate and the command appropriate to each product should be used.

Streams: Supplemental logging is added at the point the prepare command is issued. If additional supplemental logging is required , this needs to be added explicitly with the relevant 'ADD SUPPLEMENTAL LOG' command

GoldenGate: GoldenGate supplemental logging is added with the ADD TRANDATA command. Where DDL is handled, DDLOPTIONS ADDTRANDATA would also need to be used as an Extract parameter. Columns specified will be unconditionally logged in the redo stream.

For more detail and examples of the way Supplemental logging can affect data integrity , refer to 'Data Convergence and Conflict Resolution'.

Extract / Database Persistence

Streams: Streams processes resides within the Oracle instance . Streams has its own internal state information which is maintained within Oracle database tables . All related processes are part of the Oracle instance. Streams is therefore protected by Oracle's recovery mechanism. Capture of changes in memory requires periodic checkpointing as memory is non permanent. Checkpoint information is written to the database.
GoldenGate: The Extract process resides apart from the Oracle database and writes no state information into the Oracle database. Again a checkpointing is required as again GoldenGate , like Streams, attempts to process as much as possible in memory. Checkpoint information is written outside the database.

Refer to 'Checkpoint Operation' below.

Transaction Ordering

Streams: Transaction SCN commit time ordering is managed implicitly. Furthermore , Apply parallelism takes into consideration transaction SCN commit ordering.
GoldenGate: GoldenGate enforces commit time ordering but does not offer parallel Replicat . Separate Extract / Replicat processes can be used. Again, like Streams , each Extract/Replicat pair enforces transaction ordering based on commit sequence.

Where multiple Capture / Extract processes are defined for a source database , the table groups should be logically partitioned or separated, i.e. there should not be referential integrity across tables associated with different processes. Furthermore, any application which uses this data should be aware that the X tables connected with Process X' may not be at the same point in time as the Y tables connected with the different process Y' - since the processes may be processing at different rates.

Handling DDL Activity

Streams: The Multi Version Data Dictionary (MVDD) is used by Logminer to mine redo. This allows logminer to go back to a point in the past , provided the redo logs exist , and mine forward. The MVDD records maintain changes to the object over time so that exact object information is available at the SCN it is mined. Some object information is also sent to the Apply database for replicated objects , again so that the object detail is known at any point in time.

If there is DDL activity (such as tables being dropped or truncated) taking place against the database, these operations will be recorded in the MVDD. Also, if columns are being dropped or added, the MVDD will hold this information to give precise detail of what the object is at a point in time.
GoldenGate: GoldenGate does not have an MVDD as Streams does. However Goldengate still faces the same challenge of how to translate object ids, columns ids, partition numbers, etc. back into high level meaningful information. If tables are different on source and destination DEFGEN may need to be run to allow these differences to be reconciled.

If an Extract process is stopped, transaction changes occur to an object, the object is dropped then recreated, GoldenGate is still able to capture these transaction changes after it is restarted. In order to do this, DDL support needs to be installed which comes from the ddl_setup.sql script ; the tables created are used by Extract to determine what the related object, etc. was . Oracle's data dictionary only reflects the current state of the database but does show the history of DDL activity that has taken place. At the time a statement is processed by Extract, the object may actually have changed or have been dropped altogether. This however, should not affect the ability to process all activity on an object up until the point it was removed on a source database.

In order to maintain the above information , GoldenGate creates a 'before ddl on database' trigger which can record details at the time the operation starts.

Instantiating Objects

Instantiation in a replication environment is basically the same process for Streams and GoldenGate . The aim is to have an exact copy of the source tables at the destination database before the Capture or Extract process is started.

A brief comparision of the activity is detailed as follows . Details have been expanded so that it is evident what needs to be considered in the process. Allowance is given to an active / busy database in details .
Streams: Streams: The Streams Administration guide is the definitive location to reference this activity as there are significantly more details presented there.

- perform a build and/or create the Capture process to create the MVDD;
- prepare all the relevant objects which are involved in replication which will result in an MVDD entry on the Apply side database being populated correctly. This step has to come after the creation of the Capture process . The prepare call also adds supplemental logging for the object;
- if there are txns active against the object being prepared, the operation will hang until the txn has completed;
- once completed, the current system SCN or flashback SCN can be determined and with this a copy of the objects can be moved to the Apply database , as follows :
- select dbms_flashback.get_system_change_number from dual;
- perform an export, specifying the flahsback scn and import into the destination database; or
- use the flashback scn restore operation and subsequent incomplete recovery at the destination;
- specify the flashback scn as the instantion scn for Apply side tables and then start the replication processes.
GoldenGate: The equivalent to the above would be as follows :

- there is no MVDD, therefore there is no build operation;
- prepare is not relevant as there is no MVDD;
- Add supplemental logging for all tables on the source using add trandata; adding supplemental requires that there are no active transactions running against the table therefore should this hang for an extended duration it must mean that there is an active transaction ; these need to be killed off if this is the case - as would be the case with Streams as well;
- Record the active transactions just before the Extract is created (gv$transaction) - just as a precaution (as mentioned in Note:1276058.1;
- Create the Extract process.

Note : As is the case with Streams Capture , we want to make sure that there are no active transactions running on the system at the time when we create the Extract process and then determine the flashback SCN . Once we have the flashback SCN we can commence with the instantiation process. It would be impossible for the Extract process to capture complete changes in the trail file connected with active txns which occurred before it existed !.

- Once transactions have been completed (or killed off), the data can be instantiated :
- get the current system change number (SCN);
- select dbms_flashback.get_system_change_number from dual;
- export specifying the flashback scn and import into the destination database; or
- use the flashback scn in the incomplete recovery restore at the destination;
- the Replicat process can then be started using the AFTERSCN which maps to the flashback SCN.

More details are outlined in : Note:1276058.1.

Checkpoint Operation

Both Streams and GoldenGate perform checkpoint operations. The objective of checkpointing is to avoid the Capture or Extract process having to go back to an old logfile and remining perhaps a number of days worth of redo; worse still, if this does happen a redo logfile may be no longer be available.

Streams and Goldengate do things slightly differently in this area but the intention is the same : to record the state of activity at periodic intervals.
Streams: As discussed in Note:418755.1 Streams uses logminer checkpointing. Checkpointing can be controlled as discussed in the Oracle product documentation as well as in Note:335516.1 - Master Note for Streams Performance Recommendations. Checkpointing is done by the Capture process and is not done on a time basis but is driven by the amount of redo processed ; therefore, if the system is relatively inactive, it would seem reasonable to perform checkpoints relatively infrequently.

The redo log which the Capture process will restart on is indicated by the required_checkpoint_scn column located in the dba_capture view. This value is also printed out in the Streams Healthcheck report - refer to : Note:273674.1.
GoldenGate: GoldenGate does not use logminer. It maintains a number of separate checkpoint files which record where a process has reached. For more information refer to the GoldenGate administration guide.

The different OGG checkpoint files are located under the dirchk/ location, e.g.

- extract related : *.cpe;
- extract pump (datapump) related : *.cpe;
- bounded recovery related : *.cpb; - refer to 'Bounded Recovery'
- replicat related : *.cpr

Note: the above are binary files - they should not be modified; they are detailed merely to help understand what is going on in this area of discussion.

Looking at the above in relation to the flow of transactions through GoldenGate, first of all we have the Extract checkpoint file. Extract checkpointing is extended by Bounded Recovery - refer to 'Bounded Recovery' discussed separately below.

In the Extract checkpoint file, the read position in the Oracle redo stream is recorded together with the write position in the trail. The read position relates to the start of the oldest active transaction in the Oracle redo logs which the Extract process is interested in. There are always going to be active transactions but if these are also long running, this would affect the read position and it would remain unchanged.

The Extract pump checkpoint file will record the read position in the trail file as well as the write position in the remote trail.

The Replicat checkpoint file will record the read position in the trail.

Bounded Recovery

There is detailed information in the GoldenGate reference guide relating to Bounded Recovery.
Streams: Streams: The idea of bounded recovery is to write out long running transactions to disk. This is done at the Apply side database. Therefore, Streams effectively does the same as GoldenGate.
GoldenGate: Bounded Recovery adds to the standard Extract checkpointing mechanism in Goldengate.
Its purpose is to write out long running transactions from the Extract cache so that the bounded recovery read checkpoint can advance.

Bounded recovery writes out to disk transactions which are older than 1 bounded recovery interval. When a bounded recovery checkpoint takes place , as well as writing to disk long running transactions , the mechanism also remembers the RBA in the Oracle redo of the oldest or earliest active transaction in the last bounded recovery interval. It is this which Extract will use as a restart point in the Oracle redo.

In the Bounded Recovery checkpoint file, we need to keep a record of the long running transactions which have been written out to disk . We also need to record the read position in the Oracle redo of the oldest active transaction in the last bounded recovery period which will be the point Extract needs to restart from.

Capture / Extract Restart

A consequence of the checkpoint mechanisms used by both Streams and GoldenGate mean that redo will inevitably have to be reprocessed. Both Streams Capture and GoldenGate extract can reprocess transaction data more than once ; these transaction will not be reapplied however. Refer to 'Apply / Replicat Commit'.
Streams: As Streams uses logminer checkpointing , on restart , it will go back to the redo log file associated with the required_checkpoint_scn on restart of the database . This value basically means that all committed and uncommitted transaction data in the redo logs at this SCN value do not need to be reprocessed; this also takes account of the fact that open transactions may have been written to the Apply spill table.

The required_checkpoint_scn is associated with logminer . Streams will likely restart capturing changes at a later SCN and will filter out changes which it has already processed. This has to take into account open and long running Apply spilled transactions as well as transactions which have been applied. Where changes are recaptured , they will be discarded by the Apply process as it maintains a record of what has been applied.

GoldenGate: When Extract restarts , it will need to go back to the read position in the Oracle redo logs found in the Bounded recovery checkpoint file or that in the Extract checkpoint file.

When Extract starts, it will first of all look at the Bounded Recovery checkpoint details to see if they can be used. If details are present , this will avoid having to go further back in the redo to the start time of the logfile containing the oldest active long running transaction. The long running transaction will already be partially collected and can be added to as Extract advances - Streams will also do much the same and add to Apply spilled transaction data if additional changes are found for the transaction.

Apply / Replicat Commit

Both Streams and GoldenGate know what transactions have been applied to the database.

Streams: Streams Apply records the transaction ids of committed transactions on the Apply database as they occur so that in the event of a transaction being retransmitted, in the event of a restart or recovery, the transaction does not need to be reapplied - it can be discarded. The committed transaction ids are recorded within the database .
Streams Apply can run in parallel. Therefore, there may be numerous transactions which have no dependencies executing at the same time across the Apply servers. When these commit , the transaction ids are recorded for recovery purposes.

GoldenGate: GoldenGate: GoldenGate Replicat executes serially . The checkpoint table in the database is updated once the transaction has been applied to the database , prior to commit to the database. This guarantees that the transaction and the pointer to the Replicat's read location in the trail (pointing to the next transaction) is updated at the time of the commit. The use of a checkpoint table ensures that Replicat will not attempt to reapply a transaction which has already been applied.

Purging Activity

Streams and GoldenGate both purge details in the background periodically. The main activities are summarised below. As well as the details outlined , redo logs can be removed (after backup) once they have been accessed and are not longer required by Streams or GoldenGate.
Streams: Due to the checkpoint mechanism , Streams has to periodically purge out checkpoint data. This activity is performed the Capture process.
GoldenGate: GoldenGate needs to purge trail files (PURGEOLDEXTRACTS). This activity is normally managed by the Manager but can be done by extracts and replicats (a practice that is discouraged). Checkpointing in GoldenGate amounts to ensuring that pointers to relevant locations are up to date. GoldenGate may also need to purge discard files if there are errors.

Data Convergence and Conflict Resolution

Even with the best planned strategy for avoiding errors, conflict errors may eventually occur. If there are multiple locations where data is being changed, one site should own the ability to make change at any one time. If multiple sites can change the same rows at the same time , conflict resolution should be adopted.

It is important to recognise exactly what happens when changes are made to tables configured in replication and what can be expected. Streams and GoldenGate behave slightly differently in this area as will be shown below. The examples which follow are based on a simple table as detailed with replication taking place from source -> destination. The following highlights how data can diverge and the relevance of supplemental logging in the case of an update operation.