11g引入了大量compress相关的特性,其中之一便是dbms_compression包;GET_COMPRESSION_RATIO函数可以帮助我们了解压缩某个表后各种可能的影响。换而言之,这个函数可以让我们在具体实施表压缩技术或者测试前,对于压缩后的效果能有一个基本的印象。该包在11gr2中被首次引入,故而使用之前版本的包括11gr1都无缘得用。其次除OLTP压缩模式之外的柱形混合压缩只能在基于Exdata存储的表空间上实现。使用DBMS_COMPRESSION包获取的相关压缩信息是十分准确的,因为在评估过程中Oracle通过实际采样并建立模型表以尽可能还原逼真的数据。 我们可以通过trace来分析其评估过程中的具体操作,可以分成2步:
1. 建立原表的样本表,其采样值基于原表的大小:
SQL> create table samp_dss_nation tablespace SCRATCH as select * from dss_nation sample block (50); Table created.
2. 基于采用表建立对应压缩类型的模型表:
SQL> create table model_dss_nation tablespace SCRATCH compress for query high as select * from samp_dss_nation; create table model_dss_nation tablespace SCRATCH compress for query high as select * from samp_dss_nation * ERROR at line 1: ORA-64307: hybrid columnar compression is only supported in tablespaces residing on Exadata storage
可以看到在实际建立过程中Oracle将拒绝在非Exdata存储的表空间上建立该类柱形混合压缩(包括:COMP_FOR_QUERY_HIGH,COMP_FOR_QUERY_LOW,COMP_FOR_ARCHIVE_HIGH,CO
MP_FOR_ARCHIVE_LOW)。但DBMS_COMPRESSION在进行评估时可以绕过Oracle对于该类操作的LOCK.
要在没有Exdata存储设备的情况下使用dbms_compression包评测OLTP压缩模式外的柱状混合压缩模式时
(hybrid columnar compression is only supported in tablespaces residing on Exadata storage),首先需要打上patch 8896202:
[oracle@rh2 admin]$ /s01/dbhome_1/OPatch/opatch lsinventory Invoking OPatch 11.1.0.6.6 Oracle Interim Patch Installer version 11.1.0.6.6 Copyright (c) 2009, Oracle Corporation. All rights reserved. Oracle Home : /s01/dbhome_1 Central Inventory : /s01/oraInventory from : /etc/oraInst.loc OPatch version : 11.1.0.6.6 OUI version : 11.2.0.1.0 OUI location : /s01/dbhome_1/oui Log file location : /s01/dbhome_1/cfgtoollogs/opatch/opatch2010-06-02_23-08-33PM.log Patch history file: /s01/dbhome_1/cfgtoollogs/opatch/opatch_history.txt Lsinventory Output file location : /s01/dbhome_1/cfgtoollogs/opatch/lsinv/lsinventory2010-06-02_23-08-33PM.txt -------------------------------------------------------------------------------- Installed Top-level Products (1): Oracle Database 11g 11.2.0.1.0 There are 1 products installed in this Oracle Home. Interim patches (1) : Patch 8896202 : applied on Wed Jun 02 21:55:44 CST 2010 Unique Patch ID: 11909460 Created on 29 Oct 2009, 15:21:45 hrs US/Pacific Bugs fixed: 8896202 该patch用以:ENABLE COMPRESSION ADVISOR TO ESTIMATE EXADATA HCC COMPRESSION RATIOS
接着我们还需要运行被修改后的DBMSCOMP包创建SQL,具体操作为:
SQL> @?/rdbms/admin/prvtcmpr.plb Package created. Grant succeeded. Package body created. No errors. Package body created. No errors. Type body created. No errors. SQL> @?/rdbms/admin/dbmscomp.sql Package created. Synonym created. Grant succeeded. No errors. DBMS_COMPRESSION包在对表压缩进行评估时,默认表最少数据为1000000行,可能在你的测试库中没有这么多数据,我们可以修改这个下限; 通过将COMP_RATIO_MINROWS常数修改为1后,就可以分析最小为1行的表了: SQL>create or replace package sys.dbms_compression authid current_user is COMP_NOCOMPRESS CONSTANT NUMBER := 1; COMP_FOR_OLTP CONSTANT NUMBER := 2; COMP_FOR_QUERY_HIGH CONSTANT NUMBER := 4; COMP_FOR_QUERY_LOW CONSTANT NUMBER := 8; COMP_FOR_ARCHIVE_HIGH CONSTANT NUMBER := 16; COMP_FOR_ARCHIVE_LOW CONSTANT NUMBER := 32; COMP_RATIO_MINROWS CONSTANT NUMBER := 10; COMP_RATIO_ALLROWS CONSTANT NUMBER := -1; PROCEDURE get_compression_ratio(scratchtbsname IN varchar2, ownname IN varchar2, tabname IN varchar2, partname IN varchar2, comptype IN number, blkcnt_cmp OUT PLS_INTEGER, blkcnt_uncmp OUT PLS_INTEGER, row_cmp OUT PLS_INTEGER, row_uncmp OUT PLS_INTEGER, cmp_ratio OUT NUMBER, comptype_str OUT varchar2, subset_numrows IN number DEFAULT COMP_RATIO_MINROWS); function get_compression_type(ownname IN varchar2, tabname IN varchar2, row_id IN rowid) return number; PROCEDURE incremental_compress(ownname IN dba_objects.owner%type, tabname IN dba_objects.object_name%type, partname IN dba_objects.subobject_name%type, colname IN varchar2, dump_on IN number default 0, autocompress_on IN number default 0, where_clause IN varchar2 default ''); end dbms_compression; Package created. SQL> alter package dbms_compression compile body; Package body altered. 接下来我们通过建立一个基于TPC-D的测试的Schema,保证各表上有较多的数据,并且数据有一定的拟真度: SQL> select table_name,num_rows,blocks from user_tables ; TABLE_NAME NUM_ROWS BLOCKS ------------------------------ ---------- ---------- DSS_SUPPLIER 20000 496 DSS_PART 400000 7552 DSS_REGION 5 5 DSS_PARTSUPP 1600000 29349 DSS_LINEITEM 12000000 221376 DSS_ORDER 3000000 48601 DSS_CUSTOMER 300000 6922 DSS_NATION 25 5 现在可以进行压缩评估了,我们针对测试模型Schema编辑以下匿名块并运行 SQL> set serveroutput on; SQL> declare cmp_blk_cnt binary_integer; uncmp_blk_cnt binary_integer; cmp_rows binary_integer; uncmp_rows binary_integer; cmp_ratio number; cmp_typ varchar2(100); BEGIN for i in (SELECT TABLE_NAME from dba_tables where compression = 'DISABLED' and owner = 'MACLEAN' and num_rows>1000000) loop for j in 1 .. 5 loop dbms_compression.get_compression_ratio(scratchtbsname => 'SCRATCH', ownname => 'MACLEAN', tabname => i.table_name, partname => NULL, comptype => power(2, j), blkcnt_cmp => cmp_blk_cnt, blkcnt_uncmp => uncmp_blk_cnt, row_cmp => cmp_rows, row_uncmp => uncmp_rows, cmp_ratio => cmp_ratio, comptype_str => cmp_typ); dbms_output.put_line(i.table_name || '--' || 'compress_type is ' || cmp_typ || ' ratio :' || to_char(cmp_ratio, '99.9') || '%'); end loop; end loop; end; / DSS_ORDER--compress_type is "Compress For OLTP" ratio : 1.1% Compression Advisor self-check validation successful. select count(*) on both Uncompressed and EHCC Compressed format = 1000001 rows DSS_ORDER--compress_type is "Compress For Query High" ratio : 2.7% Compression Advisor self-check validation successful. select count(*) on both Uncompressed and EHCC Compressed format = 1000001 rows DSS_ORDER--compress_type is "Compress For Query Low" ratio : 1.7% Compression Advisor self-check validation successful. select count(*) on both Uncompressed and EHCC Compressed format = 1000001 rows DSS_ORDER--compress_type is "Compress For Archive High" ratio : 2.9% Compression Advisor self-check validation successful. select count(*) on both Uncompressed and EHCC Compressed format = 1000001 rows DSS_ORDER--compress_type is "Compress For Archive Low" ratio : 2.7% DSS_PARTSUPP--compress_type is "Compress For OLTP" ratio : .9% Compression Advisor self-check validation successful. select count(*) on both Uncompressed and EHCC Compressed format = 1000001 rows DSS_PARTSUPP--compress_type is "Compress For Query High" ratio : 1.8% Compression Advisor self-check validation successful. select count(*) on both Uncompressed and EHCC Compressed format = 1000001 rows DSS_PARTSUPP--compress_type is "Compress For Query Low" ratio : 1.2% Compression Advisor self-check validation successful. select count(*) on both Uncompressed and EHCC Compressed format = 1000001 rows DSS_PARTSUPP--compress_type is "Compress For Archive High" ratio : 1.9% Compression Advisor self-check validation successful. select count(*) on both Uncompressed and EHCC Compressed format = 1000001 rows DSS_PARTSUPP--compress_type is "Compress For Archive Low" ratio : 1.8% DSS_LINEITEM--compress_type is "Compress For OLTP" ratio : 1.4% Compression Advisor self-check validation successful. select count(*) on both Uncompressed and EHCC Compressed format = 1000001 rows DSS_LINEITEM--compress_type is "Compress For Query High" ratio : 3.5% Compression Advisor self-check validation successful. select count(*) on both Uncompressed and EHCC Compressed format = 1000001 rows DSS_LINEITEM--compress_type is "Compress For Query Low" ratio : 2.3% Compression Advisor self-check validation successful. select count(*) on both Uncompressed and EHCC Compressed format = 1000001 rows DSS_LINEITEM--compress_type is "Compress For Archive High" ratio : 4.3% Compression Advisor self-check validation successful. select count(*) on both Uncompressed and EHCC Compressed format = 1000001 rows DSS_LINEITEM--compress_type is "Compress For Archive Low" ratio : 3.7% PL/SQL procedure successfully completed.
可以从上述测试看到,”Compress For Archive High”压缩率最高,该类型最适合于数据归档存储,但其算法复杂度高于”Compress For Archive Low”,压缩耗时亦随之上升。
总体压缩率都较低,这同TPC-D测试的数据建模有一定关联,我们再使用一组TPC-H的测试数据来模拟压缩:
SQL> conn liu/liu; Connected. SQL> select num_rows,blocks,table_name from user_tables; NUM_ROWS BLOCKS TABLE_NAME ---------- ---------- ------------------------------ 3000000 46817 H_ORDER 300000 6040 H_CUSTOMER 12000000 221376 H_LINEITEM 25 5 H_NATION 400000 7552 H_PART 5 5 H_REGION 1600000 17491 H_PARTSUPP 20000 496 H_SUPPLIER 8 rows selected. SQL> set serveroutput on; SQL> declare cmp_blk_cnt binary_integer; uncmp_blk_cnt binary_integer; cmp_rows binary_integer; uncmp_rows binary_integer; cmp_ratio number; cmp_typ varchar2(100); BEGIN for i in (SELECT TABLE_NAME from dba_tables where compression = 'DISABLED' and owner = 'LIU' and num_rows>1000000) loop for j in 1 .. 5 loop dbms_compression.get_compression_ratio(scratchtbsname => 'SCRATCH', ownname => 'LIU', tabname => i.table_name, partname => NULL, comptype => power(2, j), blkcnt_cmp => cmp_blk_cnt, blkcnt_uncmp => uncmp_blk_cnt, row_cmp => cmp_rows, row_uncmp => uncmp_rows, cmp_ratio => cmp_ratio, comptype_str => cmp_typ); dbms_output.put_line(i.table_name || '--' || 'compress_type is ' || cmp_typ || ' ratio :' || to_char(cmp_ratio, '99.9') || '%'); end loop; end loop; end; / H_ORDER--compress_type is "Compress For OLTP" ratio : 1.1% Compression Advisor self-check validation successful. select count(*) on both Uncompressed and EHCC Compressed format = 1000001 rows H_ORDER--compress_type is "Compress For Query High" ratio : 5.2% Compression Advisor self-check validation successful. select count(*) on both Uncompressed and EHCC Compressed format = 1000001 rows H_ORDER--compress_type is "Compress For Query Low" ratio : 2.9% Compression Advisor self-check validation successful. select count(*) on both Uncompressed and EHCC Compressed format = 1000001 rows H_ORDER--compress_type is "Compress For Archive High" ratio : 7.2% Compression Advisor self-check validation successful. select count(*) on both Uncompressed and EHCC Compressed format = 1000001 rows H_ORDER--compress_type is "Compress For Archive Low" ratio : 5.5% H_PARTSUPP--compress_type is "Compress For OLTP" ratio : .9% Compression Advisor self-check validation successful. select count(*) on both Uncompressed and EHCC Compressed format = 1000001 rows H_PARTSUPP--compress_type is "Compress For Query High" ratio : 5.1% Compression Advisor self-check validation successful. select count(*) on both Uncompressed and EHCC Compressed format = 1000001 rows H_PARTSUPP--compress_type is "Compress For Query Low" ratio : 2.7% Compression Advisor self-check validation successful. select count(*) on both Uncompressed and EHCC Compressed format = 1000001 rows H_PARTSUPP--compress_type is "Compress For Archive High" ratio : 7.2% Compression Advisor self-check validation successful. select count(*) on both Uncompressed and EHCC Compressed format = 1000001 rows H_PARTSUPP--compress_type is "Compress For Archive Low" ratio : 5.3% H_LINEITEM--compress_type is "Compress For OLTP" ratio : 1.4% Compression Advisor self-check validation successful. select count(*) on both Uncompressed and EHCC Compressed format = 1000001 rows H_LINEITEM--compress_type is "Compress For Query High" ratio : 5.2% Compression Advisor self-check validation successful. select count(*) on both Uncompressed and EHCC Compressed format = 1000001 rows H_LINEITEM--compress_type is "Compress For Query Low" ratio : 3.0% Compression Advisor self-check validation successful. select count(*) on both Uncompressed and EHCC Compressed format = 1000001 rows H_LINEITEM--compress_type is "Compress For Archive High" ratio : 7.4% Compression Advisor self-check validation successful. select count(*) on both Uncompressed and EHCC Compressed format = 1000001 rows H_LINEITEM--compress_type is "Compress For Archive Low" ratio : 5.6% PL/SQL procedure successfully completed.
可以看到相比TPC-D的测试用数据,TPC-H建立的数据更具可压缩性。
PS:
TPC-D represents a broad range of decision support (DS) applications that require complex, long running queries against large complex data structures. Real-world business questions were written against this model, resulting in 17 complex queries.
The TPC Benchmark™H (TPC-H) is a decision support benchmark. It consists of a suite of business oriented ad-hoc queries and concurrent data modifications. The queries and the data populating the database have been chosen to have broad industry-wide relevance. This benchmark illustrates decision support systems that examine large volumes of data, execute queries with a high degree of complexity, and give answers to critical business questions.
The performance metric reported by TPC-H is called the TPC-H Composite Query-per-Hour Performance Metric (QphH@Size), and reflects multiple aspects of the capability of the system to process queries. These aspects include the selected database size against which the queries are executed, the query processing power when queries are submitted by a single stream, and the query throughput when queries are submitted by multiple concurrent users. The TPC-H Price/Performance metric is expressed as $/QphH@Size.
Leave a Reply