Different settings are available depending on the type of analyzer selected.
Multiple analysis settings are specified using ;
.
Analysis setting | Description | Supported databases |
---|---|---|
sample | Defines the sampling algorithm for selecting data for Sensitive data discovery process. The following values are available: - top - First N records will be selected from each table. Excellent performance. Use sample=top .- sample - Records sampling using probabilities will be used. Good performance. Use sample=sample .- random - Random records will be selected from each table. Low performance, good quality of data. Use sample=random . |
DB2 - top, sample, random (default). Informix - top (default). SQL Server - top, sample, random (default). Oracle - top, sample, random (default). Sybase - top (default). PostgreSQL - top (default), random. |
schemas | Only listed schemas will be included in the import. Provide the list of schemas separated with comma (example: schemas=SCOTT,DAVID,JACK ). |
Oracle |
includeSystemSchemas | Defines if the import of data source will include system schemas and tables. The following values are available: -true - system schemas and tables will be imported. Use includeSystemSchemas=true. -false - system schemas and tables won't be imported. Use includeSystemSchemas=false . This is the default setting. |
DB2, Oracle, PostgreSQL |
countRecords | Defines if the import of data source will retrieve the number of records in each table. The following values are available: -(blank) - count of the records won't be retrieved. Omit the countRecords declaration. This is the default setting. -stat - reads the database information schema statistics to predict the number of records. Excellent performance but the number of records saved may not be equal to the real number of records. Use countRecords=stat .-count - reads the exact count of records in the database. Database intensive operation. Use countRecords=count . |
DB2, Informix, SQL Server, Oracle, Sybase, PostgreSQL |
In continuation, some additional specifics are explained:
countRecords=stat
Before using this setting, it is recommended to update the database statistics. Updating the statistics for databases can be found on the following links:
sample
Besides the sample
setting, there is also something called sample size and it is used in discovery rules. The performance and quality of the data can also be affected by the sample size which can be set during the creation of the discovery rules. The default options in both of these settings are the recommended settings which should work well with most database tables.
BizDataX Documentation © Built by Ekobit. All rights reserved.
https://www.ekobit.com/ https://bizdatax.com/ https://bizdatax.com/support/