Informatica Data Quality for Siebel 9.1.0 HotFix 2

Transcription

Informatica Data Quality for Siebel 9.1.0 HotFix 2
Informatica Data Quality for Siebel (Version 9.1.0 HotFix 2)
User Guide
Informatica Data Quality for Siebel User Guide
Version 9.1.0 HotFix 2
August 2011
Copyright (c) 1998-2011 Informatica. All rights reserved.
This software and documentation contain proprietary information of Informatica Corporation and are provided under a license agreement containing restrictions on use and
disclosure and are also protected by copyright law. Reverse engineering of the software is prohibited. No part of this document may be reproduced or transmitted in any form,
by any means (electronic, photocopying, recording or otherwise) without prior consent of Informatica Corporation. This Software may be protected by U.S. and/or international
Patents and other Patents Pending.
Use, duplication, or disclosure of the Software by the U.S. Government is subject to the restrictions set forth in the applicable software license agreement and as provided in
DFARS 227.7202-1(a) and 227.7702-3(a) (1995), DFARS 252.227-7013 © (1)(ii) (OCT 1988), FAR 12.212(a) (1995), FAR 52.227-19, or FAR 52.227-14 (ALT III), as
applicable.
The information in this product or documentation is subject to change without notice. If you find any problems in this product or documentation, please report them to us in
writing.
Informatica, Informatica Platform, Informatica Data Services, PowerCenter, PowerCenterRT, PowerCenter Connect, PowerCenter Data Analyzer, PowerExchange,
PowerMart, Metadata Manager, Informatica Data Quality, Informatica Data Explorer, Informatica B2B Data Transformation, Informatica B2B Data Exchange Informatica On
Demand, Informatica Identity Resolution, Informatica Application Information Lifecycle Management, Informatica Complex Event Processing, Ultra Messaging and Informatica
Master Data Management are trademarks or registered trademarks of Informatica Corporation in the United States and in jurisdictions throughout the world. All other company
and product names may be trade names or trademarks of their respective owners.
Portions of this software and/or documentation are subject to copyright held by third parties, including without limitation: Copyright DataDirect Technologies. All rights
reserved. Copyright © Sun Microsystems. All rights reserved. Copyright © RSA Security Inc. All Rights Reserved. Copyright © Ordinal Technology Corp. All rights
reserved.Copyright © Aandacht c.v. All rights reserved. Copyright Genivia, Inc. All rights reserved. Copyright Isomorphic Software. All rights reserved. Copyright © Meta
Integration Technology, Inc. All rights reserved. Copyright © Intalio. All rights reserved. Copyright © Oracle. All rights reserved. Copyright © Adobe Systems Incorporated. All
rights reserved. Copyright © DataArt, Inc. All rights reserved. Copyright © ComponentSource. All rights reserved. Copyright © Microsoft Corporation. All rights reserved.
Copyright © Rogue Wave Software, Inc. All rights reserved. Copyright © Teradata Corporation. All rights reserved. Copyright © Yahoo! Inc. All rights reserved. Copyright ©
Glyph & Cog, LLC. All rights reserved. Copyright © Thinkmap, Inc. All rights reserved. Copyright © Clearpace Software Limited. All rights reserved. Copyright © Information
Builders, Inc. All rights reserved. Copyright © OSS Nokalva, Inc. All rights reserved. Copyright Edifecs, Inc. All rights reserved. Copyright Cleo Communications, Inc. All rights
reserved. Copyright © International Organization for Standardization 1986. All rights reserved. Copyright © ej-technologies GmbH . All rights reserved. Copyright © Jaspersoft
Corporation. All rights reserved.
This product includes software developed by the Apache Software Foundation (http://www.apache.org/), and other software which is licensed under the Apache License,
Version 2.0 (the "License"). You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0. Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
License for the specific language governing permissions and limitations under the License.
This product includes software which was developed by Mozilla (http://www.mozilla.org/), software copyright The JBoss Group, LLC, all rights reserved; software copyright ©
1999-2006 by Bruno Lowagie and Paulo Soares and other software which is licensed under the GNU Lesser General Public License Agreement, which may be found at http://
www.gnu.org/licenses/lgpl.html. The materials are provided free of charge by Informatica, "as-is", without warranty of any kind, either express or implied, including but not
limited to the implied warranties of merchantability and fitness for a particular purpose.
The product includes ACE(TM) and TAO(TM) software copyrighted by Douglas C. Schmidt and his research group at Washington University, University of California, Irvine,
and Vanderbilt University, Copyright © 1993-2006, all rights reserved.
This product includes software developed by the OpenSSL Project for use in the OpenSSL Toolkit (copyright The OpenSSL Project. All Rights Reserved) and redistribution of
this software is subject to terms available at http://www.openssl.org and http://www.openssl.org/source/license.html.
This product includes Curl software which is Copyright 1996-2007, Daniel Stenberg, <[email protected]>. All Rights Reserved. Permissions and limitations regarding this
software are subject to terms available at http://curl.haxx.se/docs/copyright.html. Permission to use, copy, modify, and distribute this software for any purpose with or without
fee is hereby granted, provided that the above copyright notice and this permission notice appear in all copies.
The product includes software copyright 2001-2005 (©) MetaStuff, Ltd. All Rights Reserved. Permissions and limitations regarding this software are subject to terms available
at http://www.dom4j.org/ license.html.
The product includes software copyright © 2004-2007, The Dojo Foundation. All Rights Reserved. Permissions and limitations regarding this software are subject to terms
available at http://dojotoolkit.org/license.
This product includes ICU software which is copyright International Business Machines Corporation and others. All rights reserved. Permissions and limitations regarding this
software are subject to terms available at http://source.icu-project.org/repos/icu/icu/trunk/license.html.
This product includes software copyright © 1996-2006 Per Bothner. All rights reserved. Your right to use such materials is set forth in the license which may be found at http://
www.gnu.org/software/ kawa/Software-License.html.
This product includes OSSP UUID software which is Copyright © 2002 Ralf S. Engelschall, Copyright © 2002 The OSSP Project Copyright © 2002 Cable & Wireless
Deutschland. Permissions and limitations regarding this software are subject to terms available at http://www.opensource.org/licenses/mit-license.php.
This product includes software developed by Boost (http://www.boost.org/) or under the Boost software license. Permissions and limitations regarding this software are subject
to terms available at http:/ /www.boost.org/LICENSE_1_0.txt.
This product includes software copyright © 1997-2007 University of Cambridge. Permissions and limitations regarding this software are subject to terms available at http://
www.pcre.org/license.txt.
This product includes software copyright © 2007 The Eclipse Foundation. All Rights Reserved. Permissions and limitations regarding this software are subject to terms
available at http://www.eclipse.org/org/documents/epl-v10.php.
This product includes software licensed under the terms at http://www.tcl.tk/software/tcltk/license.html, http://www.bosrup.com/web/overlib/?License, http://www.stlport.org/doc/
license.html, http://www.asm.ow2.org/license.html, http://www.cryptix.org/LICENSE.TXT, http://hsqldb.org/web/hsqlLicense.html, http://httpunit.sourceforge.net/doc/
license.html, http://jung.sourceforge.net/license.txt , http://www.gzip.org/zlib/zlib_license.html, http://www.openldap.org/software/release/license.html, http://www.libssh2.org,
http://slf4j.org/license.html, http://www.sente.ch/software/OpenSourceLicense.html, http://fusesource.com/downloads/license-agreements/fuse-message-broker-v-5-3-licenseagreement; http://antlr.org/license.html; http://aopalliance.sourceforge.net/; http://www.bouncycastle.org/licence.html; http://www.jgraph.com/jgraphdownload.html ; http://
www.jcraft.com/jsch/LICENSE.txt. http://jotm.objectweb.org/bsd_license.html; http://www.w3.org/Consortium/Legal/2002/copyright-software-20021231; http://www.slf4j.org/
license.html; http://developer.apple.com/library/mac/#samplecode/HelpHook/Listings/HelpHook_java.html; http://www.jcraft.com/jsch/LICENSE.txt; http://
nanoxml.sourceforge.net/orig/copyright.html; http://www.json.org/license.html; http://forge.ow2.org/projects/javaservice/; http://www.postgresql.org/about/license.html; http://
www.sqlite.org/copyright.html; http://www.tcl.tk/software/tcltk/license.html; http://www.jaxen.org/faq.html; http://www.jdom.org/docs/faq.html; and http://www.slf4j.org/
license.html.
This product includes software licensed under the Academic Free License (http://www.opensource.org/licenses/afl-3.0.php), the Common Development and Distribution
License (http://www.opensource.org/licenses/cddl1.php ) the Common Public License (http://www.opensource.org/licenses/cpl1.0.php ), the Sun Binary Code License
Agreement Supplemental License Terms, the BSD License (http://www.opensource.org/licenses/bsd-license.php), the MIT License (http://www.opensource.org/licenses/mitlicense.php) and the Artistic License (http://www.opensource.org/licenses/artistic-license-1.0).
This product includes software copyright © 2003-2006 Joe WaInes, 2006-2007 XStream Committers. All rights reserved. Permissions and limitations regarding this software
are subject to terms available at http://xstream.codehaus.org/license.html. This product includes software developed by the Indiana University Extreme! Lab. For further
information please visit http://www.extreme.indiana.edu/.
This product contains runtime modules of IBM DB2 Driver for JDBC and SQLJ (c) Copyright IBM Corporation 2006 All rights reserved.
This Software is protected by U.S. Patent Numbers 5,794,246; 6,014,670; 6,016,501; 6,029,178; 6,032,158; 6,035,307; 6,044,374; 6,092,086; 6,208,990; 6,339,775;
6,640,226; 6,789,096; 6,820,077; 6,823,373; 6,850,947; 6,895,471; 7,117,215; 7,162,643; 7,254,590; 7,281,001; 7,421,458; 7,496,588; 7,523,121; 7,584,422, 7,720,842;
7,721,270; and 7,774,791 , international Patents and other Patents Pending.
DISCLAIMER: Informatica Corporation provides this documentation "as is" without warranty of any kind, either express or implied, including, but not limited to, the implied
warranties of noninfringement, merchantability, or use for a particular purpose. Informatica Corporation does not warrant that this software or documentation is error free. The
information provided in this software or documentation may include technical inaccuracies or typographical errors. The information in this software and documentation is
subject to change at any time without notice.
NOTICES
This Informatica product (the "Software") includes certain drivers (the "DataDirect Drivers") from DataDirect Technologies, an operating company of Progress Software
Corporation ("DataDirect") which are subject to the following terms and conditions:
1. THE DATADIRECT DRIVERS ARE PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
2. IN NO EVENT WILL DATADIRECT OR ITS THIRD PARTY SUPPLIERS BE LIABLE TO THE END-USER CUSTOMER FOR ANY DIRECT, INDIRECT,
INCIDENTAL, SPECIAL, CONSEQUENTIAL OR OTHER DAMAGES ARISING OUT OF THE USE OF THE ODBC DRIVERS, WHETHER OR NOT INFORMED OF
THE POSSIBILITIES OF DAMAGES IN ADVANCE. THESE LIMITATIONS APPLY TO ALL CAUSES OF ACTION, INCLUDING, WITHOUT LIMITATION, BREACH
OF CONTRACT, BREACH OF WARRANTY, NEGLIGENCE, STRICT LIABILITY, MISREPRESENTATION AND OTHER TORTS.
Part Number: IDQ-SEI-91000-HF2-0001
Table of Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
Informatica Resources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
Informatica Customer Portal. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
Informatica Documentation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
Informatica Web Site. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
Informatica How-To Library. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
Informatica Knowledge Base. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
Informatica Multimedia Knowledge Base. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
Informatica Global Customer Support. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
Chapter 1: Introduction to Data Quality for Siebel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Understanding Data Cleansing, Deduplication, and Address Validation. . . . . . . . . . . . . . . . . . . . . . . 2
Logical Architecture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
About Workflows, Mappings, and Mapplets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Cleansing and Deduplication Process Flows. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Real-Time and Batch Cleansing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Real-Time Deduplication. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Batch Deduplication. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Using Staging Tables in Batch Deduplication. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Address Validation Process Flows. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Realtime Address Validation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Batch Address Validation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Physical Architecture and Installable Components. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Siebel Components. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Informatica Components. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Data Quality for Siebel Components. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Bill of Materials. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
System Requirements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Chapter 2: Installing and Configuring Informatica Components. . . . . . . . . . . . . . . . . . . . 10
Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Install Sequence. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Install PowerCenter Client and Server Components. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Post-Installation Requirements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Create a PowerCenter Web Services Hub. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Create Staging Tables for Batch Duplicate Analysis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Pre-Installation Requirements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Table of Contents
i
Character Encoding Requirements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Install Informatica Data Quality Reference Tables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Importing PowerCenter Workflows. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Pre-Import Requirements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Steps to Import. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Post-Import Requirements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Configuring the Siebel Parameter File. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Verify the Index Database Type. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Update the Batch Deduplication Settings. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Installing Reference Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Chapter 3: Configuring Siebel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Adding Library Files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Adding the JAR File. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Adding Configuration Files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Editing Configuration Files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Chapter 4: Configuring Siebel for Cleansing and Deduplication. . . . . . . . . . . . . . . . . . . . 17
Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Enable Data Quality Functionality for Siebel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Register the INFADQSiebel Library. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Enable Siebel Data Quality. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Configure Siebel Vendor Parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Set Siebel Vender Parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Set Siebel Data Quality Parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Configure Business Components for Data Quality Operations. . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Select Informatica as the Data Quality Vendor. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Define Business Components and Data Quality Operations. . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Generate Siebel Match Keys. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Enable Cleansing and Deduplication for a Thick Client. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Chapter 5: Configuring Siebel for Address Validation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
SIF File Import. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Adding Picks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Modifying the Address Business Component. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Adding Picks for Address Applets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
System Preferences Setup. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
INFADQSiebel JAR Setup. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
JAR Setup for the Siebel Thick Client. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
JAR Setup for the Siebel Thin Client. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Batch Setup. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
ii
Table of Contents
Verifying Your Address Validation Implementation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Verifying a Realtime Address Validation Implementation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Verifying a Batch Address Validation Implementation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Table of Contents
iii
Preface
The Informatica Data Quality for Siebel Guide is written for Siebel system administrators and other users who
install and set up Data Quality for Siebel and who configure Siebel applications to communicate with Data Quality
for Siebel. This guide assumes that you have an understanding of data cleansing, deduplication, and address
validation capabilities in real-time and batch scenarios.
Informatica Resources
Informatica Customer Portal
As an Informatica customer, you can access the Informatica Customer Portal site at
http://mysupport.informatica.com. The site contains product information, user group information, newsletters,
access to the Informatica customer support case management system (ATLAS), the Informatica How-To Library,
the Informatica Knowledge Base, the Informatica Multimedia Knowledge Base, Informatica Product
Documentation, and access to the Informatica user community.
Informatica Documentation
The Informatica Documentation team takes every effort to create accurate, usable documentation. If you have
questions, comments, or ideas about this documentation, contact the Informatica Documentation team through
email at [email protected]. We will use your feedback to improve our documentation. Let us
know if we can contact you regarding your comments.
The Documentation team updates documentation as needed. To get the latest documentation for your product,
navigate to Product Documentation from http://mysupport.informatica.com.
Informatica Web Site
You can access the Informatica corporate web site at http://www.informatica.com. The site contains information
about Informatica, its background, upcoming events, and sales offices. You will also find product and partner
information. The services area of the site includes important information about technical support, training and
education, and implementation services.
Informatica How-To Library
As an Informatica customer, you can access the Informatica How-To Library at http://mysupport.informatica.com.
The How-To Library is a collection of resources to help you learn more about Informatica products and features. It
iv
includes articles and interactive demonstrations that provide solutions to common problems, compare features and
behaviors, and guide you through performing specific real-world tasks.
Informatica Knowledge Base
As an Informatica customer, you can access the Informatica Knowledge Base at http://mysupport.informatica.com.
Use the Knowledge Base to search for documented solutions to known technical issues about Informatica
products. You can also find answers to frequently asked questions, technical white papers, and technical tips. If
you have questions, comments, or ideas about the Knowledge Base, contact the Informatica Knowledge Base
team through email at [email protected].
Informatica Multimedia Knowledge Base
As an Informatica customer, you can access the Informatica Multimedia Knowledge Base at
http://mysupport.informatica.com. The Multimedia Knowledge Base is a collection of instructional multimedia files
that help you learn about common concepts and guide you through performing specific tasks. If you have
questions, comments, or ideas about the Multimedia Knowledge Base, contact the Informatica Knowledge Base
team through email at [email protected].
Informatica Global Customer Support
You can contact a Customer Support Center by telephone or through the Online Support. Online Support requires
a user name and password. You can request a user name and password at http://mysupport.informatica.com.
Use the following telephone numbers to contact Informatica Global Customer Support:
North America / South America
Europe / Middle East / Africa
Asia / Australia
Toll Free
Brazil: 0800 891 0202
Mexico: 001 888 209 8853
North America: +1 877 463 2435
Toll Free
France: 0805 804632
Germany: 0800 5891281
Italy: 800 915 985
Netherlands: 0800 2300001
Portugal: 800 208 360
Spain: 900 813 166
Switzerland: 0800 463 200
United Kingdom: 0800 023 4632
Toll Free
Australia: 1 800 151 830
New Zealand: 09 9 128 901
Standard Rate
India: +91 80 4112 5738
Standard Rate
Belgium: +31 30 6022 797
France: +33 1 4138 9226
Germany: +49 1805 702 702
Netherlands: +31 306 022 797
United Kingdom: +44 1628 511445
Preface
v
vi
CHAPTER 1
Introduction to Data Quality for
Siebel
This chapter includes the following topics:
¨ Overview, 1
¨ Understanding Data Cleansing, Deduplication, and Address Validation, 2
¨ Logical Architecture, 2
¨ Cleansing and Deduplication Process Flows, 4
¨ Address Validation Process Flows, 5
¨ Physical Architecture and Installable Components, 7
Overview
Siebel acts as a unified source of account, customer, and prospect data across an enterprise. Data Quality for
Siebel enhances your ability to create and maintain reliable and duplicate-free data in the Siebel system.
Data Quality for Siebel applies the data quality management capabilities of Informatica applications to new record
data entering the Siebel system and to data stored in the system. It integrates with the Siebel Data Quality (SDQ)
Universal Connector through its application programming interface to deliver enhanced data cleansing,
deduplication, and address validation capabilities in realtime and in batch mode.
On These Record Types...
You Can Perform These Data Quality Operations
Account
Customer
List Management Prospect
Real-Time Cleansing
Batch Cleansing
Real-Time Deduplication
Batch Deduplication
CUT/Business Address
Real-Time and Batch Cleansing
CUT/Business Address
Real-Time and Batch Address Validation
The data cleansing processes in Data Quality for Siebel correct and standardize record data values. The data
deduplication processes identify duplicate records in the system and return these records for evaluation. The
address validation processes validate records against valid addresses in a reference dataset. You can customize
the data cleansing, deduplication, and address validation rules to suit the requirements of your organization.
1
Note: Siebel provides data consolidation and record merging rules are part of its configuration and setup. Data
Quality for Siebel provides match scores to Siebel to facilitate this functionality.
Understanding Data Cleansing, Deduplication, and
Address Validation
The data cleansing, deduplication, and address validation functionality in Data Quality for Siebel focuses on name
and address data.
Data cleansing performs standardization operations on the input record so that it meets user standards and
requirements. Standardization harmonizes variations in customer terms, such as Doctor/Dr and Street/St, and
removes extraneous punctuation.
Data Quality for Siebel performs data cleansing in real time and batch modes. In each case, the Siebel Universal
Connector sends data to Data Quality for Siebel one record at a time.
Data deduplication identifies potential duplicates in record data provided by Siebel to Informatica applications.
Data Quality for Siebel performs data deduplication in real time and batch modes.
When processing a new record in real time, Data Quality for Siebel performs data cleansing before data
deduplication. The deduplication process begins when the cleansing process completes.
Address validation enhances the address cleansing and verification capabilities of Data Quality for Siebel by
adding Informatica address validation components to the installed solution. Full address validation requires
software and reference datasets sourced by Informatica from third-party address reference specialists.
In realtime mode, users can interactively verify addresses against an address validation reference dataset in order
to choose from valid address matches. In batch mode, Data Quality for Siebel non-interactively compares input
record addresses to the reference dataset and then returns a validated or corrected address with status or error
code information appended.
Logical Architecture
Data Quality for Siebel communicates with the Siebel system through a library file called INFADQSiebel that
Universal Connector:in logical architecture resides locally to the Universal Connector component of Siebel Data
Quality.
The library file “wraps” the record data and instructions it receives from the Universal Connector in a SOAP
(Simple Object Access Protocol) envelope and sends them as XML documents to PowerCenter. PowerCenter Web
Services Hub receives the data and sends it to a PowerCenter workflow associated with the required data
process. Depending on the data process involved, PowerCenter may call one or several workflows to process the
incoming data.
Each workflow may contain one or several mappings that include mapplets implementing data quality processes.
These mapplets define the low-level cleansing, deduplication, and address validation operations performed on the
Siebel data.
The Data Quality for Siebel installation fileset includes Data Quality for Siebel mappings. These mappings do not
represent the only types of cleansing, deduplication, and address validation you can perform with Informatica
applications, nor do they represent the complete extent of Informatica’s data quality capabilities. The mappings are
provided as a useful sample of these capabilities. To edit the mappings, or create mappings or mapplets specific
2
Chapter 1: Introduction to Data Quality for Siebel
to your business processes, use Informormatica Data Quality. For more information, see “Data Quality for Siebel
Mappings and Mapplets” on page 4.
The following graphic illustrates the architecture on the Siebel side:
Figure 1. Data Quality for Siebel Architecture, Siebel Orientation
The following graphic illustrates the architecture on the Informatica side:
Figure 2. Data Quality for Siebel Architecture, Informatica Orientation
About Workflows, Mappings, and Mapplets
Three types of Informatica processes are involved in running data cleansing, deduplication, and address validation
tasks on data received from Siebel: workflows, mappings, and mapplets.
Workflow. A set of instructions informing PowerCenter to run one or more tasks. It can be triggered by a
PowerCenter user, by the arrival of data in real time, or by a scheduler. Workflows are composed of tasks. Tasks
that contain mappings are called sessions.
Data Quality for Siebel provides several workflows for import to the PowerCenter repository. These workflows
contain one or more session tasks that specify the mappings that run on the input data. Data Quality for Siebel can
run one or several mappings in sequence to perform cleansing, deduplication, and address validation on input
data.
Workflows are created in PowerCenter Workflow Manager and saved in the PowerCenter repository. Some
workflows run continuously to maximize speed of data throughput between Siebel and Informatica applications.
Mapping. A set of data input (source) parameters, data movement or transformation instructions, and data output
(target) parameters that can be applied to a dataset in PowerCenter. Mappings are created in PowerCenter
Designer and saved in the PowerCenter repository.
Mapplets. A mapplet contains a set of input (source) parameters, data analysis, enhancement, or deduplication
instructions, and output (target) parameters. Mapplets are stored in the PowerCenter repository for use in
mappings. Data Quality for Siebel includes pre-built mapplets containing data quality processes.
Logical Architecture
3
Data Quality for Siebel Mappings and Mapplets
Use Informatica Data Quality to generate Data Quality for Siebel mappings and mapplets. The Data Quality for
Siebel installation fileset includes Data Quality for Siebel mappings. You can customize the mappings, or you can
create mappings and mapplets specific to your business processes. Save Data Quality for Siebel mappings and
mapplets into the PowerCenter repository.
Cleansing and Deduplication Process Flows
When you enter a new record to the Siebel system, Data Quality for Siebel cleanses the record and returns the
cleansed record to Siebel. When Siebel receives the cleansed record, the data deduplication processes begin.
Note: You cannot create a record with the same name as an existing record. If Siebel finds an record match with
the new record name, it displays a message stating that new record names must be unique.
Real-Time and Batch Cleansing
The data cleansing process flow is identical for real-time and batch scenarios. In each case, records are passed
one at a time to Informatica. Real-time data cleansing enhances the usability of new records as they enter the
Siebel system. Batch cleansing enhances the usability of records present in the system.
The data flow is as follows:
1.
Real time. You enter or edit a record in the Siebel system.
- or Batch. Siebel begins a batch cleansing job.
2.
Siebel calls its Data Quality Manager, which instructs Siebel Data Quality to initiate the data cleansing
operation.
3.
The Siebel Universal Connector sends the record data to the INFADQSiebel library file, which passes the
record information to the PowerCenter Web Services Hub in a SOAP envelope.
4.
PowerCenter Web Services Hub routes the input data to a task in a PowerCenter workflow.
5.
The data is routed to a PowerCenter mapping containing a data cleansing mapplet. When the data reaches
the mapplet instructions, PowerCenter cleanses the data according to these instructions.
6.
The same workflow task returns the cleansed data to the PowerCenter Web Services Hub.
7.
The PowerCenter Web Services Hub returns the cleansed data to the Siebel system through the
INFADQSiebel library file.
Real-Time Deduplication
This process begins when Siebel receives the cleansed record from the INFADQSiebel library file. The process
flow is as follows:
1.
Siebel calls its Data Quality Manager, which instructs Siebel Data Quality to initiate the deduplication
operation.
Siebel Data Quality performs initial duplicate analysis on its data to identify possible matches for the record.
These possible matches are called candidate records. The input record is called the driver record.
2.
4
The Siebel Universal Connector sends the driver and candidate records to the INFADQSiebel library file,
which passes them to the PowerCenter Web Services Hub in a SOAP envelope.
Chapter 1: Introduction to Data Quality for Siebel
3.
PowerCenter Web Services Hub routes the input data to a task in a PowerCenter workflow.
4.
The data is routed to a PowerCenter mapping containing a data deduplication mapplet. When the data
reaches the mapplet instructions, PowerCenter performs duplicate analysis according to the instructions.
Informatica Data Quality operations calculate a set of match scores indicating the level of similarity between
each driver-candidate pair. Informatica Data Quality returns the match scores and a unique ID for each pair. It
does not return the original record data.
5.
A workflow task returns the data results to the PowerCenter Web Services Hub, which returns the data to the
Siebel System.
6.
Siebel displays onscreen the candidate records that meet or exceed the match threshold score set in the
Siebel system. You can configure the columns presented on this screen within the Siebel application using
Siebel Tools. You can edit the match threshold value in Siebel.
7.
You review the results of the process and decide how to proceed. You can select the original record for
addition to the database or a candidate record to be merged with the information entered on the new record.
Batch Deduplication
In batch duplicate analysis, the process flow is as follows:
1.
Siebel begins a batch cleansing job and calls its Data Quality Manager, which instructs Siebel Data Quality to
initiate the deduplication operation.
2.
The Siebel Universal Connector sends records to the INFADQSiebel library file in one or more batches. The
default upper limit of records per batch is 200. This value is set in Siebel. The INFADQSiebel library file
passes these records to the PowerCenter Web Services Hub in a SOAP envelope.
3.
PowerCenter Web Services Hub routes the input data to a task in a PowerCenter workflow.
4.
The data is routed to a PowerCenter mapping that writes the data to staging tables. Informatica Data Quality
performs duplicate analysis on the data in these tables.
Informatica Data Quality analyzes the degrees of similarity between records that share a common Siebel
Dedup Token. Informatica Data Quality compares each record against each other record with the same
Dedup Token and calculates a match score for each record pair. PowerCenter writes the match scores and a
unique ID for each record pair to the staging database.
5.
A workflow task returns the data results to the PowerCenter Web Services Hub, which returns the scores and
IDs to Siebel. It does not return the original record data.
Using Staging Tables in Batch Deduplication
In data cleansing and real-time deduplication PowerCenter operates on data in XML. In batch deduplication,
PowerCenter writes the data to a staging table before Informatica Data Quality performs duplicate analysis.
PowerCenter then returns the results of the analysis to Siebel as XML.
You must create the staging tables that will hold this data. For more information, see “Create Staging Tables for
Batch Duplicate Analysis” on page 11.
Address Validation Process Flows
Data Quality for Siebel uses AddressDoctor to perform address validation in both realtime and batch mode.
Previously, Data Quality for Siebel used QAS Pro Web to perform all realtime address validation.
Address Validation Process Flows
5
Data Quality for Seibel performs realtime and batch address validation for the countries indicated by the following
ISO codes:
¨ AUS
¨ CAN
¨ DEU
¨ FRA
¨ GBR
¨ NLD
¨ USA
Realtime Address Validation
In realtime address validation, the process flow is as follows:
1.
A Siebel user selects the pick icon in the Address field.
2.
If the country is supported by and configured for address validation, the Address data is passed by Siebel to
the INFADQSiebel JAR via a Java Business Service.
3.
The INFADQSiebel JAR passes the data to the PowerCenter WebServices Hub using the INFADQSiebel
library, which passes the record to the PowerCenter Web Services Hub in a SOAP envelope.
4.
The PowerCenter Web Services Hub routes the input data to a task in the PowerCenter Workflow.
5.
The data is routed to a PowerCenter mapping that uses an Address Validator transformation to analyze the
data.
6.
The response from the PowerCenter mapping is returned to the WebServices Hub, which returns a list of
matching addresses and their scores to Siebel.
7.
The matching addresses and scores are presented to the user as a list in Siebel.
Batch Address Validation
In batch address validation, the process flow is as follows:
6
1.
The Informatica Address Validation workflow is executed using the Siebel 'Workflow Process Manager' job.
2.
The Informatica Address Validation workflow searches for addresses matched by the 'Informatica Batch Org'
or 'Informatica Batch Per' searches defined in System Preferences. The workflow sends matches to the
INFADQSiebel JAR in one or more batches. The default upper limit of records per batch is configurable.
3.
The INFADQSiebel JAR passes the data to the PowerCenter WebServices Hub using the INFADQSiebel
library, which passes the record to the PowerCenter Web Services Hub in a SOAP envelope.
4.
The PowerCenter Web Services Hub routes the input data to a task in the PowerCenter Workflow.
5.
The data is routed to a PowerCenter mapping, which sends the data to an appropriate address validation
engine based on the country that the address is from.
6.
The address data is compared to reference data, and if possible, corrected by the address validation engine.
The address validation engine generates a validation status and returns the updated address to the Web
Services Hub.
7.
The Web Services Hub returns the address and validation status to Siebel.
8.
Siebel updates the address data only if the validation status is specified in the 'Informatica Batch Status' entry
in System Preferences.
9.
The MatchStatus column of the address is updated with the validation status value returned.
Chapter 1: Introduction to Data Quality for Siebel
Physical Architecture and Installable Components
The system architecture that encompasses Data Quality for Siebel comprises applications and files from Siebel
and Informatica.
Siebel Components
Data Quality for Siebel uses the following Siebel components:
¨ Siebel system. Server-side Siebel instance, including the database.
¨ Data hubs. Used for managing data in the database. Supported hubs are Siebel Universal Customer Master
and Siebel Customer Relationship Management.
¨ Data entry client. A client data entry tool for Siebel. Thick client users must load library and configuration files
to their machines to use Data Quality for Siebel. Thin client users do not need to install these files.
¨ Data Quality Business Services. Data hubs use business services to call data quality processes on input
data.
¨ Siebel Data Quality. Application that manages data cleansing and deduplication operations and communicates
with third-party applications such as Data Quality for Siebel.
Informatica Components
Data Quality for Siebel uses server-side and client-side components from Informatica PowerCenter.
Data Quality for Siebel uses the following server-side components:
¨ Repository Service
¨ Integration Service
¨ Web Services Hub
The Repository and Integration Services manage and run data processes in PowerCenter. The Web Services Hub
is a gateway that makes PowerCenter functionality available to external client applications through web services.
The Web Services Hub and the web services it hosts comprise the Web Services Provider.
Data Quality for Siebel also uses the client-side Repository Manager component. Use the Repository Manager to
import the required workflows to the PowerCenter repository.
Ensure that your license includes the following PowerCenter options:
¨ Data Cleansing Option
¨ Real-Time Option
Data Quality for Siebel Components
Data Quality for Siebel provides the following components, which you install separately from the Informatica
components above. Locate these components in the Data Quality for Siebel install folders.
¨ Informatica Data Quality Library and Configuration Files. Data Quality for Siebel uses library and
configuration files to enable communication between the Universal Connector and the PowerCenter Web
Services Hub.
Locate the library files in the bin folder of the installable fileset. Locate the configuration files in the
SDQConnector folder of the installable fileset.
¨ PowerCenter Workflows. The data quality mapplets that perform data cleansing, duplication, and address
validation operations are embedded in PowerCenter mappings. PowerCenter runs these mappings through
Physical Architecture and Installable Components
7
workflow tasks. The workflows and their constituent tasks, mappings, and mapplets are saved as a single XML
file for import to the PowerCenter repository.
Locate the workflow XML file in the Workflows folder of the installable fileset.
¨ Informatica Data Quality Reference Tables. Data quality mapplets use reference tables in cleansing
operations.
Locate the reference tables in the Dictionaries folder of the installable fileset.
¨ Informatica Data Quality Mappings. Backup copies of the PowerCenter mappings. You do not need to install
these files unless you want to edit the mappings in Data Quality and generate new mapplets for your solution.
¨ Staging Database Table Schema. PowerCenter uses staging tables when performing duplicate analysis in
batch mode. You must configure PowerCenter to use tables with the required schema. The schema is
contained in a file named <DatabaseType>-ddl.txt, where DatabaseType refers to the installed database.
Locate this file in the Database folder of the installable fileset.
Bill of Materials
Your receive your Data Quality for Siebel components in a compressed folder. The following table describes the
contents of this folder:
Sub-folder Name
File Type
[root folder]
Readme.txt
bin
Library files
Database
Database schema files
Dictionaries
XML file containing reference tables
Java
Informatica JAR files
Resources
XML file containing data quality mappings
SDQConnector
Configuration and property files
Sif
Siebel installation file
Workflows
PowerCenter workflow XML
Xsd
Web service data description files
System Requirements
The following general requirements apply to Data Quality for Siebel:
¨ Platform requirements. Data Quality for Siebel installs on 32-bit Windows, Linux, and AIX platforms.
¨ Application requirements. Data Quality for Siebel comprises applications and files associated with Siebel,
Informatica PowerCenter, and Informatica Data Quality. The system requirements for each of these
applications also apply to the components installed on their respective machines. Consult the user
documentation for these applications for full information.
¨ Character encoding requirements. The staging database used by Informatica for batch duplicate analysis
must be configured to accept data encoded in the UTF-8 encoding of Unicode.
8
Chapter 1: Introduction to Data Quality for Siebel
¨ Database requirements. Data Quality for Siebel requires one of the following databases to stage data during
batch deduplication: Oracle, DB2, or Microsoft SQL Server.
¨ Address validation requirements. To perform address validation, a 32-bit Java VM (version 1.5 or higher)
must be installed on the Siebel machine.
Additional configuration requirements apply to installations of PowerCenter and Informatica Data Quality. For more
information on PowerCenter requirements and Informatica Data Quality requirements, see Chapter 2, “Installing
and Configuring Informatica Components” on page 10.
Physical Architecture and Installable Components
9
CHAPTER 2
Installing and Configuring
Informatica Components
This chapter includes the following topics:
¨ Overview, 10
¨ Install PowerCenter Client and Server Components, 11
¨ Create a PowerCenter Web Services Hub, 11
¨ Create Staging Tables for Batch Duplicate Analysis, 11
¨ Install Informatica Data Quality Reference Tables, 12
¨ Importing PowerCenter Workflows, 12
¨ Configuring the Siebel Parameter File, 13
¨ Installing Reference Data, 14
Overview
The steps to install and deploy Informatica software for Data Quality for Siebel include installing applications and
loading files for use in the Data Quality for Siebel environment.
You must consult other Informatica documents for complete install instructions. This chapter identifies the
documents you need.
Informatica product documentation is available from the Informatica Customer Portal site at
http://mysupport.informatica.com.
Install Sequence
Install and configure Informatica components in this order:
10
1.
Install PowerCenter Client and Server.
2.
Create a PowerCenter Web Services Hub.
3.
Create staging tables for PowerCenter batch deduplication operations.
4.
Install Informatica Data Quality reference tables.
5.
Import PowerCenter workflows and mappings.
6.
If you want to customize the provided Data Quality mappings, install Informatica Data Quality.
Install all Informatica software on the same machine. You need not install Informatica software on the same
machine as Siebel software. Informatica and Siebel components communicate by Simple Object Access Protocol
(SOAP).
Note: Informatica recommends installing Siebel and PowerCenter in the same physical location or data center.
Install PowerCenter Client and Server Components
Install PowerCenter server-side and client-side components. PowerCenter Client components import workflows to
the PowerCenter repository. PowerCenter Server components define the PowerCenter repository and execute the
workflow tasks.
Refer to the PowerCenter documentation for instructions on installing the server-side and client-side PowerCenter
components.
¨ The product install document is the PowerCenter Installation Guide.
Post-Installation Requirements
Use the PowerCenter Admin Console to review the Maximum Processes setting.
The Maximum Processes setting determines the maximum number of concurrent processes permitted on the
PowerCenter node must be adequate for the quantity of processes that Data Quality for Siebel may use.
The Maximum Processes setting must be no lower than 20. Informatica recommends a Maximum Processes
setting of 50 for Data Quality for Siebel.
Create a PowerCenter Web Services Hub
Create a Web Services Hub on the PowerCenter Integration Service machine. Refer to the PowerCenter
documentation for information on the Web Services Hub and Web Services Provider.
¨ The install document is the PowerCenter Administration Guide.
¨ For more information on the Web Services Hub, consult the PowerCenter Web Services Provider Guide.
Create Staging Tables for Batch Duplicate Analysis
Data Quality for Siebel writes data to staging tables for batch duplicate analysis. You must create these tables.
1.
Copy the schema for these tables from the <DataBaseType>-ddl.txt file included in your install fileset for Data
Quality for Siebel.
Pre-Installation Requirements
Ensure that the PowerCenter Integration Service machine has a client installed for your staging database type.
The following staging database types are supported: Oracle, DB2, and Microsoft SQL Server.
Install PowerCenter Client and Server Components
11
Character Encoding Requirements
If any characters outside of the Latin1 character will be passed to Data Quality for Siebel for batch duplicate
analysis, you must verify that the database can store Unicode characters.
Contact the database administrator to configure the encoding for the staging database.
Install Informatica Data Quality Reference Tables
Informatica Data Quality uses reference tables in data cleansing.
1.
Locate the Dictionaries folder in your Data Quality for Siebel install fileset. This folder contains a sub-folder
named DQforSiebel.
2.
Copy the DQforSiebel folder into the services folder of your Informatica installation directory.
Importing PowerCenter Workflows
Data Quality for Siebel uses pre-defined PowerCenter workflows, which you must import to PowerCenter. The
workflows are saved in the INFADQSiebel.xml file located in the Workflows folder of your install fileset. Import this
file to your PowerCenter repository using the PowerCenter Repository Manager.
Pre-Import Requirements
Before you import the workflows, verify the following:
¨ Create a relational connection in the PowerCenter Repository and configure it to use the staging database
schema specified on “Create Staging Tables for Batch Duplicate Analysis” on page 11. Ensure that the code
page for this connection is set to “UTF-8 encoding of Unicode”.
¨ Add the All.param parameter file to the PowerCenter system in the $PMRootDir of the integration service. In
default installations, this directory is named infa_shared. Within this directory, create a new directory called
Param and place All.param there. See “Configuring the Siebel Parameter File” on page 13 for instructions on
editing this file to match your system settings.
Steps to Import
To import a mapping or workflow to a PowerCenter repository:
12
1.
Start the PowerCenter Repository Manager and connect to a repository.
2.
From the Repository menu, select Import Objects.
3.
The Import Wizard opens. Click Browse and select the INFADQSiebel.xml file. Click Next.
4.
On the Select objects to import screen, select the option to Add All. Click Next.
5.
On the Match Folders page, click the Browse button in the Destination Folder field to open the Folder
Selection dialog box. Select the destination folder for the objects you will import. Create a new repository
folder if necessary. Click Next.
Chapter 2: Installing and Configuring Informatica Components
6.
You are prompted to specify rules for conflict resolution. Conflicts will occur if any object you import has the
same name as an object in the destination repository. Create a rule that replaces (overwrites) any such
objects. Click Next.
7.
Resolve any conflicts identified between the import file and the contents of the destination folder. If the status
is Resolved, click Import.
8.
The import wizard describes the import progress on its Output tab. Click Done when the import is complete.
Post-Import Requirements
After you import the workflows, use PowerCenter Workflow Manager to associate the imported workflows with your
PowerCenter Integration Service.
To assign a workflow to a PowerCenter Integration Service:
1.
From the Workflow Manager menu, select Service > Assign Integration Service.
2.
The Assign Integration Service dialog box opens. In this dialog box, use the menu options to select the folder
containing your imported Data Quality for Siebel workflows and to select an Integration Service.
3.
Select the workflows to assign to the Integration Service. To select all workflows in the folder, choose Select
all displayed workflows.
4.
Click Assign.
5.
Restart the PowerCenter Web Services Hub.
Configuring the Siebel Parameter File
The Data Quality for Siebel install files contain a parameter file named All.param that you must add to your
PowerCenter installation and edit to suit your system. The file consists of several settings that have blank values
or values that are deselected by a prefixed # “comment” character.
Review these settings and edit them as necessary for your system. The types of setting include:
¨ The database type.
¨ Batch deduplication settings.
Verify the Index Database Type
The parameter file contains $$DATABASE_TYPE entries for Oracle, Microsoft SQL Server, and DB2 as shown
below. By default, Oracle is the selected database type, indicated by the absence of a # symbol at the start of the
line entry.
$$DATABASE_TYPE=Oracle
#$$DATABASE_TYPE=Microsoft SQL Server
#$$DATABASE_TYPE=DB2
To change the database type:
u
Remove the # character from its $$DATABASE_TYPE and add this character to the current line entry.
Configuring the Siebel Parameter File
13
Update the Batch Deduplication Settings
Relational Connection Settings
The $DBConnectionDQBatch setting specifies the relational connection for batch deduplication operations.
$DBConnectionDQBatch=SDQ_Batch_Oracle
Update the value of this setting to match the relational connection configured for batch deduplication in the
Workflow Manager.
Debug Settings
Set $$DEBUG_ON to 1 to store batch deduplication source rows after deduplication completes. Source rows are
stored in the following locations:
¨ Account - sdq_account_bulk_match_bak
¨ Contact - sdq_contact_bulk_match_bak
¨ Prospect - sdq_prospect_bulk_match_bak
Session ID Cache Settings
$$SESSION_ID_CACHE specifies the number of session ids that the Informatica library caches between calls to the
SDQ_GetSessionID web service. The default value is 1000. This value must be set to 2 or higher. If the value is set
to lower than 2, the default value of 1000 will be used.
Match Score Threshold
The $$MATCHSCORE_LIMIT setting determines the batch deduplication records to be stored in the staging database
and returned to Siebel. This value should be less than or equal to the matchscore threshold configured in Siebel.
The default value for this setting is as follows:
MPLT_SDQ_BatchDedup.$$MATCHSCORE_LIMIT=60
Record Removal Settings
Use the $$RETAINED_DAYS setting to determine the number of days that records are stored in the batch
deduplication staging database. Records older than the number of days specified for this setting are deleted on a
nightly basis.
Installing Reference Data
Informatica provides the batch and realtime reference datasets for Data Quality for Siebel. Informatica also
provides the Data Quality Content Installer.
Use the Data Quality Content Installer to install address reference datasets after you install all applications. For
instructions on running the Content Installer, see the Informatica Data Quality Content Installation Guide.
14
Chapter 2: Installing and Configuring Informatica Components
CHAPTER 3
Configuring Siebel
This chapter includes the following topics:
¨ Overview, 15
¨ Adding Library Files, 15
¨ Adding the JAR File, 16
¨ Adding Configuration Files, 16
¨ Editing Configuration Files, 16
Overview
This chapter describes the setup and configuration procedures that enable Siebel to communicate with Data
Quality for Siebel. The communication is handled by Informatica library files that reside in the Siebel Data Quality
environment.
These procedures require Administrator privileges in the Siebel environment.
This chapter does not provide instructions on installing Siebel products. Consult your Siebel documentation for
these instructions. You should have access to a copy of Siebel Bookshelf when performing the steps in this
chapter.
Adding Library Files
Data Quality for Siebel provides Informatica-Siebel connectivity through the INFADQSiebel library file. This file
manages the data exchange interactions between Informatica and Siebel software. Data exchange is by XML.
Informatica provides standard library files in addition to INFADQSiebel.
The library files are thread-safe and support multiple sessions by using unique session IDs. They support UTF-16
(UCS2) as the default Unicode Encoding.
To load the library files to Siebel:
1.
Locate the library files in the bin folder of your installable fileset.
2.
Copy these files to <Siebel Server Root Directory>\bin for Windows systems, and <Siebel Server Root
Directory>/lib for Linux and Unix-based systems.
3.
Windows only. If using a Siebel thick client, copy the files to its bin directory.
15
Adding the JAR File
The INFADQSiebel JAR file provides communication between Siebel scripts and the Informatica library.
To load the JAR file to Siebel:
1.
Locate INFADQSiebel.jar in the Java folder of your installable fileset.
2.
Copy this file to a folder on the Siebel server machine, e.g., \Informatica\DQforSiebel\Java.
Note: this location as you will need to add it to the Siebel class path variable.
Adding Configuration Files
Informatica provides configuration and property files that provide information on the Web Services host machine
and on logging.
To load the configuration files to Siebel:
1.
Locate the SDQConnector folder in your installable fileset.
2.
Copy these files in this folder to <Siebel Server Root Directory>\SDQConnector. Create the SDQConnector
folder if it does not exist.
3.
Windows only. If using a Siebel thick client, copy the files to its SDQConnector folder. Create the
SDQConnector folder if it does not exist.
Editing Configuration Files
You must edit the INFADQSiebel.cfg file that you copy to the Siebel SDQConnector folder so that the file identifies
the PowerCenter Web Services Hub that Data Quality for Siebel will use.
If you will connect to Data Quality for Siebel with the Siebel thick client, you must also edit the
<SiebelApplication>.cfg file in its bin/[install language] folder. For more information, see “Enable Cleansing and
Deduplication for a Thick Client” on page 25.
To edit the INFADQSiebel.cfg file:
1.
Locate the INFADQSiebel.cfg file in your Siebel installation folder.
2.
Open the file.
3.
Append the name of your PowerCenter Web Services Hub host machine to the WebServiceHost parameter.
4.
At the LogDirectory parameter, type the path to the folder where Data Quality for Siebel will create its log file.
5.
Review the WebServicePort and LogLevel settings.
The WebServicePort setting identifies a port on the web services host machine for PowerCenter use. The
default number is 7333.
The LogLevel setting specifies the quantity of logging information that Data Quality for Siebel will create. The
default setting is Warning.
6.
16
Restart the Siebel Server service.
Chapter 3: Configuring Siebel
CHAPTER 4
Configuring Siebel for Cleansing
and Deduplication
This chapter includes the following topics:
¨ Overview, 17
¨ Enable Data Quality Functionality for Siebel, 17
¨ Configure Siebel Vendor Parameters, 18
¨ Set Siebel Data Quality Parameters, 19
¨ Configure Business Components for Data Quality Operations, 20
¨ Generate Siebel Match Keys, 25
¨ Enable Cleansing and Deduplication for a Thick Client, 25
Overview
This chapter describes the setup and configuration procedures for cleansing and deduplication in Data Quality for
Siebel. The steps outlined in this chapter assume a standard Siebel installation.
Enable Data Quality Functionality for Siebel
To enable data quality functionality and Data Quality for Siebel on the Siebel system, you must perform the
following actions:
¨ Enable Siebel Data Quality at the Enterprise Level and the Object Manager Level.
¨ Register the Informatica library file with Siebel.
Register the INFADQSiebel Library
To register the INFADQSiebel library, follow the steps below.
1.
Log in to Siebel with Administrator privileges.
2.
Navigate to the Administration - Data Quality screen.
3.
Select the Third Party Administration view.
17
4.
In the Vendor List, create a new record with these values:
Name
Informatica
DLL Name
INFADQSiebel
Enable Siebel Data Quality
To enable Siebel Data Quality at the Enterprise Level and the Object Manager Level, follow the steps in Chapter 5
of the Siebel Data Quality Administration Guide in your Siebel 8 bookshelf.
When prompted for a vendor name (for example Vendor1), type Informatica.
Configure Siebel Vendor Parameters
You must configure a series of Siebel parameters that determine the types of data that Siebel passes to
Informatica applications.
To access the Vendor Parameters tab:
1.
Log in to Siebel with Administrator privileges.
2.
Navigate to the Administration - Data Quality screen.
3.
Select the Third Party Administration view.
4.
In the Vendor List, select Informatica.
5.
Below the Vendor List, select the Vendor Parameters tab.
Set Siebel Vender Parameters
To set the vendor parameters:
u
Define the parameters using the names and values in this table. Names and values are case-sensitive.
Note: Refresh the match keys in Siebel after editing Token or Query Expressions.
18
Name
Value
Account DataCleanse Record Type
Account
Account DataCleansing Conflict Id Field
S_ORG_EXT.Conflict Id
Account DeDup Record Type
Account
Account Query Expression
Left([Name], 3)
Account Token Expression
Left([Name], 3)
Batch Max Num of Records
200
UCM only: CUT Address DataCleanse Record Type
CRM only: Business Address DataCleanse Record Type
Business Address
Chapter 4: Configuring Siebel for Cleansing and Deduplication
Name
Value
Contact DataCleanse Record Type
Contact
Contact DataCleansing Conflict Id Field
S_CONTACT.Conflict Id
Contact DeDup Record Type
Contact
Contact Query Expression
Left([Last Name], 3)
Contact Token Expression
Left([Last Name], 3)
List Mgmt Prospective Contact DataCleanse Record Type
List Mgmt Prospective Contact
List Mgmt Prospective Contact Conflict Id Field
S_PRSP_CONTACT.Conflict Id
List Mgmt Prospective Contact DeDup Record Type
List Mgmt Prospective Contact
List Mgmt Prospective Contact Query Expression
Left([Last Name], 3)
List Mgmt Prospective Contact Token Expression
Left([Last Name], 3)
Max Search Spec Length
1000
Realtime Max Num of Records
200
Set Siebel Data Quality Parameters
The following procedure describes how to set the Siebel Data Quality parameters. Ensure that the Siebel Data
Quality parameters match the settings listed in “Set Siebel Data Quality Parameters” on page 19.
To set Siebel Data Quality parameters:
1.
Log in to Siebel with Administrator privileges.
2.
Navigate to the Administration - Data Quality screen.
3.
Select the Data Quality Settings view.
4.
Verify or apply the following settings for each value in this view:
Table 1. Siebel Data Quality Settings
Parameter
Description
Value
Enable DataCleansing
Determines whether real-time data cleansing is enabled for the Siebel
Server the administrator is currently logged into.
Yes
Enable DeDuplication
Determines whether real-time data matching is enabled for the Siebel
Server the administrator is currently logged into.
Yes
Set Siebel Data Quality Parameters
19
Parameter
Description
Value
Force User Dedupe - Account
Determines whether duplicate records are displayed in a pop-up window
when a user saves a new account record. The user can then merge
duplicates.
If the value is set to No, duplicates are not displayed in a pop-up
window, but the user can merge duplicates in the Duplicate Accounts
view.
Yes
Force User DeDupe - Contact
Determines whether duplicate records are displayed in a pop-up window
when a user saves a new contact record. The user can then merge
duplicates.
If the value is set to No, duplicates are not displayed in a pop-up
window, but the user can merge duplicates in the Duplicate Contacts
view.
Yes
Force User DeDupe - List
Mgmt
Determines whether duplicate records are displayed in a pop-up window
when a user saves a new prospect record. The user can then merge
duplicates.
If the value is set to No, duplicates are not displayed in a pop-up
window, but the user can merge duplicates in the Duplicate Prospects
view.
Yes
Fuzzy Query Enabled
Determines whether fuzzy querying is enabled for the Siebel Server the
administrator is currently logged into.
No
Fuzzy Query - Max Returned
Specifies the maximum number of records returned when a fuzzy query
is performed.
500
Match Threshold
Specifies the minimum score required for Siebel Data Quality to treat a
pair of records as a likely match.
70
If a parameter is not set in this view, the system will use the default value.
Note: The following parameters may appear in the Data Quality Settings View: Key Type, Search Type.
These parameters apply to the Oracle Data Quality Matching Server (SSA) and are not relevant to Data
Quality for Siebel.
Configure Business Components for Data Quality
Operations
Data Quality for Siebel operates on records from four business components: Account, Contact, List Management
Prospect, and CUT/Business Address. You must identify in Siebel the fields from each business component that
Informatica applications will cleanse or deduplicate. Only the fields you identify are passed to Informatica.
This process has three stages:
20
1.
Select Informatica as a data quality operations application vendor.
2.
Define the business components that use Informatica for selected data quality operations.
3.
Specify the business component data fields that Siebel sends to Informatica.
Chapter 4: Configuring Siebel for Cleansing and Deduplication
Select Informatica as the Data Quality Vendor
To select Informatica as the data quality vendor:
1.
Log in to Siebel with Administrator privileges.
2.
Navigate to the Administration - Data Quality screen.
3.
Select the Third Party Administration view.
4.
In the Vendor List, select Informatica.
Define Business Components and Data Quality Operations
To define the business components and their associated data quality operations:
1.
Ensure Informatica is selected as a data quality vendor.
2.
Below the Vendor List, click the BC Vendor Field Mapping tab.
3.
Under BC Operation, add a business component name and add one or more data quality operations. The
following table details the combinations of components and operations that you can use.
Table 2. Business Components and Data Quality Operations
Business Component Name
Operation
Account
Data Cleansing
Account
DeDuplication
CUT/Business Address
Data Cleansing
Note: This component should be disabled if address
validation is enabled for Informatica Data Quality for
Siebel.
Contact
Data Cleansing
Contact
DeDuplication
List Mgmt Prospective Contact
Data Cleansing
List Mgmt Prospective Contact
DeDuplication
Mapping Business Component Data Fields to Informatica Data Fields
The following tables list the default field mappings for the Informatica processes in the Data Quality for Siebel
solution. The Business Component Field column contains the field names used by Siebel. The Mapped Field
column contains the field label used by Informatica.
Configure Business Components for Data Quality Operations
21
Note: The mapped field names in these business screens must match the field names in the corresponding
PowerCenter mappings. If you edit the field names in your mappings, you must edit the business component field
names on these screens. The field values are case-sensitive.
Table 3. Account Deduplication Field Mappings
Business Component Field
Mapped Field
Dedup Token
Account.DedupToken
Id
Account.Id
Location
Account.Location
Name
Account.Name
Primary Account City
Account.City
Primary Account Country
Account.Country
Primary Account Postal Code
Account.PostalCode
Primary Account State
Account.State
Primary Account Street Address
Account.StreetAddress
Table 4. CUT/Business Address Cleansing Fields
Business Component Field
Mapped Field
City
BusAddr.City
Country
BusAddr.Country
Id
BusAddr.Id
Postal Code
BusAddr.PostalCode
State
BusAddr.State
Street Address
BusAddr.StreetAddress
Street Address 2
BusAddr.StreetAddress2
Table 5. Account Cleansing Field Mappings
22
Business Component Field
Mapped Field
Dedup Token
Account.DedupToken
Location
Account.Location
Chapter 4: Configuring Siebel for Cleansing and Deduplication
Business Component Field
Mapped Field
Name
Account.Name
Region
Account.Region
Note: The CUT/Business address component should be disabled if address validation is enabled for Informatica
Data Quality for Siebel.
Table 6. Contact Deduplication Field Mappings
Business Component Field
Mapped Field
Account Location
Contact.AccountLocation
Dedup Token
Contact.DedupToken
First Name
Contact.FirstName
Id
Contact.Id
Last Name
Contact.LastName
Middle Name
Contact.MiddleName
Primary Account Name
Contact.AccountName
Primary City
Contact.City
Primary Country
Contact.Country
Primary Postal Code
Contact.PostalCode
Table 7. List Mgmt Prospect Cleansing Field Mappings
Business Component Field
Mapped Field
Account
ListMgtProspectContact.Account
City
ListMgtProspectContact.City
Country
ListMgtProspectContact.Country
First Name
ListMgtProspectContact.FirstName
Job Title
ListMgtProspectContact.JobTitle
Last Name
ListMgtProspectContact.LastName
Middle Name
ListMgtProspectContact.MiddleName
Postal Code
ListMgtProspectContact.PostalCode
Primary Account Location
ListMgtProspectContact.AccountLocation
Configure Business Components for Data Quality Operations
23
Business Component Field
Mapped Field
State
ListMgtProspectContact.State
Street Address
ListMgtProspectContact.StreetAddress
Street Address 2
ListMgtProspectContact.StreetAddress2
Table 8. List Mgmt Prospect Deduplication Field Mappings
Business Component Field
Mapped Field
Account
ListMgtProspectContact.Account
City
ListMgtProspectContact.City
Country
ListMgtProspectContact.Country
Dedup Token
ListMgtProspectContact.DedupToken
First Name
ListMgtProspectContact.Firstame
Id
ListMgtProspectContact.Id
Last Name
ListMgtProspectContact.LastName
Middle Name
ListMgtProspectContact.MiddleName
Postal Code
ListMgtProspectContact.PostalCode
Primary Account Location
ListMgtProspectContact.AcountLocation
State
ListMgtProspectContact.State
Street Address
ListMgtProspectContact.StreetAddress
Table 9. Contact Cleansing Field Mappings
24
Business Component Field
Mapped Field
First Name
Contact.FirstName
Job Title
Contact.JobTitle
Last Name
Contact.LastName
Middle Name
Contact.MiddleName
Chapter 4: Configuring Siebel for Cleansing and Deduplication
Generate Siebel Match Keys
As the first step in duplicate analysis, Siebel Data Quality searches the Siebel database for possible matches with
the driver record. To do so, SDQ does not use the full driver record but instead uses match key values.
Match keys are subsets of data selected from meaningful fields in the input or driver record. For example, a
person’s surname provides a meaningful match key for prospects and contacts, and an account name provides a
meaningful match key for account records.
Siebel Data Quality maintains an index of match keys for use in deduplication. Before you perform deduplication
with Data Quality for Siebel, you must generate a set of match keys or refresh the match keys so that it they are
current.
1.
Refer to the chapter 6 of the Siebel Data Quality Administration Guide for instructions on generating and
refreshing match keys.
Enable Cleansing and Deduplication for a Thick Client
Enable cleansing and deduplication for a Siebel thick client by editing the <SiebelApplication>.cfg file.
To edit the <SiebelApplication>.cfg file:
1.
Locate the <SiebelApplication>.cfg file. This file resides in the bin\[install language] folder of your Siebel thick
client installation, for example bin\ENU. If you have multiple language folders in your bin folder, edit UCM.cfg
file in each such folder.
2.
Set the following configuration parameters in this file.
[DataCleansing]
Enable = TRUE
Type = Informatica
[Deduplication]
Enable = TRUE
Type = Informatica
3.
Restart the Siebel thick client.
Generate Siebel Match Keys
25
CHAPTER 5
Configuring Siebel for Address
Validation
This chapter includes the following topics:
¨ Overview, 26
¨ SIF File Import, 26
¨ Adding Picks, 27
¨ System Preferences Setup, 30
¨ INFADQSiebel JAR Setup, 31
¨ Batch Setup, 32
¨ Verifying Your Address Validation Implementation, 32
Overview
This chapter provides address validation configuration procedures for Data Quality for Siebel. Data Quality for
Siebel implements address validation through Siebel pick items on address screens. To configure the pick items,
you must have administrator access to Siebel and access to Siebel Tools. The steps outlined in this chapter
assume a standard Siebel installation.
Note: Before performing the address validation configuration steps in this chapter, ensure that you have
completed the configuration steps specified in these chapters:
¨ Chapter 2, “Installing and Configuring Informatica Components” on page 10
¨ Chapter 3, “Configuring Siebel” on page 15
SIF File Import
This section describes the process of importing the Informatica SIF file. This file provides customized settings
necessary for setting up Informatica address validation for Siebel.
To import the Informatica SIF file:
26
1.
In Siebel Tools, select Tools>Import from Archive.
2.
Browse to the INFADQSiebel.sif file in your installation fileset. Select the SIF file and click Import.
3.
Click Next.
4.
The dialog box displays a list of the objects to import. Click Next.
5.
The dialog box displays the objects to be inserted, modified, and deleted. Click Yes to continue.
6.
The dialog box lists the imported objects as the objects are installed. After all objects are installed, click Finish.
7.
Navigate to 'Tools' > 'Compile Projects'.
8.
Choose the 'Informatica' project from the list.
9.
Choose a Siebel Repository file to write to (e.g., siebel.srf or siebel_sia.srf) using the 'Browse' button.
10.
Click 'Compile'.
Adding Picks
This section describes the process of adding picks to Siebel fields. The example business components, applets,
and field names provided are relevant for clean Siebel installations.
Tip: To identify the name of the applet to set the pick for, right-click on the applet window and choose the option
that displays the source HTML. Within the HTML, search for the string ‘applet’ and identify the actual name of the
applet.
Modifying the Address Business Component
Prior to setting up picks for applets, you must add new fields and user properties to the main address business
component for your application. Perform the steps in the sections below to make these additions.
Adding New Fields
1.
In Siebel Tools, locate and check out the address business component associated with the applet.
2.
Navigate to 'Business Component' > 'Field'.
3.
Add a new Field 'ValidationSearch'.
4.
Set the 'Calculated' column to ‘True’ (checked).
5.
In the Calculated Value field, enter concatenated address fields using pipes to separate them: [Street
Address] + "|" + [Street Address 2] + "|" + [City] + "|" +[State] +"|" + [Postal Code] + "|" + [Country]
6.
Add a new Field 'MatchStatus'.
7.
Update the 'Column' column by choosing an unused field from the Address table, for example, ‘COMMENTS’.
Adding New User Properties
1.
From the menu, choose 'View' - 'Options…', and then select the 'Object Explorer' tab.
2.
Make sure that the box beside 'Business Component' > 'Business Component User Prop' is checked.
3.
Choose the address business component you are working with (e.g., 'Business Address', 'CUT Address', or a
custom address BC).
4.
Navigate the tree to 'Business Component' > 'Business Component User Prop'.
Adding Picks
27
5.
Determine if there are any existing 'On Field Update Set n' rows, where ‘n’ equals an integer.
¨ If ‘On Field Update Set n' rows are present, take note of the highest value of ‘n’ for use in Step 6.
¨ If 'On Field Update Set n' rows are not present, use a value of ‘0’ for ‘n’ in Step 6.
6.
7.
Create a new record for each address field entered in “Adding New Fields” on page 27. Edit the columns to
match the settings described in the following table, ensuring that Name Column settings reflect the value of ‘n’
from the previous step.
Field
Name Column (Replace “n” with
number)
Value Column
Street Address
On Field Update Set n+1
“Street Address”, "MatchStatus", "Unchecked"
Street Address 2
On Field Update Set n+2
“Street Address 2”, "MatchStatus", "Unchecked"
City
On Field Update Set n+3
“City”, "MatchStatus", "Unchecked"
State
On Field Update Set n+4
“State”, "MatchStatus", "Unchecked"
Postal Code
On Field Update Set n+5
“Postal Code”, "MatchStatus", "Unchecked"
Country
On Field Update Set n+6
“Country”, "MatchStatus", "Unchecked"
Compile the address business component into the SRF file.
Adding Picks for Address Applets
This section describes the procedures for adding picks to address applets, including list applets and multi-value
group applets.
Update the Address Screen Applet
1.
Find the applet to change (e.g., 'Account Address Mvg Applet'). To search for the applet, use <CTRL>-Q on
the list of applets and search in the 'Name' column.
2.
Select the applet row. Ensure that you have the applet checked out.
3.
Navigate the tree to 'Applet' > 'List' > 'List Column'.
4.
Choose the 'Street Address' column in the right pane.
5.
Set the column 'Runtime flag' to 'True' (checked).
6.
Set the column 'Pick Applet' to 'Informatica Address Validation Pick' using the dropdown in the cell.
7.
Compile the applet into the SRF file.
Update the Parent Business Component
28
1.
Identify the parent business component of the applet by selecting ‘Help’ > ‘About View’ on the page the applet
launches from. The name of the parent business component (e.g., 'Account') appears after the “BusComp”
string.
2.
In Siebel Tools, locate and check out the parent business component.
3.
Navigate to 'Business Component' > 'Field'. If you are updating the parent BC for a list applet, skip to
Step “Update the Parent Business Component” on page 28. If you are updating the parent BC for a multivalue group applet, continue on to Step “Update the Parent Business Component” on page 28.
Chapter 5: Configuring Siebel for Address Validation
4.
5.
Multi-Value Group Applets only. Add a new record to the lower right pane with the following column values:
Column Name
Value
Name
MatchStatus
Multivalued
True (Checked)
Multivalue Link
Name of child BC (e.g. 'Business Address' or 'CUT Address')
Dest Field
MatchStatus
Multi-Value Group Applets only. Add another record to the lower right pane with the following values:
Column Name
Value
Name
ValidationSearch
Multivalued
True (Checked)
Multivalue Link
Name of child BC (e.g. 'Business Address' or 'CUT Address')
Dest Field
ValidationSearch
6.
Select the row named 'Street Address'.
7.
Set the 'Picklist' column to 'Informatica Address Picklist'.
8.
Navigate to the 'Field' > 'Pick Map' and add the following fields:
9.
Name
Picklist Field
Sequence
Constrain
Street Address
Street Address
1
False
Street Address 2
Street Address 2
2
False
City
City
3
False
State
State
4
False
Postal Code
ZipCode
5
False
Country
Country
6
False
ValidationSearch
Search-spec
7
True (Checked)
MatchStatus
MatchStatus
8
False
Compile the parent business component (e.g., Account) into the destination SRF file.
Adding Picks
29
System Preferences Setup
This section describes the procedure for setting system preferences for Data Quality for Siebel.
1.
In the sitemap, navigate to 'Administration - Application' > 'System Preferences'.
2.
Ensure that the 'New' Button is enabled. If it is not enabled, then you will need to perform the following steps
to change the System Preferences object to allow inserts:
¨ Open Siebel Tools.
¨ Navigate to 'Business Component'.
¨ Find 'System Preferences'.
¨ Set the column 'No Insert' to False (Unchecked).
¨ Compile the System Preferences object into the SRF.
3.
If it is not possible to edit the Name or value field then you will need to perform the following steps to change
the System Preferences Applet:
¨ Open Siebel Tools.
¨ Navigate to 'Applet'.
¨ Find 'System Preferences'.
¨ Navigate to 'List' -> 'List Column'.
¨ Set the 'Read Only' column to False (Unchecked) for all fields you want to edit.
¨ Compile the applet into the destination SRF file.
4.
30
Add the following rows:
Rows
Values
Informatica Batch MatchStatus
‘VALIDATED - BT'
Informatica Batch Org
Set to '0' (Disabled) or to '1' (Enabled)
Informatica Batch Org Fields
A comma separated list of additional fields to be sent for validation; if no
additional fields are required, set to “0“
Informatica Batch Org Search
Search used to identify Addresses to be sent to the batch Address Validation
process. For example, [MatchStatus] = "UNMATCHED - BT" AND [Country] =
"United Kingdom" AND [City] = "London" OR [City] = "Guildford”
Informatica Batch Per
Set to '0' (Disabled) or to '1' (Enabled)
Informatica Batch Per Fields
A comma separated list of additional fields to be sent for validation; if no
additional fields are required, set to “0“
Informatica Batch Per Search
Search used to identify Addresses to be sent to the batch Address Validation
process. For example, [MatchStatus] = "UNMATCHED - BT" AND [Country] =
"United Kingdom" AND [City] = "London" OR [City] = "Guildford”
Chapter 5: Configuring Siebel for Address Validation
Rows
Values
Informatica Batch Size
Number of records to be sent in each batch set, e.g. '500'
Informatica Countries
A comma separated list of the countries which Informatica PowerCenter is
configured to provide validation for. Both Real Time and Batch. E.g.
'USA,Germany,United Kingdom,France,Australia,Canada,Netherlands"
INFADQSiebel JAR Setup
This section describes the procedures for setting up the INFADQSiebel JAR for Siebel thick clients and thin clients.
A 32-bit Java VM (version 1.5 or higher) must be installed on the Siebel machine to set up the INFADQSiebel JAR.
Locate and copy the system path of the JVM library file, as this must be added to the INFADQSiebel configuration
information during JAR setup.
Note: When following the instructions in the subsections below, replace the filenames in brackets with the full
paths to the relevant files. Ensure that these paths include the filename and file extension.
JAR Setup for the Siebel Thick Client
1.
Copy the following text:
[JAVA]
DLL=[Path To JVM.DLL]
CLASSPATH=[Path To Siebel.jar];[Path To SiebelJI_enu.jar];[Path To INFADQSiebel.jar]
VMOPTIONS= -Xrs -Djava.compiler=NONE
2.
Paste the text into a text editor. Replace the bracketed text with the file paths corresponding to your system
configuration.
3.
Copy the edited text and paste it into the <Siebel application>.cfg file (e.g., UCM.cfg).
JAR Setup for the Siebel Thin Client
1.
Copy the following two commands:
[Directory to SiebSrv Bin]\srvrmgr /E [EnterpriseName] /g [GatewayName] /s [ServerName] /u SADMIN /p
[SADMINPASS]
Create named subsystem JAVA for subsystem JVMSubSys with DLL=[Path To JVM.Dll], CLASSPATH=.;
[Path To Siebel.jar];[Path To SiebelJI_enu.jar];[Path To INFADQSiebel.jar], VMOPTIONS=" -Xrs Djava.compiler=NONE"
2.
Paste the text into a text editor. Replace the bracketed text with the file paths corresponding to your system
configuration.
3.
Individually copy the edited commands and paste them into a command shell on the thin client server.
INFADQSiebel JAR Setup
31
Batch Setup
To set up batch address validation, you must publish the workflow. You must also ensure that field used for
MatchStatus on the PER and ORG business components is the same field chosen for MatchStatus when setting up
realtime validation.
To set up batch address validation:
1.
In Siebel Tools, navigate the tree to 'Business Component'.
2.
Locate and check out the Informatica Business Component that you are using for Batch ('Informatica Address
Org' or 'Informatica Address Per').
3.
Navigate the tree to 'Field'.
4.
In the lower right pane, select the row named 'MatchStatus'.
5.
In the 'Column' column select the name of the database column in which MatchStatus is to be stored. This
should be the same column selected when modifying the Address business component.
6.
Check in the Business Component and Compile to the destination SRF file.
7.
In Siebel Tools, navigate the tree to ‘Projects’.
8.
Search for the Informatica project.
9.
Ensure that the locked column is set to true (checked).
10.
Navigate to ‘Tools > Check In ...’
11.
In the Check In dialog, select the Informatica project and click the ‘Check In’ button.
12.
In Siebel Site Map, navigate to Administration - Business Process > Workflow Deployment.
13.
Query for Informatica*.
14.
Choose the Informatica Address Validation Workflow and click the Activate button.
Verifying Your Address Validation Implementation
This section contains instructions on verifying your address validation implementation for both realtime and batch
scenarios.
Verifying a Realtime Address Validation Implementation
Before verifying your address validation implementation for realtime scenarios, you must complete all of the
configuration instructions specified in this chapter, with the exception of the instructions specified in “Batch
Setup” on page 32. Batch setup procedures are not a prerequisite for realtime address validation.
The following instructions assume that address search functionality is enabled for an ‘Account’ address applet.
To verify realtime address validation:
1.
Navigate to the ‘Account’ screen.
2.
Select an existing account.
3.
Open the Address applet and select an existing address.
4.
Select the pick icon in the 'Street Address' field.
After you select this icon, the applet displays a list of matching addresses.
32
Chapter 5: Configuring Siebel for Address Validation
5.
If an address has a check mark in the 'Refine' column, select the record and click the 'Refine' button to
identify a full address. It may be necessary to refine the record more than once to locate a full address.
6.
Select the full address you require and click the 'Pick' button to replace the existing address.
7.
If the applet does not present a suitable address, click the 'Cancel' button to return to the original address.
Verifying a Batch Address Validation Implementation
Before verifying your address validation implementation for realtime scenarios, you must complete all of the
configuration instructions specified in this chapter, with the exception of the instructions specified in “Adding
Picks” on page 27. Adding a pick is not a prerequisite for batch address validation.
To verify a batch address validation implementation, perform the procedures described below.
Setting Up a Batch Address Validation Workflow
To set up a batch address validation workflow:
1.
Navigate to Site Map > 'Administration - Application' > 'System Preferences'.
2.
Depending upon which underlying Siebel tables that store addresses in your Siebel system, place a '1' in the
value field of the entry named either 'Informatica Batch Org' or 'Informatica Batch Per’.
3.
Update the related search specification to identify the address records you wish to process. For example, set
'Informatica Batch Org Search' to '[Country] = "USA" AND [Region] = "CA"' to select all records from the Org
table that are from California, USA.
4.
In the entry named 'Informatica Batch Status', set the status values that will be updated with corrected data
during the address validation process. For example, to update records that are identified as valid or as
corrected, set this entry to 'VALIDATED - BT,CORRECTED - BT'.
Executing the Address Validation Workflow
Execute the address validation workflow using one of the following methods:
1.
Execute the workflow through the srvrmgr console. To execute the batch workflow through the srvrmgr
Console, run the following command from srvrmgr console:
'run task for component WfProcMgr with processName='Informatica Address Validation'
2.
Execute the workflow through a browser. To execute the batch workflow through a browser, perform the
following actions sequentially:
¨ Navigate to Administration - Server Management > Jobs.
¨ Create a new Workflow Process Manager job with following Job Parameter settings:
Name
Workflow Process Name
Value
Informatica Address Validation
¨ Submit the job.
Verifying Your Address Validation Implementation
33
INDEX
A
Address Validation
process flow 5
All.param file 13
Architecture
logical architecture 2
physical architecture 7
D
Data Cleansing
process flow 4
Data Deduplication
process flow 4
Data Quality for Siebel
bill of materials 8
defined 1
installable components 7
I
Informatica Components
install sequence 10
installable components 7
Informatica Data Quality
reference dictionaries 12
Informatica Data Quality Workbench
mapplets 4
Install Sequence
Informatica components 10
Installable Components
Data Quality for Siebel 7
Informatica components 7
Siebel components 7
Platform Support 8
PowerCenter Admin Console 11
PowerCenter applications
Informatica Data Quality Integration 7
PowerCenter Integration Service 7
PowerCenter Mappings 3
PowerCenter Mapplet 3
PowerCenter Repository Manager 12
PowerCenter Repository Service 7
PowerCenter Session Tasks 3
PowerCenter Web Services Hub 7
PowerCenter Workflows
assigning to PowerCenter Integration Service 13
importing 12
Q
Query Expressions 18
R
Reference data
installing 14
S
Keys, Generating 25
Siebel
business components, configuring 20
data quality parameters, configuring 19
vendor parameters, configuring 18
vendor parameters, setting 18
Siebel Components
installable components 7
Siebel Data Quality
enabling at Enterprise Level 18
enabling at Object Manager Level 17
Staging Tables
character encoding requirements 12
System Requirements 8
O
T
Oracle Match Keys. See Keys 25
Token Expressions 18
K
P
parameter file 13
34