Informatica Data Quality for Siebel 9.1.0 HotFix 2
Transcription
Informatica Data Quality for Siebel 9.1.0 HotFix 2
Informatica Data Quality for Siebel (Version 9.1.0 HotFix 2) User Guide Informatica Data Quality for Siebel User Guide Version 9.1.0 HotFix 2 August 2011 Copyright (c) 1998-2011 Informatica. All rights reserved. This software and documentation contain proprietary information of Informatica Corporation and are provided under a license agreement containing restrictions on use and disclosure and are also protected by copyright law. Reverse engineering of the software is prohibited. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise) without prior consent of Informatica Corporation. This Software may be protected by U.S. and/or international Patents and other Patents Pending. Use, duplication, or disclosure of the Software by the U.S. Government is subject to the restrictions set forth in the applicable software license agreement and as provided in DFARS 227.7202-1(a) and 227.7702-3(a) (1995), DFARS 252.227-7013 © (1)(ii) (OCT 1988), FAR 12.212(a) (1995), FAR 52.227-19, or FAR 52.227-14 (ALT III), as applicable. The information in this product or documentation is subject to change without notice. If you find any problems in this product or documentation, please report them to us in writing. Informatica, Informatica Platform, Informatica Data Services, PowerCenter, PowerCenterRT, PowerCenter Connect, PowerCenter Data Analyzer, PowerExchange, PowerMart, Metadata Manager, Informatica Data Quality, Informatica Data Explorer, Informatica B2B Data Transformation, Informatica B2B Data Exchange Informatica On Demand, Informatica Identity Resolution, Informatica Application Information Lifecycle Management, Informatica Complex Event Processing, Ultra Messaging and Informatica Master Data Management are trademarks or registered trademarks of Informatica Corporation in the United States and in jurisdictions throughout the world. All other company and product names may be trade names or trademarks of their respective owners. Portions of this software and/or documentation are subject to copyright held by third parties, including without limitation: Copyright DataDirect Technologies. All rights reserved. Copyright © Sun Microsystems. All rights reserved. Copyright © RSA Security Inc. All Rights Reserved. Copyright © Ordinal Technology Corp. All rights reserved.Copyright © Aandacht c.v. All rights reserved. Copyright Genivia, Inc. All rights reserved. Copyright Isomorphic Software. All rights reserved. Copyright © Meta Integration Technology, Inc. All rights reserved. Copyright © Intalio. All rights reserved. Copyright © Oracle. All rights reserved. Copyright © Adobe Systems Incorporated. All rights reserved. Copyright © DataArt, Inc. All rights reserved. Copyright © ComponentSource. All rights reserved. Copyright © Microsoft Corporation. All rights reserved. Copyright © Rogue Wave Software, Inc. All rights reserved. Copyright © Teradata Corporation. All rights reserved. Copyright © Yahoo! Inc. All rights reserved. Copyright © Glyph & Cog, LLC. All rights reserved. Copyright © Thinkmap, Inc. All rights reserved. Copyright © Clearpace Software Limited. All rights reserved. Copyright © Information Builders, Inc. All rights reserved. Copyright © OSS Nokalva, Inc. All rights reserved. Copyright Edifecs, Inc. All rights reserved. Copyright Cleo Communications, Inc. All rights reserved. Copyright © International Organization for Standardization 1986. All rights reserved. Copyright © ej-technologies GmbH . All rights reserved. Copyright © Jaspersoft Corporation. All rights reserved. This product includes software developed by the Apache Software Foundation (http://www.apache.org/), and other software which is licensed under the Apache License, Version 2.0 (the "License"). You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0. Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. This product includes software which was developed by Mozilla (http://www.mozilla.org/), software copyright The JBoss Group, LLC, all rights reserved; software copyright © 1999-2006 by Bruno Lowagie and Paulo Soares and other software which is licensed under the GNU Lesser General Public License Agreement, which may be found at http:// www.gnu.org/licenses/lgpl.html. The materials are provided free of charge by Informatica, "as-is", without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and fitness for a particular purpose. The product includes ACE(TM) and TAO(TM) software copyrighted by Douglas C. Schmidt and his research group at Washington University, University of California, Irvine, and Vanderbilt University, Copyright © 1993-2006, all rights reserved. This product includes software developed by the OpenSSL Project for use in the OpenSSL Toolkit (copyright The OpenSSL Project. All Rights Reserved) and redistribution of this software is subject to terms available at http://www.openssl.org and http://www.openssl.org/source/license.html. This product includes Curl software which is Copyright 1996-2007, Daniel Stenberg, <[email protected]>. All Rights Reserved. Permissions and limitations regarding this software are subject to terms available at http://curl.haxx.se/docs/copyright.html. Permission to use, copy, modify, and distribute this software for any purpose with or without fee is hereby granted, provided that the above copyright notice and this permission notice appear in all copies. The product includes software copyright 2001-2005 (©) MetaStuff, Ltd. All Rights Reserved. Permissions and limitations regarding this software are subject to terms available at http://www.dom4j.org/ license.html. The product includes software copyright © 2004-2007, The Dojo Foundation. All Rights Reserved. Permissions and limitations regarding this software are subject to terms available at http://dojotoolkit.org/license. This product includes ICU software which is copyright International Business Machines Corporation and others. All rights reserved. Permissions and limitations regarding this software are subject to terms available at http://source.icu-project.org/repos/icu/icu/trunk/license.html. This product includes software copyright © 1996-2006 Per Bothner. All rights reserved. Your right to use such materials is set forth in the license which may be found at http:// www.gnu.org/software/ kawa/Software-License.html. This product includes OSSP UUID software which is Copyright © 2002 Ralf S. Engelschall, Copyright © 2002 The OSSP Project Copyright © 2002 Cable & Wireless Deutschland. Permissions and limitations regarding this software are subject to terms available at http://www.opensource.org/licenses/mit-license.php. This product includes software developed by Boost (http://www.boost.org/) or under the Boost software license. Permissions and limitations regarding this software are subject to terms available at http:/ /www.boost.org/LICENSE_1_0.txt. This product includes software copyright © 1997-2007 University of Cambridge. Permissions and limitations regarding this software are subject to terms available at http:// www.pcre.org/license.txt. This product includes software copyright © 2007 The Eclipse Foundation. All Rights Reserved. Permissions and limitations regarding this software are subject to terms available at http://www.eclipse.org/org/documents/epl-v10.php. This product includes software licensed under the terms at http://www.tcl.tk/software/tcltk/license.html, http://www.bosrup.com/web/overlib/?License, http://www.stlport.org/doc/ license.html, http://www.asm.ow2.org/license.html, http://www.cryptix.org/LICENSE.TXT, http://hsqldb.org/web/hsqlLicense.html, http://httpunit.sourceforge.net/doc/ license.html, http://jung.sourceforge.net/license.txt , http://www.gzip.org/zlib/zlib_license.html, http://www.openldap.org/software/release/license.html, http://www.libssh2.org, http://slf4j.org/license.html, http://www.sente.ch/software/OpenSourceLicense.html, http://fusesource.com/downloads/license-agreements/fuse-message-broker-v-5-3-licenseagreement; http://antlr.org/license.html; http://aopalliance.sourceforge.net/; http://www.bouncycastle.org/licence.html; http://www.jgraph.com/jgraphdownload.html ; http:// www.jcraft.com/jsch/LICENSE.txt. http://jotm.objectweb.org/bsd_license.html; http://www.w3.org/Consortium/Legal/2002/copyright-software-20021231; http://www.slf4j.org/ license.html; http://developer.apple.com/library/mac/#samplecode/HelpHook/Listings/HelpHook_java.html; http://www.jcraft.com/jsch/LICENSE.txt; http:// nanoxml.sourceforge.net/orig/copyright.html; http://www.json.org/license.html; http://forge.ow2.org/projects/javaservice/; http://www.postgresql.org/about/license.html; http:// www.sqlite.org/copyright.html; http://www.tcl.tk/software/tcltk/license.html; http://www.jaxen.org/faq.html; http://www.jdom.org/docs/faq.html; and http://www.slf4j.org/ license.html. This product includes software licensed under the Academic Free License (http://www.opensource.org/licenses/afl-3.0.php), the Common Development and Distribution License (http://www.opensource.org/licenses/cddl1.php ) the Common Public License (http://www.opensource.org/licenses/cpl1.0.php ), the Sun Binary Code License Agreement Supplemental License Terms, the BSD License (http://www.opensource.org/licenses/bsd-license.php), the MIT License (http://www.opensource.org/licenses/mitlicense.php) and the Artistic License (http://www.opensource.org/licenses/artistic-license-1.0). This product includes software copyright © 2003-2006 Joe WaInes, 2006-2007 XStream Committers. All rights reserved. Permissions and limitations regarding this software are subject to terms available at http://xstream.codehaus.org/license.html. This product includes software developed by the Indiana University Extreme! Lab. For further information please visit http://www.extreme.indiana.edu/. This product contains runtime modules of IBM DB2 Driver for JDBC and SQLJ (c) Copyright IBM Corporation 2006 All rights reserved. This Software is protected by U.S. Patent Numbers 5,794,246; 6,014,670; 6,016,501; 6,029,178; 6,032,158; 6,035,307; 6,044,374; 6,092,086; 6,208,990; 6,339,775; 6,640,226; 6,789,096; 6,820,077; 6,823,373; 6,850,947; 6,895,471; 7,117,215; 7,162,643; 7,254,590; 7,281,001; 7,421,458; 7,496,588; 7,523,121; 7,584,422, 7,720,842; 7,721,270; and 7,774,791 , international Patents and other Patents Pending. DISCLAIMER: Informatica Corporation provides this documentation "as is" without warranty of any kind, either express or implied, including, but not limited to, the implied warranties of noninfringement, merchantability, or use for a particular purpose. Informatica Corporation does not warrant that this software or documentation is error free. The information provided in this software or documentation may include technical inaccuracies or typographical errors. The information in this software and documentation is subject to change at any time without notice. NOTICES This Informatica product (the "Software") includes certain drivers (the "DataDirect Drivers") from DataDirect Technologies, an operating company of Progress Software Corporation ("DataDirect") which are subject to the following terms and conditions: 1. THE DATADIRECT DRIVERS ARE PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. 2. IN NO EVENT WILL DATADIRECT OR ITS THIRD PARTY SUPPLIERS BE LIABLE TO THE END-USER CUSTOMER FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, CONSEQUENTIAL OR OTHER DAMAGES ARISING OUT OF THE USE OF THE ODBC DRIVERS, WHETHER OR NOT INFORMED OF THE POSSIBILITIES OF DAMAGES IN ADVANCE. THESE LIMITATIONS APPLY TO ALL CAUSES OF ACTION, INCLUDING, WITHOUT LIMITATION, BREACH OF CONTRACT, BREACH OF WARRANTY, NEGLIGENCE, STRICT LIABILITY, MISREPRESENTATION AND OTHER TORTS. Part Number: IDQ-SEI-91000-HF2-0001 Table of Contents Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv Informatica Resources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv Informatica Customer Portal. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv Informatica Documentation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv Informatica Web Site. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv Informatica How-To Library. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv Informatica Knowledge Base. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v Informatica Multimedia Knowledge Base. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v Informatica Global Customer Support. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v Chapter 1: Introduction to Data Quality for Siebel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Understanding Data Cleansing, Deduplication, and Address Validation. . . . . . . . . . . . . . . . . . . . . . . 2 Logical Architecture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 About Workflows, Mappings, and Mapplets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Cleansing and Deduplication Process Flows. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Real-Time and Batch Cleansing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Real-Time Deduplication. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Batch Deduplication. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Using Staging Tables in Batch Deduplication. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Address Validation Process Flows. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Realtime Address Validation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Batch Address Validation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Physical Architecture and Installable Components. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Siebel Components. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Informatica Components. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Data Quality for Siebel Components. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Bill of Materials. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 System Requirements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Chapter 2: Installing and Configuring Informatica Components. . . . . . . . . . . . . . . . . . . . 10 Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Install Sequence. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Install PowerCenter Client and Server Components. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Post-Installation Requirements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Create a PowerCenter Web Services Hub. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Create Staging Tables for Batch Duplicate Analysis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Pre-Installation Requirements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Table of Contents i Character Encoding Requirements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Install Informatica Data Quality Reference Tables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Importing PowerCenter Workflows. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Pre-Import Requirements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Steps to Import. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Post-Import Requirements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 Configuring the Siebel Parameter File. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 Verify the Index Database Type. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 Update the Batch Deduplication Settings. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 Installing Reference Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 Chapter 3: Configuring Siebel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 Adding Library Files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 Adding the JAR File. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 Adding Configuration Files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 Editing Configuration Files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 Chapter 4: Configuring Siebel for Cleansing and Deduplication. . . . . . . . . . . . . . . . . . . . 17 Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 Enable Data Quality Functionality for Siebel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 Register the INFADQSiebel Library. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 Enable Siebel Data Quality. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Configure Siebel Vendor Parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Set Siebel Vender Parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Set Siebel Data Quality Parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 Configure Business Components for Data Quality Operations. . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 Select Informatica as the Data Quality Vendor. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 Define Business Components and Data Quality Operations. . . . . . . . . . . . . . . . . . . . . . . . . . . 21 Generate Siebel Match Keys. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 Enable Cleansing and Deduplication for a Thick Client. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 Chapter 5: Configuring Siebel for Address Validation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 SIF File Import. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 Adding Picks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 Modifying the Address Business Component. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 Adding Picks for Address Applets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 System Preferences Setup. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 INFADQSiebel JAR Setup. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 JAR Setup for the Siebel Thick Client. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 JAR Setup for the Siebel Thin Client. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 Batch Setup. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 ii Table of Contents Verifying Your Address Validation Implementation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 Verifying a Realtime Address Validation Implementation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 Verifying a Batch Address Validation Implementation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 Table of Contents iii Preface The Informatica Data Quality for Siebel Guide is written for Siebel system administrators and other users who install and set up Data Quality for Siebel and who configure Siebel applications to communicate with Data Quality for Siebel. This guide assumes that you have an understanding of data cleansing, deduplication, and address validation capabilities in real-time and batch scenarios. Informatica Resources Informatica Customer Portal As an Informatica customer, you can access the Informatica Customer Portal site at http://mysupport.informatica.com. The site contains product information, user group information, newsletters, access to the Informatica customer support case management system (ATLAS), the Informatica How-To Library, the Informatica Knowledge Base, the Informatica Multimedia Knowledge Base, Informatica Product Documentation, and access to the Informatica user community. Informatica Documentation The Informatica Documentation team takes every effort to create accurate, usable documentation. If you have questions, comments, or ideas about this documentation, contact the Informatica Documentation team through email at [email protected]. We will use your feedback to improve our documentation. Let us know if we can contact you regarding your comments. The Documentation team updates documentation as needed. To get the latest documentation for your product, navigate to Product Documentation from http://mysupport.informatica.com. Informatica Web Site You can access the Informatica corporate web site at http://www.informatica.com. The site contains information about Informatica, its background, upcoming events, and sales offices. You will also find product and partner information. The services area of the site includes important information about technical support, training and education, and implementation services. Informatica How-To Library As an Informatica customer, you can access the Informatica How-To Library at http://mysupport.informatica.com. The How-To Library is a collection of resources to help you learn more about Informatica products and features. It iv includes articles and interactive demonstrations that provide solutions to common problems, compare features and behaviors, and guide you through performing specific real-world tasks. Informatica Knowledge Base As an Informatica customer, you can access the Informatica Knowledge Base at http://mysupport.informatica.com. Use the Knowledge Base to search for documented solutions to known technical issues about Informatica products. You can also find answers to frequently asked questions, technical white papers, and technical tips. If you have questions, comments, or ideas about the Knowledge Base, contact the Informatica Knowledge Base team through email at [email protected]. Informatica Multimedia Knowledge Base As an Informatica customer, you can access the Informatica Multimedia Knowledge Base at http://mysupport.informatica.com. The Multimedia Knowledge Base is a collection of instructional multimedia files that help you learn about common concepts and guide you through performing specific tasks. If you have questions, comments, or ideas about the Multimedia Knowledge Base, contact the Informatica Knowledge Base team through email at [email protected]. Informatica Global Customer Support You can contact a Customer Support Center by telephone or through the Online Support. Online Support requires a user name and password. You can request a user name and password at http://mysupport.informatica.com. Use the following telephone numbers to contact Informatica Global Customer Support: North America / South America Europe / Middle East / Africa Asia / Australia Toll Free Brazil: 0800 891 0202 Mexico: 001 888 209 8853 North America: +1 877 463 2435 Toll Free France: 0805 804632 Germany: 0800 5891281 Italy: 800 915 985 Netherlands: 0800 2300001 Portugal: 800 208 360 Spain: 900 813 166 Switzerland: 0800 463 200 United Kingdom: 0800 023 4632 Toll Free Australia: 1 800 151 830 New Zealand: 09 9 128 901 Standard Rate India: +91 80 4112 5738 Standard Rate Belgium: +31 30 6022 797 France: +33 1 4138 9226 Germany: +49 1805 702 702 Netherlands: +31 306 022 797 United Kingdom: +44 1628 511445 Preface v vi CHAPTER 1 Introduction to Data Quality for Siebel This chapter includes the following topics: ¨ Overview, 1 ¨ Understanding Data Cleansing, Deduplication, and Address Validation, 2 ¨ Logical Architecture, 2 ¨ Cleansing and Deduplication Process Flows, 4 ¨ Address Validation Process Flows, 5 ¨ Physical Architecture and Installable Components, 7 Overview Siebel acts as a unified source of account, customer, and prospect data across an enterprise. Data Quality for Siebel enhances your ability to create and maintain reliable and duplicate-free data in the Siebel system. Data Quality for Siebel applies the data quality management capabilities of Informatica applications to new record data entering the Siebel system and to data stored in the system. It integrates with the Siebel Data Quality (SDQ) Universal Connector through its application programming interface to deliver enhanced data cleansing, deduplication, and address validation capabilities in realtime and in batch mode. On These Record Types... You Can Perform These Data Quality Operations Account Customer List Management Prospect Real-Time Cleansing Batch Cleansing Real-Time Deduplication Batch Deduplication CUT/Business Address Real-Time and Batch Cleansing CUT/Business Address Real-Time and Batch Address Validation The data cleansing processes in Data Quality for Siebel correct and standardize record data values. The data deduplication processes identify duplicate records in the system and return these records for evaluation. The address validation processes validate records against valid addresses in a reference dataset. You can customize the data cleansing, deduplication, and address validation rules to suit the requirements of your organization. 1 Note: Siebel provides data consolidation and record merging rules are part of its configuration and setup. Data Quality for Siebel provides match scores to Siebel to facilitate this functionality. Understanding Data Cleansing, Deduplication, and Address Validation The data cleansing, deduplication, and address validation functionality in Data Quality for Siebel focuses on name and address data. Data cleansing performs standardization operations on the input record so that it meets user standards and requirements. Standardization harmonizes variations in customer terms, such as Doctor/Dr and Street/St, and removes extraneous punctuation. Data Quality for Siebel performs data cleansing in real time and batch modes. In each case, the Siebel Universal Connector sends data to Data Quality for Siebel one record at a time. Data deduplication identifies potential duplicates in record data provided by Siebel to Informatica applications. Data Quality for Siebel performs data deduplication in real time and batch modes. When processing a new record in real time, Data Quality for Siebel performs data cleansing before data deduplication. The deduplication process begins when the cleansing process completes. Address validation enhances the address cleansing and verification capabilities of Data Quality for Siebel by adding Informatica address validation components to the installed solution. Full address validation requires software and reference datasets sourced by Informatica from third-party address reference specialists. In realtime mode, users can interactively verify addresses against an address validation reference dataset in order to choose from valid address matches. In batch mode, Data Quality for Siebel non-interactively compares input record addresses to the reference dataset and then returns a validated or corrected address with status or error code information appended. Logical Architecture Data Quality for Siebel communicates with the Siebel system through a library file called INFADQSiebel that Universal Connector:in logical architecture resides locally to the Universal Connector component of Siebel Data Quality. The library file “wraps” the record data and instructions it receives from the Universal Connector in a SOAP (Simple Object Access Protocol) envelope and sends them as XML documents to PowerCenter. PowerCenter Web Services Hub receives the data and sends it to a PowerCenter workflow associated with the required data process. Depending on the data process involved, PowerCenter may call one or several workflows to process the incoming data. Each workflow may contain one or several mappings that include mapplets implementing data quality processes. These mapplets define the low-level cleansing, deduplication, and address validation operations performed on the Siebel data. The Data Quality for Siebel installation fileset includes Data Quality for Siebel mappings. These mappings do not represent the only types of cleansing, deduplication, and address validation you can perform with Informatica applications, nor do they represent the complete extent of Informatica’s data quality capabilities. The mappings are provided as a useful sample of these capabilities. To edit the mappings, or create mappings or mapplets specific 2 Chapter 1: Introduction to Data Quality for Siebel to your business processes, use Informormatica Data Quality. For more information, see “Data Quality for Siebel Mappings and Mapplets” on page 4. The following graphic illustrates the architecture on the Siebel side: Figure 1. Data Quality for Siebel Architecture, Siebel Orientation The following graphic illustrates the architecture on the Informatica side: Figure 2. Data Quality for Siebel Architecture, Informatica Orientation About Workflows, Mappings, and Mapplets Three types of Informatica processes are involved in running data cleansing, deduplication, and address validation tasks on data received from Siebel: workflows, mappings, and mapplets. Workflow. A set of instructions informing PowerCenter to run one or more tasks. It can be triggered by a PowerCenter user, by the arrival of data in real time, or by a scheduler. Workflows are composed of tasks. Tasks that contain mappings are called sessions. Data Quality for Siebel provides several workflows for import to the PowerCenter repository. These workflows contain one or more session tasks that specify the mappings that run on the input data. Data Quality for Siebel can run one or several mappings in sequence to perform cleansing, deduplication, and address validation on input data. Workflows are created in PowerCenter Workflow Manager and saved in the PowerCenter repository. Some workflows run continuously to maximize speed of data throughput between Siebel and Informatica applications. Mapping. A set of data input (source) parameters, data movement or transformation instructions, and data output (target) parameters that can be applied to a dataset in PowerCenter. Mappings are created in PowerCenter Designer and saved in the PowerCenter repository. Mapplets. A mapplet contains a set of input (source) parameters, data analysis, enhancement, or deduplication instructions, and output (target) parameters. Mapplets are stored in the PowerCenter repository for use in mappings. Data Quality for Siebel includes pre-built mapplets containing data quality processes. Logical Architecture 3 Data Quality for Siebel Mappings and Mapplets Use Informatica Data Quality to generate Data Quality for Siebel mappings and mapplets. The Data Quality for Siebel installation fileset includes Data Quality for Siebel mappings. You can customize the mappings, or you can create mappings and mapplets specific to your business processes. Save Data Quality for Siebel mappings and mapplets into the PowerCenter repository. Cleansing and Deduplication Process Flows When you enter a new record to the Siebel system, Data Quality for Siebel cleanses the record and returns the cleansed record to Siebel. When Siebel receives the cleansed record, the data deduplication processes begin. Note: You cannot create a record with the same name as an existing record. If Siebel finds an record match with the new record name, it displays a message stating that new record names must be unique. Real-Time and Batch Cleansing The data cleansing process flow is identical for real-time and batch scenarios. In each case, records are passed one at a time to Informatica. Real-time data cleansing enhances the usability of new records as they enter the Siebel system. Batch cleansing enhances the usability of records present in the system. The data flow is as follows: 1. Real time. You enter or edit a record in the Siebel system. - or Batch. Siebel begins a batch cleansing job. 2. Siebel calls its Data Quality Manager, which instructs Siebel Data Quality to initiate the data cleansing operation. 3. The Siebel Universal Connector sends the record data to the INFADQSiebel library file, which passes the record information to the PowerCenter Web Services Hub in a SOAP envelope. 4. PowerCenter Web Services Hub routes the input data to a task in a PowerCenter workflow. 5. The data is routed to a PowerCenter mapping containing a data cleansing mapplet. When the data reaches the mapplet instructions, PowerCenter cleanses the data according to these instructions. 6. The same workflow task returns the cleansed data to the PowerCenter Web Services Hub. 7. The PowerCenter Web Services Hub returns the cleansed data to the Siebel system through the INFADQSiebel library file. Real-Time Deduplication This process begins when Siebel receives the cleansed record from the INFADQSiebel library file. The process flow is as follows: 1. Siebel calls its Data Quality Manager, which instructs Siebel Data Quality to initiate the deduplication operation. Siebel Data Quality performs initial duplicate analysis on its data to identify possible matches for the record. These possible matches are called candidate records. The input record is called the driver record. 2. 4 The Siebel Universal Connector sends the driver and candidate records to the INFADQSiebel library file, which passes them to the PowerCenter Web Services Hub in a SOAP envelope. Chapter 1: Introduction to Data Quality for Siebel 3. PowerCenter Web Services Hub routes the input data to a task in a PowerCenter workflow. 4. The data is routed to a PowerCenter mapping containing a data deduplication mapplet. When the data reaches the mapplet instructions, PowerCenter performs duplicate analysis according to the instructions. Informatica Data Quality operations calculate a set of match scores indicating the level of similarity between each driver-candidate pair. Informatica Data Quality returns the match scores and a unique ID for each pair. It does not return the original record data. 5. A workflow task returns the data results to the PowerCenter Web Services Hub, which returns the data to the Siebel System. 6. Siebel displays onscreen the candidate records that meet or exceed the match threshold score set in the Siebel system. You can configure the columns presented on this screen within the Siebel application using Siebel Tools. You can edit the match threshold value in Siebel. 7. You review the results of the process and decide how to proceed. You can select the original record for addition to the database or a candidate record to be merged with the information entered on the new record. Batch Deduplication In batch duplicate analysis, the process flow is as follows: 1. Siebel begins a batch cleansing job and calls its Data Quality Manager, which instructs Siebel Data Quality to initiate the deduplication operation. 2. The Siebel Universal Connector sends records to the INFADQSiebel library file in one or more batches. The default upper limit of records per batch is 200. This value is set in Siebel. The INFADQSiebel library file passes these records to the PowerCenter Web Services Hub in a SOAP envelope. 3. PowerCenter Web Services Hub routes the input data to a task in a PowerCenter workflow. 4. The data is routed to a PowerCenter mapping that writes the data to staging tables. Informatica Data Quality performs duplicate analysis on the data in these tables. Informatica Data Quality analyzes the degrees of similarity between records that share a common Siebel Dedup Token. Informatica Data Quality compares each record against each other record with the same Dedup Token and calculates a match score for each record pair. PowerCenter writes the match scores and a unique ID for each record pair to the staging database. 5. A workflow task returns the data results to the PowerCenter Web Services Hub, which returns the scores and IDs to Siebel. It does not return the original record data. Using Staging Tables in Batch Deduplication In data cleansing and real-time deduplication PowerCenter operates on data in XML. In batch deduplication, PowerCenter writes the data to a staging table before Informatica Data Quality performs duplicate analysis. PowerCenter then returns the results of the analysis to Siebel as XML. You must create the staging tables that will hold this data. For more information, see “Create Staging Tables for Batch Duplicate Analysis” on page 11. Address Validation Process Flows Data Quality for Siebel uses AddressDoctor to perform address validation in both realtime and batch mode. Previously, Data Quality for Siebel used QAS Pro Web to perform all realtime address validation. Address Validation Process Flows 5 Data Quality for Seibel performs realtime and batch address validation for the countries indicated by the following ISO codes: ¨ AUS ¨ CAN ¨ DEU ¨ FRA ¨ GBR ¨ NLD ¨ USA Realtime Address Validation In realtime address validation, the process flow is as follows: 1. A Siebel user selects the pick icon in the Address field. 2. If the country is supported by and configured for address validation, the Address data is passed by Siebel to the INFADQSiebel JAR via a Java Business Service. 3. The INFADQSiebel JAR passes the data to the PowerCenter WebServices Hub using the INFADQSiebel library, which passes the record to the PowerCenter Web Services Hub in a SOAP envelope. 4. The PowerCenter Web Services Hub routes the input data to a task in the PowerCenter Workflow. 5. The data is routed to a PowerCenter mapping that uses an Address Validator transformation to analyze the data. 6. The response from the PowerCenter mapping is returned to the WebServices Hub, which returns a list of matching addresses and their scores to Siebel. 7. The matching addresses and scores are presented to the user as a list in Siebel. Batch Address Validation In batch address validation, the process flow is as follows: 6 1. The Informatica Address Validation workflow is executed using the Siebel 'Workflow Process Manager' job. 2. The Informatica Address Validation workflow searches for addresses matched by the 'Informatica Batch Org' or 'Informatica Batch Per' searches defined in System Preferences. The workflow sends matches to the INFADQSiebel JAR in one or more batches. The default upper limit of records per batch is configurable. 3. The INFADQSiebel JAR passes the data to the PowerCenter WebServices Hub using the INFADQSiebel library, which passes the record to the PowerCenter Web Services Hub in a SOAP envelope. 4. The PowerCenter Web Services Hub routes the input data to a task in the PowerCenter Workflow. 5. The data is routed to a PowerCenter mapping, which sends the data to an appropriate address validation engine based on the country that the address is from. 6. The address data is compared to reference data, and if possible, corrected by the address validation engine. The address validation engine generates a validation status and returns the updated address to the Web Services Hub. 7. The Web Services Hub returns the address and validation status to Siebel. 8. Siebel updates the address data only if the validation status is specified in the 'Informatica Batch Status' entry in System Preferences. 9. The MatchStatus column of the address is updated with the validation status value returned. Chapter 1: Introduction to Data Quality for Siebel Physical Architecture and Installable Components The system architecture that encompasses Data Quality for Siebel comprises applications and files from Siebel and Informatica. Siebel Components Data Quality for Siebel uses the following Siebel components: ¨ Siebel system. Server-side Siebel instance, including the database. ¨ Data hubs. Used for managing data in the database. Supported hubs are Siebel Universal Customer Master and Siebel Customer Relationship Management. ¨ Data entry client. A client data entry tool for Siebel. Thick client users must load library and configuration files to their machines to use Data Quality for Siebel. Thin client users do not need to install these files. ¨ Data Quality Business Services. Data hubs use business services to call data quality processes on input data. ¨ Siebel Data Quality. Application that manages data cleansing and deduplication operations and communicates with third-party applications such as Data Quality for Siebel. Informatica Components Data Quality for Siebel uses server-side and client-side components from Informatica PowerCenter. Data Quality for Siebel uses the following server-side components: ¨ Repository Service ¨ Integration Service ¨ Web Services Hub The Repository and Integration Services manage and run data processes in PowerCenter. The Web Services Hub is a gateway that makes PowerCenter functionality available to external client applications through web services. The Web Services Hub and the web services it hosts comprise the Web Services Provider. Data Quality for Siebel also uses the client-side Repository Manager component. Use the Repository Manager to import the required workflows to the PowerCenter repository. Ensure that your license includes the following PowerCenter options: ¨ Data Cleansing Option ¨ Real-Time Option Data Quality for Siebel Components Data Quality for Siebel provides the following components, which you install separately from the Informatica components above. Locate these components in the Data Quality for Siebel install folders. ¨ Informatica Data Quality Library and Configuration Files. Data Quality for Siebel uses library and configuration files to enable communication between the Universal Connector and the PowerCenter Web Services Hub. Locate the library files in the bin folder of the installable fileset. Locate the configuration files in the SDQConnector folder of the installable fileset. ¨ PowerCenter Workflows. The data quality mapplets that perform data cleansing, duplication, and address validation operations are embedded in PowerCenter mappings. PowerCenter runs these mappings through Physical Architecture and Installable Components 7 workflow tasks. The workflows and their constituent tasks, mappings, and mapplets are saved as a single XML file for import to the PowerCenter repository. Locate the workflow XML file in the Workflows folder of the installable fileset. ¨ Informatica Data Quality Reference Tables. Data quality mapplets use reference tables in cleansing operations. Locate the reference tables in the Dictionaries folder of the installable fileset. ¨ Informatica Data Quality Mappings. Backup copies of the PowerCenter mappings. You do not need to install these files unless you want to edit the mappings in Data Quality and generate new mapplets for your solution. ¨ Staging Database Table Schema. PowerCenter uses staging tables when performing duplicate analysis in batch mode. You must configure PowerCenter to use tables with the required schema. The schema is contained in a file named <DatabaseType>-ddl.txt, where DatabaseType refers to the installed database. Locate this file in the Database folder of the installable fileset. Bill of Materials Your receive your Data Quality for Siebel components in a compressed folder. The following table describes the contents of this folder: Sub-folder Name File Type [root folder] Readme.txt bin Library files Database Database schema files Dictionaries XML file containing reference tables Java Informatica JAR files Resources XML file containing data quality mappings SDQConnector Configuration and property files Sif Siebel installation file Workflows PowerCenter workflow XML Xsd Web service data description files System Requirements The following general requirements apply to Data Quality for Siebel: ¨ Platform requirements. Data Quality for Siebel installs on 32-bit Windows, Linux, and AIX platforms. ¨ Application requirements. Data Quality for Siebel comprises applications and files associated with Siebel, Informatica PowerCenter, and Informatica Data Quality. The system requirements for each of these applications also apply to the components installed on their respective machines. Consult the user documentation for these applications for full information. ¨ Character encoding requirements. The staging database used by Informatica for batch duplicate analysis must be configured to accept data encoded in the UTF-8 encoding of Unicode. 8 Chapter 1: Introduction to Data Quality for Siebel ¨ Database requirements. Data Quality for Siebel requires one of the following databases to stage data during batch deduplication: Oracle, DB2, or Microsoft SQL Server. ¨ Address validation requirements. To perform address validation, a 32-bit Java VM (version 1.5 or higher) must be installed on the Siebel machine. Additional configuration requirements apply to installations of PowerCenter and Informatica Data Quality. For more information on PowerCenter requirements and Informatica Data Quality requirements, see Chapter 2, “Installing and Configuring Informatica Components” on page 10. Physical Architecture and Installable Components 9 CHAPTER 2 Installing and Configuring Informatica Components This chapter includes the following topics: ¨ Overview, 10 ¨ Install PowerCenter Client and Server Components, 11 ¨ Create a PowerCenter Web Services Hub, 11 ¨ Create Staging Tables for Batch Duplicate Analysis, 11 ¨ Install Informatica Data Quality Reference Tables, 12 ¨ Importing PowerCenter Workflows, 12 ¨ Configuring the Siebel Parameter File, 13 ¨ Installing Reference Data, 14 Overview The steps to install and deploy Informatica software for Data Quality for Siebel include installing applications and loading files for use in the Data Quality for Siebel environment. You must consult other Informatica documents for complete install instructions. This chapter identifies the documents you need. Informatica product documentation is available from the Informatica Customer Portal site at http://mysupport.informatica.com. Install Sequence Install and configure Informatica components in this order: 10 1. Install PowerCenter Client and Server. 2. Create a PowerCenter Web Services Hub. 3. Create staging tables for PowerCenter batch deduplication operations. 4. Install Informatica Data Quality reference tables. 5. Import PowerCenter workflows and mappings. 6. If you want to customize the provided Data Quality mappings, install Informatica Data Quality. Install all Informatica software on the same machine. You need not install Informatica software on the same machine as Siebel software. Informatica and Siebel components communicate by Simple Object Access Protocol (SOAP). Note: Informatica recommends installing Siebel and PowerCenter in the same physical location or data center. Install PowerCenter Client and Server Components Install PowerCenter server-side and client-side components. PowerCenter Client components import workflows to the PowerCenter repository. PowerCenter Server components define the PowerCenter repository and execute the workflow tasks. Refer to the PowerCenter documentation for instructions on installing the server-side and client-side PowerCenter components. ¨ The product install document is the PowerCenter Installation Guide. Post-Installation Requirements Use the PowerCenter Admin Console to review the Maximum Processes setting. The Maximum Processes setting determines the maximum number of concurrent processes permitted on the PowerCenter node must be adequate for the quantity of processes that Data Quality for Siebel may use. The Maximum Processes setting must be no lower than 20. Informatica recommends a Maximum Processes setting of 50 for Data Quality for Siebel. Create a PowerCenter Web Services Hub Create a Web Services Hub on the PowerCenter Integration Service machine. Refer to the PowerCenter documentation for information on the Web Services Hub and Web Services Provider. ¨ The install document is the PowerCenter Administration Guide. ¨ For more information on the Web Services Hub, consult the PowerCenter Web Services Provider Guide. Create Staging Tables for Batch Duplicate Analysis Data Quality for Siebel writes data to staging tables for batch duplicate analysis. You must create these tables. 1. Copy the schema for these tables from the <DataBaseType>-ddl.txt file included in your install fileset for Data Quality for Siebel. Pre-Installation Requirements Ensure that the PowerCenter Integration Service machine has a client installed for your staging database type. The following staging database types are supported: Oracle, DB2, and Microsoft SQL Server. Install PowerCenter Client and Server Components 11 Character Encoding Requirements If any characters outside of the Latin1 character will be passed to Data Quality for Siebel for batch duplicate analysis, you must verify that the database can store Unicode characters. Contact the database administrator to configure the encoding for the staging database. Install Informatica Data Quality Reference Tables Informatica Data Quality uses reference tables in data cleansing. 1. Locate the Dictionaries folder in your Data Quality for Siebel install fileset. This folder contains a sub-folder named DQforSiebel. 2. Copy the DQforSiebel folder into the services folder of your Informatica installation directory. Importing PowerCenter Workflows Data Quality for Siebel uses pre-defined PowerCenter workflows, which you must import to PowerCenter. The workflows are saved in the INFADQSiebel.xml file located in the Workflows folder of your install fileset. Import this file to your PowerCenter repository using the PowerCenter Repository Manager. Pre-Import Requirements Before you import the workflows, verify the following: ¨ Create a relational connection in the PowerCenter Repository and configure it to use the staging database schema specified on “Create Staging Tables for Batch Duplicate Analysis” on page 11. Ensure that the code page for this connection is set to “UTF-8 encoding of Unicode”. ¨ Add the All.param parameter file to the PowerCenter system in the $PMRootDir of the integration service. In default installations, this directory is named infa_shared. Within this directory, create a new directory called Param and place All.param there. See “Configuring the Siebel Parameter File” on page 13 for instructions on editing this file to match your system settings. Steps to Import To import a mapping or workflow to a PowerCenter repository: 12 1. Start the PowerCenter Repository Manager and connect to a repository. 2. From the Repository menu, select Import Objects. 3. The Import Wizard opens. Click Browse and select the INFADQSiebel.xml file. Click Next. 4. On the Select objects to import screen, select the option to Add All. Click Next. 5. On the Match Folders page, click the Browse button in the Destination Folder field to open the Folder Selection dialog box. Select the destination folder for the objects you will import. Create a new repository folder if necessary. Click Next. Chapter 2: Installing and Configuring Informatica Components 6. You are prompted to specify rules for conflict resolution. Conflicts will occur if any object you import has the same name as an object in the destination repository. Create a rule that replaces (overwrites) any such objects. Click Next. 7. Resolve any conflicts identified between the import file and the contents of the destination folder. If the status is Resolved, click Import. 8. The import wizard describes the import progress on its Output tab. Click Done when the import is complete. Post-Import Requirements After you import the workflows, use PowerCenter Workflow Manager to associate the imported workflows with your PowerCenter Integration Service. To assign a workflow to a PowerCenter Integration Service: 1. From the Workflow Manager menu, select Service > Assign Integration Service. 2. The Assign Integration Service dialog box opens. In this dialog box, use the menu options to select the folder containing your imported Data Quality for Siebel workflows and to select an Integration Service. 3. Select the workflows to assign to the Integration Service. To select all workflows in the folder, choose Select all displayed workflows. 4. Click Assign. 5. Restart the PowerCenter Web Services Hub. Configuring the Siebel Parameter File The Data Quality for Siebel install files contain a parameter file named All.param that you must add to your PowerCenter installation and edit to suit your system. The file consists of several settings that have blank values or values that are deselected by a prefixed # “comment” character. Review these settings and edit them as necessary for your system. The types of setting include: ¨ The database type. ¨ Batch deduplication settings. Verify the Index Database Type The parameter file contains $$DATABASE_TYPE entries for Oracle, Microsoft SQL Server, and DB2 as shown below. By default, Oracle is the selected database type, indicated by the absence of a # symbol at the start of the line entry. $$DATABASE_TYPE=Oracle #$$DATABASE_TYPE=Microsoft SQL Server #$$DATABASE_TYPE=DB2 To change the database type: u Remove the # character from its $$DATABASE_TYPE and add this character to the current line entry. Configuring the Siebel Parameter File 13 Update the Batch Deduplication Settings Relational Connection Settings The $DBConnectionDQBatch setting specifies the relational connection for batch deduplication operations. $DBConnectionDQBatch=SDQ_Batch_Oracle Update the value of this setting to match the relational connection configured for batch deduplication in the Workflow Manager. Debug Settings Set $$DEBUG_ON to 1 to store batch deduplication source rows after deduplication completes. Source rows are stored in the following locations: ¨ Account - sdq_account_bulk_match_bak ¨ Contact - sdq_contact_bulk_match_bak ¨ Prospect - sdq_prospect_bulk_match_bak Session ID Cache Settings $$SESSION_ID_CACHE specifies the number of session ids that the Informatica library caches between calls to the SDQ_GetSessionID web service. The default value is 1000. This value must be set to 2 or higher. If the value is set to lower than 2, the default value of 1000 will be used. Match Score Threshold The $$MATCHSCORE_LIMIT setting determines the batch deduplication records to be stored in the staging database and returned to Siebel. This value should be less than or equal to the matchscore threshold configured in Siebel. The default value for this setting is as follows: MPLT_SDQ_BatchDedup.$$MATCHSCORE_LIMIT=60 Record Removal Settings Use the $$RETAINED_DAYS setting to determine the number of days that records are stored in the batch deduplication staging database. Records older than the number of days specified for this setting are deleted on a nightly basis. Installing Reference Data Informatica provides the batch and realtime reference datasets for Data Quality for Siebel. Informatica also provides the Data Quality Content Installer. Use the Data Quality Content Installer to install address reference datasets after you install all applications. For instructions on running the Content Installer, see the Informatica Data Quality Content Installation Guide. 14 Chapter 2: Installing and Configuring Informatica Components CHAPTER 3 Configuring Siebel This chapter includes the following topics: ¨ Overview, 15 ¨ Adding Library Files, 15 ¨ Adding the JAR File, 16 ¨ Adding Configuration Files, 16 ¨ Editing Configuration Files, 16 Overview This chapter describes the setup and configuration procedures that enable Siebel to communicate with Data Quality for Siebel. The communication is handled by Informatica library files that reside in the Siebel Data Quality environment. These procedures require Administrator privileges in the Siebel environment. This chapter does not provide instructions on installing Siebel products. Consult your Siebel documentation for these instructions. You should have access to a copy of Siebel Bookshelf when performing the steps in this chapter. Adding Library Files Data Quality for Siebel provides Informatica-Siebel connectivity through the INFADQSiebel library file. This file manages the data exchange interactions between Informatica and Siebel software. Data exchange is by XML. Informatica provides standard library files in addition to INFADQSiebel. The library files are thread-safe and support multiple sessions by using unique session IDs. They support UTF-16 (UCS2) as the default Unicode Encoding. To load the library files to Siebel: 1. Locate the library files in the bin folder of your installable fileset. 2. Copy these files to <Siebel Server Root Directory>\bin for Windows systems, and <Siebel Server Root Directory>/lib for Linux and Unix-based systems. 3. Windows only. If using a Siebel thick client, copy the files to its bin directory. 15 Adding the JAR File The INFADQSiebel JAR file provides communication between Siebel scripts and the Informatica library. To load the JAR file to Siebel: 1. Locate INFADQSiebel.jar in the Java folder of your installable fileset. 2. Copy this file to a folder on the Siebel server machine, e.g., \Informatica\DQforSiebel\Java. Note: this location as you will need to add it to the Siebel class path variable. Adding Configuration Files Informatica provides configuration and property files that provide information on the Web Services host machine and on logging. To load the configuration files to Siebel: 1. Locate the SDQConnector folder in your installable fileset. 2. Copy these files in this folder to <Siebel Server Root Directory>\SDQConnector. Create the SDQConnector folder if it does not exist. 3. Windows only. If using a Siebel thick client, copy the files to its SDQConnector folder. Create the SDQConnector folder if it does not exist. Editing Configuration Files You must edit the INFADQSiebel.cfg file that you copy to the Siebel SDQConnector folder so that the file identifies the PowerCenter Web Services Hub that Data Quality for Siebel will use. If you will connect to Data Quality for Siebel with the Siebel thick client, you must also edit the <SiebelApplication>.cfg file in its bin/[install language] folder. For more information, see “Enable Cleansing and Deduplication for a Thick Client” on page 25. To edit the INFADQSiebel.cfg file: 1. Locate the INFADQSiebel.cfg file in your Siebel installation folder. 2. Open the file. 3. Append the name of your PowerCenter Web Services Hub host machine to the WebServiceHost parameter. 4. At the LogDirectory parameter, type the path to the folder where Data Quality for Siebel will create its log file. 5. Review the WebServicePort and LogLevel settings. The WebServicePort setting identifies a port on the web services host machine for PowerCenter use. The default number is 7333. The LogLevel setting specifies the quantity of logging information that Data Quality for Siebel will create. The default setting is Warning. 6. 16 Restart the Siebel Server service. Chapter 3: Configuring Siebel CHAPTER 4 Configuring Siebel for Cleansing and Deduplication This chapter includes the following topics: ¨ Overview, 17 ¨ Enable Data Quality Functionality for Siebel, 17 ¨ Configure Siebel Vendor Parameters, 18 ¨ Set Siebel Data Quality Parameters, 19 ¨ Configure Business Components for Data Quality Operations, 20 ¨ Generate Siebel Match Keys, 25 ¨ Enable Cleansing and Deduplication for a Thick Client, 25 Overview This chapter describes the setup and configuration procedures for cleansing and deduplication in Data Quality for Siebel. The steps outlined in this chapter assume a standard Siebel installation. Enable Data Quality Functionality for Siebel To enable data quality functionality and Data Quality for Siebel on the Siebel system, you must perform the following actions: ¨ Enable Siebel Data Quality at the Enterprise Level and the Object Manager Level. ¨ Register the Informatica library file with Siebel. Register the INFADQSiebel Library To register the INFADQSiebel library, follow the steps below. 1. Log in to Siebel with Administrator privileges. 2. Navigate to the Administration - Data Quality screen. 3. Select the Third Party Administration view. 17 4. In the Vendor List, create a new record with these values: Name Informatica DLL Name INFADQSiebel Enable Siebel Data Quality To enable Siebel Data Quality at the Enterprise Level and the Object Manager Level, follow the steps in Chapter 5 of the Siebel Data Quality Administration Guide in your Siebel 8 bookshelf. When prompted for a vendor name (for example Vendor1), type Informatica. Configure Siebel Vendor Parameters You must configure a series of Siebel parameters that determine the types of data that Siebel passes to Informatica applications. To access the Vendor Parameters tab: 1. Log in to Siebel with Administrator privileges. 2. Navigate to the Administration - Data Quality screen. 3. Select the Third Party Administration view. 4. In the Vendor List, select Informatica. 5. Below the Vendor List, select the Vendor Parameters tab. Set Siebel Vender Parameters To set the vendor parameters: u Define the parameters using the names and values in this table. Names and values are case-sensitive. Note: Refresh the match keys in Siebel after editing Token or Query Expressions. 18 Name Value Account DataCleanse Record Type Account Account DataCleansing Conflict Id Field S_ORG_EXT.Conflict Id Account DeDup Record Type Account Account Query Expression Left([Name], 3) Account Token Expression Left([Name], 3) Batch Max Num of Records 200 UCM only: CUT Address DataCleanse Record Type CRM only: Business Address DataCleanse Record Type Business Address Chapter 4: Configuring Siebel for Cleansing and Deduplication Name Value Contact DataCleanse Record Type Contact Contact DataCleansing Conflict Id Field S_CONTACT.Conflict Id Contact DeDup Record Type Contact Contact Query Expression Left([Last Name], 3) Contact Token Expression Left([Last Name], 3) List Mgmt Prospective Contact DataCleanse Record Type List Mgmt Prospective Contact List Mgmt Prospective Contact Conflict Id Field S_PRSP_CONTACT.Conflict Id List Mgmt Prospective Contact DeDup Record Type List Mgmt Prospective Contact List Mgmt Prospective Contact Query Expression Left([Last Name], 3) List Mgmt Prospective Contact Token Expression Left([Last Name], 3) Max Search Spec Length 1000 Realtime Max Num of Records 200 Set Siebel Data Quality Parameters The following procedure describes how to set the Siebel Data Quality parameters. Ensure that the Siebel Data Quality parameters match the settings listed in “Set Siebel Data Quality Parameters” on page 19. To set Siebel Data Quality parameters: 1. Log in to Siebel with Administrator privileges. 2. Navigate to the Administration - Data Quality screen. 3. Select the Data Quality Settings view. 4. Verify or apply the following settings for each value in this view: Table 1. Siebel Data Quality Settings Parameter Description Value Enable DataCleansing Determines whether real-time data cleansing is enabled for the Siebel Server the administrator is currently logged into. Yes Enable DeDuplication Determines whether real-time data matching is enabled for the Siebel Server the administrator is currently logged into. Yes Set Siebel Data Quality Parameters 19 Parameter Description Value Force User Dedupe - Account Determines whether duplicate records are displayed in a pop-up window when a user saves a new account record. The user can then merge duplicates. If the value is set to No, duplicates are not displayed in a pop-up window, but the user can merge duplicates in the Duplicate Accounts view. Yes Force User DeDupe - Contact Determines whether duplicate records are displayed in a pop-up window when a user saves a new contact record. The user can then merge duplicates. If the value is set to No, duplicates are not displayed in a pop-up window, but the user can merge duplicates in the Duplicate Contacts view. Yes Force User DeDupe - List Mgmt Determines whether duplicate records are displayed in a pop-up window when a user saves a new prospect record. The user can then merge duplicates. If the value is set to No, duplicates are not displayed in a pop-up window, but the user can merge duplicates in the Duplicate Prospects view. Yes Fuzzy Query Enabled Determines whether fuzzy querying is enabled for the Siebel Server the administrator is currently logged into. No Fuzzy Query - Max Returned Specifies the maximum number of records returned when a fuzzy query is performed. 500 Match Threshold Specifies the minimum score required for Siebel Data Quality to treat a pair of records as a likely match. 70 If a parameter is not set in this view, the system will use the default value. Note: The following parameters may appear in the Data Quality Settings View: Key Type, Search Type. These parameters apply to the Oracle Data Quality Matching Server (SSA) and are not relevant to Data Quality for Siebel. Configure Business Components for Data Quality Operations Data Quality for Siebel operates on records from four business components: Account, Contact, List Management Prospect, and CUT/Business Address. You must identify in Siebel the fields from each business component that Informatica applications will cleanse or deduplicate. Only the fields you identify are passed to Informatica. This process has three stages: 20 1. Select Informatica as a data quality operations application vendor. 2. Define the business components that use Informatica for selected data quality operations. 3. Specify the business component data fields that Siebel sends to Informatica. Chapter 4: Configuring Siebel for Cleansing and Deduplication Select Informatica as the Data Quality Vendor To select Informatica as the data quality vendor: 1. Log in to Siebel with Administrator privileges. 2. Navigate to the Administration - Data Quality screen. 3. Select the Third Party Administration view. 4. In the Vendor List, select Informatica. Define Business Components and Data Quality Operations To define the business components and their associated data quality operations: 1. Ensure Informatica is selected as a data quality vendor. 2. Below the Vendor List, click the BC Vendor Field Mapping tab. 3. Under BC Operation, add a business component name and add one or more data quality operations. The following table details the combinations of components and operations that you can use. Table 2. Business Components and Data Quality Operations Business Component Name Operation Account Data Cleansing Account DeDuplication CUT/Business Address Data Cleansing Note: This component should be disabled if address validation is enabled for Informatica Data Quality for Siebel. Contact Data Cleansing Contact DeDuplication List Mgmt Prospective Contact Data Cleansing List Mgmt Prospective Contact DeDuplication Mapping Business Component Data Fields to Informatica Data Fields The following tables list the default field mappings for the Informatica processes in the Data Quality for Siebel solution. The Business Component Field column contains the field names used by Siebel. The Mapped Field column contains the field label used by Informatica. Configure Business Components for Data Quality Operations 21 Note: The mapped field names in these business screens must match the field names in the corresponding PowerCenter mappings. If you edit the field names in your mappings, you must edit the business component field names on these screens. The field values are case-sensitive. Table 3. Account Deduplication Field Mappings Business Component Field Mapped Field Dedup Token Account.DedupToken Id Account.Id Location Account.Location Name Account.Name Primary Account City Account.City Primary Account Country Account.Country Primary Account Postal Code Account.PostalCode Primary Account State Account.State Primary Account Street Address Account.StreetAddress Table 4. CUT/Business Address Cleansing Fields Business Component Field Mapped Field City BusAddr.City Country BusAddr.Country Id BusAddr.Id Postal Code BusAddr.PostalCode State BusAddr.State Street Address BusAddr.StreetAddress Street Address 2 BusAddr.StreetAddress2 Table 5. Account Cleansing Field Mappings 22 Business Component Field Mapped Field Dedup Token Account.DedupToken Location Account.Location Chapter 4: Configuring Siebel for Cleansing and Deduplication Business Component Field Mapped Field Name Account.Name Region Account.Region Note: The CUT/Business address component should be disabled if address validation is enabled for Informatica Data Quality for Siebel. Table 6. Contact Deduplication Field Mappings Business Component Field Mapped Field Account Location Contact.AccountLocation Dedup Token Contact.DedupToken First Name Contact.FirstName Id Contact.Id Last Name Contact.LastName Middle Name Contact.MiddleName Primary Account Name Contact.AccountName Primary City Contact.City Primary Country Contact.Country Primary Postal Code Contact.PostalCode Table 7. List Mgmt Prospect Cleansing Field Mappings Business Component Field Mapped Field Account ListMgtProspectContact.Account City ListMgtProspectContact.City Country ListMgtProspectContact.Country First Name ListMgtProspectContact.FirstName Job Title ListMgtProspectContact.JobTitle Last Name ListMgtProspectContact.LastName Middle Name ListMgtProspectContact.MiddleName Postal Code ListMgtProspectContact.PostalCode Primary Account Location ListMgtProspectContact.AccountLocation Configure Business Components for Data Quality Operations 23 Business Component Field Mapped Field State ListMgtProspectContact.State Street Address ListMgtProspectContact.StreetAddress Street Address 2 ListMgtProspectContact.StreetAddress2 Table 8. List Mgmt Prospect Deduplication Field Mappings Business Component Field Mapped Field Account ListMgtProspectContact.Account City ListMgtProspectContact.City Country ListMgtProspectContact.Country Dedup Token ListMgtProspectContact.DedupToken First Name ListMgtProspectContact.Firstame Id ListMgtProspectContact.Id Last Name ListMgtProspectContact.LastName Middle Name ListMgtProspectContact.MiddleName Postal Code ListMgtProspectContact.PostalCode Primary Account Location ListMgtProspectContact.AcountLocation State ListMgtProspectContact.State Street Address ListMgtProspectContact.StreetAddress Table 9. Contact Cleansing Field Mappings 24 Business Component Field Mapped Field First Name Contact.FirstName Job Title Contact.JobTitle Last Name Contact.LastName Middle Name Contact.MiddleName Chapter 4: Configuring Siebel for Cleansing and Deduplication Generate Siebel Match Keys As the first step in duplicate analysis, Siebel Data Quality searches the Siebel database for possible matches with the driver record. To do so, SDQ does not use the full driver record but instead uses match key values. Match keys are subsets of data selected from meaningful fields in the input or driver record. For example, a person’s surname provides a meaningful match key for prospects and contacts, and an account name provides a meaningful match key for account records. Siebel Data Quality maintains an index of match keys for use in deduplication. Before you perform deduplication with Data Quality for Siebel, you must generate a set of match keys or refresh the match keys so that it they are current. 1. Refer to the chapter 6 of the Siebel Data Quality Administration Guide for instructions on generating and refreshing match keys. Enable Cleansing and Deduplication for a Thick Client Enable cleansing and deduplication for a Siebel thick client by editing the <SiebelApplication>.cfg file. To edit the <SiebelApplication>.cfg file: 1. Locate the <SiebelApplication>.cfg file. This file resides in the bin\[install language] folder of your Siebel thick client installation, for example bin\ENU. If you have multiple language folders in your bin folder, edit UCM.cfg file in each such folder. 2. Set the following configuration parameters in this file. [DataCleansing] Enable = TRUE Type = Informatica [Deduplication] Enable = TRUE Type = Informatica 3. Restart the Siebel thick client. Generate Siebel Match Keys 25 CHAPTER 5 Configuring Siebel for Address Validation This chapter includes the following topics: ¨ Overview, 26 ¨ SIF File Import, 26 ¨ Adding Picks, 27 ¨ System Preferences Setup, 30 ¨ INFADQSiebel JAR Setup, 31 ¨ Batch Setup, 32 ¨ Verifying Your Address Validation Implementation, 32 Overview This chapter provides address validation configuration procedures for Data Quality for Siebel. Data Quality for Siebel implements address validation through Siebel pick items on address screens. To configure the pick items, you must have administrator access to Siebel and access to Siebel Tools. The steps outlined in this chapter assume a standard Siebel installation. Note: Before performing the address validation configuration steps in this chapter, ensure that you have completed the configuration steps specified in these chapters: ¨ Chapter 2, “Installing and Configuring Informatica Components” on page 10 ¨ Chapter 3, “Configuring Siebel” on page 15 SIF File Import This section describes the process of importing the Informatica SIF file. This file provides customized settings necessary for setting up Informatica address validation for Siebel. To import the Informatica SIF file: 26 1. In Siebel Tools, select Tools>Import from Archive. 2. Browse to the INFADQSiebel.sif file in your installation fileset. Select the SIF file and click Import. 3. Click Next. 4. The dialog box displays a list of the objects to import. Click Next. 5. The dialog box displays the objects to be inserted, modified, and deleted. Click Yes to continue. 6. The dialog box lists the imported objects as the objects are installed. After all objects are installed, click Finish. 7. Navigate to 'Tools' > 'Compile Projects'. 8. Choose the 'Informatica' project from the list. 9. Choose a Siebel Repository file to write to (e.g., siebel.srf or siebel_sia.srf) using the 'Browse' button. 10. Click 'Compile'. Adding Picks This section describes the process of adding picks to Siebel fields. The example business components, applets, and field names provided are relevant for clean Siebel installations. Tip: To identify the name of the applet to set the pick for, right-click on the applet window and choose the option that displays the source HTML. Within the HTML, search for the string ‘applet’ and identify the actual name of the applet. Modifying the Address Business Component Prior to setting up picks for applets, you must add new fields and user properties to the main address business component for your application. Perform the steps in the sections below to make these additions. Adding New Fields 1. In Siebel Tools, locate and check out the address business component associated with the applet. 2. Navigate to 'Business Component' > 'Field'. 3. Add a new Field 'ValidationSearch'. 4. Set the 'Calculated' column to ‘True’ (checked). 5. In the Calculated Value field, enter concatenated address fields using pipes to separate them: [Street Address] + "|" + [Street Address 2] + "|" + [City] + "|" +[State] +"|" + [Postal Code] + "|" + [Country] 6. Add a new Field 'MatchStatus'. 7. Update the 'Column' column by choosing an unused field from the Address table, for example, ‘COMMENTS’. Adding New User Properties 1. From the menu, choose 'View' - 'Options…', and then select the 'Object Explorer' tab. 2. Make sure that the box beside 'Business Component' > 'Business Component User Prop' is checked. 3. Choose the address business component you are working with (e.g., 'Business Address', 'CUT Address', or a custom address BC). 4. Navigate the tree to 'Business Component' > 'Business Component User Prop'. Adding Picks 27 5. Determine if there are any existing 'On Field Update Set n' rows, where ‘n’ equals an integer. ¨ If ‘On Field Update Set n' rows are present, take note of the highest value of ‘n’ for use in Step 6. ¨ If 'On Field Update Set n' rows are not present, use a value of ‘0’ for ‘n’ in Step 6. 6. 7. Create a new record for each address field entered in “Adding New Fields” on page 27. Edit the columns to match the settings described in the following table, ensuring that Name Column settings reflect the value of ‘n’ from the previous step. Field Name Column (Replace “n” with number) Value Column Street Address On Field Update Set n+1 “Street Address”, "MatchStatus", "Unchecked" Street Address 2 On Field Update Set n+2 “Street Address 2”, "MatchStatus", "Unchecked" City On Field Update Set n+3 “City”, "MatchStatus", "Unchecked" State On Field Update Set n+4 “State”, "MatchStatus", "Unchecked" Postal Code On Field Update Set n+5 “Postal Code”, "MatchStatus", "Unchecked" Country On Field Update Set n+6 “Country”, "MatchStatus", "Unchecked" Compile the address business component into the SRF file. Adding Picks for Address Applets This section describes the procedures for adding picks to address applets, including list applets and multi-value group applets. Update the Address Screen Applet 1. Find the applet to change (e.g., 'Account Address Mvg Applet'). To search for the applet, use <CTRL>-Q on the list of applets and search in the 'Name' column. 2. Select the applet row. Ensure that you have the applet checked out. 3. Navigate the tree to 'Applet' > 'List' > 'List Column'. 4. Choose the 'Street Address' column in the right pane. 5. Set the column 'Runtime flag' to 'True' (checked). 6. Set the column 'Pick Applet' to 'Informatica Address Validation Pick' using the dropdown in the cell. 7. Compile the applet into the SRF file. Update the Parent Business Component 28 1. Identify the parent business component of the applet by selecting ‘Help’ > ‘About View’ on the page the applet launches from. The name of the parent business component (e.g., 'Account') appears after the “BusComp” string. 2. In Siebel Tools, locate and check out the parent business component. 3. Navigate to 'Business Component' > 'Field'. If you are updating the parent BC for a list applet, skip to Step “Update the Parent Business Component” on page 28. If you are updating the parent BC for a multivalue group applet, continue on to Step “Update the Parent Business Component” on page 28. Chapter 5: Configuring Siebel for Address Validation 4. 5. Multi-Value Group Applets only. Add a new record to the lower right pane with the following column values: Column Name Value Name MatchStatus Multivalued True (Checked) Multivalue Link Name of child BC (e.g. 'Business Address' or 'CUT Address') Dest Field MatchStatus Multi-Value Group Applets only. Add another record to the lower right pane with the following values: Column Name Value Name ValidationSearch Multivalued True (Checked) Multivalue Link Name of child BC (e.g. 'Business Address' or 'CUT Address') Dest Field ValidationSearch 6. Select the row named 'Street Address'. 7. Set the 'Picklist' column to 'Informatica Address Picklist'. 8. Navigate to the 'Field' > 'Pick Map' and add the following fields: 9. Name Picklist Field Sequence Constrain Street Address Street Address 1 False Street Address 2 Street Address 2 2 False City City 3 False State State 4 False Postal Code ZipCode 5 False Country Country 6 False ValidationSearch Search-spec 7 True (Checked) MatchStatus MatchStatus 8 False Compile the parent business component (e.g., Account) into the destination SRF file. Adding Picks 29 System Preferences Setup This section describes the procedure for setting system preferences for Data Quality for Siebel. 1. In the sitemap, navigate to 'Administration - Application' > 'System Preferences'. 2. Ensure that the 'New' Button is enabled. If it is not enabled, then you will need to perform the following steps to change the System Preferences object to allow inserts: ¨ Open Siebel Tools. ¨ Navigate to 'Business Component'. ¨ Find 'System Preferences'. ¨ Set the column 'No Insert' to False (Unchecked). ¨ Compile the System Preferences object into the SRF. 3. If it is not possible to edit the Name or value field then you will need to perform the following steps to change the System Preferences Applet: ¨ Open Siebel Tools. ¨ Navigate to 'Applet'. ¨ Find 'System Preferences'. ¨ Navigate to 'List' -> 'List Column'. ¨ Set the 'Read Only' column to False (Unchecked) for all fields you want to edit. ¨ Compile the applet into the destination SRF file. 4. 30 Add the following rows: Rows Values Informatica Batch MatchStatus ‘VALIDATED - BT' Informatica Batch Org Set to '0' (Disabled) or to '1' (Enabled) Informatica Batch Org Fields A comma separated list of additional fields to be sent for validation; if no additional fields are required, set to “0“ Informatica Batch Org Search Search used to identify Addresses to be sent to the batch Address Validation process. For example, [MatchStatus] = "UNMATCHED - BT" AND [Country] = "United Kingdom" AND [City] = "London" OR [City] = "Guildford” Informatica Batch Per Set to '0' (Disabled) or to '1' (Enabled) Informatica Batch Per Fields A comma separated list of additional fields to be sent for validation; if no additional fields are required, set to “0“ Informatica Batch Per Search Search used to identify Addresses to be sent to the batch Address Validation process. For example, [MatchStatus] = "UNMATCHED - BT" AND [Country] = "United Kingdom" AND [City] = "London" OR [City] = "Guildford” Chapter 5: Configuring Siebel for Address Validation Rows Values Informatica Batch Size Number of records to be sent in each batch set, e.g. '500' Informatica Countries A comma separated list of the countries which Informatica PowerCenter is configured to provide validation for. Both Real Time and Batch. E.g. 'USA,Germany,United Kingdom,France,Australia,Canada,Netherlands" INFADQSiebel JAR Setup This section describes the procedures for setting up the INFADQSiebel JAR for Siebel thick clients and thin clients. A 32-bit Java VM (version 1.5 or higher) must be installed on the Siebel machine to set up the INFADQSiebel JAR. Locate and copy the system path of the JVM library file, as this must be added to the INFADQSiebel configuration information during JAR setup. Note: When following the instructions in the subsections below, replace the filenames in brackets with the full paths to the relevant files. Ensure that these paths include the filename and file extension. JAR Setup for the Siebel Thick Client 1. Copy the following text: [JAVA] DLL=[Path To JVM.DLL] CLASSPATH=[Path To Siebel.jar];[Path To SiebelJI_enu.jar];[Path To INFADQSiebel.jar] VMOPTIONS= -Xrs -Djava.compiler=NONE 2. Paste the text into a text editor. Replace the bracketed text with the file paths corresponding to your system configuration. 3. Copy the edited text and paste it into the <Siebel application>.cfg file (e.g., UCM.cfg). JAR Setup for the Siebel Thin Client 1. Copy the following two commands: [Directory to SiebSrv Bin]\srvrmgr /E [EnterpriseName] /g [GatewayName] /s [ServerName] /u SADMIN /p [SADMINPASS] Create named subsystem JAVA for subsystem JVMSubSys with DLL=[Path To JVM.Dll], CLASSPATH=.; [Path To Siebel.jar];[Path To SiebelJI_enu.jar];[Path To INFADQSiebel.jar], VMOPTIONS=" -Xrs Djava.compiler=NONE" 2. Paste the text into a text editor. Replace the bracketed text with the file paths corresponding to your system configuration. 3. Individually copy the edited commands and paste them into a command shell on the thin client server. INFADQSiebel JAR Setup 31 Batch Setup To set up batch address validation, you must publish the workflow. You must also ensure that field used for MatchStatus on the PER and ORG business components is the same field chosen for MatchStatus when setting up realtime validation. To set up batch address validation: 1. In Siebel Tools, navigate the tree to 'Business Component'. 2. Locate and check out the Informatica Business Component that you are using for Batch ('Informatica Address Org' or 'Informatica Address Per'). 3. Navigate the tree to 'Field'. 4. In the lower right pane, select the row named 'MatchStatus'. 5. In the 'Column' column select the name of the database column in which MatchStatus is to be stored. This should be the same column selected when modifying the Address business component. 6. Check in the Business Component and Compile to the destination SRF file. 7. In Siebel Tools, navigate the tree to ‘Projects’. 8. Search for the Informatica project. 9. Ensure that the locked column is set to true (checked). 10. Navigate to ‘Tools > Check In ...’ 11. In the Check In dialog, select the Informatica project and click the ‘Check In’ button. 12. In Siebel Site Map, navigate to Administration - Business Process > Workflow Deployment. 13. Query for Informatica*. 14. Choose the Informatica Address Validation Workflow and click the Activate button. Verifying Your Address Validation Implementation This section contains instructions on verifying your address validation implementation for both realtime and batch scenarios. Verifying a Realtime Address Validation Implementation Before verifying your address validation implementation for realtime scenarios, you must complete all of the configuration instructions specified in this chapter, with the exception of the instructions specified in “Batch Setup” on page 32. Batch setup procedures are not a prerequisite for realtime address validation. The following instructions assume that address search functionality is enabled for an ‘Account’ address applet. To verify realtime address validation: 1. Navigate to the ‘Account’ screen. 2. Select an existing account. 3. Open the Address applet and select an existing address. 4. Select the pick icon in the 'Street Address' field. After you select this icon, the applet displays a list of matching addresses. 32 Chapter 5: Configuring Siebel for Address Validation 5. If an address has a check mark in the 'Refine' column, select the record and click the 'Refine' button to identify a full address. It may be necessary to refine the record more than once to locate a full address. 6. Select the full address you require and click the 'Pick' button to replace the existing address. 7. If the applet does not present a suitable address, click the 'Cancel' button to return to the original address. Verifying a Batch Address Validation Implementation Before verifying your address validation implementation for realtime scenarios, you must complete all of the configuration instructions specified in this chapter, with the exception of the instructions specified in “Adding Picks” on page 27. Adding a pick is not a prerequisite for batch address validation. To verify a batch address validation implementation, perform the procedures described below. Setting Up a Batch Address Validation Workflow To set up a batch address validation workflow: 1. Navigate to Site Map > 'Administration - Application' > 'System Preferences'. 2. Depending upon which underlying Siebel tables that store addresses in your Siebel system, place a '1' in the value field of the entry named either 'Informatica Batch Org' or 'Informatica Batch Per’. 3. Update the related search specification to identify the address records you wish to process. For example, set 'Informatica Batch Org Search' to '[Country] = "USA" AND [Region] = "CA"' to select all records from the Org table that are from California, USA. 4. In the entry named 'Informatica Batch Status', set the status values that will be updated with corrected data during the address validation process. For example, to update records that are identified as valid or as corrected, set this entry to 'VALIDATED - BT,CORRECTED - BT'. Executing the Address Validation Workflow Execute the address validation workflow using one of the following methods: 1. Execute the workflow through the srvrmgr console. To execute the batch workflow through the srvrmgr Console, run the following command from srvrmgr console: 'run task for component WfProcMgr with processName='Informatica Address Validation' 2. Execute the workflow through a browser. To execute the batch workflow through a browser, perform the following actions sequentially: ¨ Navigate to Administration - Server Management > Jobs. ¨ Create a new Workflow Process Manager job with following Job Parameter settings: Name Workflow Process Name Value Informatica Address Validation ¨ Submit the job. Verifying Your Address Validation Implementation 33 INDEX A Address Validation process flow 5 All.param file 13 Architecture logical architecture 2 physical architecture 7 D Data Cleansing process flow 4 Data Deduplication process flow 4 Data Quality for Siebel bill of materials 8 defined 1 installable components 7 I Informatica Components install sequence 10 installable components 7 Informatica Data Quality reference dictionaries 12 Informatica Data Quality Workbench mapplets 4 Install Sequence Informatica components 10 Installable Components Data Quality for Siebel 7 Informatica components 7 Siebel components 7 Platform Support 8 PowerCenter Admin Console 11 PowerCenter applications Informatica Data Quality Integration 7 PowerCenter Integration Service 7 PowerCenter Mappings 3 PowerCenter Mapplet 3 PowerCenter Repository Manager 12 PowerCenter Repository Service 7 PowerCenter Session Tasks 3 PowerCenter Web Services Hub 7 PowerCenter Workflows assigning to PowerCenter Integration Service 13 importing 12 Q Query Expressions 18 R Reference data installing 14 S Keys, Generating 25 Siebel business components, configuring 20 data quality parameters, configuring 19 vendor parameters, configuring 18 vendor parameters, setting 18 Siebel Components installable components 7 Siebel Data Quality enabling at Enterprise Level 18 enabling at Object Manager Level 17 Staging Tables character encoding requirements 12 System Requirements 8 O T Oracle Match Keys. See Keys 25 Token Expressions 18 K P parameter file 13 34
Similar documents
Siebel Module User Guide
The Spectrum™ Technology Platform's Siebel Module is implemented in two different ways: SDQ and Non-SDQ. SDQ stands for Siebel Data Quality and is an out-of-the-box feature of the Siebel applicatio...
More information