ATAPY Software in Brief
Transcription
ATAPY Software in Brief
”Working with ATAPY has been a pleasure. We have been impressed with the high level of concern for producing the best possible text of the works and the accuracy of the results.” Virginia Laursen The Royal Danish Library Copenhagen, Denmark “Russia’s history is a history of great thinkers. Look at the chess players for example. ATAPY employs highly educated engineers gathered together in an efficient company. The result is quality tailor-made software at a competitive price.” Rob Camerlink EasyData B.V. Apeldoorn, the Netherlands “ATAPY reached 99.992% accuracy in the GermanRussian Dictionary, and 99.997% quality in the Spanish-Russian Dictionary project. They also corrected many mistakes in the source dictionary text, including typographical misprints and even mistakes in special marks that are almost impossible to detect without special programming tools and profound knowledge of linguistics.” Anna Zhavoronkova ABBYY Software House Moscow, Russia ATAPY Software in Brief ATAPY Software is a software development company with offices in Russia and Germany specializing in OCR/document imaging and data capture solutions. The company was established in 2001 with active support from its main partner ABBYY Software House. The major specialization of the company is outsourced software development services in OCR/document imaging and data capture fields; the company also possesses its own unique know-how in the area of document imaging and applied OCR solutions. ATAPY Software specializes in development of software solutions of the following types: Scanning front-end applications 5 Document imaging solutions - graphical OCR enhancement filters, page skew/orientation correction tools, image zoning tools 5 Document management applications, pre- and post-OCR routines to solve a variety of tasks: intuitive manual data verification, document classification, grouping, routing, validation, conversion, EDMS integration, etc. 5 PDF tools and solutions 5 Complete data input solutions on the basis of contemporary OCR SDKs 5 Applied industry-specific solutions based on OCR technology (ANPR/LPR solutions for traffic control systems, media clipping systems for PR agencies, etc.) 5 Professional services based on products/technologies by ABBYY Software House in the areas of data capture, OCR/ICR, computer linguistics, search/information retrieval engines The main part of the company's own know-how lies in the following fields: Document imaging: a number of pre-OCR quality enhancement filters, and a layout analysis tool for segmentation of non-standard pages 5 ANPR/LPR: a car license plate recognition SDK capable of reading Russian, German, Swedish and Dutch number plates The company's first projects were implemented for ABBYY Software House and involved customizing their enterprise-class products to end-user requirements. Today, ATAPY has a considerable track record of projects for customers worldwide, which includes solutions based on ABBYY products and other OCR platforms and toolkits. Among our customers are: ABBYY Russia: runs a development team and outsources occasional PS projects 5 ABBYY Europe and ABBYY USA: run technical support teams and outsource occasional PS projects 5 Notable Solutions Inc. (NSi, developer of AutoStore EDMS system): runs a dedicated team of software engineers and testers 5 EasyData, Lucom, PrePress Systeme: run small software development teams at ATAPY 5 Springer Verlag: for this publishing house ATAPY is converting a large scientific encyclopedia to electronic format (an ongoing project since 2003) 5 Five other smaller clients currently run software development and media service projects at ATAPY ATAPY Software also provides a range of digitization and data entry services (scanning, OCR, character repair, KFI, data format conversion, etc.). The company employs experienced software engineers, linguists, and multilingual operators. The track record of the company includes more than 150 completed projects for clients from the US, many European countries, Russia, and Middle East. Directly or through our strategic partnerships we have had the privilege of supplying our software development experience to such companies and organizations as Oce, RICOH, Fujitsu, Toshiba, Hewlett-Packard, Captiva, Apple Computers, the Government of the Netherlands, and the Meta-E consortium of 17 European and American universities funded by the European Commission. ATAPY Software is a Microsoft Certified Partner since 2007. ATAPY has two Microsoft Partner competencies: the Software Development competency and the Data Platform competency. From European car license plates to Dutch penitentiary system surveys, from American healthcare forms to Turkish printed media, our software works around the world and around the clock for our clients to forget about paper entry issues and to concentrate on their core businesses. Sergey Borovoy CEO ©2011 ATAPY Software. All rights reserved. All trademarks used are the property of their respective owners. ATAPY Software 630090, Engineernaya Street, 4a, 522 Novosibirsk, Russia Tel. +7 383 33 56 56 9 Fax +7 383 33 56 56 1 www.atapy.com [email protected] Virtual Image Processing for AUTOSTORE NSi AUTOSTORE transforms paper - and manual-based data capture processes into easy and efficient electronic workflows. Companies and organizations that use AUTOSTORE streamline their business processes, improve productivity, reduce operating costs and enforce compliance with laws and regulations. AUTOSTORE enables users to classify documents, extract just the information that's needed, and send it wherever necessary, all with . a few easy clicks. AUTOSTORE provides a smart business automation solution for a range of information-intensive industries, including: Image optimizing component for the powerful document capture system by Notable Solutions, Inc. (NSi) marked the starting point in partnership between ATAPY and NSi NSi’s AUTOSTORE application is a leading product for document capture, processing and distribution. Adopted by many top manufacturers of onramp devices, such as MFPs and digital copiers, AUTOSTORE’s framework is becoming the standard of server-based document capture for devices by HP, RICOH, Kyocera, Xerox, Canon and many other internationallyrenowned companies. ATAPY has enhanced AUTOSTORE with a new image treatment tool named Virtual Image Processing (VIP). In terms of AUTOSTORE, VIP is an integrated process component placed between the “capture” and “route” sources in a capture workflow. Such workflows are created visually in the AUTOSTORE Process Designer (an Administrative tool) ABBYY FormReader ICR AutoCapture Healthcare 5 Financial services 5 Retail 5 Legal Virtual Image Processing Microsoft Exchange 5 Utilities 5 Manufacturing 5 Transportation and logistics The capture workflow creation process 5 Professional services 5 Local and state government 5 Non-profit using drag-and-drop technique. VIP provides for optimizing document image quality for subsequent recognition via several configurable filters and their combinations. The number of filters is growing with every new version of the AUTOSTORE application. All original image modifications are reflected “on the fly” in the preview window of the component configuration interface. Once the desired image quality is achieved, the filter combination profile can be saved and used in other AutoStore workflows containing VIP. This eliminates the need to configure filters every time a new capture-process-route sequence is designed. This is extremely valuable for companies with large workflows of template-based documents. VIP implements a large number of different image filters, out of which the most basic are: Color Extraction filter - keeps the selected color, while all other colors are dropped. Users can also select a color to replace the dropped ones in the resulting image Deskew filter - corrects the skewed angle of the image Despeckle filter - eliminates small speckles and garbage on the image Color Dropping filter - drops the selected color and replaces it with the substitute color VIP filter profile configuration dialog Color dropping configuration dialog Thresholding filter - converts an image to blackand-white. All pixels with the brightness level less than the specified level become white, while others become black Adaptive Thresholding filter - allows improving recognition results for source images with irregular background using a sophisticated binarization algorithm Notable Solutions, Inc. (www.nsius.com) was founded in 1995 to provide complete technology solutions in the fields of Software Development, System Engineering, Technology Training and Business Consulting. Since that time NSi has evolved towards software and hardware design, development, network integration, and support of document management systems. NSi prides itself on a commitment to quality and a reputation for excellence. It is now a leading provider of content capture software with products in use by Canon, HP, Kodak, Kyocera, RICOH, Sharp, Xerox and others. ©2011 ATAPY Software. All rights reserved. AutoStore is a registered trademark of Notable Solutions, Inc. All the other trademarks are the property of their respective owners. ATAPY Software Notable Solutions, Inc. (NSi) 630090, Engineernaya Street, 4a, 522 Novosibirsk, Russia Tel. +7 383 33 56 56 9 Fax +7 383 33 56 56 1 www.atapy.com [email protected] 9715 Key West Ave., Suite 200 Rockville, MD 20852, USA Tel.: +1 240 683-8400 Fax: + 1 240 683 8420 www.nsius.com [email protected] ABBYY FineReader for Fujitsu ScanSnap!TM “The software supplied is also very good. ABBYY FineReader recognizes text, tables and images in documents and can import to Word or Excel for editing. It will automatically rotate scanned pages to their correct orientation and leave out any blank pages.” Tim Smith, “Computeractive” UK Fujitsu (www.fujitsu.com) is a multinational computer hardware and IT services company based in Tokyo, Japan. The company specializes in semiconductors, air conditioners, computers (supercomputers, personal computers, servers), telecommunications, and services. Fujitsu employs around 400,000 people and has ~500 subsidiary companies. Under a contract with ABBYY Europe GmbH, ATAPY Software has completed ABBYY FineReader for ScanSnap!™ software package to be bundled with Fujitsu scanners ScanSnap!™ by Fujitsu is a family of highspeed desktop office scanners with the key idea of one-step approach to document conversion. Normally, one push of a button on the scanner's faceplate is enough to see the document image on the screen. And if any scanning parameters need to be modified, ScanSnap Monitor software enables users to do that in just a few mouse clicks. But the image is not always the answer - no scanning software suite is complete without a good OCR application. For Fujitsu, the choice was obvious: ABBYY FineReader. ABBYY European office which negotiated the deal transferred the development to ATAPY Software, ABBYY's technology partner. The application implemented by ATAPY comprised four components. The Scan2Word, Scan2Excel, and Scan2PDF modules converted scanned images to the corresponding file formats as a result of pushing the button on the scanner. The intentional simplicity of the solution didn’t mean lack of flexibility, as the fourth component - the Exporter Settings module - provided access to a variety of parameters: turning on and off preservation of page and line breaks, retaining text color for Word documents, replacing uncertain words with images and reducing picture resolution for PDF, and overall practically the entire wealth of FineReader settings and options. The package offered a wide choice of recognition and interface languages. A user could select from 7 languages of the program interface: English, German, French, Russian, Spanish, Portuguese, or Italian. Also there were tools for adding new interface languages. And, thanks to the built-in capabilities of ABBYY FineReader, the number of recognition languages was so large (177) that they had to be categorized into 5 groups for manageability. ABBYY FineReader for ScanSnap!™ combines the simplicity of the brilliant Fujitsu one-step scanning approach and the power of ABBYY FineReader technology. ATAPY's expertise enabled these components to work in synergy for the best performance and user satisfaction. ©2011 ATAPY Software. All rights reserved. ABBYY and ABBYY FormReader are registered trademarks of ABBYY Software House. All the other trademarks are the property of their respective owners. ATAPY Software ABBYY Europe Software House 630090, Engineernaya Street, 4a, 522 Novosibirsk, Russia Tel. +7 383 33 56 56 9 Fax +7 383 33 56 56 1 www.atapy.com [email protected] 80687 Munich, Germany Elsenheimerstrasse 49 Tel. +49 89 511 159 0 Fax +49 89 511 159 59 www.abbyyeu.com [email protected] FineReader-based OCR module for Captiva InputAccel Captiva Software Corporation, a standard setter in enterprise input solutions, chose ABBYY Software House OCR technologies for leveraging in its flagship information input solution. Captiva InputAccel is used by hundreds of companies around the world, helping them to collect and integrate external information in their systems. InputAccel works around the globe and the clock to transform the deluge of external data into usable, business-ready content, no matter what its format or point of origin is. Needless to say that accuracy in processing data streams flowing in and out 24/7 is a vital matter for any paper-intensive company’s success. To ensure high data capture accuracy, Captiva Software Corporation entrusted creation of an OCR module for InputAccel to ATAPY, ABBYY's software development partner experienced in ABBYY FineReader-based solutions for S O F T WA R E C O R P O R AT I O N where information lives Companies of all sizes. Thorough study of InputAccel by specialists from ABBYY resulted in complete project documentation passed to ATAPY for implementation. Besides that, synergetic joint project management (ABBYY+ATAPY) allowed to improve the module in terms of usability and productivity right in the course of development, without running into additional work investment or stretching the project timelines. Feature highlights OCR module integrates into InputAccel workflows as configurable instances with user-defined settings. Settings are provided by FineReader Engine and cover pre-OCR optimization, recognition options, and output document formatting. Each instance can be inserted into the workflow to envelop a task, be it processing a single page or digitizing large volumes of incoming documentation. Captiva Software Corp. manufactures software products for document processing and data capture from paper and electronic documents and provides related services. In 2005 Captiva was acquired by EMC Software Group, a division of EMC Corporation (www.emc.com). This acquisition represented a natural extension to the EMC Documentum enterprise content management platform and added existing integrated technology to the EMC software portfolio. ©2011 ATAPY Software. All rights reserved. ABBYY and ABBYY FineReader are registered trademarks of ABBYY Software House. All the other trademarks are the property of their respective owners. ATAPY Software 630090, Engineernaya Street, 4a, 522 Novosibirsk, Russia Tel. +7 383 33 56 56 9 Fax +7 383 33 56 56 1 www.atapy.com [email protected] EasyData and ATAPY Software for Seamless Data Capture Through innovative use of ABBYY FineReader products and long-term cooperation with ATAPY Software, EasyData B.V. is becoming the leading supplier of customized OCR/ICR solutions in the Netherlands EasySeparate is the solution for today’s demand for converting and combining the output of an MFP. With EasySeparate it is possible to simplify the data input process by configuring a profile for mass input of each particular type of forms. The configurable presets include: Combining or separating documents based on a barcode or text string/regular expression Blank page removal, a number of image processing options Indexing, adding time and date stamp for extra traceability, collecting metadata in XML ticket Setting a pre-defined output format, including file type and file name EasySeparate supports integration with ABBYY Form Reader – a powerful data capture product by ABBYY Software House allowing to automatically read, convert, and export unstructured documents, such as invoices, to any external system. EasySeparate provides a variety of customization options by means of its Scripting module. FormReader As a subcontractor to EasyData B. V., ATAPY has applied its experience in a number of projects for the Dutch IT industry. ATAPY solutions are now part of several efficient software products which streamline the data capture routines for end users in a number of vertical markets. ATAPY develops a valuable add-on to EasyData's Flagship Product EasySeparate ATAPY Software contributed to the development of EasySeparate by implementing Visioneer OneTouch® Link – a solution that integrates the product with the industry-leading Visioneer and XEROX scanners supporting the ® ® OneTouch technology. The ATAPY's Visioneer-certified OneTouch Link works with the device driver, allowing users to enjoy the benefits of quick and straightforward scanning combined with the outstanding EasySeparate's data capture capabilities. Preparing to scan multiple documents of similar type or structure, a user selects one of his/her pre-configured EasySeparate profiles in the scanner interface settings and specifies EasySeparate as a destination. All the rest of work, apart from pushing or clicking “Scan”, is done by EasySeparate – including intelligent barcode- and text-based document flow separation, sorting, blank page removal, OCR, document indexing, metadata processing, grouping, format conversion, and export. The solution provides a powerful add-on to EasySeparate, making specific and familiar EasySeparate processing choices immediately available to users. "With EasySeparate we've been striving to bring a new degree of ease and transparency into the document management process of small to middle organizations, allowing them to increase operating efficiency and cut costs. This new integration takes us one step further in this direction; I'm sure it'll be demanded by our customers,” says Robert Camerlink, the EasyData B.V. CEO. Participation in Development of the Scan2IT Component for Océ Among EasyData's customers is Oce, one of the world leaders in hardware and software for document processing. Cooperation had started with a relatively simple “Oce Document Interpreter” application which detected separator pages within the incoming image stream and split the stream into multi-page documents. Stability, short development time, and low cost of this initial application led Oce to order additional features, such as barcode search and logging, empty page detection, and many others. Another application, titled FineRead, packs the power of ABBYY OCR technologies into a silent “image gobbler”, lurking as an NT service program and monitoring selected catalogues for new images. Incoming images get recognized and exported according to a sophisticated set of instructions composed by the user. Merged together, these two applications formed the basis for Scan2IT (”Scan to Intelligent Text”) – a complex image processing system now offered by Oce offices around the world. About Visioneer: Visioneer (www.visioneer.com) is a world-class developer of intelligent imaging solutions that provide a faster and easier way to capture documents and photographs and integrate them with popular Windows and document imaging applications. About Oce: Oce NV (www.oce.com) is a Netherlands-based company that manufactures and sells production printing and copying hardware and related software. Oce N.V. has been a listed company since 1958 and is the holding company for the international Oce Group. This group has operating companies in 25 industrialized countries. EasyData B.V. (www.easydata.nl) is a Netherlands-based company specializing in data capture, document management solutions, and the associated consulting services. Serving the needs of the Dutch and Belgian SMB markets, EasyData’s innovative applications make the user experience more compelling by changing the way paperintensive organization manage their document flows. ATAPY Software (www.atapy.com) is a provider of on-demand software solutions in the fields of OCR/ICR, document imaging, and data capture. In addition to its main activity, ATAPY has been taking part in various archive digitization and knowledge preservation endeavors through offering a range of media services (scanning, data capture, key-from-image, mark-up, verification, etc.) to libraries and data archives all over the world. ©2011 ATAPY Software. All rights reserved. EasyData and EasySeparate are registered trademarks of EasyData B.V. ABBYY and ABBYY FormReader are registered trademarks of ABBYY Software House. All the other trademarks used are the property of their respective owners. ATAPY Software EasyData B.V. 630090, Engineernaya Street, 4a, 522 Novosibirsk, Russia Tel. +7 383 33 56 56 9 Fax +7 383 33 56 56 1 www.atapy.com [email protected] Koninginnelaan, 16 7315 BS Apeldoorn, the Netherlands Tel. +31 55 53 44 886 www.easydata.nl [email protected] sense your media PRNet is a Media Monitoring and Analysis company serving over 300 corporate clients in Turkey. The company acts as a strategic partner for communication specialists and executives, who aim to develop corporate reputation and who need to assess the results of their communication strategies. PRNet provides access to their online database where customers can search among more than 25 thousand clips and 80 million results stored since 2000, survey 4,500 pages of newspapers and magazines, view videos of 74 TV channels recorded on a 24/7 basis, and access more than 1,000 Internet portals. According to ISO 500 research, 7 of the top 10 companies of Turkey, and 84 of the top 100, prefer PRNet for serving their mediamonitoring and industrial information needs. ©2011 ATAPY Software. All rights reserved. ABBYY, the ABBYY logo and ABBYY FineReader are registered trademarks of ABBYY Software House. PRNet and the PRNet logo are registered trademarks of PRNet. A Networked Media Clipping System For more than a century, daily, systematic analysis of printed media has been an important tool for successful businesses worldwide. Media clipping companies, using tools suited to the last century, provided the analysis business demanded. All those years, the rustling of pages and jingle of scissors were the constant audio background of media clipping companies' operations. Arrival of the digital age healed the callused hands of operators. Fewer and fewer scissors were used as companies switched to scanning printed material. Paper no longer left the scanner room, and reading was done from computer monitors. But overall processing of newspapers and magazines still required too much human input to automate, so the amount of labor spent by media clipping companies remained largely the same. Early 90s OCR programs worked for letters and faxes, but turned out to be useless when confronted with the complex layout and font variety of newspapers. workload was largely shifted to unattended computers: OCR PCs had to be rackmounted 10 units tall to fit into a single room, with one hotswitchable monitor for control. In 1997, a Turkish media research company named PRNet approached ABBYY Software House, the manufacturer of FineReader OCR products, with the request to design a system to streamline the media clipping process. Dalian 1.0 went into operation in 1998, delivering subscribers a service previously unheard of. As early as nine in the morning, subscribers could log on to PRNet's web site, click on their own customized albums, and view a new page with clippings from that very day's morning newspapers. Only clippings containing this subscriber's keywords went to his/her albums. Content was delivered as text and pictures in HTML format, allowing the subscriber to copy & paste it into other software for distribution or editing. Pictures were delivered as well. Keywords were highlighted. All major Turkish publications were covered (50 titles). The clippings were preserved in MS SQL Server database for long-term storage and future reference. All this was achieved with an average staff presence of 14 operators - a fantastic efficiency compared to less sophisticated systems. The When the new version of FineReader OCR came out, PRNet invited ABBYY to migrate Dalian to this new platform. Pursuant to new corporate outsourcing policies, ABBYY transferred the project to ATAPY Software, an IT development company specializing in custom OCR tools. Besides migration, PRNet asked ATAPY to add web-based administration, system statistics and reports, a web client for extended media search, improved output for clippings, and many other features and enhancements. The new Dalian 2.0 went into operation in 2003, providing media insights to about 80 clients, including the Turkish offices of Alcatel, Compaq, Toyota, Uniliver, Vestel, CNN, Reebok, and Siemens, as well as such local giants as members of the Koñ Group and the leading banks of Turkey. The dramatic improvement in recognition rate, the possibility to employ home-based operators working through web interfaces, and other serious advancements in system functionality and manageability place Dalian 2.0 in the top rank of modern media clipping software solutions. ATAPY Software PRNet 630090, Engineernaya Street, 4a, 522 Novosibirsk, Russia Tel. +7 383 33 56 56 9 Fax +7 383 33 56 56 1 www.atapy.com [email protected] Spring Giz Plaza B Blok 17/18, Maslak 80670 Istanbul, Turkey Tel. +90 212 328 18 09 Fax +90 212 328 18 07 www.prnet.com.tr [email protected] EasyData and ATAPY Software: Continued Partnership since 2001 EasyData B.V. is a Netherlands-based company specializing in data capture, document management solutions and the associated consulting services. Serving the demands of the Dutch and Belgian SMB market, EasyData makes the user experience more compelling by changing the way paperintensive organizations manage their document flows. EasyData flagship data capture product, EasySeparate, allows to reduce the cycle times of mission critical business transactions by providing quick and userfriendly document capture. www.easydata.nl Merging together the exceptional power of ABBYY OCR technology and the experience of ATAPY engineers, EasyData provides Dutch companies with made-to-order yet affordable document management software applications Among EasyData's customers are such institutions as Amsterdam International Airport Schiphol, the largest Dutch hospital AMC, Dutch Institute of War documentation NIOD, University of Nijmegen, Technical University of Eindhoven, University of Wageningen and others. All those, and many more, have benefited from EasyData's commitment to innovation and excellence. Some of the most notable projects completed with ATAPY participation are described below. Data Capture Solutions for Logistics Industry A cost-effective document management solution for Van Gend & Loos Van Gend & Loos, one of the largest Dutch logistics and cargo companies, turned to EasyData for a solution to automate its document processing through OCR technology. The goal was to extract certain textual and graphical information from an incoming stream of printed forms. Having considered the prohibitive cost of purchasing and operating the full-scale form-processing systems, EasyData and ATAPY offered and implemented a custom solution which works only with the documents received by Van Gend & Loos, but does it better and for a fraction of the cost as compared to any ready-to-use product. EasyData and ATAPY help to deliver frozen food to multinational customers Frigolanda is a transport and logistics company specializing in transporting of refrigerated cargo with offices in Belgium, the Netherlands, Germany, and India. In 2003, EasyData negotiated a contract with Frigolanda under which ATAPY designed a custom export automation module for ABBYY FormReader 6.0. Commercial order forms were fed to a FRIGOLANDA cold logistics group scanner, and images were automatically read. Then ATAPY's export module received recognized data from FormReader and converted it into custom-format files for further analysis and processing. In 2004, the customer returned to EasyData with a request for further evolvement of the program. The input documents were 2 pages double-sided; sometimes due to scanning mistakes (double feeds, face-down feeds) the data was getting mixed up. The new version of the program watched the page order and corrected it when possible, or warned the user of a non-recoverable situation. Both parts of the project were implemented quickly and to the full satisfaction of the customer. The program is currently working on the customer's site backing up an important part of its business process. Customizing ABBYY FineReader for processing construction shipping forms in Belgium The Fernand Georges company (www.georges.be), construction and industrial equipment dealer, was looking for means to automate its document flow processing. The task was to detect specific spots on shipping forms, then read the data at these spots and export it for further processing. In addition, the forms had to be re-grouped into multi-page TIFF files, each file representing one form with its attachments. EasyData and ATAPY proposed ABBYY FineReader 6.0 Engine as the OCR basis. This F E R N A N D choice was additionally justified by the fact that the forms were being prepared with a matrix printer, for which FineReader provides the special recognition mode to increase OCR quality. The solution was successfully implemented and is now participating in heavyduty production at Fernand Georges. About Van Gend & Loos: Van Gend & Loos was a Dutch distribution company. It was established in 1809 by the Antwerp-based innkeeper and carriage driver J.B. van Gend. It was sold to Deutsche Post in 1999. The three daughter companies of Deutsche Post (Danzas, DHL Worldwide Express and Van Gend & Loos) were merged to form DHL in 2003, ending the almost 200-year history of Van Gend & Loos. About FRIGOLANDA: FRIGOLANDA (www.frigolanda.com) is a cold logistics group with a storage capacity of more than 75,000 pallet places - 65,000 deep frozen and 10,000 chilled - in Europe. The company possesses offices in the Netherlands, Germany, Belgium and Poland, and a high quality distribution and transport fleet of 65 lorries. FRIGOLANDA drivers deliver to some 1,000 addresses throughout the Benelux, Germany and a growing number of addresses in France, Italy, Great Britain, Switzerland, Austria and Scandinavia every day. ©2011 ATAPY Software. All rights reserved. EasyData and EasySeparate are registered trademarks of EasyData B.V. ABBYY, ABBYY FormReader and ABBYY FineReader Engine are registered trademarks of ABBYY Software House. All the other trademarks used are the property of their respective owners. ATAPY Software EasyData B.V. 630090, Engineernaya Street, 4a, 522 Novosibirsk, Russia Tel. +7 383 33 56 56 9 Fax +7 383 33 56 56 1 www.atapy.com [email protected] Koninginnelaan, 16 7315 BS Apeldoorn, the Netherlands Tel. +31 55 53 44 886 www.easydata.nl [email protected] Automated Document Processing Tool for PPS PrePress Systeme GmbH PPS PrePress Systeme GmbH is a digital paper solutions and media service provider headquartered in Germany. The company specializes in converting paper archives into digital form, allowing full-text information retrieval over decades of newspaper issues and other digitized data. PPS delivers high quality images and recognized text of newspaper pages, which they receive in paper form or in microfilm. For knowledge retrieval, PPS offers flexible and powerful tools from interface projects. PPS contacted ATAPY Software for creation of a custom document processing tool. The goal was to automate four main tasks: accept and route images for subsequent OCR 6 use ABBYY FineReader Scripting Edition to recognize the images 6 export recognition results to a variety of document formats 6 save resulting documents in user-defined output directories The tool designed by ATAPY detected scanned documents in the user-defined input directories, sent them to working directories, and submitted them to ABBYY FineReader for recognition. Recognized documents were stored in the output directories while problematic images went to the special "error directories". An important feature of the application was the capability of exporting each recognized document into several formats: a user was getting multiple documents as a result of a single processing phase. CSV DOC RTF HTM HTML DBF PDF XLS TXT High flexibility and configurability were the key points of the solution designed and implemented by ATAPY. For each output format, the application allowed a user to set its individual parameters, such as page size for RTF/DOC, picture resolution for PDF, codepage for HTML, etc. State-of-the-art OCR technology by ABBYY Software House combined with ATAPY's engineering expertise allowed this application to come out fast, highly usable, and costeffective. PPS PrePress Systeme GmbH (www.prepress-systeme.de) provides the Publishing industry with state-of-the-art software solutions since 1992, and offers the services of newspaper archives digitization since 1999. PPS PrePress Systeme GmbH offers a line of innovative search solutions, including an enterprise-class intelligent search system «inter: gator» and a semantic search engine «PPS Finder». ©2011 ATAPY Software. All rights reserved. ABBYY, ABBYY FineReader and ABBYY FineReader Scripting Edition are registered trademarks of ABBYY Software House. All the other trademarks are the property of their respective owners. ATAPY Software PPS PrePress Systeme GmbH 630090, Engineernaya Street, 4a, 522 Novosibirsk, Russia Tel. +7 383 33 56 56 9 Fax +7 383 33 56 56 1 www.atapy.com [email protected] Hohemarkstrasse 20 D-61440 Oberursel, Germany Tel. +49 6171 7085725 www.prepress-systeme.de [email protected] Auto-Import Station for DM Dokumenten Management GmbH DM Document Management GmbH is a Germanybased provider of end-to-end document management solutions. One of the company`s clients is Wustenrot-Gruppe - an Austrian company providing financial and real-estate services. Its main two subsidiaries are the Wustenrot construction and the Wustenrot insurance companies. For this client DM developed and installed a mass document capture system. This system was based on ABBYY FormReader 6.0 Enterprise Edition - a distributed data capture product comprising a number of computer “stations” performing different tasks. Some stations are responsible for scanning, others for OCR, verification, data validation, system administration, and release. Therefore, the images were captured using the software bundled with the scanners, and stored as multi-page TIFF files. After that they needed to be automatically imported. Although ABBYY FormReader 6.0 Enterprise Edition Scanning Station has the “import-from-folder” feature, its functionality did not satisfy the system’s requirements. In search of a solution, DM contacted ATAPY Software. Thanks to its profound knowledge of ABBYY products, ATAPY was able to write a new component for the system named Auto-Import Station (AIS). AIS replaced the Scanning Station and, as far as the rest of the system is concerned, imitated its behavior. The solution was successfully installed at Wustenrot and demonstrated impressive performance. As its input front-end, the system used industrial scanners that weren’t directly compatible with ABBYY FormReader 6.0 Enterprise Edition Scanning Station. ÒÌ DM Dokumenten Management GmbH (www.dokumenten-management.de) develops efficient solutions for document management and revision-safe archiving for over 15 years. DM Dokumenten Management GmbH designed a completely new product generation lobo dms which meets the latest requirements of the Document Management industry. DM customers include Aventis AG, Deutsche Post, BMW AG, Deutsche Borse, Linde AG. ©2011 ATAPY Software. All rights reserved. ABBYY and ABBYY FineReader are registered trademarks of ABBYY Software House. All the other trademarks are the property of their respective owners. ATAPY Software DM Dokumenten Management GmbH 630090, Engineernaya Street, 4a, 522 Novosibirsk, Russia Tel. +7 383 33 56 56 9 Fax +7 383 33 56 56 1 www.atapy.com [email protected] Dornierstrasse 4 82178 Puchheim, Germany Tel. +49 89 800 613 0 Fax +49 89 800 613 99 www.dokumenten-management.de [email protected] ATAPY installs ABBYY FormReader at Inmarko, Inc. Novosibirsk-based Inmarko is the largest ice-cream manufacturer in Russia with an annual production volume of nearly 31,000 tonnes (2003). Today Inmarko, Inc. employs several thousand people and sells icecream across the entire country, from street kiosks to international retail chains like Metro and Auchan. Inmarko had developed a complicated procedure for processing product requests from retailers. Request forms were distributed to supply agents who filled in the numbers coming from each retail point and submitted them to the Data entry Department. Employees of the Department manually entered the forms into “1C:Enterprise”™ ERP system, from which purchase orders, delivery truck routes, and other documents were automatically generated and passed to the Logistics department. This procedure had only one bottleneck. The formkeying operators could finish their work only by late night, which delayed and obstructed further business processes. The reason was the tremendous volume of information, as each form contained more than 1,000 fields filled with handprinted text. Another serious problem was the entry mistakes. Single typos were irritating, but much worse were situations when operators skipped or duplicated table columns, or even the entire forms. As a result of such mistakes, some retailers received twice the quantity of each ice cream sort that they actually ordered, while others received nothing. In search of the solution, Inmarko's IT Department discovered FormReader by ABBYY Software House, a product for automatic input of data from printed and handprinted forms. As FormReader is a “box” product, no costly integration was needed; most of the installation and tune-up work was done in-house, with minimum intervention from ATAPY Software, the local ABBYY dealer. FormReader required no changes in application form design and no special staff training, therefore the costs of printing, distribution and collection of the forms remained the same. Just as realistic were FormReader's hardware require-ments: a regular flatbed scanner and a common office computer. Once FormReader went into operation, work that previously required three typists now required just one operator, and even with that reduction, the form entry process was completed much earlier. Input productivity increased 6 times, and the entire distribution logistics of the company got improved considerably. Especially important was the fact that the number of mistakes decreased very significantly, the most “disastrous” mistakes going away completely. This successful experience has moved Inmarko to use FormReader for capturing other types of corporate documents. This new challenge has required no additional investment at all, as FormReader can be configured for processing up to 99 document types in one batch, automatically telling one document type from another. Now the same operator, using the same hardware and software, processes Inmarko's documents of different types. A similar system was installed at Inmarko's plant in another city, and more installations are underway. Besides Inmarko, the efficiency and stability of ABBYY FormReader is acknowledged by hundreds of users around the world, including the Federal Tax Service of Russia (Personal Income Statements, Taxpayer Identification Number application, and other tax forms), the Russian Ministry of Education (examination papers in the centralized all-Russian students' testing program), the Russian State Pension Fund (insurance application forms and premium reports), the Ministry of Rural Development of Malaysia (agriculture statistic reports), Adidas (retailer order sheets and questionnaires), Phillip Morris (sweep-stakes entry forms), Finansbank, Turkey (credit card application forms), Target Media/UK (checkbox questionnaires), Allianz Poistovna, Poland (vehicle insurance applications). FormReader contains an Application Program Interface for interaction with other applications. This makes it possible to use FormReader not only as a standalone application, but to create entire production lines for mass input of documents. The three largest lines are installed at the Moscow Tax Inspectorate. In each of them 1 computer is responsible for scanning, up to 10 computers provide automatic OCR, and other 10 computers allow operators to proofread recognition results and correct mistakes. Each line can process up to 3,500 pages of handprinted forms per hour. Scanning stations Recognition stations Verification stations Main database Filled Forms The ASYS Softwareenwicklung GmbH company (Germany) integrated ABBYY FormReader into SMARTscan, their workflow automation solution for pharmacies. With ABBYY FormReader, all the work associated with input and processing of handprinted medicine prescriptions is reduced to several clicks and can practically be done by a pharmacy salesperson while talking to the client. ATAPY Software, a strategic partner of ABBYY Software House, specializes in programming tools and add-ons for ABBYY products, including FormReader. ATAPY employs ABBYY-trained experts in the cutting-edge FlexiForm technology for processing document types traditionally considered “non-automatable”, such as phone bills, invoices, job applicant resumes, library cards and many more. ATAPY can also integrate FormReader into any Enterprise Document Management System for the benefit of prospective customers and partners. Inmarko, Inc. (www.inmarko.ru/en) is the Number One company in the Russian ice cream market for its output and sales volume. It has a domestic market share of over 16%. Established in 1993, Inmarko has its own factories and cold storage facilities in Omsk, Tula, and Novosibirsk. Today it employs a staff of over 5,000. ©2011 ATAPY Software. All rights reserved. ABBYY and ABBYY FormReader are registered trademarks of ABBYY Software House. All the other trademarks are the property of their respective owners. ATAPY Software Inmarko, Inc. 630090, Engineernaya Street, 4a, 522 Novosibirsk, Russia Tel. +7 383 33 56 56 9 Fax +7 383 33 56 56 1 www.atapy.com [email protected] 630050 Elitnoe Novosibirsk region, Russia Tel. +7 3832 599 799 Fax +7 3832 48 13 12 www.inmarko.ru [email protected] EasyData and ATAPY: Document Imaging Tools Since 2001, EasyData B.V. (www.easydata.nl) has been partnering with ATAPY Software to provide companies in Western Europe with highquality data input solutions based on the OCR/ICR technology of their mutual partner ABBYY Software House. As a subcontractor to EasyData, ATAPY has competed more than 40 projects, each one solving a specific task not answered by off-the-shelf data capture products available on the market. "Russia's history is a history of great thinkers. Look at the chess players for example. ATAPY employs highly educated engineers gathered together in an efficient company. The result is quality tailor-made software at a competitive price. This partnership has always been a source of innovative solutions for us and our customers. ATAPY has also taken part in the development of our flagship product EasySeparate by implementing the Visioneer OneTouch® Link scanner integration component.” Robert Camerlink CEO EasyData B.V. Document Processing Solutions for Education Industry A forms processing solution for MEMIC When MEMIC, the Center for Data and Information Management of the University of Maastricht, faced a huge number of questionnaire images to be picked up page-by-page from different catalogues, merged into multi-page TIFF documents and indexed, it relied on EasyData to deliver the right tool for the job. And just as before, EasyData knew the right people for it. The application developed by ATAPY has now been operating flawlessly at MEMIC for several years and has already processed tens of thousands of forms up to the customer's expectations. EasyData and ATAPY streamline the document flow for Wageningen University ATAPY's experience with barcode recognition came in useful when EasyData received a request from one of its clients for a program that generates PDF files from a flow of scanned document pages. Wageningen University used a variety of scanners to provide convenient access to academic materials. ATAPY designed a program that launched scanning process through its own interface, stored the images in a specific folder, and combined one-page images into multi-page PDF files. The program employed ABBYY FineReader 6.0 Engine to find barcodes and used the barcoded page as the cover page of the PDF. Then it appended the subsequent images as pages to the PDF until it encountered the next barcoded page. The program provided an efficient GUI for selecting a scanner, specifying the output resolution, setting other important parameters, and controlling the process. The competency of ATAPY engineers and the power of ABBYY FineReader resulted in a reliable and stable application for converting large volumes of incoming material. EasyData and ATAPY Deliver an Image Cleaning Solution for Legal Industry "Raad voor Rechtsbijstand" (www.rvr.org), a Dutch legal agency, faced an unexpected problem while attempting to convert a large archive of various legal records to digital format. Many documents were printed on intensively colored paper. The same background that made the paper documents look nice and distinct, on black-and-white scans turned up as heavy jitter, inflating the file sizes, making them difficult to read and impossible to OCR. The agency sought professional advice from EasyData B.V. Having analyzed the issue, EasyData concluded that no ready-made solution, such as despeckling facilities of the modern OCR packages, was able to produce acceptable results, as the jitter was significantly heavier than what those tools were capable of removing. EasyData outsourced the problem for a more thorough research to ATAPY Software. ATAPY engineers designed and implemented a custom algorithm that approached the task more intelligently, taking into account not only the linear characteristics of each dot cluster but also its context (characteristics of the neighboring clusters). The result was a tool which produced nearly-clean images that took up to 10 times less disk space: After (OCR rate = 98.9%) Before (OCR rate = 1.9%) EasyData and ATAPY contribute to the IT program of the Dutch Government The Dutch Ministry of Justice carried out the assessment program for the living conditions at the correctional institutions in the Netherlands. For this purpose, it distributed and collected large volumes of multi-page questionnaires, reading them automatically with ABBYY FormReader. As the proprietary format of the statistics-and-reporting system did not allow direct export, the Ministry turned to ABBYY's most experienced integrator in the country for a solution. EasyData, together with ATAPY, designed a custom export module accessible from ABBYY FormReader toolbar. The module saved recognition results into a special intermediate file format importable into the target system. About Wageningen University: Wageningen University and Research Centre (www.wur.nl) is a research and higher education institution which trains specialists (BSc, MSc and PhD) in life sciences. About the Dutch Ministry of Justice: The Dutch Ministry of Justice (english.justitie.nl) sees its mission of maintaining order in Dutch society, while ensuring that justice, safety and unity come first. The Ministry employs almost 30,000 civil servants; its main office is located in The Hague. ©2011 ATAPY Software. All rights reserved. EasyData and EasySeparate are registered trademarks of EasyData B.V. ABBYY, ABBYY FormReader, ABBYY FineReader Engine are registered trademarks of ABBYY Software House. All the other trademarks used are the property of their respective owners. ATAPY Software EasyData B.V. 630090, Engineernaya Street, 4a, 522 Novosibirsk, Russia Tel. +7 383 33 56 56 9 Fax +7 383 33 56 56 1 www.atapy.com [email protected] Koninginnelaan, 16 7315 BS Apeldoorn, the Netherlands Tel. +31 55 53 44 886 www.easydata.nl [email protected] ATAPY Software Participates in Development of International Computer Dictionaries “ATAPY reached 99.992% text accuracy in the German-Russian Dictionary (1 mistake per 8,760 symbols), and 99.997% quality for the Spanish-Russian Dictionary project (1 mistake per 31,500 symbols). They also corrected many mistakes in the source dictionary text, including typographical misprints and even mistakes in special marks that are almost impossible to detect without special programming tools and profound knowledge of linguistics.” Anna Zhavoronkova Project Manager, ABBYY Software House Electronic dictionaries and translation systems are an area of great practical importance in the ever-globalizing world. ABBYY Software House, a world leader in OCR/ICR and linguistic technologies, develops and sells Lingvo electronic dictionaries. For many years Lingvo has been known as the best English-Russian dictionary on the market. In version 8.0, ABBYY planned to add 3 more languages; to introduce those to Lingvo, it was required to digitize world's latest best-of-breed dictionaries reflecting the modern state of the new languages to be supported. The ABBYY Lingvo 8.0 product line includes ABBYY Lingvo 8.0 Multilingual Edition, ABBYY Lingvo 8.0 for Pocket PC, as well as an updated and expanded version of ABBYY Lingvo English-Russian Edition. 5 ABBYY Lingvo 8.0 Multilingual Edition supports eight translation directions: English-Russian, German-Russian, French-Russian, Italian-Russian, Russian-English, Russian-German, RussianFrench, and Russian-Italian. This Edition of ABBYY Lingvo includes more than 40 dictionaries containing more than 2,400,000 entries. ABBYY turned to ATAPY Software, its outsourcing partner in Novosibirsk, for digital conversion of two dictionaries from the list picked out by the Linguistics Department. The 3-volume 1750-page Leping GermanRussian Dictionary and the 830-page Narumov Spanish-Russian Dictionary were to be recognized and proofread for subsequent automatic conversion into the ABBYY Lingvo database. Highest possible text recognition accuracy was obviously a must. A single mistake could break the words' alphabetical order and tear the word away from its paradigm. If the number of such mistakes were above even a very modest threshold, the dictionary would have become unsearchable. Adequate interpretation of special dictionary marks was no less vital for the project. They were used as field delimiters in the automatic database conversion process and had to be recognized 100% accurately. Special marks appeared either as text characteristics (bold/italics), or as special symbols (brackets, asterisks), or as a combination of the two (e.g., italics + brackets indicated a dictionary comment). Omitting a single bracket or missing italization would break the article's structure. This is why the project required both intelligent programming and highly qualified manual effort - a true challenge for any contractor in the media service area. The dictionaries were scanned and automatically recognized with ABBYY FineReader specially tuned-up for processing this material. Then a team of qualified operators proofread and cross-checked the results using the Double verification technique to ensure recognition accuracy. Double verification allowed to detect certain unexpected cases, such as typos in the source dictionary text, which have been corrected according to the ABBYY's guidelines. In its effort to automate the proofreading work to the maximum of possible extent, ATAPY developed and customized a number of in-house utilities. One of them was Glyphica, a tool for quick input of characters that cannot be found on the keyboard. For Leping Dictionary ATAPY developed a custom converter with built-in spellchecking and punctuation checking utilities which allowed to weed out mistakes unspotted during the previous stages and finally convert the material into the Lingvo vocabulary database. ABBYY Software House (www.abbyy.com) is based in Moscow, Russia. The company was founded in 1989. Today ABBYY has over 880 employees worldwide, including offices in Russia, USA, Ukraine, UK, Germany, Taiwan, Japan and Cyprus. ABBYY develops software products in the fields of artificial intelligence, document recognition, data capture and applied linguistics. ABBYY is most notable for their optical character recognition package ABBYY FineReader. ©2011 ATAPY Software. All rights reserved. AutoStore is a registered trademark of Notable Solutions, Inc. All the other trademarks are the property of their respective owners. ATAPY Software ABBYY Software House 630090, Engineernaya Street, 4a, 522 Novosibirsk, Russia Tel. +7 383 33 56 56 9 Fax +7 383 33 56 56 1 www.atapy.com [email protected] P.O. Box #20, Moscow, Russia, 127273 Tel. +7 (495) 783 3700 Fax +7 095 783 2663 www.abbyy.com [email protected] Meeting the Challenge of Time ATAPY Software participates in the development of ABBYY FineReader XIX - an OCR system for reading old European books “I've got FineReader 7.0 installed here on my computer. The Frakturschrift recognition is very good. Even though old text recognition is not a large and growing market, I am sure all the service bureaus here in Germany will be ordering 1 or 2 copies and have it run 7x24” Johannes Stöpetie CEO, ABBYY Europe GmbH Meta-E (http://meta-e.uibk.ac.at) is a collaborative initiative undertaken by a consortium of 14 universities from 7 European countries and the US, co-funded by the European Union. The project is focused on providing technology basis for digitization and web-publishing of valuable old printed sources spanning several centuries of European history. For this purpose, an OCR system was required, capable of recognizing historical texts for the period 1800-1938, including those printed with Frakturschrift (an old-styled black-letter typeface prevalent at that time). At that point no omnifont-Frakturschrift systems were available: all OCR products had to be trained on each individual book before processing it. Meta-E coordinators started looking for a high quality OCR package to be augmented according to their requirements. ABBYY FineReader was chosen due to its unrivalled recognition accuracy, support for 176 modern languages, and user-friendliness. ABBYY Software House, the international manufacturer of FineReader product line, took up the project as a direct contractor to carry out the development of the omnifont part (introducing the Frakturschrift graphics to FineReader). The linguistic part of the project was subcontracted to ATAPY Software, ABBYY's long-term partner in OCR and computer linguistics development. Based on FineReader 7.0 ATAPY's role in the Meta-E project was constructing Old Language Models (Lms) for 5 European languages: English, French, German, Italian, and Spanish. LM is a computer database that describes the vocabulary of a language. FineReader uses LMs during recognition for building OCR hypotheses and spellchecking. LMs are not just full lists of words in all possible grammar forms: such a database would be enormous in size and hardly manageable. FineReader LMs store only stems of each word, and describe the grammar as a set of flexing rules (paradigms). Each stem is assigned a list of paradigms; applying them to the stem produces all possible forms of the word. ATAPY was to study a large amount of authentic dictionaries and original old European texts dating back to the targeted time span, review the word stock, add the words that got phased out of the languages, and correct the paradigm assignments to synchronize the LMs with the actual grammatical practice used at that time. To complete this task, ATAPY's linguists carefully selected 10 dictionaries reflecting the state of the 5 languages, published between 1808 and 1930. ATAPY had also thoroughly analyzed 105 authentic books of that period, comprising more than 50 MB of text. The next step was to build FineReader LMs. ATAPY's linguists manually compared the information from authentic dictionaries and texts - about 500,000 entries in total - to the existing FineReader vocabularies. This work turned up a total of 458,767 words, from which 61% remained unchanged, and 36% were added to the vocabularies from the analyzed sources. About 3% of the words had their paradigms corrected towards the XVIII-early XX century grammar rules. To carry out such correction, the linguists had to add 159 historic grammar paradigms that were missing in the contemporary models. Finally, the LMs were compiled and tested on the control text corpus. They manifested 98.91% vocabulary coverage for Old English, 99.16% for Old French, 96.58% for Old German, 98.58% for Old Italian, and 98.79% for Old Spanish languages. To illustrate the above, let’s look at a few samples. A regular FineReader package, or any other contemporary OCR system, will make a lot of mistakes here. For example, “Alterthumskunde” may become “Allerlhumskunde“ on the first fragment; on the second fragment, “UEBERSICHT” (“Ubersicht” in modern German) gets recognized as two words “UEBER SICHT”, etc. These mistakes occur because of two factors. The first is the low printing quality, but there is nothing that can be done about it. The second is the old spelling used in those incorrectly-recognized words. All existing OCR systems are targeted at modern texts and therefore only know modern spelling. Once the five LMs were merged into FineReader shell, ABBYY was able to offer a special version of FineReader which knows the spelling specifics of old European languages. This version has a much lesser chance of making mistakes in places similar to those shown above. In effect, users will be able to OCR old texts with higher quality, saving much of the time which previously had to be spent on error correction. The special version of ABBYY FineReader, officially released by ABBYY under the name ABBYY FineReader XIX (http://www.frakturschrift.com), became a powerful tool assisting Meta-E consortuim in its largescale digitization work. The product is the industry's first box OCR product to recognize Renaissance and Late Medieval sources, a product specially targeted at European libraries and public organizations engaged in preservation and publishing of cultural assets, and at service bureaus helping them fulfill this mission. ABBYY Europe GmbH is a European department of ABBYY Software House based in Munich, Germany. ABBYY Software House is the manufacturer of software products in the fields of artificial intelligence, document recognition and applied linguistics. One of the most notable products by ABBYY Software House is the optical character recognition package ABBYY FineReader. ©2011 ATAPY Software. All rights reserved. ABBYY, ABBYY FineReader and FineReader XIX are registered trademarks of ABBYY Software House. All the other trademarks are the property of their respective owners. ATAPY Software ABBYY Europe Software House 630090, Engineernaya Street, 4a, 522 Novosibirsk, Russia Tel. +7 383 33 56 56 9 Fax +7 383 33 56 56 1 www.atapy.com [email protected] 80687 Munich, Germany Elsenheimerstrasse 49 Tel. +49 89 511 159 0 Fax +49 89 511 159 59 www.abbyyeu.com [email protected] International media analysis company serves its clients using a suite of PDF tools by ATAPY Software Presse+, a large media monitoring and research company (500 clients among 800 top French businesses and administrations, over 2,000,000 users daily) dealt with incoming printed media sources, audio and video materials, and digital media to produce synthetic summary documents. To secure its annual growth of 25% and to be able to process over 7,000 media sources 24 / 7, Presse+ had to constantly adopt innovative technologies and perfect its business procedures. This explains why the company had been heavily investing in the electronic support of its production chain. The summary documents were delivered in PDF format. Accuracy, efficiency, and speed of PDF generation were crucial for the company's success. However, the software packages available on the market failed to meet the particular needs of Presse+. This is why Presse+ turned to ATAPY Software in search of a solution. ATAPY’s experience with OCR and image processing allowed Presse+ to fill the gaps left by off-the-shelf products. ATAPY developed a suite of customized, reliable, and fast tools built around ABBYY FineReader OCR technology. The highlights included: batch processing for quick and convenient conversion 6 integration with the existing technology 6 reduced software licensing and maintenance costs; minimal training required 6 non-stop processing of input images in multiple formats containing graphics and text, in many languages 6 export of recognized data to PDF with user-defined keywords highlighted 6 recognition of “image only” PDF as a set of images 6 decreasing the sizes of the PDF files by compressing the illustrations stored inside The result was an effective customized solution that eliminated the need for expensive generic products. Through innovative approach, ATAPY engineers built a robust and scalable system which met the high quality standards of Presse+. Presse+ is a leading French provider of Media Monitoring (Press, Broadcast, Internet, News Wires, etc.) and International Press Analysis Services (covering major European, US, and Asian publications). In 2005 the company was acquired by TNS Media Intelligence (www.tns-mi.com), a company of the TNS Media Group. The acquisition became a considerable boost to TNS’s news monitoring service in France. ©2011 ATAPY Software. All rights reserved. ABBYY and ABBYY FineReader are registered trademarks of ABBYY Software House. All the other trademarks are the property of their respective owners. ATAPY Software 630090, Engineernaya Street, 4a, 522 Novosibirsk, Russia Tel. +7 383 33 56 56 9 Fax +7 383 33 56 56 1 www.atapy.com [email protected]
Similar documents
ATAPY Software: Participation in the Development of FineReader XIX
Models (LM) for 5 European languages: English, French, German, Italian, and Spanish. LM is a computer database that describes the vocabulary of a language. FineReader uses LMs during recognition to...
More informationMedia Service Profile
All the other trademarks used are the property of their respective owners.
More information