OCR SYSTEM WITH PATTERN, COLOR RECOGNITION AND
Transcription
OCR SYSTEM WITH PATTERN, COLOR RECOGNITION AND
i OCR SYSTEM WITH PATTERN, COLOR RECOGNITION AND TRANSLATION FUNCTION HO CHUN KIT Faculty of Electrical Engineering Universiti Teknologi Malaysia i HO CHUN KIT 23 MAY 1986 OCR SYSTEM WITH PATTERN, COLOR RECOGNITION AND TRANSLATION FUNCTION 2009/2010 860523-56-6709 20 APRIL 2010 ASC.PROF DR YAHAYA MD SAM 20 APRIL 2010 ii “I hereby declare that I have read this thesis and in my opinion this thesis is sufficient in terms of scope and quality for the award of the degree Bachelor of Electrical Engineering (Control and Instrumentation) Signature : .................................................... Name of Supervisor : PM. Dr. Yahaya Md Sam 20 APRIL 2010 Date : ............................................................ iii OCR SYSTEM WITH PATTERN, COLOR RECOGNITION AND TRANSLATION FUNCTION HO CHUN KIT A thesis submitted in fulfillment of the requirements for the award of the degree of Bachelor Electrical Engineering (Control and Instrumentation) Faculty of Electrical Engineering Universiti Teknologi Malaysia APRIL 2010 iv I declare that this thesis entitled “OCR System with Pattern, Color Recognition and Translation Function” is the result of my own research except as cited in the references. The thesis has not been accepted for any degree and is not concurrently submitted in candidature of any other degree. Signature : .................................................... Name : HO CHUN KIT 20 APRIL 2010 Date : .................................................... v To my beloved family and supervisor. vi ACKNOWLEDGEMENT I am grateful to many people for assistance with the preparation of this work: firstly, to my supervisor, PM. Dr. Yahaya Md Sam for much helpful guidance, encouragement and advice. I would like to thank all of my friends who have given me supports and opinions. Without their opinions, I will not be able to finish this project fast. My training company, TTVISION SDN BHD has inspired me to choose this title. Thanks for giving me guidance and valuable experience along my training. Thanks for my family who has been giving me supports, love and encouragement all the time. To one and all, I extend my appreciation and thanks. vii ABSTRACT Optical Character Recognition(OCR). OCR is mainly about pattern recognition, artificial intelligent, machine vision. The OCR techniques also include the image processing technique now. OCR knowledge becoming popular nowadays due to its market value in machine vision inspection system, character scanning and recognition and pattern recognition. OCR can greatly decrease the error probability and work loads of human. The building and operation of this system is based on industrial OCR and IC chips inspection system, and for the software part of the translation function, it is based on the OCR software for characters recognition. This project is developed an intelligent OCR system that can be used for materials and characters recognition for inspection, and language learning purposes. In addition, it is equipped with easy- to-use teaching system to do instant image recognition and memorization until the system getting more accurate after learning. viii ABSTRAK OCR dalam kata penuh ialah Optical Character Recognition. OCR secara umumnya adalah mengenai pengecaman bentuk, kepintaran buatan manusia, dan mesin pemantauan. Teknik OCR sekarang juga merangkumi teknik pemprosesan imej. Pada zaman kini, pengetahuan OCR menjadi semakin penting kerana pengetahuan ini adalah amat berharga dalam sistem mesin pemantau, pengecaman tulisan dan bentuk.OCR berupaya menurunkan kebarangkalian kesalahan dan menringankan kerja manusia. Laporan ini adalah mengenai pembinaan satu sistem OCR (Optical Character Recognition) yang pintar, seterusnya digunakan dalam pelbagai bidang. Pembinaan sistem ini adalah berdasarkan OCR dalam industri dan sistem pengesan kecacatan litar bersepadu semasa pemprosesan. Bahagian perisian penjemahan bahasa pula adalah berdasarkan perisian OCR pengecaman basaha dan tulisan. Tujuan projek ini adalah membina satu sistem OCR yang pintar untuk mengesan kecacatan produk , dan pembelajaran bahasa. Tambahan pula, sistem tersebut mengandungi fungsi pengecaman yang senang digunakan untuk mengecam dan mengingati imej secara serta-merta, sehingga sistem ini menjadi semakin ‘pintar’ selepas ‘belajar’. ix TABLE OF CONTENT CHAPTER 1 TITLE PAGE DECLARATION ii DEDICATION v ACKNOWLEDGEMENT vi ABSTARCT vii ABSTRAK viii TABLE OF CONTENT ix LIST OF TABLES xii LIST OF FIGURES xiii LIST OF APPENDICES xvii INTRODUCTION 1 1.1 OBJECTIVES 2 1.2 SCOPES 3 1.3 PROBLEM STATEMENT 3 1.4 BACKGROUND OF STUDY 4 1.5 COMPARISONS OF THE PREVIOUS THESIS 7 OF UTM’S STUDENTS 2 LITERATURE REVIEW 9 2.1 METHOD AND TECHNIQUES 9 2.1.1 MODERN APPROACHES 9 2.1.2 TECHNIQUES 11 2.1.2.1 AID OF PROCESSING TOOLS 11 x 2.1.2.2 ACCESS THE IMAGE AND PROCESS 12 THE PIXELS 2.2 2.3 3 DEVICES 16 2.2.1 RS232 DB9 AS SERIAL COMMUNICATOR 16 2.2.2 PIC18F452 19 2.2.3 SENSOR AND COMPARATOR 20 2.2.4 MOTOR DRIVER 22 SOFTWARE 22 2.3.1 MICROSOFT VISUAL BASIC 2005 22 2.3.2 HARDWARE PROGRAMMING (mikroC) 23 METHODOLOGY 24 3.1 FLOW OF WORK 24 3.2 EQUIPMENTS 25 3.2.1 SOFTWARE 25 3.2.2 HARDWARE 26 PROGRAM DESIGN 28 3.3.1 FLOW CHARTS 28 3.3.2 THRESHOLD 30 3.3.3 RECOGNITION ALGORITHM 34 3.3.4 INTELLIGENT TEACH FUNCTION 37 3.3 3.3.5 SHAPE RECOGNITION 38 3.3.6 COLOR RECOGNITION 40 3.3.7 DIFFERENT CHARACTER SIZE 43 RECOGNITION 3.3.8 CHINESE RECOGNITION 44 3.3.9 JAPANESE RECOGNITION AND 46 PRONUNCIATION GUIDES 3.4 3.3.10 SERIAL COMMUNICATION 51 HARDWARE AND CIRCUIT DESIGN 53 3.4.1 LIGHTING 53 3.4.2 54 REJECTER DESIGN 3.4.3 ITEM SLOTS 55 xi 4 PROJECT IMPLEMENTATION 60 4.1 OPERATION DESCRIPTION 60 4.2 SYSTEM IMPLEMENTATION AND RESULT 61 4.2.1 SOFTWARE IMPLEMENTATION 61 4.2.1.1 CHARACTER RECOGNITION AND 61 INSPECTION 4.2.1.2 SHAPE RECOGNITION AND PCB 68 INSPECTION 4.2.1.3 COLOR RECOGNITION AND 71 INSPECTION 4.2.1.4 CHINESE RECOGNITION AND 78 TRANSLATION 4.2.1.5 JAPANESE RECOGNITION AND 83 PRONUNCIATION GUIDES 4.2.1.6 DIFFERENT CHARACTER SIZE 84 RECOGNITION 4.3 5 4.2.2 HARDWARE IMPLEMENTATION 86 RESULT ANALYSIS 88 RECOMMENDATION AND CONCLUSION 89 5.1 CONCLUSION 89 5.2 FUTURE WORK RECOMMENDATION 90 REFERENCES APPENDICES A-D 91 94-102 xii LIST OF TABLE TABLE NO. TITLE PAGE 3.1 List of components 26 3.2 Calculation of the distribution of black pixel for 35 several characters. xiii LIST OF FIGURES FIGURE NO. TITLE PAGE 1.1 Basic flow of the project 2 1.2 Pattern and color inspection 4 1.3 A Demo Version of an OCR tool. 6 1.4 Two freeware examples of OCR tools that written is VB6 7 to let people get basic programming concept. 1.5 Industrial OCR tools that used for IC inspection. From 7 TT VISION SDN BHD. 2.1 Example of image threshold from MATLAB. 12 2.2 Distinguishing two characters from each other. 13 2.3 Color specification of RGB color cube. 13 2.4 TTVISION Epoxy inspection base on segmentation and 14 blob discovery concept 2.5 A number of ‘0’ in pixels 15 2.6 A pattern of pixels distribution for number of ‘0’. 15 2.7 Fuzzy output of the number recognition. SOP is sum of 16 pixels. TERM is width. 2.8 RS232 DB9 Pin out. 16 2.9 Connection between PIC and RS232-DB9 16 2.10 Sample program using VB6 to open comport . 18 2.11 PIC18F452 Pin Diagram from Microchip Technology Inc 19 2.12 IR sensors and Operation. 20 2.13 Pin Diagram of LM324 21 2.14 IR Sensor working with comparator. 21 xiv 2.15 L293D Pin Diagram 22 3.1 Project development flow 24 3.2 Circuit diagram and circuit built 27 3.3 Hardware and items a) 3D drawing of the system 28 b) inspection item models 3.4 Characters Inspection Program Flow Chart 29 3.5 Pattern Inspection Program Flow Chart 29 3.6 Colors Inspection Program Flow Chart 29 3.7 Chinese and Japanese Recognition Program Flow Chart 30 3.8 Flow chart of automatic lighting threshold adjust 32 3.9 Image threshold process. 33 3.10 The Lighting setting and auto calibrate function of the 33 software. 3.11 Sample ‘A’ captured under certain lighting condition. 33 3.12 Distribution of Pixels 34 3.13 Image Under Different Lighting Threshold Value. 36 3.14 Detecting and processing multi character. Boundaries 36 formed in sequence a to z and 1 to 2 3.15 Pixel manipulation (a) Forming array (b) Pixel Detection 37 3.16 Teach Function Input Box. 38 3.17 Shape Inspection Selections. 40 3.18 Color Selection (red rectangle). 40 3.19 Percentage Color Inspection and Threshold Selections 42 3.20 Character of Different Size 43 3.21 Chinese character recognition and translation interface. 45 3.22 Set of Hiragana 47 3.23 Set of katakana with some Modern digraph additions 48 with diacritics. 3.24 Japanese Recognition Interface 49 3.25 Japanese Recognition Interface with combination of 49 characters. 3.26 Port Setting Dialog Box. 52 3.27 Camera and lighting positioning 53 xv a) Camera Distance Fixing b)Lighting set up 3.28 Rejecter design and position. Red arrow shows the 54 movement path of rejecter. 3.29 Rejecter should be at proper height 55 a) view from behind b) side view 3.30 Item Slots a) top view b) side view c) bottom view 56 3.31 Motor for Item Slots 57 3.32 First trial (detecting the black strips) 58 3.33 IR sensor function by reflection 58 3.34 IR sensor detecting the slit 58 3.35 Shielded IR sensor 59 3.36 Slits on Item Slots 59 4.1 Location of the exe file 61 4.2 Character recognition and inspection interface 62 4.3 Port setting user interface. 63 4.4 Image Preview from camera. 64 4.5 NOT PASS code generated 65 4.6 Teach function of the system 66 4.7 OCR result after teaching the system. 67 4.8 database in a text file for system recognition. 67 4.9 Shape recognition interface. 68 4.10 Threshold image of a simple PCB pattern. 69 4.11 Marked selection region. 70 4.12 Result of shape inspection 71 4.13 Color recognition interface. 72 4.14 Regional color inspection mode. 73 4.15 Regional color inspection for PASS item. 74 4.16 Regional color inspection for NOT PASS item. 74 4.17 Percentage color inspection mode. 75 4.18 Improved setting to filter out colors. 76 4.19 Color percentage inspection is independent of the 77 item’s pattern. 4.20 Item that not pass the test because over percentage. 77 4.21 Chinese recognition interface. 78 xvi 4.22 Chinese recognition result. 79 4.23 Chinese recognition result turned to editable text. 80 4.24 Chinese recognition result can be exported. 80 4.25 Chinese recognition Teach Mode. 81 4.26 Key in the Character by Chinese software in the text box. 82 4.27 New recognition result. 82 4.28 Japanese Recognition result. 83 4.29 Japanese recognition result for combination 84 of pronunciation. 4.30 Larger size character recognition. 85 4.31 Smaller size characters recognition 85 4.32 Overall system hardware. 86 4.33 a) Inspection items position 87 b) System scan through each items 4.34 System flow a) Rejecter reject the item 87 b) Process continue until last item 4.35 System reset to initial position automatically after finish. 87 5.1 Use hand phone to translate any language. 90 xvii LIST OF APPENDICES APPENDIX A TITLE 2D MECHANICAL DRAWING PAGE 94 (BODY TOP VIEW) B 2D MECHANICAL DRAWING 95 (SLOTS TOP VIEW) C 2D MECHANICAL DRAWING 96 (OVERALL TOP VIEW) D E MICE EXHIBITION CERTIFICATE PIC18F452 CODING 97 98 1 CHAPTER 1 INTRODUCTION This project basically can perform mark or characters recognition (communicated to hardware to do inspection like IC mark inspection), Shapes recognition (communicated to hardware to do inspection like PCB inspection), Color recognition (communicated to hardware to do inspection like colors inspection), Chinese recognition and translation to English. For extra function, the system created also will be able to do Japanese recognition and pronunciation guide. OCR can still be considered as a new industrial knowledge, where it is normally applied in building machine vision system for various manufacturing fields. This kind of industrial software is normally kept private. Generally, this project is to study about OCR and the application of this knowledge for different purposes. That is, OCR in numbers and alphabets (applied in IC mark inspections), color recognition (applied in organic materials inspections), Chinese and Japanese characters recognition, teaching, turn to editable text, and translation 2 function. The following parts of this paper will describes on more detailed how can this system be created at the lowest cost with easy-to-understand algorithm based on pixels RGB manipulations, pixels division and array matching. The basic flow and concept of this project is as shown in the Figure 1.1. First of all, image acquiring under proper lighting condition, image processing, then, the result will be analyzed by the computer. Next, the computer will communicate with the hardware to do further action, for example, item rejection. Figure 1.1: Basic flow of the project 1.1 OBJECTIVES The aims of building the OCR system are: i. To create an OCR system that can react or give signal according to recognition result (alphabets and words). ii. To create an OCR system with pattern and color recognition. iii. To create an OCR system that translates Chinese character to English. 3 1.2 i. SCOPE Create an OCR system to do mark, character or word inspection. a. Identify the word / alphabet (specific font) on pictures. b. Differentiate with the desired word/ alphabets. ii. Create an OCR system to do pattern inspection. a. Identify and differentiate simple pattern on PCB or IC lead. iii. Create an OCR system to do color inspection. a. Identify and differentiate colors (3 colors). iv. Create an OCR system to translate Chinese characters to English. a. Users can teach the system. 1.3 PROBLEM STATEMENT The OCR technology is not solely about scanned text transformation to the computer text. It is becoming important to the electronics production industries in the vision inspection system. It will be good to develop a vision system that can do the OCR, pattern recognition, and even color recognition in one package. It is also essential for the system to interact with the computer and user to react and give alert to the user after getting the recognition results to minimize the works of the operators. Inspections by operators are usually slow, tiring, high probability of error and cost is needed to pay the salary to the operators. Machine inspection is still a new industry. More discovers of this knowledge is required. Many inspection machine still using color sensors. Color inspection is normally applied in the inspection or sorting of organic substances like tobacco leaves. Color inspection by using color sensors that have limited spec is not 4 flexible. Additional costs have to be spent in any changes of colors and number of colors. If the OCR system can recognize colors as well, cost can be saved. Beside inspection, OCR can also be applied in translating Chinese or others characters into English. Figure 1.2: Pattern and color inspection 1.4 BACKGROUND OF THE STUDY The OCR technique is developed since the year of 1929 in Germany. The first concept on OCR is by using Photo detector. When a lined character passes through the photo detectors, no light will reach the photo detector. The pattern of the light reflection will then give signal to the computer to continue the analyzing job. The previous commercially use of the OCR technique is from the Reader’s Digest Company for avoiding time consumption of retyping a certain text. The second system was sold to the Standard Oil Company of California for reading credit card imprints for billing purposes, with many more systems sold to other oil companies. Other systems sold by IMR during the late 1950s included a bill stub reader to the Ohio Bell Telephone Company and a page scanner to the United States Air Force for reading 5 and transmitting by teletype typewritten messages. IBM and others were later licensed on Shepard's OCR patents. [1] In America, the OCR system is applied in mail sorting purpose. The second system was sold to the Standard Oil Company of California for reading credit card imprints for billing purposes, with many more systems sold to other oil companies. Other systems sold by IMR during the late 1950s included a bill stub reader to the Ohio Bell Telephone Company and a page scanner to the United States Air Force for reading and transmitting by teletype typewritten messages. IBM and others were later licensed on Shepard's OCR patents.[2] Basically, The OCR is being applied in the field of barcode reading, text reading for the blind people. The OCR program is being sold since 1979 that do mainly the text to computer text conversion purpose. Nowadays, OCR technology keeps improving to achieve almost 99% accuracy. There are many ways to write a OCR program, such as using Neural Network, C++, C#, VB and so on. Each of the languages performs the task almost the same way. VB is used widely in the industrial today due to the cost and easy to learn. OCR is being applied to a lot of advance scanning purpose by many companies in Malaysia. For example, TT VISION and VITROX at Penang. Vision industry is still a new industry in Malaysia. OCR and pattern recognition are being applied in the IC chips inspection system to indentify the defects of the Mark, Leads, Thickness and Labels. The companies are using high resolution cameras like Panasonic Industrial Camera and various type of lighting system that using LED to get a good image to do OCR and pattern recognition. Due to the highly repeatable system of OCR, the cost of hiring operators to examine the defects can be cutoff. The probability of making errors also will be minimized. The common methods of vision are: a. Pixel counting: counts the number of light or dark pixels b. Threshold: converts an image with gray tones to simply black and white c. Segmentation: used to locate and/or count parts 6 i. Blob discovery & manipulation: inspecting an image for discrete blobs of connected pixels (e.g. a black hole in a grey object) as image landmarks. These blobs frequently represent optical targets for machining, robotic capture, or manufacturing failure. ii. Recognition-by-components: extracting geons from visual input iii. Robust pattern recognition: location of an object that may be rotated, partially hidden by another object, or varying in size d. Barcode reading: decoding of 1D and 2D codes designed to be read or scanned by machines e. Optical character recognition: automated reading of text such as serial numbers f. Gauging: measurement of object dimensions in inches or millimeters g. Edge detection: finding object edges h. Template matching: finding, matching, and/or counting specific patterns. [2] There are many OCR software available in the market and selling with high price. For example, ExperVision TypeReader & OpenRTK, ABBYY FineReader OCR, Aprise OCR as shown in Figure 1.3, Microsoft Office Document Imaging, Brainware, SimpleOCR, Tesseract and many more. There are also free OCR learning software like Quickwrite OCR as shown in Figure 1.4. Example of industrial OCR software like OCR software from TTVISION as shown in Figure 1.5. Figure 1.3: A Demo Version of an OCR tool. 7 Figure 1.4: Two freeware examples of OCR tools that written is VB6 to let people get basic programming concept. Figure 1.5: Industrial OCR tools that used for IC inspection. From TT VISION SDN BHD. 1.5 COMPARISONS OF THE PREVIOUS THESIS OF UTM’S STUDENTS. LEOW YEE RUN from 5SEM has done a thesis in the title of “Neural Network Simulator for Alphabet Recognition with Speech Synthesizer.” [3] The author has used several software to perform OCR task. VB6, MATLAB, Neural Network, Microsoft Agent are the software that being applied to do image recognition and then convert it to the sound signal by using Microsoft Agent. The only Hardware that the author used is a simple webcam. 8 NOOR KHAFILAH BINTI KHALID from 4SEM also had written a thesis entitled of “An Image Processing Approach Towards Classification of Defects Printed Circuit Board.” [4] The purpose of the software design is to compare a sample image to a prototype image. An image processing approach is applied to examine the defects of the PCB board. The author did not use any hard ware during the process. It just simply a comparison between pixels of two picture to obtain a certain value of percentage error. If the picture exceeds a certain threshold of error, the user of the software will be notified. AZIZUL BIN KEPLI, 5SEM had written a thesis on “Vision Based autonomous Color Detection and Object Tracking Robot”. [5] The author different from the previous one. He used only hardware to do the image recognition. CMUcam2 is used to differentiate the colors of the object and the robot will be able to react to the object. Lastly, LEE CHEE WEI from 4SEI, have written a thesis on “Smart Automated Parking System Using Plate Number Recognition Technology”. The author using Visual Basic 2008 to write the OCR program. The system is able to recognize the plate number and record the numbers in the computer when the webcams are triggered by the PIC. He used the aid of image processing tools (TESSERACT). All these studies have applied the OCR and pattern recognition technology. There are still certain criteria that need to be improved to optimize the image recognition result, and the abilities of OCR system. 9 CHAPTER 2 LITERATURE REVIEW 2.1 METHOD AND TECHNIQUES This section is mainly discussion of the method and technique applied for building the OCR system, devices and type of development software involve. 2.1.1 MODERN APPROACHES Traditional method of OCR using light source and photo detector is no longer applicable due to inaccuracy and problem of resolution. There are some modern approaches like using progressive cameras to capture the image and analyze by using computer. Hence, software writing is very important. 10 There are several common modern approaches on OCR software writing; Visual Basic: It is very common software on writing graphical user interface (GUI) in many applications. This method of developing OCR software is widely applied in developing commercial and industrial OCR. The reasons are, the programming language is very simple, easy to learn, more features on GUI, easy to find coding references because of its popularity, and the most important thing is, Microsoft provide free download for this software for everybody. MATLAB: MATLAB has provided a toolbox for users to do image processing. Although most of the function of image processing has been provided, but it is still not widely apply in developing OCR due to the coding used is more complicated and the most important thing is, it is not very capable in developing GUI. Neural Network: This method is popular in analyzing image. Widely applied in complicated image processing purposes. For example, analyzing maps, blood cells. This software and concept is more difficult to learn compare to visual basic. Fuzzy Logic: This method required a microcontroller as fuzzy engine that can receive fuzzy input of data. For example, MC68HC11E9 8-bit microcontroller, provided by Freescale Semiconductor [6]. Users must write a fuzzy rule to provide inputs to this microcontroller. The coding is complicated, by the concepts are easy to understand. This method is not popular in OCR because people prefer to use computer to analyze rather than a microcontroller. Actually, the concept on writing the fuzzy rules is widely applied in writing OCR software. 11 2.1.2 TECHNIQUES Although we have a lot of different approaches on writing to OCR software, such as using Visual Basic, MATLAB, Neural Network, Fuzzy logic, C#, C++, at the end, commonly end up with two choices. 1. Aid of image processing tools 2. Access the image and process the pixels 2.1.2.1 AID OF PROCESSING TOOLS If we are choosing the first method, there are a lot of commercially available tools like TESSERACT, SDK LEADTOOLS, ABBYY Fine Reader OCR and more. By using this method, during the programming part, user can call the dll files directly provided by these commercial tools to do image processing jobs. Users can save a lot of times in order to figure out the way on how to manipulating the pixels. There are some drawbacks of this method. Firstly, this method is less flexible. The processing tools cannot always provide the needs to suit all different conditions and system requirement. Secondly, there is a limitation of creativity. Cost is needed to purchase this kind of processing tools. 12 2.1.2.2 ACCESS THE IMAGE AND PROCESS THE PIXELS If we are choosing the second method, we have to do a lot of coding for image processing. It is time consuming because image processing normally involve a lot of logic behind the coding. The image processing method we have to apply are like pixels manipulating, Threshold, Edge detection, Blob discovery, applying fuzzy rules. This method is more flexible, more spaces for creativity development and improvement and cost saving. Figure 2.1 shows one of the image processing technique called threshold. Figure 2.1: Example of image threshold from MATLAB. Threshold is a very important technique in image processing to differentiate and extract the objects in pictures. It is generally about image contrast adjustment. This technique applied even in blood cells and maps analysis for more advance purposes. [7],[8].The following techniques are Sub pixels, Segmentation , pixels location detections, Array matching [6],[10], Fuzzy rules[11]. Image captured from the images contain colors and blur that may affect the result of OCR. So, we have to change to grayscale and then change to black and white bit by bit. After this, we have to rotate the position of the image and then, do segmentation and recognition. Figure 2.2 shows the way to calculate pixels to distinguish characters. 13 Figure 2.2: Distinguishing two characters from each other. The all-black square is a pixel that belongs to the connected component that comprises the current character. Since none of the pixels belonging to the next character touches the current character, an ending location is set. For color recognition, the color definition of computer is Red, Green, and Blue. Each counting from 0 to 255. The possible combination of the color components are 256*256*256 = 16777216 colors as shown in the color block in Figure 2.3. We can retrieve the color value by Color=Bitmap.GetPixel(x, y) coding of Visual Basic 2005 Color values retrieved from the image captured then will be compares with reference. Figure 2.3: Color specification of RGB color cube. 14 The real coding examples for industrial and commercial OCR are not easy to get due to the copy write issue. Most of the time, we can only get the techniques and theories on image processing. For example, through the journals “International journal of pattern recognition and artificial intelligent” and “Photogrammetric Engineering and Remote Sensing”. For personal experience, through industrial training in TT VISION SDN.BHD at Pulau Pinang, their development of inspection system basically using the same approaches like pixel manipulation, blob discovery and segmentation as shown in Figure 2.4. The way of pixels manipulation is depends on creativity. Figure 2.4: TTVISION Epoxy inspection base on segmentation and blob discovery concept Here again let’s take an example from FreeScale Semiconductor in developing fuzzy rules to do character recognition. 15 Figure 2.5: A number of ‘0’ in pixels [6] From Figures 2.5 and 2.6, it can be observed the distribution of the black pixels. From high to low, then to high again and lastly decrease to 0. This is true even for the skewed number. Each number will have different distribution pattern. Figure 2.6: A pattern of pixels distribution for number of ‘0’. The next step is, divide the pixels of the number into a few columns to generate fuzzy rule. For example, the first column has large number of black pixels it will provide output of ‘High’ .Second column is ‘Low’. If it is medium, output will be ‘medium’. But this is not enough. There must be some similar pattern of distribution. For example, ‘0’ with ‘8’, ‘2’ and ‘5’ and so on. To handle this, we have to count the total number of pixels as second level of recognition between these cases. Larger number of pixels will produce ‘High’ and lower will produce ‘Low’. If there is conflict of similar case again, we have to calculate the width of that numbers again to produce fuzzy output. At last, 16 the fuzzy output will be generated to differentiate numbers. This output then will be boot into the microcontroller to do OCR. Figure 2.7: Fuzzy output of the number recognition. SOP is sum of pixels. TERM is width. 2.2 DEVICES 2.2.1 RS232 DB9 AS SERIAL COMMUNICATOR Figure 2.8: RS232 DB9 Pin out. Figure 2.9: Connection between PIC and RS232-DB9 17 The serial connection between the PIC and RS232 is as shown in Figures 2.8 and 2.9 Normally we have to consider about the : 1) Baud rate = The baud unit is named after Jean Maurice Emile Baudot, who was an officer in the French Telegraph Service. He is credited with devising the first uniformlength 5-bit code for characters of the alphabet in the late 19th century. What baud really refers to is modulation rate or the number of times per second that a line changes state. This is not always the same as bits per second (BPS). If you connect two serial devices together using direct cables then baud and BPS are in fact the same. Thus, if you are running at 19200 BPS, then the line is also changing states 19200 times per second. [12] 2) Data Bit = Directly following the start bit, the data bits are sent. A bit value 1 causes the line to go in mark state, the bit value 0 is represented by a space. The least significant bit is always the first bit sent. 3) Stop Bit = The stop bit identifying the end of a data frame can have different lengths. Actually, it is not a real bit but a minimum period of time the line must be idle (mark state) at the end of each word. On PC's this period can have three lengths: the time equal to 1, 1.5 or 2 bits. 1.5 bits is only used with data words of 5 bits length and 2 only for longer words. A stop bit length of 1 bit is possible for all data word sizes. 4) Parity Bit = For error detecting purposes, it is possible to add an extra bit to the data word automatically. The transmitter calculates the value of the bit depending on the information sent. The receiver performs the same calculation and checks if the actual parity bit value corresponds to the calculated value.[13] 18 Create an instance of CRs232 then set COM parameters before invoking the Open method Here's an example: Dim moRS232 as New Rs232() With moRs232 .Port = 1 '// Uses COM1 .BaudRate = 2400 ' // 2400 baud rate .DataBit = 8 ‘// 8 data bits .StopBit = Rs232.DataStopBit.StopBit_1 '// 1 Stop bit .Parity = Rs232.DataParity.Parity_None '// No Parity .Timeout = 500 '// 500 ms of timeout admitted to get all required bytes End With '// Initializes and Open moRS232.Open () You can, optionally control the state of DTR/RTS lines after the Port is open '// Set state of RTS / DTS moRS232.Dtr = True Figure 2.10: Sample program using VB6 to open comport . Baud Rate is very important in serial communication, so the PIC and the computer can be synchronize. This can be done by computer coding as shown in Figure 2.10. Baud rate calculation: Desired Baud Rate = Fosc/(64X + 64) = Fosc/64(X + 1) Fosc is the frequency of the crystal use. X is the value we load into the SPBGR register. The following is the example command of opening comport for the PIC program code. OpenUSART (USART_TX_INT_OFF & USART_RX_INT_OFF & USART_ASYNCH_MODE & USART_EIGHT_BIT & USART_CONT_RX & USART_BRGH_HIGH, X) 19 2.2.2 PIC18F452 PIC18F452 in figure 2.11 is a 40-pin high performance, enhanced FLASH microcontrollers with 10-Bit A/D. It is an enhanced version of microcontroller compared to PIC16F and PIC17F. It has more memory space compared to previous versions as stated earlier, but Source code compatible with the PIC16 and PIC17 instruction sets. The popular features of PIC18F452 are like analog to digital conversion, 100,000 erase/write cycle Enhanced FLASH program memory typical, 1,000,000 erase/write cycles Data EEPROM memory, FLASH/Data EEPROM retention more than 40 years, wide operating voltage range (2.0V to 5.5V), low power, high speed FLASH/EEPROM technology. Figure 2.11: PIC18F452 Pin Diagram from Microchip Technology Inc The reasons for choosing PIC18F452 in this project are, high performance, robust, easy to program by using C programming, large amount of I/O pins and reasonable price. 20 2.2.3 SENSOR AND COMPARATOR IR emitter and IR phototransistor. An infrared emitter is an LED made from gallium arsenide, which emits near-infrared energy at about 880nm. The infrared phototransistor acts as a transistor with the base voltage determined by the amount of light hitting the transistor. Hence it acts as a variable current source. Greater amount of IR light cause greater currents to flow through the collector-emitter leads. [14] Figure 2.12: IR sensors and Operation. The IR sensor pair normally work with comparator in order to trigger or give signal according to the voltage output or the IR pair as shown in figure 2.12. Example or comparator circuit is LM324 chip or Low Power Quad Operational Amplifiers. Its low power requirement with 2nA and +5V to 30V operation range make it suitable for this project. LM324 in figure 2.13 provides 4 op-amps that can be use as comparator. 21 Figure 2.13: Pin Diagram of LM324 Figure 2.14: IR Sensor working with comparator. When the receiver IR RX receives the infrared light from the emitter, the potential difference of the negative input of the comparator will rise. When it higher than the positive input voltage, sensor X will be triggered. R3 is use to adjust the sensitivity of the IR sensor because the voltage of the transmitter and receiver should be adjust to balance at different condition of lighting before use. The basic circuit is as shown in Figure 2.14. 22 2.2.4 MOTOR DRIVER Motor driver is required in this project even motors with small power requirement are being used. We can’t use PIC to drive the motors because motors will produce back electric motive force that will damage the PIC. It is always a good practice to use a motor driver to drive motors by using PIC. Motor driver chosen in this project is L293D or Push-Pull Four Channel Driver with diodes. This chip provides 4 channels to enable users control 4 motors as shown in Figure 2.15. Figure 2.15: L293D Pin Diagram 2.3 SOFTWARE 2.3.1 MICROSOFT VISUAL BASIC 2005 Visual Basic 2005 is the most popular programming language for building a Windows and Web application. The most important thing that this language provides is the .NET Framework, which is a very large collection of functions where users allowed 23 to call for complete their tasks. It is a very powerful and user friendly software that even can create some application without a single line of coding. Visual Basic 2005 provided a large numbers of tools for users to create Windows forms according to their needs. The debugging tools of the Visual Basic itself will help the users to debug their coding. Visual studio 2005 express Edition is a freeware that users can directly download online from the Microsoft official website. 2.3.2 HARDWARE PROGRAMMING (mikroC) The software used to program the PIC18F452 is mikroC. This software allowed users to develop their own applications quickly and easily with intuitive C compiler for PIC microcontrollers. This software provides useful implemented tools, many practical codes examples, broad set of built in routines, and a comprehensive Help. It is suitable for experienced engineers or beginners. This software chosen in this project because it provides free student version. It also provides easy-to-use USART module for serial communication purpose. 24 CHAPTER 3 METHODOLOGY 3.1 FLOW OF WORK Figure 3.1: Project development flow Computer software development is development of character recognition, color recognition and translation software by using Visual Basic 2005. Here, 25 references should be made. There are a lot of OCR software and techniques available in the market, but there are not open source. Hence, getting basic ideas and concepts are very important. In this project, no image processing tools involve to achieve true learning and design process. For the Hardware design part, PIC18F452 as the main hardware to control the flow of whole inspection system. In this process, there are a lot of problems that we need to solve. For example, dimensions of the mechanical parts, motors, noise to serial port, isolation of supply, and orientation of the inspection items, and so on. After that, communication of hardware with computer software written is by using serial RS232 communication with the help of MAX232 chip. Here, communication coding by Visual Basic and mikroC are needed. This is also a huge topic. Testing and improvement part is mainly improving the programming, robustness, and rapid ability of the system. 3.2 EQUIPMENTS 3.2.1 SOFTWARE i. Microsoft Visual Basic 2005, Express Edition ii. microC 8.0 student version. iii. UIC00A PICkit 2v2.55 (PIC burner) 26 3.2.2 HARDWARE For circuit, the components required are as stated below. For the mechanical parts, it is as stated according to the mechanical drawing. The tracks are aluminum, while the other parts are zinc. For detailed two dimensions drawing, can refer to APPENDIX A, APENDIX B and APPENDIX C. 1. PIC18F452 16. Adaptor terminal x2 2. L293D motor driver 17. 7805 regulator x2 3. DC motor x2 18. Crystal 10MHz 4. DC supply 12v x2 19. 1k Ω variable resistor x2 5. MAX232 20. Soldering gun 6. IR sensor set x2 21. PIC programmer 7. LED x5 22. Jumper wires 8. 10Mhz crystal 23. Male/female connectors 9. 30pF capacitor x4 10. 10µF capacitor x 7 11. Push button x2 12. 10kΩ resistors x2 13. 220Ω resistor 14. RS232 port 15. Webcam (akkord) Table 3.1: List of component The system consists of a simple webcam with fixed position to minimize the noise from the lighting changes. There is a sliding plate where consist of 5 slots. Only 4 slots can be used to place the items to keep the sliding plate still in the track of motor (motor still touching the plate). Picking mechanism move by another motor where it can push the defected items out. The sliding plate will have slits in between each slot to let the IR 27 sensors to sense the positions of the plates and stop at the right position. The motor speed will be controlled by PWM to avoid any overshoot of the position of the plate. Figure 3.2 shows the Circuit of the system built while Figure 3.3 shows the hardware and inspection items model. Figure 3.2: Circuit diagram and circuit built 28 (a) (b) Figure 3.3: Hardware and items a) 3D drawing of the system b) inspection item models 3.3 PROGRAM DESIGN 3.3.1 FLOW CHARTS Each features of the system has its own flow of programming. There are not interacted basically but will link to the main program (form) to give signal to the PIC. 29 Figure 3.4: Characters Inspection Program Flow Chart Figure 3.5: Pattern Inspection Program Flow Chart Figure 3.6: Colors Inspection Program Flow Chart 30 Figure 3.7: Chinese and Japanese Recognition Program Flow Chart The reason to let the program count to 4 is because the inspection system model built is limited to 4 items inspection at once. The items calculation is done by PIC18F. The more detail concept about this flow will be discussed in the sections below. 3.3.2 THRESHOLD Threshold process is the first concept that we should start in image processing. Basically, we need to turn the picture to black and white. The first step on OCR is capturing a good picture, where adequate lighting [15] is crucial in threshold process. Figure 3.9 shows the technique of threshold process to filter out the undesired colors and noise from a picture for the ease of pixels manipulation and calculation. 31 Each pixels has its own value of RGB (Red , Green, Blue) from 0 to 255.To threshold a pixel, we have to retrieve the RGB values of the pixels and multiply it with certain color ratio using equation 3.1: H= (x, y).R*CR+(x, y).G*CG+(x, y).B*CB ………………………………… (3.1) Where H is threshold value, (x, y) is location of pixel, (x, y).R*CR is red component of that particular pixel multiply with the red color ratio, (x, y).G*CG is green component of that particular pixel multiply with the green color ratio, (x, y).B*CB is blue component of that particular pixel multiply with the blue color ratio [16]. H value will vary according to the lighting from surrounding and it indicates darker colors. So, in the programming if H< lightsetting, pixel will be set to black. Else, it is white. For user friendly purpose, lightsetting is a variable form 0 to 255 set by users with simple Horizontal Drag Bar as shown in Figure 3.10. This is very important for the system used in different lighting conditions and flexibility of setting any threshold values for acquiring different quality of images as users like. The first program developed is not good because every time we try to capture an image, I will have to adjust the lighting threshold. This is very not user friendly and we will not know what is the threshold value suitable for that particular lighting condition. This will cause a difficulty in pixel manipulation and calculation process. I continue to develop my program to more immune to lighting changes. I design my software such that users need to calibrate the lighting tolerance by using the alphabet of ‘A’. To let the software become more user friendly, auto calibrate of the brightness of the picture taken is an good idea. Auto calibrate is a process to let the system adjust the 32 threshold value to a suitable threshold value under different lighting condition. In this case, a character ‘A’ is used as a sample. First of all, capture an image of the character ‘A’. Then, all we need to do is just press the button of ‘Auto Calibrate’. After that, the system will run automatically to adjust the threshold bar until suitable threshold value obtained. In order to do this, in the programming section, some looping of program is required. The basic structure of the programming flow is as Figure 3.8: Figure 3.8: Flow chart of automatic lighting threshold adjust From the programming code structure above, first of all, threshold the image. Then, calculate the total number of black pixel. In this case, earlier testing was conducted, and the best quality of threshold is when the total black pixels of ‘A’ is 415. If the image captured is in different lighting condition, it will affect the total number of black pixel. Hence, we can let the program calculate the total number of black pixel. If it is not equal to 415, then system will adjust the lighting adjustment bar. The process 33 continues until we get the total number of pixel equal to 415. Then, this is the best threshold value for that particular lighting condition. Figure 3.9: Image threshold process. Figure 3.10: The Lighting setting and auto calibrate function of the software. Figure 3.11: Sample ‘A’ captured under certain lighting condition. 34 3.3.3 RECOGNITION ALGORITHM The First trial of recognition is by applying Fuzzy Rules to find the number of black pixels distributed in different region as shown in Figure 3.12. Figure 3.12: 3 Distribution of Pixels For the ease of explanation, let’s take the character ‘T’ in Figure 3.12 as an example. If we move from left to right, we can observe that the distribution of the black pixel increase and become maximum at the middle of the character. Then, it decrease to the right. Same thing happened as we move from the top to bottom. The distribution of the pixel ixel maximum at the top, then decreases as we move to the middle and then, maintained. Through these characteristic curves curves,, we can actually differentiate a character. In this method, total number of pixels of each character is analyzed. Different character will form different shape of graph. The highest point of the graph will be considered as ‘High’ .Middle point is ‘Medium’, lowest point is ‘Low’. Different characters will have different sequences of ‘High’, ‘Medium’, and ‘Low’. If there are similar cases like ‘0’ and ‘H’, which will provide the same result, then the total number of pixels will be considered and etc. This method is not good enough after testing because there are many similarities of each character and we have to figure out more 35 ways to differentiate between them. This method also sensitive to lighting changes form surrounding. Table 3.2: Calculation of the distribution of black pixel for several characters The second trial which is the array matching method is preferred and applied in this project. Referring to Figure 3.16 (a), we can observe that each character or alphabet consists of black pixels and white pixels. First of all, we need to find the first point of the character, which is the intersection point between line ‘a’, and line ‘b’ in Figure 3.16. Then, the program will go through each pixel from that point until the end point, which is the intersection point between line ‘c’ and line‘d’. Black pixels will be saved as ‘1’ and white pixels will be saved as ‘0’. A two dimensions array is formed. This array (array A) will be compared with the arrays stored in the text earlier in the Teaching Function. Let’s take one example among the arrays stored and call it as array B. When we compare each array, we run through each elements or array A and array B simultaneously. When there are same elements like array A(x, y) = ‘1’ and array B (x, y) = ‘1’ or array A(x, y) = ‘0’ and array B (x, y) = ‘0’ it will be counted as one same bit. So the percentage of similarity can be calculated as in equation 3.2, number_ of _ same_ bit *100 …………. (3.2) area_ bounded_ by _ line _ abcd Percentage of similarity = 36 Each stored array will be recalled and compare with array A. Result is in percentage of similarity. The highest percentage will be the result of OCR. The recognition algorithm implemented is better than Fuzzy Rules that normally being applied due to the reason of less similarity case consideration and no overlapping characteristics of each character. With teaching function, it can handle more cases without reprogramming the software to add extra algorithm in differentiating characters. In addition, the array matching method will be more immune to lighting changes and poor threshold value selection. For example, case (a) in Figure 3.13 is the image captured. Case (b) is after threshold under correct value. So, the system can detect it as ‘Y’. Case (c) is under poorer threshold value but the system still can recognize it as ‘Y’. Because it still has the highest percentage of similarity. Figure 3.13: Image Under Different Lighting Threshold Value. For multi character, looping is required. After detecting the first character, extra programming is required to detect whether there are extra character inside the image. So, we have to loop the image again to detect any black pixel left. If there are black pixels, another array will be formed as shown in figure 3.14. Figure 3.14: Detecting and processing multi character. Boundaries formed in sequence a to z and 1 to 2 37 3.3.4 INTELLIGENT TEACH FUNCTION The earlier developed alphabet recognition was by differentiating the number of pixels of particular alphabets. User can teach the system to learn alphabets and the input will be stored in RAM. This is not good because many alphabets have same number of pixels, and RAM memory will loss when we restart the program. After several improvements, the system created is well equipped with an intelligent memorizing and recognition function. It is also equipped with camera function that directly link to any camera connected to the computer. Once the users capture the image to the system, Teaching Function will allow the program to detect the position of the character or alphabet captured and memorize the positions of black and white pixels (as shown in Figure 3.15 (b)) and store in an array and files to become a ROM. So the system will remember these characters even when the system is restarted. From Figure 3.15 (a), first of all the system will scan through x and y axis of the picture to find line ‘a’ (First meeting black pixel), then line ‘b’, ‘c’ and‘d’. A two dimensions array formed. Black pixels stored as ‘1’ and white pixels stored as ‘0’. This array will be exported to a text file to store as ROM. Then, users will prompt to a new window that requires the users to teach the system what is this shape represents and will be stored in a proper location. Teach Function completed. Hence, even any type of font and shape, the system is capable to learn it and turn to editable text. For Chinese and Japanese Characters Recognition, the same rules applied. (a) (b) Figure 3.15: Pixel manipulation (a) Forming array (b) Pixel Detection 38 After storing the array, user will be prompted to tell the system the meanings of that character or the array stored. This can be done by using a Input Box function. Then, the data from the Input Box will be stored in another file as shown in figure 3.16. If the user want to teach the system again, the following data will be saved in the same file path for array and the meanings. The new data will be saved at another line of the file. Hence, number of line in these files represents the number of data stored. For example, the array will be stored at file A line 1. The meanings will be stored at file B line 1. The following array and meanings will be stored at line 2 for each file. Then, the teaching process ended. System Teach Function is a very good idea because we don’t have to write the coding again if we need to recognize a new image. It is very easy to use with just a few simple steps, even a person who don’t know any programming technique can complete the task. Figure 3.16: Teach Function Input Box. 3.3.5 SHAPE RECOGNITION By applying the same concept of array matching, system can be programmed to do shape recognition and inspection. Here, features added, so the users can select 6 39 regions on pictures to do matching and inspection. The results will be returned in a form of percentage of similarity. And the users can select the threshold for it to become ‘PASS’ or ‘NOT PASS’. Shape recognition normally applied in PCB or IC lead inspection purposes. The flexibility for the user to select the region they want to inspect is very important to increase the speed of the inspection process. Even though it is more convenient to the user if we let the system to memories the whole circuit, but the speed of inspection will be decreased. So, in order to increase the speed, it is better to process only the regions selected by the user. Actually, user can still drag and select the whole circuit to let the system memories if they don’t care about the speed. The program is written in such a way that it will memories the array within the region of selection based on the image reference point. Image reference point is the first point of the image (the first black pixel at left and top). After memories, inspection process can be started. First of all, the program will run through the image from position x = 0 and y = 0, the first black pixel detected will be the reference point. Based on the reference point, coordinate of the stored array will be recalled and comparison process will be carried out. The result is percentage of similarity. If percentage of similarity is less than the threshold set by the user, it will generate a ‘NOT PASS’ code. Item will be rejected out. Else, the item will ‘PASS’. For the convenient of the user, program is written to memories the coordinates of the reference point and arrays in a text files. Even the user restart the program, the memories still retained, unless deleted by users. Example of shape recognition selection is as shown in figure 3.17. 40 Figure 3.17: Shape Inspection Selections. 3.3.6 COLOR RECOGNITION As stated earlier, each pixel has its own RGB value. Each pixel may have a very different RGB values after testing, although we cannot notice any difference of color between the two different dotes. This difference in values of RGB is caused by uneven lighting condition when capturing an image. In this project, color recognition function consists of 2 modes. Regional Color Inspection and Percentage Color Inspection. User can select maximum number of 3 colors to recognize by mouse dragging. Any colors can be selected. Without worrying the RGB values, computer can handle for the rest. Figure 3.18: Color Selection (red rectangle). 41 From the region selected as shown in Figure 3.18, computer will run through each pixel within that particular area and get the values of R, G and B of each pixel and get the average as in equation 3.3; Average R = R _Value_ of _ each_ pixel ……………………………………. total _ number_ of _ pixel (3.3) Similar calculation for G, green and B, blue. These values will be memorized by the system Tolerance can be adjusted by simple scroll bar provided to the users in the software interface. When in run time, if the system gets a different value of RGB in one of the region selected, the item will be rejected out. The items will pass, if they follow the rules as below. Average R of item - tolerance Average R stored Average R of item+ tolerance Average G of item - tolerance Average G stored Average G of item+ tolerance Average B of item - tolerance Average B stored Average B of item+ tolerance For Percentage Color inspection, users can select any maximum of 3 colors. Selection is as easy as previous function like Figure 3.18. This time, the system will get the average RGB values and detect the whole picture to see if there is similar color detected or not. System will filter out other unselected colors and calculate the percentage of the selected colors in the picture. Users can set a threshold value of percentage. If exceed the threshold percentage, the item will be rejected. 42 The program will run through each pixels of the image. If the pixels RGB values are in the range of the selected RGB, it will be displayed. Else, Other colors of the image will be turned to white color. Hence, calculation can be made according to equation 3.4; colored _ pixel total _ number _ of _ pixel Percentage of selected color in image = …….. (3.4) By referring to Figure 36, if users set threshold of ‘PASS’ is 2 % each, after processing, Red color is 4%. It will generate a ‘NOT PASS’ alert and this item will be rejected. Figure 3.19 also shows the interface of colors threshold setting for user. This concept is applicable in organic materials inspections where we need to determine percentage of colors in rotten fruits and tobacco leaf to determine its grades and quality. Figure 3.19: Percentage Color Inspection and Threshold Selections 43 3.3.7 DIFFERENT CHARACTER SIZE RECOGNITION How about different size of the character? The user may ask. The answer will be no problem for my software. Character recognition for different size of font is little bit complicated compared to normal array matching. Normally different size of the character of alphabet not applied in mark inspection. But this project is study about OCR. So, other applications like car’s plat number recognition should be considered. Similar method is applied in this function to allow the system to recognize different sizes of fonts. This time, we cannot go through each pixel of different fonts because we will get different numbers of pixels for sure. We can divide the character in N x N parts as in Figure 3.20. Through observation, we can realize that the area occupied by the dark pixels in each small part will be same even the character ‘grow up’ bigger. So, we can store the 2 dimensions array based on concept that if dark area more that 50% of the total area of each small part, will be stored as ‘1’. Otherwise, it will be stored as ‘0’. After that, same method of array matching can be used. Figure 3.20: Character of Different Size 44 3.3.8 CHINESE RECOGNITION Chinese recognition is one of the functions of OCR. There are many software available in the market to do such work. For sure, the coding of these software are kept private for the developers. The Chinese recognition part of this project is solely own ideas without references and helps of image processing tools from other source. First of all, image captured will be threshold to black and white automatically. If the threshold quality is not good, user can readjust the threshold value again. The distance of the image from the camera is fixed. For this function, the software can only recognize certain size of character. User must teach the system first with a set of characters. For this purpose, Teach Function is prepared for the user to do instant image recognition. After threshold the image, a little programming is required to let the user select the character they want to recognize. As the user move the mouse to the threshold image, a red color square must be provided to the user as a guide to proper character selection as shown in Figure 3.21. This can be done with mouse move event and system drawing function provided in visual basic [17]. 45 Rich text box Selection guide Top 5 arrays Figure 3.21: Chinese character recognition and translation interface. The image within the selection region will be changed to array and stored inside the computer text file when user clicks. Then, an Input Box should be prompt out to ask the user insert the meanings in English and will be stored in another file path. Chinese characters sometimes can’t stand alone to represent one meaning. To solve this problem, we must turn the image recognized into editable text (text we normally type inside word file). Hence, followed by the Input Box that prompt the user to insert the character’s meanings in English, another Input Box must be prompted to the user to type the Chinese character and stored inside the system. Hence, in order to teach the system, one must know Chinese well, and the computer must be able to let the user inserts Chinese characters. After teaching, the system will be able to recognize the same image when user selects the image again. The concept is the same for the alphabet recognition, based on array matching. When the system not in Teach Mode, if the user clicks the character, the array bounded by the red square will be saved in a temporary array inside the program. This array will be used to compare each line of the arrays stored inside the computer text 46 file. The comparison result is in percentage of similarity. After comparing all the arrays stored inside the computer, the top 5 of the arrays that having the highest percentage of similarity will be the recognition result. The locations of these arrays are based on the number of line in the text file. The meanings of these arrays also having the same number of lines (same location), but in another text file. Once the locations of the top 5 arrays are known, the meanings can be recalled and display in the text boxes in the software. When the user clicks the text boxes, the text will be copied to the rich text box at right hand side of the software interface. This is very important to let the user copy the text to other path of the computer. If the user click the ‘EXPORT’ button, an event should be call (System.Diagnostic.Start( )) to start an empty text file and copy the contain of the rich text box inside this text file. Hence, user can save the recognized text to the other path of the computer in text file format. 3.3.9 JAPANESE RECOGNITION AND PRONUNCIATION GUIDES Japanese characters can’t stand alone to represent one meaning. Therefore, we can’t provide the translation of the Japanese characters to English in this project. It is better to develop the software to just provide pronunciation guides due to time constraint. There are 3 sets of Japanese characters. Katakana, Hiragana, and Kanji. Hiragana are part of the Japanese writing system. Japanese writing normally consists of kanji which are used for the main words in a sentence, and hiragana which are used for the little words that make up the grammar (in English these would be words like “from” and 47 “his”). Hiragana is also used for the endings of some of the words [18]. Katakana is a Japanese syllabary, one component of the Japanese writing system along with hiragana, kanji, and in some cases the Latin alphabet. The word katakana means "fragmentary kana", as the katakana scripts are derived from components of more complex kanji. Katakana are characterized by short, straight strokes and angular corners, and are the simplest of the Japanese scripts [19]. Figure 3.22: Set of Hiragana [18] 48 Figure 3.23: Set of katakana with some Modern digraph additions with diacritics. [19] The method of teaching, recognizing are storing arrays are basically the same with the Chinese recognition. Instead of ask user to key in the meanings, this time the Input Box will ask to user to key in the pronunciation of that particular character. There is one part that made the Japanese recognition system becoming more difficult, which is the ‘yoon’ part of the Hiragana and diacritics part of the katakana. For these two parts, when certain characters come together, their pronunciation will be different. Here, we should pay attention while in programming. First of all, system has been taught with the vowels of Hiragana and monographs of katakana by using the same method as the Chinese recognition part. Now, the system will be able to recognize and provide pronunciation guides for single character of Hiragana or Katakana. But this is not enough because it will provides wrong pronunciation guide for case of ‘yoon’ and diacritics. For example, referring to the 49 Figure 3.22, when ‘ki’ followed by ‘ya’, the pronunciation will become ‘kya’ instead of ‘ki ya’. Mode Figure 3.24: Japanese Recognition Interface Tricky part Figure 3.25: Japanese Recognition Interface with combination of characters. 50 Another example of combination of character is as shown in Figure 3.25, the ‘mya’. In the programming part, a text box should be provided below the rich text box of the pronunciation guide to do further notice to the user about the correct pronunciation of ‘mi ya’. This text box should be hidden again when user do further selection of text. How can we do this? The trick is at the ‘Tricky part’ as shown in Figure 3.25. This part is provided to show the user about the pronunciation of current selection and the previous selection. This part playing an important role in programming to detect any combination of characters that provide different pronunciation. For example, when ‘mi’ is the previous selection, and ‘ya’ is the current selection, the text box will appeared and provides extra pronunciation guides. For other combinations, same method will be applied. So, in programming part, while providing pronunciation guide, extra programming technique like ‘Select case…case ‘mi’..if followed by ‘ya’..End Select’ then prompt out the proper pronunciation should be applied. Next, how the system knows the image captured and selected is in Chinese or Japanese? In this case, by referring to Figure 3.25, the interface has provided a menu strip to let the user to change from Chinese recognition mode to Japanese recognition mode. In the programming part, if the user selects the Japanese mode, only the Japanese recognition text files will be access. If the user selects the Chinese recognition mode, only the Chinese recognition text files will be access. 51 3.3.10 SERIAL COMMUNICATION In this project, serial communication being used to communicate with the hardware to let the hardware know what action should be taken to the item that not pass or pass the test. Serial communication is done by using RS232 and MAX232 chip to connect computer with the PIC18F452. The following coding is the serial communication module provided by mikroC helps menu. void main() { USART_init(9600); // initialize USART module // (8 bit, 9600 baud rate, no parity bit...) while (1) { if (USART_Data_Ready()) { i = USART_Read(); USART_Write(i-32); // if data is received // read the received data // send data via USART }}} By observing this code, we can modify the coding to suit our needs. When receive certain data, we can launch certain function. After the task finish, we can send back certain data to the computer to update the status. APPENDIX D is the modified coding to suit this project. The data is 8 bits data, in this case, it is a character (alphabet) because character length is 8 bits. The coding from APPENDIX D is for PIC18F452. For the software to “talk” with the PIC, another set of coding is needed. In Visual Basic 2005, the serial communication part has been improved compared to the previous version. Library must be included at the beginning of the coding as shown below. 52 Import System.IO.Ports Import System.Runtime.Remoting.Messaging The detailed coding can refer to www.lvr.com [20]. After the declaration, a dialog box (Figure 3.26) have to be created to let the user choose the number of comport, baud rate, hand shaking, and so on. For this project, a small letter will be sent to the PIC to tell the PIC to do different work according to different alphabet sent. After finish certain work assigned, PIC will send back the same alphabet, but in capital letter to update the status. Capital letter can be obtained by minus the ASCII code by 32. In the programming part, any small letter typed inside the text box of the ‘COMPORT ACTIVITY’ will be sent to the PIC. The echo also will be displayed at the same text box. So, the user can control the movement of the hardware manually by using keyboard. In serial communication data sending, if the OCR inspection passes, ‘a’ will be sent to the PIC, so the hardware will move to the next item. If OCR inspection not passes, ‘b’ will be sent to PIC, it will activate the rejecter after the Item Slots moved to the next item. The hardware take action according to the serial data sent to PIC. For more detail, refer to APPENDIX D. Figure 3.26: Port Setting Dialog Box. 53 3.4 HARDWARE AND CIRCUIT DESIGN 3.4.1 LIGHTING Lighting playing an important role for image capturing. Proper lighting helps to reduce noise for the image capturing process. First of all, distance of the camera from image has to be fixed before continue to write the OCR program. With the help of super bright L.E.D, we can start to adjust the value of threshold and image position. Figure 3.27 below is the lighting set up process and criteria. (a) (b) Figure 3.27: Camera and lighting positioning a) Camera Distance Fixing b)Lighting set up The lighting for L.E.D. is 50 mm x 50 mm surrounding the picture as shown in Figure 3.27 (b). The L.E.D. used are white L.E.D. For testing purpose (circuit on proto board), 11 L.E.D. being used. For the real circuit, only 7 L.E.D. are required. The reason using L.E.D. as lighting source is, energy saving and long life time. The lighting angle is about 45 to 50 degrees pivot from horizontal, which is, from the image. Camera distance from image is 640mm. This distance should be fixed all the time to avoid any changes of the font size. 54 3.4.2 REJECTER DESIGN The rejecter consists of a 5V small DC motor and two ‘legs’ to push out items that not pass the inspection test as shown in Figure 3.28. Motor Legs Figure 3.28: Rejecter design and position. Red arrow shows the movement path of rejecter. The rejecter is one of the motors driven by the motor driver. Legs are the small aluminum plates stick on the moveable part of the rejecter. Small pieces of sponge stick on the legs to enlarge the bottom area of the legs to push out items. Actually, this is a part of CD player that we normally can find. The position of the rejecter should be very carefully placed to avoid any crashing of the system. Rejecter placed next to the camera so it can push out the item immediately after the software has process the image. It can be program to move 55 forward and backward. Care should be taken to not damage the motor while programming because this rejecter has limited place to move. In programming part, let the motor move forward for 110ms and stop for 300ms before move backward for 120ms. These values obtained through trial and error. If we let the motor move forward or backward longer in time, the gears will be damaged. In addition, if we are not stopping the motor for a while after moving forward before move to backward, the motor will not last long also. Rejecter should be supported to achieve proper height (100mm) as shown in the Figure 3.29 below. 100mm Figure 3.29: Rejecter should be at proper height a) view from behind b) side view 3.4.3 ITEM SLOTS Item slots or inspection slots is the place where we put the items that we need to inspect on it. The proper dimension of Item Slots is as shown in the APPENDIX B. The real Item Slots built is as shown in the Figure 3.30 below. Several magnets stick below the Item Slots to hold the inspection items on their position. In real case, through mechanical fabrication, magnets are not required. 56 Slits (a) (b) Magnets to hold inspection items (c) Figure 3.30: Item Slots a) top view b) side view c) bottom view 57 To move the slots, a small 12V DC motor was used under the Item Slots as shown in Figure 3.31. An anti-slide material (for car dashboard) sticks along the Item Slots and the motor gear to create friction for the smooth movement of motor instead of using expensive motor conveyer belt and timing belt. Motor Anti-slide material Figure 3.31: Motor for Item Slots In order to let the slots stop at exact location for the camera to capture the image, some effort has been paid. The first trial is by using black strips stick on the back of the Item Slots. Then, the IR sensor pair is placed below the camera to detect the reflection of light. By detecting black strips, it will give signal to the PIC to let the Item Slots stop. Unfortunately, after testing the mechanism under different lighting conditions, the Item Slots not stop at same position each time. This is due to the noise from surrounding. The different intensity of light, will affect the sensitivity of the IR sensor. The first trial is as shown in Figures 3.32 and 3.33. 58 Black strip IR sensor Figure 3.32: First trial (detecting the black strips) Figure 3.33: IR sensor function by reflection After that, a better method has to be figured out. By applying the concept of rotary encoder, several slits were built along the slots as shown in figure 3.34, and shield the IR sensor to let it immune to lighting changes from the surrounding. This time, IR sensor was configured to detect the slits, not base on reflection. The problem solved. Figure 3.34: IR sensor detecting the slit 59 Transmitter and receiver inside Figure 3.35: Shielded IR sensor Figure 3.36: Slits on Item Slots 60 CHAPTER 4 PROJECT IMPLEMENTATION This chapter basically about discussion of how to use or implement the software and hardware of the system built and results of the system implemented. 4.1 OPERATION DESCRIPTION This project consists of software part and hardware part. Software part are the OCR inspection with teach function, Shape recognition, color recognition, Chinese recognition with English translation, Japanese recognition with pronunciation guides , lastly, different size of font recognition. For hardware part, we don’t have to take care about it much because it is controlled by the software with serial communication. 61 4.2 SYSTEM IMPLEMENTATION AND RESULT 4.2.1 SOFTWARE IMPLEMENTATION After writing the software through Visual Basic 2005, an exe file will be built when we try to simulate the software. The exe file is stored inside the bin file in the project as shown in Figure 4.1. This exe file will let the user run the system created. When user double click the exe file, the Graphical User Interface created will be executed. Figure 4.1: Location of the exe file 4.2.1.1 CHARACTER RECOGNITION AND INSPECTION After clicking the exe file, the first interface in the OCR inspection interface. This is an interface that consists of various functions. A menu strip, with drop down lists, a lighting threshold adjustment bar, Available Devices display for displaying available 62 cameras connected to the computer, several function buttons, comport activity and status display, and Teach function as shown in figure 4.2. Menu strip Function buttons Camera available Comport status Image frames OCR display Teach mode Lighting threshold adjustment Automatic inspection button Figure 4.2: Character recognition and inspection interface First of all, we should activate the comport to let the computer communicate with the hardware. The hardware must be switch on. In the menu strip, click ‘SETTING’, then, from the drop down list, select ‘COMPORT’. After that, a window interface will be popped out as figure 4.3 to let the user select the comport, baud rate, handshaking and 63 so on. For this case, select the COM1 and baud rate of 9600 since 9600 is used in writing mikroC. Figure 4.3: Port setting user interface. After setting the comport, user should click ‘OK’. Then user will back to the earlier OCR interface. This project communicates with the hardware by sending 8 bit character. User can observe the character sent and received by selecting ‘COMPORT ACTIVITY’ in the ‘SETTING’ section of the menu strip. A text box will be displayed above the comport status text box. Then, user should click ‘OPEN PORT’ in the OCR interface. The button will change to ‘CLOSE PORT’ because it is programmed as a multi-function button. User can click the same button again to close the comport. The camera should be connected to the CPU to enable the ‘Start Preview’ button. After clicking the start preview button, a camera image will be displayed in the picture box as shown in Figure 4.4. Then, user can adjust the position of the items based on the camera image. For OCR inspection, the program is written such a way that it can recognize the image at any position of the picture box, but the image must not be rotated. Rotated image will generate error in the software and the inspection process will stop. 64 Figure 4.4: Image Preview from camera. After that, user can adjust the lighting bar to adjust the image threshold value and click the ‘Black White’ button until a satisfied image occur in the next picture box. User can use the Auto calibrate button by applying the method as stated earlier. After capturing a good image with suitable threshold value, user can try the OCR result by clicking ‘OCR3’. The software will process the image and the character recognized will be displayed in the text box labeled OCR result. If the result is not the same as the characters on the item, the threshold value should be adjusted again or the position of the item is not correct or rotated. There is a section called ‘OCR Model’ below the ‘OCR result’ section. This is the place where the user inserts the model for the inspection. If the character recognized in the ‘OCR result’ part not the same with the ‘OCR Model’ inserted, NOT PASS code will be generated as shown in Figure 4.5. Then, the item will be rejected. If same, PASS code will be generated. Of course, the OCR Model can be changed. User can select the ‘INSERT OCR MODEL’ in the ‘SETTING’ section in the menu strip. An Input Box will appear to ask the user to key in the OCR model. Proper guides on how to key in the OCR model has been provided in the Input Box. 65 If the OCR result match with the characters in the image and the hardware is at the correct position, user can press the green button and start the Inspection process. There is a small text box labeled ‘Item’. It is created for display the item inspected and count the number of inspection item. If the number is 4, the inspection will be ended. The slots will be reset back to the initial position. To emergency stop the inspection process, user can click the ‘STOP INSPECTION’ button displayed in red color where it is initially green color and displayed ‘START INSPECTION’. After starting the inspection, the hardware will move, and the software will capture and analyze the image automatically. The movement of hardware will be discussed in the hardware implementation part. Different Figure 4.5: NOT PASS code generated The ‘OCR TEACH ENABLE’ item in the ‘SETTING’ section of the menu strip enables the user to change to Teach Mode. This function is password protected to avoid any misusing (password is ckho). User requires some knowledge of the program flow to teach the system. It will activate the ‘Teach’ button. This system is not “mature” enough. There are still some symbols or characters that the system not recognizes. To improve the system, user can always teach the system by pressing the ‘Teach’ button 66 after capture and threshold the image. The system will memorizes and stores the array or pattern in the computer as ROM. The image captured should contain only one character regardless any position (none rotated). The teaching process shown in Figure 4.6 below. After clicking the teach button, an Input Box will occurs to ask user insert the character in the image. Here, user can key in according to the image. Next, click ‘OK’. Teach completed. Now, user can try the result by clicking OCR3. The result will be displayed as shown in Figure 4.7. NOT PASS code generated because it is not same with the OCR model. Now, the system can recognize the inverted R. How to erase the taught character if mistake was made in the Teach process? This can be done by entering the text files named “OCRdatabase.txt” and delete the highlighted part as shown in Figure 4.8. Here, we can observe, the system save the characters in the form of array element ‘1’ and ‘0’ as stated earlier. The system already programmed to do 500 alphabets, 10000 Chinese characters, 10000 Japanese characters. The amount can be easily increased by little effort of programming. Figure 4.6: Teach function of the system 67 Figure 4.7: OCR result after teaching the system. Figure 4.8: database in a text file for system recognition. 68 4.2.1.2 SHAPE RECOGNITION AND PCB INSPECTION In the menu strip of the OCR inspection interface, user can select the ‘PROJECT’ item. A drop down list occurs. ‘Shape recognition’ item can be selected to enter next windows, the shape recognition mode. To do inspection, the comport must be activated, hardware must be switched on. A new interface for shape recognition will occur as shown in Figure 4.9. Menu strip Camera frame Camera Function Function buttons Image frame Threshold adjustment Percentage setting Figure 4.9: Shape recognition interface. Selection Start inspection 69 First of all, camera preview can be started by pressing ‘Start Preview’ button. After that, threshold of the image can be adjusted by adjusting lighting threshold bar. Then, user can press ‘Black White’ button to see the result of threshold image. If the threshold is not satisfying, adjust the threshold again to get clearer black and white image. As shown in Figure 4.10. Figure 4.10: Threshold image of a simple PCB pattern. User can select maximum of 6 parts to do inspection as stated earlier. This technique called blob discovery. First of all, remember that the system save the pattern in ROM. Hence, the ‘Delete’ button should be pressed to delete any previous selection of previous pattern in the memory. User will prompt to an Input Box to confirm. After delete the previous pattern, selection can be started. Selection can be done in the Image Frame by dragging the parts that user wants to inspect by mouse. A red square will occurs at the selection part. Then, click the ‘Memories Blob’ button to memories the parts. The number of parts memorized will be shown in the Selection section in light blue color. Example shown in Figure 4.11, where two parts are selected. 70 Figure 4.11: Marked selection region. Then, the ‘SET PERCENTAGE TO PASS’ should be adjusted. If the image is similar, it will have higher percentage of similarity. This section will let the user set a threshold to let the item pass or not. If the percentage of similarity calculated less than this value, it will be rejected out in the inspection process. User can test the result by capturing an image of defected PCB as shown in Figure 4.12. After capturing the image, just press ‘Black White’ button, then, press ‘Blob Inspection’ button to observe the result. Through the testing, the defective PCB is missing one part, which is the first selection part. It has lower percentage of similarity (63.141 %). The percentage of similarity not equal to 0 because while in selection, some of the white region has been selected and the white region still match with the defective circuit. The non defective part has 90.833% (not 100% because we can’t achieve perfect positioning) of similarity. The threshold percentage setting was 83% (set by trial and error). One of the parts calculated not pass, will cause the item rejected. 71 After selection and threshold testing, the inspection can be started by clicking the ‘START INSPECTION’ button, which is green in color. The system will run automatically. Figure 4.12: Result of shape inspection 4.2.1.3 COLOR RECOGNITION AND INSPECTION To enter the color inspection mode, in the menu strip of the OCR inspection interface, user can select the ‘PROJECT’ item. A drop down list occurs. ‘Color recognition’ item can be selected and enter the next windows, the color recognition mode. First of all, to enable the Group Box, select the ‘Percentage color inspection’ and ‘Regional color inspection’ in the ‘Menu’ in the menu strip. Then, tick the radio button of the ‘Region’ section to enable mouse selection. 72 Menu strip Mouse marking Camera frame Function buttons Image frame Percentage setting Result frames RGB threshold setting RGB values display Figure 4.13: Color recognition interface. The color recognition interface consists of two modes. First is Regional Color Inspection. Second is Percentage Color Inspection. For Regional color inspection, user can select maximum of 3 colors to inspect. First of all, the ‘Delete’ button should be pressed in order to delete previous stored data. Next, click the ‘Start Preview’ button to enable the camera. Then, press the ‘Capture’ button to capture the image and display at Image frame. After that, user can drag and select any three regions in the Image frame in order to let the system memorize. The selected regions will be marked with red squares. Let the system memories the colors by clicking ‘Memories Color’ in the function 73 buttons section. Adjust the RGB threshold by adjusting the ‘REGIONAL’ scrollbar. Tolerance of 20 will be sufficient for this camera. If the camera quality is better, probably tolerance will be lower. The interface should be similar with Figure 4.14. Figure 4.14: Regional color inspection mode. Now, the system memorized the three colors in each different position. User can try the result by clicking ‘Color Inspect’ button. The result should be PASS as shown in Figure 4.15. Then, try a defective item. The result should be NOT PASS as shown in Figure 4.16. The tolerance should be adjusted if the system gives PASS signal to the defective colors. If one of the colors different, the item considered not pass. After the testing, user can start the inspection by clicking ‘START INSPECTION (REGIONAL)’ button, which is green in color. 74 Figure 4.15: Regional color inspection for PASS item. Figure 4.16: Regional color inspection for NOT PASS item. 75 For percentage color inspection, the system will detect and calculate the percentage of certain color in the item instead of detecting the colors of certain regions. ‘Delete’ button should be pressed. Selection can be made by same method. RGB threshold can be adjusted to get better filter of colors in the Result frames. The interface should be similar with Figure 4.17 after clicking ‘Color % Inspect’ to try the result. Through Figure 4.17, the filtered colors are not good enough. RGB threshold has to be adjusted again. The result is NOT PASS because the percentage setting to pass is 0 %. For this case, the colors on the item will be limited to 3%. The text boxes of Percentage setting section have to be changed to 3 %. If one of the colors selected exceed 3 %, it will be rejected out. Figure 4.17: Percentage color inspection mode. After increased the RGB tolerance to 20 of each, and the percentage have been changed, the result should be PASS after clicking the ‘Color % Inspect’ button. Colors filtered become clearer. The PERCENTAGE section is the calculation of percentage of 76 selected colors on the item. The result is shown in Figure 4.18. Figure 4.19 shows the PASS item, while Figure 4.20 shows the item that will be rejected by the system. To start the inspection and let the system run automatically, user should click the ‘START INSPECTION (PERCENTAGE)’, which is green color after carried out the testing procedures above. Figure 4.18: Improved setting to filter out colors. 77 Figure 4.19: Color percentage inspection is independent of the item’s pattern. Figure 4.20: Item that not pass the test because over percentage. 78 4.2.1.4 CHINESE RECOGNITION AND TRANSLATION To enter Chinese recognition mode, user can click the ‘Chinese’ item in the ‘PROJECT’ item in the menu strip of the OCR inspection interface. A new window will be popped out to user as shown in Figure 4.21. Menu strip Camera frame Recognition result Image frame Lighting threshold adjustment Editable text section Mode display Counting Figure 4.21: Chinese recognition interface. Using the same method, image can be acquired and threshold. The lighting threshold should be adjusted to get clear black and white image each time. The 79 computer should be equipped with Chinese software in order to display and type Chinese characters. NjStar communicator will be used in this project. After captured an image, threshold the image by clicking ‘Black White’ button. As user move the mouse into the Image frame, a red square will occur. User has to move the square to surround certain character in the image. After that, click the mouse. The recognition result and translation will occur in the Recognition result section. The result should be similar with Figure 4.22. There are 5 characters appeared simultaneously in the Recognition result section. User has to select the best result. In this case, is the forth one. If the user click the text box in the Recognition result section, the result will turn to editable text in the Editable text section as shown in Figure 4.23. Text box Figure 4.22: Chinese recognition result. 80 Figure 4.23: Chinese recognition result turned to editable text. After a few recognitions, user can export the editable text to other path of the computer in text file format by clicking the ‘EXPORT’ button. A text file with the editable text on it will be popped out. Here, user can save the file to desired path as shown in Figure 4.24. Figure 4.24: Chinese recognition result can be exported. 81 The system equipped with intelligent teach function. If the system can’t recognize the character, enable the teach function with the same password. Teach mode enable is in the menu strip, ‘Menu’ item. System will indicate the Teach mode in mode display section. Select the character with the red square, an Input Box will occur. User has to key in the English meaning of the Chinese character here and click OK. Then, a message box will prompt user to enter the Chinese character in the text box occur and shown in Figure 4.26. After key in the Chinese character, click the ‘SAVE’ button. End the teach mode by clicking the ‘Teach Mode Enable’ again in the Menu. After that, the system will recognize the image when the character being selected again as shown in figure 4.27. If user wants to erase the taught character, same method will be applied as OCR inspection interface. The files should be access are “chinese.txt”, “ChineseEng_database.txt”, and “Chinese_database.txt”. The last line of these files should be deleted. Figure 4.25: Chinese recognition Teach Mode. 82 Figure 4.26: Key in the Character by Chinese software in the text box. Figure 4.27: New recognition result. 83 4.2.1.5 JAPANESE RECOGNITION AND PRONUNCIATION GUIDES The same method will be applied as Chinese recognition. The only different is the extra rich text box provided for pronunciation guides beside the editable text section. To change from Chinese recognition to Japanese recognition, click the ‘Japanese Recognition’ in the ‘Setting’ section in the menu strip. Mode indication will be changed. The teaching procedures and export procedures are the same. The results are as shown in Figure 4.28 and 4.29. When there is a combination of pronunciation, the system will prompt the user as shown in Figure 4.29. To delete the taught character, this time, user have to access the “japanese.txt”, “JapanesePro_databese.txt” and “Japanese_database.txt” and delete the last line. Figure 4.28: Japanese Recognition result. 84 Figure 4.29: Japanese recognition result for combination of pronunciation. 4.2.1.6 DIFFERENT CHARACTER SIZE RECOGNITION For different character recognition, it is built for other application like plat number recognition. To enter this mode, at the OCR inspection interface, select the ‘Plat Number’ item in the ‘PROJECT’ item of the menu strip. After the window popped out, acquired image as the method above. Try to recognize different size of characters by adjust the distance of the camera to the image and click the OCR button. The system still be able to recognize the image. Sometimes, error will occur. User can teach the system by using the same method as OCR inspection interface. To teach the system, press the ‘Recognize’ button. 85 Figure 4.30: Larger size character recognition. Figure 4.31: Smaller size characters recognition 86 4.2.2 HARDWARE IMPLEMENTATION Hardware implementation is very simple. First of all, we have to make sure the Item Slots is at its initial position. As shown in the Figure 4.32. Switch on both power supply and we will observe that the LED lighted as shown in the Figure 4.32. Camera should be connected to the USB port of the CPU. The inspection items model should be placed on the Item Slots as shown in Figure 4.33. System will scan through the items one by one after user clicked the ‘START INSPECTION’ buttons in each mode. Rejecter will reject the items according to the inspection results. After scanned through 4 items, the Item Slots will reset back to initial position. MAX 232 to RS232 Track Circuit Camera PIC programmer Rejecter Lighting Figure 4.32: Overall system hardware. Item Slots initial position 87 (a) (b) Figure 4.33: a) Inspection items position b) System scan through each items Red light indicate NOT PASS (a) Green light indicate PASS (b) Figure 4.34: System flow a) Rejecter reject the item b) Process continue until last item Figure 4.35: System reset to initial position automatically after finish. 88 4.3 RESULT ANALYSIS After numbers of testing for each mode of the system, errors occur most frequent under OCR inspection mode. Errors occur due to position of the items rotated. System can’t analyze the rotated image and cause system generates error and must be restarted. Effort has been taken in the programming part to handle this problem. When error occurs, the system will stop the inspection and gives error message to the user instead of restarting the program again. This is a temporary method. More suitable method should be figured out to recognize rotated image. Normally, type of error occurs for shape inspection is the same with OCR inspection. Rotated items will cause the percentage of similarity decreased. Hence system will reject the rotated items even it is none defective. For this project, a small metal plate was stick at the bottom face of the inspection item. It can help to hold the position of the items with the present of magnets below the Item Slots. There is no problem for other inspection and recognition mode. The speed of the inspection is about 3 to 4 seconds for each item. For color recognition, it will take longer time if user selects more color. Process speed can be increased by closing all other running program of the computer and use better processor. 89 CHAPTER 5 CONCLUSION AND FUTURE WORKS 5.1 CONCLUSION OCR system with various functions has been created. Although there are still some weakness from the mechanical parts, but this can be solve through real accurate mechanical parts fabrication. The speed of the whole process can be increase drastically by computers that with greater processor speed. The concepts of the software parts can be apply in materials inspections and language learning and turn to editable text. More features can be added to maximize OCR function in the future. The whole project is basically study and applies of OCR knowledge. Once we understand the image processing skills, we will be able to generate a lot of ideas to let the computers work for us. 90 5.2 FUTURE WORK RECOMMENDATION In the future, the weakness of the mechanical part in this project can be improved by real mechanical fabrication. Accurate mechanical fabrication can avoid the inspection items to rotate from original position. Better CPU processor should be used to increase the speed of image processing and pixel calculation. The whole project should be shielded by using dark Perspex to avoid any lighting noise from surrounding. More reference point should be taken for shape recognition and inspection instead of only one reference point in this project. Quality of the camera should be improved to increase the image capturing speed. The character recognition software can be modified and install inside the hand phone. When we are in foreign country, it can help us to capture and translates any language that we are not understand. Figure 5.1: Use hand phone’s to translate any language. 91 REFERENCES 1. “Optical Character Recognition”, Wikipedia, The free encyclopedia. http://en.wikipedia.org/wiki/OCR. 2. “Machine Vision” Wikipedia, The free encyclopedia. http://en.wikipedia.org/wiki/Machine_Vision. 3. Leow Yee Run, “Neural Network Simulator for Alphabet Recognition with Speech Synthesizer”, Final Year Thesis UTM, FKE, 5 SEM, 2003/2004. 4. Noor Khafilah Binti Khalid, “An Image Processing Approach Towards Classification of Defects Printed Circuit Board”, Final Year Thesis UTM, FKE, 5 SEM, 2007/2008. 5. Azizul Bin Kepli, “Vision Based autonomous Color Detection and Object Tracking Robot”, Final Year Thesis UTM, FKE, 5 SEM, 2003/2004. 6. William A. Gowan, “Optical Character Recognition Using Fuzzy Logic”, free scale semiconductor, http:// www.freescale.com/files/.../doc/app.../AN1220_D.pdf , 2004. 7. Hao Song & WeiXing Wang, “A new separation algorithm for overlapping blood cells using shape analysis”, “International journal of pattern recognition and artificial intelligent”, world Scientist publishing Company, vol. 23 pg 847864. 8. “An assessment of Geometric activity features for per pixel Classification of urban man-made objects using very high resolution satelite imagery”, “Photogrammetric Engineering and Remote Sensing”, vol.75 April 2009, pg 397-411. 92 9. Zhangquan, “Modification of pixel swapping algorithm with initialization from a sub pixel / pixel spatial attraction model”, “Photogrammetric Engineering and Remote Sensing”, vol.75 April 2009, pg 557-567. 10. Mikael Laine, “A STANDALONE OCR SYSTEM FOR MOBILE CAMERAPHONES” , University of Turku, Department of Information Technology and Turku Centre for Computer Science (TUCS) , 2006. 11. Zhidong Lu, “A Robust, Language-Independent OCR System”, Cambridge, MA 02138, http:// www.metacarta.com/.../Language-independent-OCR-Kornai.pdf 12. “Baud Rate”, http://www.pccompci.com/Baud_Rate.html 13. Lammert Bies, “RS232 Specifications and standards”, http://www.lammertbies.nl/comm/info/RS-232_specs.html, 2009. 14. Kai Rider and Kai Tracer, “Use of Infrared Transimitter and Receiver to detect black or white line”, http://irbasic.blogspot.com/, Friday, November 24, 2006 15. Nello Zuech, President, Vision Systems International, “Considerations in OCR/OCV Applications”, Kallett’s Corner,Machine Vision Market analysis ,2009. http://www.machinevisiononline.org/ 16. “KnowledgeBase for ActiveReports for .NET”, http://www.datadynamics.com/forums/76296/ShowPost.aspx 17. Evangelos Petroutsos, “Mastering Microsoft Visual Basic 2005 ”, Wiley Publishing, Inc, 2006 . 93 18. “Hiragana” , Wikipedia, The free encyclopedia, http://en.wikipedia.org/wiki/Hiragana. 19. “Katakana”, Wikipedia, The free encyclopedia, http://en.wikipedia.org/wiki/Katakana 20. Jan Axelson, “Lakeview Research”, http://www.lvr.com. 21. “Machine Vision” Wikipedia, The free encyclopedia. http://en.wikipedia.org/wiki/Machine_Vision. 22. “MAX232, MAX232I, DUAL EIA-232 DRIVERS/RECEIVERS”, http://focus.ti.com/lit/ds/symlink/max232.pdf, TEXAS INSTRUMENT, FEBRUARY 1989 − REVISED MARCH 2004. 23. Wichit Sirichote, “RS232C Level Converter” , http://chaokhun.kmitl.ac.th/~kswichit/MAX232/MAX232.htm 24. Shoaib Ali , “To test MAX232”, Shoaib Ali, http://www.arcelect.com/rs232.htm , 06/30/09. 25. “LOW POWER QUAD OPERATIONAL AMPLIFIERS”, http://www.datasheetcatalog.com/LM324 , STMicroelectronics, 2001. 26. “Tobacco sorter TS3” , http://www.key.net/products/tobacco-sorter/default.html 94 APPENDIX A 2D MECHANICAL DRAWING (BODY TOP VIEW) 95 APPENDIX B 2D MECHANICAL DRAWING (SLOTS TOP VIEW) 96 APPENDIX C 2D MECHANICAL DRAWING (OVERALL TOP VIEW) 97 APPENDIX D MICE EXHIBITION CERTIFICATE 98 APPENDIX E PIC18F452 CODING int i=300; int j=1; unsigned char b; ///////////////////////////////////// void motorleft(void) { while(i--) {PORTA.F5=1; } PORTA.F5=0; } /////////////////////////////////////// void picker(void) { PORTB.F1 = 1; Delay_ms(110); PORTB.F1=0; Delay_ms(300); PORTB.F2 = 1; Delay_ms(120); PORTB.F2 = 0; //function that move motor to left //function for rejecter movement //utilize concept of PWM to control motor speed //forward (push) //backward (pull back) } /////////////////////////////////////// void reachline1(void) { while(PORTD.F0!=0) //if sensor not sense the slit, move. {PORTA.F5=1; Delay_us(4); // PWM } PORTA.F5=0; } ////////////////////////////////////// void escapeline(void) { while(PORTD.F0==1) //while sensor sensing the slit, move away from slit {PORTA.F5=1; }PORTA.F5=0; } /////////////////////////////////////// void reachfollowing(void) { while(PORTD.F0!=1) //reach following slit { PORTA.F5=1; Delay_us(8); //PWM PORTA.F5=0; } PORTA.F5=0; } ////////////////////////////////////// void escapelinej(void) //while sensor sensing the slit, move away from slit { // This function is for movement from to reset position 99 while(PORTD.F0==1) {PORTB.F4=1; }PORTB.F4=0; } ////////////////////////////////////// void reachfollowingj(void) //reach following slit, for reset position { while(PORTD.F0!=1) { PORTB.F4=1; Delay_us(13); //PWM PORTB.F4=0; } PORTB.F4=0; } /////////////////////////////////////// void overshoot_test(void) //to test whether the position of slot overshoot {if(PORTD.F0==0) { while(PORTD.F0!=1) //if overshoot, move back to find the slit { PORTB.F4=1; Delay_us(3); //PWM , very slow movement to move back to slit PORTB.F4=0; }PORTB.F4=0; } else {} } ///////////////////////////////////////// void main() { USART_init(9600); // initialize USART module ADCON1 = 6; // (8 bit, 9600 baud rate, no parity bit...) TRISD =0xff; TRISA=0; PORTA=0; TRISB=0; PORTB=0; while(1) { if(Usart_Data_Ready()) { b=Usart_Read(); if(b=='i') { motorleft(); Usart_Write(b-32); } if(b=='j') { PORTA.F1=0; PORTA.F0=0; while(i--) 100 {PORTB.F4=1; } PORTB.F4=0; Usart_Write(b-32); } if(b=='p') { picker(); Usart_Write(b-32); } //picker or rejecter if(b=='v') { PORTA.F1=0; PORTA.F0=1; escapeline(); Delay_ms(1000); reachfollowing(); Delay_ms(500); overshoot_test(); Delay_ms(500); picker(); Delay_ms(1000); Usart_Write(b-32); } //OCR NOT PASS if (b=='t') { PORTA.F1=1; PORTA.F0=0; escapeline(); Delay_ms(1000); reachfollowing(); Delay_ms(500); overshoot_test(); Usart_Write(b-32); } //OCR PASS if(b=='u') {reachfollowing(); Usart_Write(b-32); } if (b=='k') { PORTA.F1=1; PORTA.F0=0; escapeline(); Delay_ms(1000); reachfollowing(); Delay_ms(500); overshoot_test(); Usart_Write(b-32); } //SHAPE PASS 101 if(b=='l') { PORTA.F1=0; PORTA.F0=1; escapeline(); Delay_ms(1000); reachfollowing(); Delay_ms(500); overshoot_test(); Delay_ms(500); picker(); Delay_ms(1000); Usart_Write(b-32); } //SHAPE NOT PASS if (b=='m') { PORTA.F1=1; PORTA.F0=0; escapeline(); Delay_ms(1000); reachfollowing(); Delay_ms(500); overshoot_test(); Usart_Write(b-32); } if(b=='n') { PORTA.F1=0; PORTA.F0=1; escapeline(); Delay_ms(1000); reachfollowing(); Delay_ms(500); overshoot_test(); Delay_ms(500); picker(); Delay_ms(1000); Usart_Write(b-32); } if (b=='o') { PORTA.F1=1; PORTA.F0=0; escapeline(); Delay_ms(1000); reachfollowing(); Delay_ms(500); overshoot_test(); Usart_Write(b-32); } if(b=='r') { PORTA.F1=0; PORTA.F0=1; //COLOR PASS (regional) //COLOR NOT PASS (regional) //COLOR PASS (percentage) //COLOR NOT PASS (percentage) 102 escapeline(); Delay_ms(1000); reachfollowing(); Delay_ms(500); overshoot_test(); Delay_ms(500); picker(); Delay_ms(1000); Usart_Write(b-32); } if (b=='x') { escapelinej(); reachfollowingj(); escapelinej(); reachfollowingj(); escapelinej(); reachfollowingj(); escapelinej(); reachfollowingj(); Usart_Write(b-32); } if(b=='z') { escapeline(); Delay_ms(1000); reachfollowing(); escapeline(); Delay_ms(1000); reachfollowing(); escapeline(); Delay_ms(1000); reachfollowing(); escapeline(); Delay_ms(1000); reachfollowing(); Usart_Write(b-32); } //motor to right and through each slits //motor to left and through each slits else if (b=='a' || b=='b' || b=='c' ||b=='d' ||b=='e' || b=='f' ||b=='g' ||b=='h' ) { Usart_Write(b-32); } } } }