integrated security in cloud computing environment

Transcription

integrated security in cloud computing environment
ELYSIUM JOURNAL OF ENGINEERING
RESEARCH AND MANAGEMENT
SEPTEMBER 2014 | VOLUME 01.NO 01 | SPECIAL ISSUE 01
ELYSIUM JOURNAL OF ENGINEERING RESEARCH AND MANAGEMENT
VOL.1 - NO.1
S.NO
1.
SEPTEMBER 2014
SPECIAL ISSUE - 1
TABLE OF CONTENTS
Integrated Security in Cloud Computing Environment
S. Srinivasan, Dr. K. Raja
2.
A Supervised Web-Scale Forum Crawler Using URL Type Recognition
A. Anitha, Mrs. R. Angeline
3.
A Robust Data Obfuscation Approach for Privacy Preserving Data Mining
S. Deebika , A. Sathyapriya
4.
E-Waste Management – A Global Scenario
R. Devika
5.
An Adequacy Based Multipath Routing In 802.16 WIMAX Networks
K.Saranya Dr. M.A. Dorai Rangasamy
6.
Calculation of Asymmetry Parameters for Lattice Based Facial Models
M. Ramasubramanian Dr. M.A. Dorai Rangaswamy
7.
Page No
1
6
16
22
24
29
Multi-Scale and Hierarchical Description Using Energy Controlled Active
Balloon Model
34
T. Gandhimathi M. Ramasubramanian M.A. Dorai Rangaswamy
8.
Current Literature Review - Web Mining
K. Dharmarajan Dr.M.A.Dorairangaswamy
9.
The Latchkey of the Research Proposal for Funded
Mrs. B. Mohana Priya
10.
A Combined PCA Model for Denoising of CT Images
Mredhula.L , Dorairangaswamy M A
11.
RFID Based Personal Medical Data Card for Toll Automation
Ramalatha M, Ramkumar.A K, Selvaraj.S, Suriyakanth.S
12.
Adept Identification of Similar Videos for Web Based Video Search
Packialatha A. Dr.Chandra Sekar A.
13.
38
43
46
51
56
Predicting Breast Cancer Survivability Using Naïve Baysein Classifier and
C4.5 Algorithm
61
R.K.Kavitha, Dr. D.Dorairangasamy
14.
Video Summarization Using Color Features and Global Thresholding
Nishant Kumar , Amit Phadikar
15.
Role of Big Data Analytic in Healthcare Using Data Mining
K.Sharmila , R.Bhuvana
64
68
16.
The Effect of Cross-Layered Cooperative Communication In Mobile AD
HOC Networks
71
N. Noor Alleema , D.Siva kumar, Ph.D
17.
Secure Cybernetics Protector in Secret Intelligence Agency
G.Bathuriya , D.E.Dekson
18.
Revitalization of Bloom’s Taxonomy for the Efficacy of Highers
Mrs. B. Mohana Priya
19.
Security and Privacy-Enhancing Multi Cloud Architectures
R.Shobana Dr.Dekson
20.
Stratagem of Using Web 2.0 Tools in TL Process
Mrs. B. Mohana priya
21.
The Collision of Techno- Pedagogical Collaboration
Mrs. B. Mohana priya
22.
No Mime When Bio-Mimicries Bio-Wave
J.Stephy Angelin, Sivasankari.P
23.
A Novel Client Side Intrusion Detection and Response Framework
Padhmavathi B, Jyotheeswar Arvind M, Ritikesh G
24.
History Generalized Pattern Taxonomy Model for Frequent Itemset Mining
Jibin Philip , K. Moorthy
25.
IDC Based Protocol in AD HOC Networks for Security Transactions
K.Priyanka , M.Saravanakumar
26.
76
80
85
89
94
98
100
106
109
Virtual Image Rendering and Stationary RGB Colour Correction for Mirror
Images
115
S.Malathy , R.Sureshkumar , V.Rajasekar
27.
Secure Cloud Architecture for Hospital Information System
Menaka.C, R.S.Ponmagal
28.
Improving System Performance Through Green Computing
A. Maria Jesintha, G. Hemavathi
29.
124
129
Finding Probabilistic Prevalent Colocations in Spatially Uncertain Data
Mining in Agriculture using Fuzzy Logics
133
Ms.latha.R , Gunasekaran E .
30.
Qualitative Behavior of A Second Order Delay Dynamic Equations
Dr. P.mohankumar, A.K. Bhuvaneswari
31.
140
Hall Effects On Magneto Hydrodynamic Flow Past An Exponentially
Accelerated Vertical Plate In A Rotating Fluid With Mass Transfer Effects
Thamizhsudar.M, Prof Dr. Pandurangan.J
143
32.
Detection of Car-License Plate Using Modified Vertical Edge Detection
Algorithm
150
S.Meha Soman, Dr.N.Jaisankar
33.
Modified Context Dependent Similarity Algorithm for Logo Matching and
Recognition
156
S.Shamini, Dr.N.Jaisankar
34.
A Journey Towards: To Become The Best Varsity
Mrs. B. Mohana Priya
35.
Extraction of 3D Object from 2D Object
Diya Sharon Christy , M. Ramasubramanian
36.
Cloud Based Mobile Social TV
Chandan Kumar Srivastawa, Mr.P.T.Sivashankar
37.
Blackbox Testing of Orangehrmorganization Configuration
Subburaj.V
167
164
170
174
INTEGRATED SECURITY IN CLOUD COMPUTING
ENVIRONMENT
1
S. Srinivasan, 2Dr. K. Raja
1
Research Scholar & 1Associate Professor
Research & Development Center, Bharathiar University &
1
Department of M.C.A, K.C.G College of Technology, Chennai, Tamil Nadu, India
2
Dean Academics, Alpha College of Engineering, Chennai, Tamilnadu, India.
1
[email protected]
1
Abstract-Cloud computing is a standard futuristic
computing model for the society to implement
Information Technology and associated functions with
low cost computing capabilities. Cloud computing
provide multiple, unrestricted distributed site from
elastic computing to on-demand conditioning with
vibrant storage and computing requirement ability.
Though, despite the probable gains attained from cloud
computing, the security of open-ended and generously
available resources is still hesitant which blows the
cloud implementation. The security crisis becomes
enlarged under the cloud model as an innovative
measurement enter into the problem size related to the
method,
multitenancy,
layer
confidence
and
extendibility. This paper introduces an in-depth
examination of cloud computing security problem. It
appraises the problem of security from the cloud
architecture perspective, cloud delivery model viewpoint,
and cloud characteristics manner. The paper examined
quite a few of the key research confront of performing
cloud-aware security exposition which can reasonably
secure the transforming and dynamic cloud model.
Based on this investigation it present a consequent
comprehensive specification of cloud security crisis and
main features that must be covered by proposed security
solution for the cloud computing.
Keywords-Cloud computing security; Cloud Security
model;
I. INTRODUCTION
Cloud computing [1] is a resource delivery and
usage model, it means to obtain resource where by shared
software, hardware, and other information are provided to
computers and other devices as a metered service via
network. Cloud computing is the next development of
distributed computing [2] paradigm which provides for
extremely resilient, resource pooling, storage, and
computing resources. Cloud computing [2] has motivated
industry, academia, businesses to implement cloud
computing to host heavy computationally exhaustive
applications down to light weight applications and
services.
The cloud providers should focus on privacy and
security issues as an affair of high and urgent priority.
The cloud providers have Infrastructure as a Service
(IaaS), Platform as a Service (PaaS), and Software as
1 | Page
Service (SaaS) and many services to present. A cloud
service has distinct characteristics such as on-demand self
service, ubiquitous network access, resource pooling,
rapid elasticity and measured service. A cloud can be
private or public. A public cloud sells services to anyone
on the Internet. A private cloud is a proprietary network
that supplies hosted services to a limited number of
people.
When a service provider uses public cloud
resources to create their private cloud, the result is called
a virtual private cloud.
Cloud computing services afford fast access to
their applications and diminish their infrastructure costs.
As per Gartner survey [3], the cloud market was worth
USD138 billion in 2013 and will reach USD 150 billion
by 2015. These revenues imply that cloud computing is a
potential and talented platform. Even though the potential
payback and revenues that could be realized from the
cloud computing model, the model still has a set of open
questions that force the cloud creditability and reputation.
Cloud security [3] is a large set of policies,
technologies, controls, and methods organized to protect
data, applications, and the related infrastructure of cloud
computing.
The major multiple issues [4] in cloud computing are:
Multi-tenancy
Cloud secure federation
Secure information management
Service level agreement
Vendor lock-in
Loss of control
Confidentiality
Data integrity and privacy
Service availability
Data intrusion
Virtualization vulnerability
Elasticity
In this paper we analyze the few security issues
involved in the cloud computing models. This paper is
organized as follows. Section II discusses several security
risks in cloud environment. In section III, analysis a short
description of few related precise issues of cloud security.
September 2014, Volume-1, Special Issue-1
In section IV, describes integrated security based
architecture for cloud computing. Section V shows
current solutions for the issues of cloud environment.
Finally, section VI concludes the paper with conclusion
and describes the future work for secure cloud
computing.
II. SECURITY RISKS IN CLOUD ENVIRONMENT
Although cloud service providers can provide
benefits to users, security risks play a vital role in cloud
environment [5]. According to a current International
Data Corporation (IDC) survey [6], the top dispute for
75% of CIOs in relation to cloud computing is security.
Protecting the information such as sharing of resources
used by users or credit card details from malicious
insiders is of critical importance. A huge datacenter
involves security disputes [7] such as vulnerability,
privacy and control issues related to information accessed
from third party, integrity, data loss and confidentiality.
According to Tabaki et al. [8], in SaaS, cloud
providers are responsible for security and privacy of
application services than the users. This task is relevant to
the public than the private the cloud environment because
the users require rigorous security requirements in public
cloud. In PaaS, clients are responsible for application
which runs on the different platform, while cloud
providers are liable for protecting one client’s application
from others. In IaaS, users are responsible for defending
operating systems and applications, whereas cloud
providers must afford protection for client’s information
and shared resources [9].
Ristenpartetal. [9] insists that the levels of
security issues in cloud environment are different.
Encryption techniques and secure protocols are not
adequate to secure the data transmission in the cloud.
Data intrusion of the cloud environment through the
Internet by hackers and cybercriminals needs to be
addressed and cloud computing environment needs to
secure and private for clients [10].
We will deal with few security factors that
mainly affect clouds, such as data intrusion and data
integrity. Cachin et al. [11] represents that when multiple
resources such as devices are synchronized by single
user, it is difficult to address the data corruption issue.
One of the solutions that they [11] propose is to use a
Byzantine fault tolerant replication protocol within the
cloud. Hendricks et al. [12] state that this solution can
avoid data corruption caused by some elements in the
cloud. In order to reduce risks in cloud environment,
users can use cryptographic methods to protect the stored
data and sharing of resources in cloud computing [12].
Using hash function [13] is a solution for data integrity
by keeping short hash in local memory.
service, is data intrusion. Amazon allows a lost password
to be reset by short message service (SMS), the hacker
may be able to log in to the electronic mail id account,
after receiving the new reset password.
Service hijacking allows attackers to concession
the services such as sessions, email transactions there by
launching malicious attacks such as phishing, and
exploitation of vulnerabilities.
III. ISSUES OF CLOUD SECURITY
There are many security issues associated with
number of dimensions in cloud environment.
Gartner [15] states that, specific security issues:
multi-tenancy, service availability, long-term viability,
privileged user access and regulatory compliance.
Multi-tenancy shows sharing of resources,
services, storage and applications with other users,
residing on same physical or logical platform at cloud
provider’s premises. Defense-in-depth approach [16] is
the solution for multi-tenancy involves defending the
cloud virtual infrastructure at different layers with
different protection mechanisms.
Another concern in cloud services is service
availability. Amazon [17] point out in its licensing
agreement that it is possible that the service may be
unavailable from time to time. The users request service
may terminate for any reason, that will break the cloud
policy or service fails, in this case there will be no charge
to cloud provider for this failure. Cloud providers found
to protect services from failure need measures such as
backups, Replication techniques [18] and encryption
methods such as HMAC technology are combined
together to solve the service availability issue.
Another cloud security issue is long-term
viability. Preferably, cloud computing provider will never
go broke or get acquired and swallowed up by large
company. But user must be ensuring their data will
remain available even after any event may occur. To
secure and protect the data in reliable manner through
combining service level agreements or law enforcement
[17], and establishment of legacy data centers.
Privileged user access and regulatory
compliance is major concern in cloud security. According
to Arjun kumar et al. [19], Authentication and audit
control mechanism, service level agreements, cloud
secure federation with single sign on [20], session key
management and Identity, Authentication, Authorization,
and Auditing (IAAA) mechanisms [21], will protect
information and restrict unauthorized user access in cloud
computing.
Garfinkel [14], another security risk that may
occur with a cloud provider, such as the Amazon cloud
2 | Page
September 2014, Volume-1, Special Issue-1
IV. INTEGRATED SECURITY BASED CLOUD
COMPUTING MODEL
The integrated security based model for cloud
environment is ensuring security in sharing of resources
to avoid threats and vulnerabilities in cloud computing.
To ensure security on distribution of resources, sharing of
services, service availability by assimilate cryptographic
methods, protective sharing algorithm and combine JAR
files (Java ARchive) and RAID (redundant array of
inexpensive or independent disk) technology with cloud
computing hardware and software trusted computing
platform. The integrated security based cloud computing
model is shown in Figure 1.
infrastructure security. The software security provides
identify management, access control mechanism, anti
spam and virus. The platform security holds framework
security and component security which helps to control
and monitor the cloud environment. The infrastructure
security make virtual environment security in integrated
security based cloud architecture.
The cloud service provider controls and monitor
the privileged user access and regulatory compliance by
service level agreement through auditing mechanism.
We can use the protective sharing algorithm and
cryptography methods to describe security and sharing of
resources and services on cloud computing:
Bs=A(user-node); Ds=F*Bs + Ki
A(.) : Access to user nodes, an application server of the
system is denoted by user-node in the formula;
Bs : Byte matrix of the file F;
Ds : Byte of data files in global center of system; Ki :
User key
Figure 1. Integrated security based cloud computing
model
The model uses a hierarchical protecting
architecture with two layers. Each layer has its own tasks
and is incorporate with each other to ensure data security
and to avoid cloud vulnerabilities in integrated security
based cloud environment.
The authentication boot and access control
mechanism layer, gives the proper digital signatures,
password protective method, and one time password
method to users and manages user access permission
matrix mechanism. An authenticated boot service monitor
the software is booted on the computer and keeps track of
audit log of the boot process.
The integration of protective sharing algorithm
and cryptography methods with redundant array of
inexpensive disk layer advances the technologies in
service availability. The model improves the efficiency of
multi-tenancy and protecting the information provided by
the users. The protective cloud environment provides an
integrated, wide-ranging security solution, and ensures
data confidentiality, integrity and availability in
integrated security based cloud architecture.
To construct the autonomous protection of
secure cloud by association with security services like
authentication, confidentiality, reduce the risk of data
intrusion, and verify the integrity in cloud environment.
F : File, file F in user-node are represented as follows:
F={F(1), F(2), F(3), ….F(n)}, file F is a group of n bytes
of a file.
Based on the values of information security of
cloud environment, we design protective sharing
algorithm with cryptography methods such as encryption
which maintains a protective secret key for each machine
in integrated security based cloud computing model is
indicated as follows :
Bs=A(user-node); Bs=P.Bs + Ki Ds=E(F)Bs
of which:
As(.) : Authorized application server; B s : Byte matrix in
protected mode; P : Users’ protective matrix;
E(F) : Encrypt the byte of file F;
The model adopts a
multi-dimension
architecture of two layer defense in cloud environment.
The RAID (redundant array of independent disk) assures
data integrity by data placement in terms of node striping.
The cloud service provider audit events, log and
monitoring, what happened in the cloud environment.
V. CURRENT SOLUTIONS FOR THE ISSUES IN
CLOUD ENVIRONMENT
In order to reduce threats, vulnerability, risk in
cloud environment, consumers can use cryptographic
methods to protect the data, information and sharing of
resources in the cloud [22]. Using a hash function [13] is
a solution for data integrity by maintaining a small hash
memory.
The cloud platform hardware and software
module restrain software security, platform security, and
3 | Page
September 2014, Volume-1, Special Issue-1
Bessani et al. [18] use Byzantine fault-tolerant
method to provide and store data on different clouds, so if
one of the cloud providers is out of order, they are still
able to store and retrieve information accurately
design more practical and operational in the future. To
welcome the coming cloud computing era, solving the
cloud security issues becomes extreme urgency, that lead
the cloud computing has a bright future.
Bessani et al [18] use a Depsky system deal with
the availability and confidentiality in cloud computing
architecture. Using cryptographic methods, store the keys
in cloud by using the secret sharing algorithm to hide the
values of the key from attackers. . Encryption is measured
solution by Bessani et al. to address the issue of loss of
data.
REFERENCES
Munts-Mulero discussed the issues of existing
privacy protection technologies like K anonymous faced
when applied to large information and analyzed the
current solutions [23].
Sharing of account credentials between
customers should be strictly denied [24] by deploying
strong authentication, authorization and auditing
mechanism by cloud service provider for consumer
session. The consumer can able to allow HIPS (Host
Intrusion Prevention System) at customer end points, in
order to achieve confidentiality and secure information
management.
The integrated based security model provides a
RAID technology with sharing algorithm and
cryptographic methods, assure data integrity and service
availability in cloud computing architecture. The
authentication boot and access control mechanism
ensuring security through cloud deployment models.
VI. CONCLUSION AND FUTURE WORK
It is clear that, although the use of cloud
computing has rapidly increased. Cloud security is still
considered the major issue in the cloud computing
environment. To achieve a secure paradigm, this paper
focused on vital issues and at a minimum, from cloud
computing deployment models view point, the cloud
security mechanisms should have the enormous flair to be
self defending with ability to offer monitoring and
controlling the user authentication, access control through
booting mechanism in cloud computing integrated
security model. This paper proposes a strong security
based cloud computing framework for cloud computing
environment with many security features such as
protective sharing of resources with cryptography
methods along with the combination of redundant array
of independent disk storage technology and java archive
files between the users and cloud service provider. The
analysis show that our proposed model is more secure
under integrated security based cloud computing
environment and efficient in cloud computing.
Future research on this work will include the
development of interfaces, standard and specific
protocols that can support confidentiality and integrity in
cloud computing environment. We will make the actual
4 | Page
[1] Guoman Lin, “Research on Electronic Data Security
Strategy Based on Cloud Computing”, 2012 IEEE
second International conference on
Consumer
Electronics,ISBN: 978-1-4577-1415-3,
2012,
pp.1228-1231.
[2] Akhil Behl, Kanika Behl, “An Analysis of Clou d
Computing Security Issues”, 2012 IEEE
proceedings World Congress on Information and
Communication Technologies, ISBN: 978-1-46734805-8,2012,pp.109-114.
[3] Deyan Chen, Hong Zhao “Data Security and Priv acy
Protection Issues in Cloud Computing”,2012 IEEE
proceedings of International Conference on
Computer Science and Electronics Engineering,
ISBN: 978-0-7695-4647-6,2012,pp.647-651.
[4] Mohammed A. AlZain, Eric Pardede, Ben Soh,
James A. Thom, “Cloud Computing Security: From
Single to Multi-Clouds”, IEEE Transactions on
cloud computing,9(4), 2012, pp.5490-5499.
[5] J.Viega, “Cloud computing and the common
man”,Computer,42,2009,pp.106-108.
[6] Clavister,”Security in the cloud”, Clavister White
Paper, 2008.
[7] C.Wang,Q.Wang,K.Ren and W.Lou,”Ensuring data
storage
security
in
cloud
computing”,ARTCOM’10:Proc.
Intl.Conf.
on
Advances
in
Recent
inTechnologies
in
Communication and Computing,2010,pp.1-9.
[8] H.Takabi,J.B.D.Joshi and G.J Ahn,”Security an d
Privacy Challenges in Cloud Computing
Environments”,IEEE
Security
&
Privacy,8(6),2010,pp.24-31.
[9] T.Ristenpart,E.Tromer ,H.Shacham and S.Savage,
“Hey you,get off of my cloud:exploring information
leakage
in
third-party
compute
clouds”,CCS’09:Proc.16 th ACM Conf. on
Computer
and
communications
security,2009,pp.199-212.
[10] S.Subashini and V.Kavitha,”A survey on secur ity
issues in service delivery models of cloud
computing”,Journal of Network and Computer
Applications,34(1),2011,pp.1-11.
[11] C.Cahin,I.Keidar and A.Shraer,”Trusting the
cloud”,ACM SIGACT News,40,2009.pp.81-86.
[12] J.Hendricks, G.R.Ganger and M.K.Reiter,”Low
overhead byzantine fault –tolerant
storage”,SOSP’07:Proc.21 st
ACM
SIGOPS
symposium
on
Operating
systems
principles,2007,pp.73-86.
[13] R.C.Merkle,”Protocols
for public
key
crptosystems”,IEEE Symposium on Security and
Privacy,1980,pp.122-134.
[14] S.L.Garfinkel,”Email-based identification
an d
September 2014, Volume-1, Special Issue-1
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
authentication: An alternative to PKI?”, IEEE
Security and Privacy,1(6),2003,pp.20-26.
Gartner:Seven
cloud
computing
security
risks.InfoWorld,2008-07-02,
http://www.infoworld.com/d/securitycentral/gartner-seven-cloud-computing-securityrisks-853.
Microsoft Research, Securing Microsoft’s Cloud
Infrastructure”, in White Paper, 2009.
Amazon,Amazon Web Services, Web Services
licensing agreement, October 3,2006.
A.Bessani,M.Correia,
B.Quaresma,
F.Andre
andP.Sousa,”DkpSky:dependable
and
secure
storage in a cloud-of-clouds”,EuroSys’11:Proc.6 th
Conf. on Computer systems,2011,pp.31-46.
Arjunkumar, Byung gook Lee, Hoon Jae
Lee,Anukumari,”Secure Storage and Access of Data
in Cloud Computing”, 2012 IEEE ICT
Convergence,ISBN:978-1-4673-48287,2012,pp.336-339.
M.S.Blumental,”Hide and Seek in the Cloud”, IE
EE Security and Privacy, IEEE,11(2),2010,pp.5758.
Akhil Behl, KanikaBehl,”Security Paradigms f or
Cloud Computing”, 2012 IEEE Fourth International
Conference
on
Computational
Intelligence,
Communication Systems and Networks,ISBN:9780-7695-4821-0,2012,pp.200-205.
R.C
Merkle,”Protocols
for
public
key
cryptosystems”,IEEE Symposium on Security and
Privacy,1980.
Muntes-Mulero V, Nin J. Privacy and
anonymization for very large datasets In:Chen P,ed.
Proc of the ACM 18th Int’l Conf. on Information
and Knowledge Management, CKIM 2009, New
York:Associationfor
Computing
Machinery,
2009,2117.2118,[doi:10.114 5/1645953.1646333].
Wikipedia-Cloud Computing security.
5 | Page
September 2014, Volume-1, Special Issue-1
A SUPERVISED WEB-SCALE FORUM CRAWLER USING
URL TYPE RECOGNITION
A. Anitha1,
Mrs. R. Angeline2,
M. Tech.
2
Assistant Professor
1,2
Department of Computer Science & Engineering, SRM University, Chennai, India.
1
ABSTRACT– The main goal of the Supervised
Web-Scale Forum Crawler Using URL Type
Recognition crawler is to discover relevant content from
the web forums with minimal overhead. The result of
forum crawler is to get the information content of a
forum threads. The recent post information of the user
is used to refresh the crawled thread in timely manner.
For each user, a regression model to predict the time
when the next post arrives in the thread page is found
.This information is used for timely refresh of forum
data. Although forums are powered by different forum
software packages and have different layouts or styles,
they always have similar implicit navigation paths.
Implicit navigation paths are connected by specific URL
types which lead users from entry pages to thread pages.
Based on this remark, the web forum crawling problem
is reduced to a URL-type recognition problem. And
show how to learn regular expression patterns of
implicit navigation paths from automatically generated
training sets usingaggregated results from weak page
type classifiers. Robust page type classifiers can be
trained from as few as three annotated forums. The
forum crawler achieved over 98 percent effectiveness
and 98 percent coverage on a large set of test forums
powered by over 100 different forum software packages.
Index Terms—EIT path, forum crawling, ITF
regex, page classification, page type, URL pattern
learning, URL type.
1 INTRODUCTION
INTERNET forums [1] (also called web forums)
are becoming most important services in online.
Discussions about distinct topics are made between
usersin web forums. For example, inopera forum Board is
a place where people can ask and shareinformation
related to opera software. Due to the abundance of
information in forums, knowledge mining on forums is
becoming an interesting research topic. Zhai and Liu [20],
Yang et al. [19], and Song et al. [15] mined structured
data from forums. Gao et al. [9] recognized question and
answer pairs in forumthreads. Glance et al. [10] tried to
extract business intelligence from forum data.
To mine knowledge from forums, their content
must bedownloaded first. Generic crawlers [7], adopts
breadth-firsttraversal strategy. The two main noncrawler
friendly features of forums [8], [18]: 1) duplicate linksand
uninformative pages and 2) page-flipping links. Aforum
contains many duplicate links which point to a common
6 | Page
page but each link hasdifferent URLs [4], e.g., shortcut
links pointing to the most recent posts or URLs for user
experience tasks such as ―view by date‖ or ―view by
title.‖ Ageneric crawler blindly follows these links and
crawl many duplicate pages, making it inefficient. A
forum also contains many uninformative pages such as
forum software specific FAQs.Following these links, a
crawler will display many uninformative pages.
A long forum board or thread is usually divided
into multiple pages .These pages are linked by pageflipping links, for example,see Figs.2b, and 2c. Generic
crawlers process each page of page-flipping links
separately and eliminate the relationships between
suchpages. Tofacilitate downstream tasks such as
pagewrapping and content indexing [19] these
relationship between pages should be conservedduring
crawling. For example, in order to mine all the posts in
the thread aswell as the reply-relationships between posts,
multiples pages of the thread should be concatenated
together.Jiang et al [7] proposed techniques to learn and
searching web forums using URL patterns but he does not
discussed about the timely refresh of thread pages.
A supervised web-scale forum crawler based on
URL type recognition is introduced to address these
challenges. The objective of this crawler is to search
relevant content, i.e., user posts, from forums with
minimal overhead. Each Forum has different layouts or
styles and each one is powered by a variety of forum
software packages, but they always contain implicit
navigation paths to lead users from entry pages to thread
pages.
Figure 1 Example link relations in a forum
Fig. 1 illustrates about the link structure of each
September 2014, Volume-1, Special Issue-1
page in a forum. For example, a user can traverse from
the entry page to a thread page through the following
paths:
1. entry -> board -> thread
2. entry -> list-of-board -> board -> thread
3. entry -> list-of-board & thread -> thread
4. entry -> list-of-board & thread -> board -> thread
5. entry -> list-of-board -> list-of-board &
thread -> thread
6. entry -> list-of-board -> list-of-board &
thread -> board -> thread
Pages between the entry page and thread page
can be called index pages. The implicit paths for
navigation in forum can be presents as (entry-indexthread (EIT) path):
entry page-> index page ->thread page
he task of forum crawling is reduced to a URL
type recognition problem. The URLs are classified into
three types- index URLs, Thread URLs and page flipping
URLs. It is showed how to learn URL patterns, i.e.,
Index-Thread-page-Flipping (ITF) regexes and steps to
identify these three types of URLs from as few as three
annotatedforum packages. ―Forum package‖ here refers
to ―forum site.‖ The timestamp in each thread page is
collected. Any change in post of the same thread page but
distributed on various pages can be concatenated using
the timestamp details in each thread page. For each
thread, a regression model to predict the time is used
when the next post arrives in the same page.
The most important contributions of this paper are as
follows:
1. The forum crawling problem is reduced to a URLtype
recognition problem.
2. It is presented how to automatically learn
regularexpression patterns (ITF regexes) that identify
theindex URL, thread URL, and page-flipping URLusing
the Pre- built page classifiers from as few as
threeannotated forums.
3.To refresh the crawled thread pages, incremental
crawling of each page using timestamp is used.
4. The evaluation of URL type recognition crawler on a
large set of 100 unseenforum packages showed that the
learned patterns(ITF regexes)aremore effective during
crawling.The result also showed that the performance of
URL type recognition crawleris more when compared
with structure-driven crawler, and iRobot.
The rest of this paper is organized as follows.
Section 2 provides a brief review of related work. Section
3, defines the termsused in this paper. Describe the
overview and the algorithms of the proposed approachin
Section 4. Experiment evaluations are reported in Section
7 | Page
5. Section 6 contains the conclusion and future work of
the research.
2 RELATED WORKS
Vidal et al. [17] proposed acrawler which crawls
from entry page to thread pages using the learned regular
expression patterns of URLs. The target pages are found
by comparing DOM trees of pages with help of
preselected sample target page. This method is effective
only when the sample page is drawn from the specific
site. For each new site the same process must be repeated.
Therefore, this method is not suitable for large-scale
crawling.
Incontrast,
URL
type
recognition
crawlerautomatically learns URL patterns across multiple
sites using the training sets and finds a forum’s entry page
given a pagefrom the forum.
Guo et al.[11] did not mention how to discover
andtraverse URLs. Li et al. [22] developed some heuristic
rules todiscover URLs but rules can be applied only for
specific forum software packages for which heuristic is
considered. But, in internet there are hundreds of different
forum software packages. Refer ForumMatrix [2] to get
extra information about forum software packages. Many
forums also have their own customized software.
A more widespread work on forum crawling is
iRobot by Cai et al. [8]. iRobotis an intelligent forum
crawler based on site-level structure analysis. It crawls by
sampling
pages,clustering
them,
using
the
informativeness evaluation select informative clusters and
find traversal path using spanning tree algorithm.But, the
traversal path selection procedure requireshuman
inspection. From the entry to thread page there are six
paths but iRobot will take only the first path(entry ->
board ->thread). iRobot discover new URL link using
both URL pattern and location information, but when the
page structure changes the URL location might become
invalid. Next, Wang et al. [18] follows the work and
proposed an algorithm for traversal pathselection
problem. They presented the concept of skeletonlink and
page-flipping link.
Skeleton links are ―the valuable links of a
forum site.‖
These links are identified by
informativeness and coverage metrics. Page-flipping links
are identified by connectivitymetric. By following these
links, they exhibited that iRobot canachieve more
effectiveness and coverage. The URL type recognition
crawler learns URL patterns instead of URL locations
todiscover new URLs. URL patterns are not affected by
page structure modification.
The next related work in forum crawling is nearduplicate detection. The main problem in Forum crawling
is to identify duplicates and remove them. The contentbasedduplicate detection [12], [14] first downloads the
pages and then applies the detection algorithm which
makes it bandwidth inefficient method. URL-based
duplicate detection[13] attempts to mine rules of different
September 2014, Volume-1, Special Issue-1
URLs with similar text. They need to analyze logs from
sites or results of a previous crawl which is helpless.
Inforums, all the three types of URLs have specific URL
patterns.URL type recognition crawler adopts a simple
URL string de-duplicationtechnique (e.g., a string
hashset).This method can avoid duplicates without
duplicate detection.
To reduce the unnecessary crawling, industry
standards protocols such as ―nofollow‖ [3], Robots
Exclusion Standard (robots.txt) [6], and Sitemap Protocol
[5] have been introduced. The page authors can inform
the crawler that the destination page is not informative by
specifying the ―rel‖ attribute with the―nofollow‖
value (i.e., ―rel=nofollow).This method is ineffective
since each time the author must specify the ―rel‖
attribute. Next, Robots Exclusion Standard(robots.txt)
specifies what pages a crawler is allowed to visitor not.
Sitemap [5] method lists all URLs along with their
metadata including update time, change frequency etc in
an XML files.The purpose of robots.txtand Sitemap is to
allow the site to be crawled intelligently.Although these
files are useful, their maintenance isvery difficult since
they change continually.
3 TEMINOLOGY
In this section, some terms used in this paper are
defined to make the demonstration clear and to proceed
further discussions,
Page Type: Forum pages are categorized into
four page types.
Entry Page: The homepage of a forum, which is
the lowest common ancestor of all threads. It
contains a list of boards. See Fig. 2a for an
example.
Index Page: It is a page board in forum which
contains a table like structure. Each row in the
table contains information of a board or a thread.
See Figs. 2b for examples. List-of board page,
list-of-board and thread page, and board page are
all stated as index pages.
Thread Page: A page in a forum that contains a
list of posts content belonging to the same
discussion topic generated by users .That page is
termed as thread page. See Figs. 2c for
examples.
Other Page: A page which doesn’t belong to any
of the three pages (i.e.) entry page, index page,
or thread page.
Figure 2 An example of EIT paths: entry board
thread
URL Type: URLs can be categorized into four
different types.

Index URL: A URL links between an entry page
and an index page or between two index pages. Its
anchor text displays the title of its destination board.
Figs. 2a and 2b show an example.

Thread URL: A URL links between an index page
and a thread page. Its anchor text is the heading of
its destination thread. Figs. 2b and 2c show an
example.

Page-flipping URL: A URL links connecting
multiple pages of a board or a thread. Page-flipping
URLs allows a crawler to download all threads in a
large board or all posts in a long thread. See Figs.
2b, and 2c for examples.

Other URL: A URL which doesn’t belong to any of
the three URLs (i.e.) index URL; thread URL, or
page-flipping URL.
EIT Path: An entry-index-thread path is navigation
8 | Page
September 2014, Volume-1, Special Issue-1
path from an entry page to thread pages through a
sequence of index pages. See fig.2
ITF Regex: An index-thread-page-flipping regular
expression is used to recognize index, thread, or pageflipping URLs. ITF regex of the URLs are learned and
applied directly in online crawling. The learned ITF
regexes are four for each specific site: one for
identifying index URLs, one for thread URLs, one for
index page-flipping URLs, and one for thread pageflipping URLs. See table 2 for example.
4 A SUPERVISED WEB SCALE FORUM
CRAWLER – URL TYPE RECOGNITION
Inthis section some observations related to crawling,
system overview and modules are discussed.
4.1 Observations
The following characteristicsof forums are
observed by investigating 20 forums to make crawling
effective:
Figure 3 System Overview
1. Navigation path:Each Forum has different layout
and styles but all the forums have implicit navigation
paths in common which lead the user from entry
page to thread pages. In this crawler, implicit
navigation path is specified as EIT path which says
about the types of links and pagesthat a crawler
should track to reach thread pages.
2. URL layout: URL layout information such as
thelocation of a URL on a page and its anchor text
lengthare usedfor the identification of Index URLs
and thread URLs. For Index URLs the anchor text
length will be small and it contains more URLs in the
same page. For thread URLs the anchor text length
will be long and it contains less or no URLs in the
page.
3. Page layout: Index pages and thread pages of
different
forums have similar layouts. Anindex page has
narrow records like a board. A thread page has large
records of user post.
Using the page type classifier learned from a set of few
annotated pages based on the page characteristic. This is
the only process in crawling where manual annotation is
required. Using the URL layout characteristics the index
URL, thread URL, and page-flipping URL can be
detected.
9 | Page
4.2 System Overview
Fig. 3 shows the overall architecture of the
crawler. It consists of two major parts: the learning and
the online crawling part. The learning part first learns ITF
regexes of a given forum from constructed URL training
sets and then implements the incremental crawling using
the timestamp when there is a new user post in the thread
page. The learned ITF regexes are used to crawl all
threads pages in the online crawling part. The crawler
finds the index URLs and thread URLs on the entry page
using Index/Thread URL Detection module.The
identified index URLs and thread URLs are stored in the
index/ thread URL training sets. The destination pages of
the identified index URLs are fed again into the
index/thread URL Detection module to find more index
and thread URLs until no more index URL is detected.
After that, the Page-Flipping URL are found from both
index pages and thread pages using Page-Flipping URL
Detection module .These URLs are stored in pageflipping URLs training sets. From the training sets, ITF
Regexes Learning module learns a set of ITF regexes
foreach URL type.
Once the learning is completed, online crawling
part is executed: starting from the entry URL, the crawler
tracks all URLs matched with any learned ITF regex and
crawl until no page could be retrieved or other condition
is satisfied. It also checks for any change in index/ thread
pages during the user login time. The next user login time
is identified by regression method. The identified change
in index and thread page fed again to detection module to
identify any changes in the page URLs. The online
crawling part displays the resultant thread pages with the
modified thread pages with help of learned ITF Regexes.
September 2014, Volume-1, Special Issue-1
4.3 ITF Regexes Learning
To learn ITF regexes, the crawler has two step of
training procedure. The first step is to construct the
training sets. The second step is regexes learning.
4.3.1 Constructing URL Training Sets
The aim of this training set is to create set of
highly precise index URL, thread URL, and pageflipping URL strings for ITF regexes learning. Two
separate training sets are created: index/thread training
set, page-flipping training set.
4.3.1.1 Index URLs and thread URLs training set:
An index URLs are the links between an entry
page and an index page or between two index pages. Its
anchor text displays the title of its destination board. A
Thread URLs are the links between an index page and a
thread page. Its anchor text is the heading of its
destination thread.
Both index and thread page has their own layout.
An index page contains many narrow records and has
long anchor text, short plain text; whereas a thread page
contains few large records (user posts). Each user post
has a very long textblock and very short anchor text.Each
record of the index page or athread page is always
associated with a timestamp field, but the timestamp
order in these two types of pages arereversed: in an index
page the timestamps are indescending order while in the
thread page they are in ascending order. T
The difference between index and thread page
are made in pre-built page classifiers. The page classifiers
are built by Support Vector Machine (SVM) [16] to
identify the page type. Based on the page layout,
outgoing links, and metadata and DOM tree structures of
the records are used as main features for crawling instead
of page content in generic crawling. The most features
with their description are displayed in Table 1.
Feature
Record
Count
Value
Float
Max/Avg/Va
r of
Float
Width
Max/Av
g/V
ar of
Height
Float
Max/Avg/Va
r of
Anchor
Float
Length
10 | Page
Description
Number of records
The
maximum/average/variance
of
record width among all
records
The
maximum/average/variance
of
record height among all
records
The
maximum/average/variance
of
anchor text length in
characters among
all records
Float
The
maximum/average/variance
of
plain text length in
characters among
Float
all records
The
maximum/average/variance
of
leaf nodes in HTML DOM
tree among
Max/Avg/Va
r of
Text Length
Max/Avg/Va
r of
Leaf Nodes
all records
The
maximum/average/variance
of
Max/Avg/Va
r of
Float
Links
links among all records
Whether each record has a
Has Link
Boolean link
Whether each record has a
Has User
link
Link
Boolean pointing to a user profile
page
Has
Whether each record has a
Timestamp Boolean timestamp
The order of timestamps in
the records
Time Order Float
if the timestamps exist
The similarity of HTML
Record Tree
DOM trees
Float
Similarity
among all the records
Ratio of
The ratio of anchor text
Anchor
length in
Length to
characters to plain text
Text
Float length in
Length
characters
The number of elements
Number of
groups after
Float HTML DOM tree
Groups
alignment
TABLE 1
Main Features
Classification
for
Index/Thread
Page
September 2014, Volume-1, Special Issue-1
Algorithm IndexUrlAndThreadUrlDetection
proposed to detect
theirproperties.
page-flippingURLs
based
on
Input: p: an entry page or index page
Output: it_groups: a group of index/thread URLs
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
12:
13:
14:
15:
let it_groupsbe φ; data
url_groups = Collect URL groups by aligning
HTML DOM tree of p;
foreach urg in url_groupsdo
urg.anchor_len = Total anchor text length in urg;
end foreach
it_groups = max(urg.anchor_len) in url_groups;
it_groups.DstPageType = select the most
common page type of the destination pages of
URLs in urg;
if it_groups.DstPageType is INDEX_PAGE
it_groups.UrlType = INDEX_URL;
else if it_groups.DstPageType is THREAD_PAGE
it_groups.UrlType = THREAD_URL;
else
it_groups = φ;
end if
return it_groups;
Figure 4 Index and Thread URL Detection
Algorithm
Using the same feature set both index and thread
page classifier can be built. The URL type recognition
crawler does not require strong page type classifiers.
According to [15], [20], URLs that are displayed in the
HTMLtable-like structure can be mined by aligning
DOM trees.These can be stored in a link-table. The
partial tree alignment method in [15] is adopted for
crawling.
The index and thread URL training sets is create
using the algorithm shown in Fig. 4. Lines 2-5 collects all
the URL groups and calculates their total anchor text
length; line 6 chooses the longest anchor text length URL
group from the index/thread URL group; and lines 7-14
decides its URL type. The URL group is discarded, if it
doesn’t belong to both index and thread pages.
4.3.1.2 Page-flipping URL training set
Page-flipping URLs are very different from both
index and thread URLs. Page-flipping URLs connects
multiple pages of index or thread. There are two types of
page-flipping URLs: grouped page-flipping URLs and
single page-flipping URLs. In a single page, grouped
page-flipping URLs have more than one page-flipping
URL.In a single page, a single page-flipping URL has
only one page-flipping link or URL. Wang et al. [18]
explained ―connectivity‖ metric to distinguish pageflippingURLs from other loop-back URLs. However, the
metric works well only for grouped page-flipping URLs
and the metric is unable to detectedsingle page-flipping
URLs. To address both the types of page-flipping URLs,
their special Characteristics are observed. An algorithm is
11 | Page
The observation states that the grouped page-flipping
URLs have thefollowing properties:
1. Their anchor text is either a series of digits suchas 1, 2,
3, or special text such as ―last‖ , ―Next.‖
2. They are seen at the source page of same location on
the DOM tree and the DOM trees of theirdestination
pages.
Algorithm PageFlippingUrlDetection
Input: pg: an index page or thread page
Output: pf_groups: a group of page-flipping URLs
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
12:
13:
14:
15:
16:
17:
18:
19:
20:
21:
22:
23:
let pf_groupsbe φ;
url_groups = Collect URL groups by aligning
HTML DOM tree of pg;
foreach urginurl_groupsdo
if the anchor texts of urg are digit strings
pages = Download (URLs in urg);
if pages have the similar layout to pgandurg is
located in same pg page
pf_groups = urg;
break;
end if
end if
end foreach
if pf_groups is φ
foreach urlin outgoingURLs inpg
sp = Download (url);
pf_urls = ExtractURL inspat the same
location asurlinpg;
if
pf_urlsexistsandpf_urls.anchor
==
url.anchor
and
pf_urls.UrlString
!=
url.UrlString
add urlandcond_urlintopf_groups;
break;
end if
end foreach
end if
pf_groups.UrlType = PAGE_FLIPPING_URL;
return pf_groups;
Figure 5 Page Flipping URL Detection
Algorithm
3. The layouts of their source page and destination pages
are similar. To determine the similarity between the two
page layouts a tree similarity method is used.
The single page-flipping URLs do not havethe
property 1, but they have another special property.
4. In the single page-flipping URLs,the source pages and
the destination pages have thesimilar anchor text but have
different URL strings.
September 2014, Volume-1, Special Issue-1
The page-flipping URL detection algorithm is
basedon the above properties. The detail is shown in Fig.
5. Lines 1-11 tries to identify the ―group‖ page-flipping
URLs; if it fails, lines 13-20 will count all the outgoing
URLs todetect the single page-flipping URLs; and line 23
set its URLtype to page-flipping URL.
4.3.2 Learning ITF Regexes
The algorithms for the creation of index URL,
thread URL, and page-flipping URL string training sets
are explained. How to learn ITF regexes from these
training sets is explained in this section. Vidal et al. [17]
proposed URL string generalization for learning, but this
method is affected by negative URL and it requires very
clean, precise URL examples.The URL Type recognition
crawler cannot guarantee that the training sets created are
clean and precise since it is generated automatically. So,
Vidal et al[17] method cannot be used for learning.
Page
Type
Index
Index
Thread
Thread
URL Type
URL Pattern
http://www.aspforums.net\foru
Index
m+/
\w+/ \d+/threads
http://
Pagewww.aspforums.net\forum+/
Flipping
+\w+/ \d+/ \w+\d
http://www.aspforums.net\Threa
Thread
ds +/
\d+/ \w/
http://
Pagewww.aspforums.net\Threads +/
Flipping
\d+/ \w+/ \d- \w
Table 2
The learned ITF regexes from
http://www.aspforums.net
Take these URLs for example
http://www.aspforums.net/Forums/NetBasics/233/Threads
http://www.aspforums.net/Threads/679152/TableC-Net/ http://www.aspforums.net/Forums/ASPAJAX/212/Threads
http://www.aspforums.net/Threads/446862/AJAXCalen/
http://www.aspforums.net/Threads/113227/AJAXForms /
a set of URLs. Theneach specific pattern is extra refined
to
get
more
specificpatterns.
Patterns
are
collectedcontinuously until no morepatterns can be
refined. When this method is applied to the previous
example, ―*‖ refined to a specific pattern
http://www.aspforums.net/\w+\d+\w/ which matches all
URLs both positive and negative URLs. Then this
pattern is further refined to two more specific patterns.
1. http://www.aspforums.net/Forums+/
\threads
\w+/
2. http://www.aspforums.net/Threads+/ \d+/ \w
All the URLs subsets are matched with each
specificpattern. These twopatternsare tried to refined
further but it can’t be done. So, the final output patterns
are these three patterns.
Asmall modification is done to this technique to
reduce patterns and expect many URLs to be covered in
the correct pattern. The adjustment is that pattern is
retained only if the number of itsmatching URLsis
greater than an empirically calculated threshold.
Thethreshold is equal to 0.2 times the total count of
URLs. For the given example, only the first pattern is
retained because it threshold is more.
The crawler learns a set of ITF regexes for a
givenForum and each ITF regex has three elements:
page type ofdestination pages, URL type, and the URL
pattern. Table 2 shows the learned ITF regexes from
forum aspforums. When a user post a new index or
thread page in the forum, it is identified from the
timestamp of the user login information .Regression
method is used to find any change in the user login
information. The crawling is done again to find new
index/ thread pages.
4.4 Online Crawling
The crawler performs an online crawling using
breadth-first strategy. The crawler first pushes the entry
URL into a URL queue and then it fetches a URLfrom
the URL queue and downloads its page. Then itpushes
the outgoing URLs of the fetched URL which matches
any learned regex into the URL queue. The
crawlercontinues this process until theURL queue is
empty or other conditions are satisfied.
The regular expression pattern for the above
URLs is given as: http://www.aspforums.net /\w+/ \d+/
\w/.
The
target
pattern
is
given
as:
http://www.aspforums.net.com/ \w+\d+\w/.Koppula et
al. [13] proposed a method to deal with negative
example.
Starting with the generic pattern ―*,‖ the
algorithmdiscoveriesthe more specific patterns matching
12 | Page
\d+/
September 2014, Volume-1, Special Issue-1
Index Page
%
Thread Page
%
Index/Thread
URL
#Tra
in
Detection %
Foru Precisio
Precisio
Precisio
m
n
Recall
n
Recall
n
Recall
Avg. – Avg. – Avg. – Avg. – Avg. – Avg. –
SD
SD
SD
SD
SD
SD
97.51 – 96.98 – 98.24 – 98.12 – 99.02 – 98.14 –
3
0.83
1.33
0.55
1.15
0.17
0.21
97.05 – 97.47 – 98.28 – 98.04 – 99.01 – 98.13 –
5
0.69
1.56
0.27
1.23
0.15
0.18
10
97.23 – 96.91 – 98.43 – 97.96 – 99.01 – 98.08 –
0.20
1.38
0.44
1.49
0.15
0.17
20
97.34 – 96.18 – 98.66 – 98.00 – 99.00 – 98.10 –
0.18
0.56
0.26
1.18
0.10
0.12
three thread pages, and three other pages from each of the
20 forums are selected manually and the features of these
pages are extracted. For testing, 10 index pages, 10 thread
pages, and 10 other pages from each of the 100 forums
are selected manually. This is known as 10-Page/100 test
set. Index/Thread URL Detection module described in
Section 4.3.1 is executed and the test set is generated. The
detected URLs are checked manually. The result is
computed at page level not at individual URL level since
a majority voting procedure is applied.
To make an additional check about how many
annotated pages this crawler needs to achieve good
performance. The same experiments are conducted with
different training forums (3, 5, 10, 20 and 30) and applied
cross validation. The results are shown in Table 3. From
the result it is showed that a minimum of three annotated
forums can achieve over 98 percent precision and recall.
97.44 – 96.38 – 99.04 – 97.49 – 99.03 – 98.12 –
30 N/A
N/A
N/A
N/A
N/A
N/A
I
D
Table 3
Results of Page Type Classification and URL
Detection
1
The online crawling of this crawler is very
efficient since it only needs to apply the learned ITF
regexes in learning phase on newoutgoing URLs in newly
downloaded pages. This reduces the time need for
crawling.
5 EXPERIMENTS AND RESULTS
In this Section, the experimental results of the
proposed system includes performance analysis of each
modules and comparison of URL type recognition
crawler with other types generic crawlers in terms of
both the effectiveness and coverage.
5.1 Experiment Setup
To carry out experiment 200 different forum
software packages are selected from ForumMatrix [2].
The Forum powered by each software package is found.
In total, there are 120 forums powered by 120 different
software packages. Among them, 20 forums are selected
as training set and remaining 100 forums are used for
testing. The 20 training packages are installed by 23,672
forums and the 100 test packages are installed by 127,345
forums. A script is created to find the number of thread
and user in these packages. It is estimated that these
packages cover about 1.2 million threads generated by
over 98,349 users
2
3
4
5
Foru
m
Forum
Softwa #Threa
re
ds
Name
AfterDa
forums.afterdaw wn:
Customi
n.com
zed
535,383
Forums
ASP.NE Commun
T
ity
1,446,2
forums.asp.net
64
Forums Server
forum.xdaAndroid
vBulletin 299,073
deveopers.com Forums
BlackBer
forums.crackber ry
vBulletin
ry.com
V2
525,381
Forums
techreport.com/f Tech
orums
Report
phpBB
65,083
Table 4
Forum used in Online Crawling Evaluation
5.2 Evaluations of Modules
5.2.2 Evaluation of Page-Flipping URL Detection
To evaluate page-flipping URL detection
module explained in Section 4.3.1, this module is
applied on the 10-Page/100 test set and checked
manually. The method achieved over 99 percent
precision and 95 percent recall in identifying the page
flipping URLs.The failure in this module is mainly
due to JavaScript-based page-flipping URLs or
HTML DOM tree alignment error.
5.2.1 Evaluation of Index/Thread URL Detection
To build page classifiers, three index pages,
5.3 Evaluation of Online Crawling
Among the 100 test five forums (table 4) are
13 | Page
September 2014, Volume-1, Special Issue-1
selected for comparison study. In which four forums
are more popular software packages used by many
forum sites.
These packages have more than 88,245 forums.
5.3.1Online Crawling Comparison
Based on these metrics URL type recognition
crawler is compared with other generic crawler like
structure-driven crawler, iRobot. Even though the
structure-driven crawler [25] is not a forum crawler, it
can also be applied to forums. Each forum is given as
an input to each crawler and the number of thread
pages and other pages retrieved during crawling are
counted.
Learning efficiency comparison
The learning efficiency comparisons between
the crawlers are evaluated by the number of pages
crawled. The results are estimated under the metric of
average coverage over the five forums. The sample for
each method is limited to almost N pages, where N varies
from 10 to 1,000 pages.
100%
80%
60%
40%
20%
0%
10
20
50
URL Type
Recognition
iRobo
t
Structure driven
crawler
100 200
500 1000
Figure 6 Coverage comparison based on different
numbers of sampled pages in learning phase
Structure driven crawler
Recognition
iRobot
URL Type
100%
50%
0%
1
2
3
4
5
Figure 7 Effectiveness comparisons between the
structure-driven, iRobot, and URL Type
recognition crawler
Using the learned knowledge the forums are
crawled for each method and results are evaluated. Fig. 6
shows the average coverage of each method based on
different numbers of sampled pages. The result showed
that URL type recognition crawler needs only 100 pages
14 | Page
to achieve a stable performance but iRobot and structure
driven crawler needs more than 750 pages to achieve a
stable performance. This result indicates that URL type
recognition crawler can learn better knowledge about
forum crawling with smaller effort.
Crawling effectiveness comparison
Fig. 7 shows the result of effectiveness
comparison. URL type recognitioncrawler achieved
almost 100% effectiveness on all forums. The average
effectiveness structure-driven crawler is about 73%. This
low effectiveness is mainly due to the absence of specific
URL similarity functions for each URL patterns. The
average effectiveness iRobot is about 90 % but also it is
considered as an ineffective crawler since it uses random
sampling strategy which samples many useless and noisy
pages during crawling. Compared to iRobot, URL type
recognition crawler learns the EIT path and ITF regexes
for crawling so it is not affected by noisy pages and
performed better. This shows that for a given fixed
bandwidth and storage, URL type recognition crawler can
fetch much more valuable content than iRobot.
Crawling coverage comparison
Fig. 8 shows that URL type recognition crawler
had better coverage than the structure-driven crawlerand
iRobot. The average coverage of URL type recognition
crawler was 99 %compared to structure-driven crawler 93
% andiRobot’s 86 %.The low coverage of structuredriven crawler is due to small domain adaptation.
Structure driven crawler
Recognition
iRobot
URL Type
100%
50%
0%
1
2
3
4
5
Figure 8 Coverage comparisons between the
structure-driven crawler, iRobot, and URL Type
Recognition
The coverage of iRobot is very low because it
learns only one path from the sampled pages which lead
to loss of many thread pages. In contrast, URL type
recognition crawler learns EIT path and ITF regexes
directly and crawls all the thread pages inforums. This
result also showed that Index and thread URL and Page
flipping URL algorithm is very effective.
6 CONCLUSION
The forum crawling problem is reduced to a
URL type recognition problem and showed how to
leverage implicit navigation paths of forums, i.e., EIT
path, and designed methods to learn ITF regexes
explicitly.Experimental results confirm that URL type
recognition crawler can effectively learn knowledge of
September 2014, Volume-1, Special Issue-1
EIT path from as few as three annotated forums. The test
resultson five unseen forums showed that URL type
recognition crawlerhas better coverage and effectiveness
than other generic crawlers.
In future, more comprehensive experiments shall
be conducted to further verify that URL type recognition
crawler method can be applied to other social media’s
and it can be enhanced to handle forums using javascript.
REFERENCES
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
Internet
Forum,
http://en.wikipedia.org/wiki/Internetforu
m,2012.
―ForumMatrix,‖
http://www.forummatrix.org/index.php,
2012.
nofollow, http://en.wikipedia.org/wiki/Nofollow,
2012.
―RFC 1738—Uniform Resource Locators
(URL),‖
http://www.ietf.org/rfc/rfc1738.txt, 2012.
―The
Sitemap
Protocol,‖
http://sitemaps.org/protocol.php,
2012.
―The
Web
Robots
Pages,‖
http://www.robotstxt.org/, 2012.
J. Jiang, X. Song, N. Yu, C.-Y. Lin, ―FoCUS:
Learning to crawl web forums,‖ IEEE Trans.
Know.and Data Eng.,vol. 25, NO. 6 pp, JUNE
2013
R. Cai, J.-M. Yang, W. Lai, Y. Wang, and L.
Zhang, ―iRobot: AnIntelligent Crawler for
Web Forums,‖ Proc. 17th Int’l Conf.
WorldWide Web, pp. 447-456, 2008.
C. Gao, L. Wang, C.-Y. Lin, and Y.-I. Song,
―Finding
Question-Answer Pairs from Online Forums,‖
Proc. 31st Ann. Int’l ACMSIGIR Conf. R & D in
Information Retrieval,pp. 467-474, 2008.
N. Glance, M. Hurst, K. Nigam, M. Siegler,
R. Stockton, and T.Tomokiyo, ―Deriving
Marketing
Intelligence
from
Online
Discussion,‖ Proc. 11th ACM SIGKDD Int’l
Conf. Knowledge Discovery andData
Mining, pp. 419-428, 2005.
Y. Guo, K. Li, K. Zhang, and G. Zhang,
―Board Forum Crawling: AWeb Crawling
Method
for
Web
Forum,‖
Proc.
15 | Page
[14]
[15]
[16]
[17]
[19]
[20]
EEE/WIC/ACMInt’l Conf. Web Intelligence,
pp. 475-478, 2006.
[12]
M. Henzinger, ―Finding NearDuplicate Web Pages: A Large-Scale Evaluation
of Algorithms,‖ Proc. Int’l ACM SIGIRConf.
Research and Development in Information
Retrieval, pp. 284-291,2006.
S. H.S. Koppula, K.P. Leela, A. Agarwal, K.P.
Chitrapura, Garg,and A. Sasturkar, ―Learning
URL Patterns for Webpage De-Duplication,‖
Proc. ACM Conf.Web Search and Data Mining,pp.
381-390, 2010.
G.S. Manku, A. Jain, and A.D. Sarma,
―Detecting
Near-Duplicatesfor
Web
Crawling,‖ Proc. 16th Int’l Conf.
WWW, pp.141-150, 2007.
X.Y. Song, J. Liu, Y.B. Cao, and C.-Y. Lin,
―Automatic Extraction ofWeb Data Records
Containing
User-Generated
Content,‖
Proc.19th Int’l Conf. Information and
Knowledge Management, pp. 39-48,2010.
V.N. Vapnik, the Nature of Statistical Learning
Theory. Springer,1995.
M.L.A. Vidal, A.S. Silva, E.S. Moura, and J.M.B.
Cavalcanti,―Structure-Driven Crawler Generation
by Example,‖ Proc. 29thAnn. Int’l ACM SIGIR
Conf. Research and Development in Information
retrieval, pp. 292-299, 2006.
Y. Y. Wang, J.-M. Yang, W. Lai, R. Cai, L.
Zhang, and W.- Ma,―Exploring Traversal
Strategy for Web Forum Crawling,‖ Proc.31st
Ann. Int’l ACM SIGIR Conf. Research and
Development inInformation Retrieval, pp. 459466, 2008.
J.-M. Yang, R. Cai, Y. Wang, J. Zhu, L. Zhang,
and W.Y. Ma,―Incorporating Site-Level Knowledge to
Extract
Structured Datafrom Web Forums,‖ Proc. 18th
Int’l Conf.
World Wide Web, pp. 181-190, 2009.
Y. Zhai and B. Liu, ―Structured Data
Extraction from the Webbased on Partial Tree
Alignment,‖
IEEE Trans. Knowledge
DataEng., vol. 18, no. 12, pp. 1614-1628, Dec.
2006. K. Li, X.Q. Cheng, Y. Guo, and K.
Zhang, ―Crawling DynamicWeb Pages in
WWW Forums,‖ Computer Eng., vol. 33, no.
6,pp.
80-82,
2007.
September 2014, Volume-1, Special Issue-1
A ROBUST DATA OBFUSCATION APPROACH FOR PRIVACY
PRESERVING DATAMINING
S.Deebika1
A.Sathyapriya2
1
PG Student
Assistant Professor
Department of Computer Science and Engineering,
Vivekananda College of engineering for women, Namakkal, India.
1
Email:[email protected]
2
Email:[email protected]
2
Abstract: Data mining play an important role in the
storing and retrieving of huge data from database. Every
user wants to efficiently retrieve some of the encrypted
files containing specific keywords, keeping the keywords
themselves secret and not jeopardizing the security of
the remotely stored files. For well-defined security
requirements and the global distribution of the attributes
needs the privacy preserving data mining (PPDM).
Privacy-preserving data mining is used to uphold
sensitive information from unendorsed disclosure.
Privacy preserving data is to develop methods without
increasing the risk of misuse of the data. Anonymization
techniques: K- Anonymity, L-Diversity, T-Closeness,
P-Sensitive and M-invariance offers more privacy
options rather to other privacy preservation techniques
(Randomization, Encryption, and Sanitization). All these
Anonymization techniques only offer resistance against
prominent attacks like homogeneity and background.
None of them is able to provide a protection against all
known possible attacks and calculate overall proportion
of the data by comparing the sensitive data. We will try
to evaluate a new technique called (n,t)-Closeness which
requires that the distribution of a sensitive attribute in
any equivalence class to be close to the distribution of
the attribute in the overall table.
Index Terms— Anonymization, L-Diversity, PPDM, PSensitive, T-Closeness, (n,t)-Closeness.
I. INTRODUCTION
Rapid growth of internet technology have made
possible to make use of remote communication in every
aspects of life. As well as the increase of technology,
privacy and security is needed in electronic
communications became warm issues. Security to
sensitive data against unofficial access has been a long
term goal for the database security study group of people.
Data mining consists of number of techniques for
manufacture automatically and entertainingly to retrieve
the information from the large amount of database which
consists of sensitive information too. Privacy is vital issue
in transferring of sensitive information from one spot to
another spot through internet.
Most considerably, in hospital, in government
administrative center and in industries; there is need to
16 | Page
establish privacy for sensitive information or data to
analyze and future processing on it from other
departments. Various organizations (e.g., Hospital
authorities, industries and government organizations etc)
releasing person thorough data, which called as micro
data. They provide information of privacy of individuals.
Main aspire is to protect information simultaneously to
produce external knowledge.
The table consist of micro data is called Micro table
[6]. i) identifiers-Uniquely identified attributes are called
as identifiers. e.g., Social Security number. ii) Quasiidentifiers -adversary of attribute may already known and
taken together can potentially identify an individual e.g.,
Birth date, Sex and Zip code. iii) Sensitive attributes adversary of attribute is unknown and sensitive. e.g.,
Disease and Salary. [3] are the three tupules.
Sensitive information is fragment different from secret
and confidential. Secret information means Passwords,
pin codes, credit card details etc.
The sensitive
information mostly linked to diseases like HIV, Cancer,
and Heart Problem etc.
II. RELATED WORKS
The main aim of the privacy preserving is to create
method and techniques for the prevention of misusage of
sensitive data. The techniques are proposed for altering
the original data to carry out privacy. The alteration may
not affect the original data and to improve the privacy on
it. Various methods of privacy can prevent unauthorized
usage of sensitive attribute. Some of the Privacy methods
[11][4] are Anonymization, Randomization, Encryption,
and Data Sanitization. Extending of this many advanced
techniques are proposed, such as p-sensitive k-anonymity,
(α, k)-anonymity, l-diversity, t-closeness, M-invariance,
Personalized anonymity, and so on.
For multiple
sensitive attribute[7], there are three kinds of information
disclosure.
i)
Identity Disclosure: An individual is linked to a
particular record in the published data.
ii) Attribute Disclosure: When sensitive information
regarding individual is disclosed known as Attribute
Disclosure.
September 2014, Volume-1, Special Issue-1
iii) Membership Disclosure: When information regarding
individual’s information is present in data set and it is
not disclosed.
When the micro data is published the various
attacks are occurred like record linkage model attack and
attribute linkage model attack. To avoid these attacks the
different anonymization techniques was introduced.
We did many surveys on anonymization [8]
techniques. They are explained below.
A.
K-Anonymity
K-anonymity is a property possessed by certain
anonymized data. The theory of k-anonymity was first
formulated by L. Sweeney[12] in a paper published in
2002 as an attempt to solve the problem: "Given personspecific field-structured data, produce a release of the data
with scientific guarantees that the individuals who are the
subjects of the data cannot be re identified while the data
remain practically useful."[9][10].. A release of data is
said have the k-anonymity property if the information for
each person contained in the release cannot be
distinguished from at least k-1 individuals whose
information also appear in the release.
Methods for k-anonymization
In the framework of k-anonymization problems,
a database is a table with n rows and m columns. Each
row of the table represents a record relating to a specific
member of a population and the entries in the various
rows need not be unique. The values in a mixture of
columns are the values of attributes associated with the
members of the population. The following table 1 is a non
anonymized database consisting of the patient records.
A conclusion section is not required. Although a
conclusion may review the main points of the paper, do
not replicate the abstract as the conclusion. A conclusion
might elaborate on the importance of the work or suggest
applications and extensions.
Suppression:
In this Suppression method,
certain values of the attributes of column are replaced by
an asterisk '*'. In the anonymized below table, have
replaced all the values in the 'Name' attribute and the
'Religion' attribute have been replaced by a '*'.
Generalisation: In this method, individual values
of attributes are replaced by with a broader category. For
example, the value '23' by '20 < Age ≤ 30’, etc. The below
table 2 shows the anonymized database.
K-anonymity model was developed to protect
released data from linking attack but it causes the
information disclosure. The protection of k-anonymity
provides is easy and simple to appreciate. K-anonymity
does not provide a shelter against attribute disclosure.
Table 2 is Anonymized version of the database are shown
below.
S.No
1
Zip
code
43**
2
43**
3
45**
4
45**
5
44**
Age
20
30
20
30
20
30
20
30
20
30
S.No
Zip code
Age
Disease
1
4369
29
TB
476**
2*
2
4389
24
476**
2*
3
4598
28
Viral
infection
No illness
4
4599
27
4790**
4790**
>=40
>=40
5
4478
23
476**
3*
476**
476**
3*
3*
Table 1 Non Anonymized database
The above table 1 has 4 attributes and 5 records in this
data. There are two common methods for achieving kanonymity [13] for some value of k.
17 | Page
< Age ≤
TB
< Age ≤
Viral
infection
No illness
< Age ≤
< Age ≤
< Age ≤
Viral
infection
Heart-related
Table 2 Anonymized database.
.
Attacks on k-anonymity
In the section, we study about the attacks on kanonymity. There are two types of attacks. They are
Homogeneity Attack and background attack. Table 3
shows two types of attack
Zip
476**
Viral
infection
Heart-related
Disease
Age
2*
Disease
Heart
Disease
Heart
Disease
Heart
Disease
Flu
Heart
Disease
Heart
Disease
Cancer
Cancer
Homogeneity
attack
Bob
Zip
Age
47678 27
John
Zip
Age
47673
36
Background
Knowledge attack
Table 3 Homogeneity and Background knowledge attack
Homogeneity Attack
Sensitive attributes are lack in diversity values. From
the above table, we easily conclude that Bob Zip code is
September 2014, Volume-1, Special Issue-1
up to the range of 476** and his age is between 20 to
29.Then finally conclude he is attacked by Heart Disease.
It is said to be Homogeneity attack.
Background Knowledge Attack
Attacker has additional background knowledge of other
sensitive data.
L-diversity does not consider the overall
distribution of sensitive values.
Similarity Attack
When the sensitive attribute values are distinct but also
semantically parallel, an adversary can learn important
information. Table 4 shows similarity attack.
Restrictions of K-anonymity
K-anonymity make visible of individuals'
sensitive attributes.
Background knowledge attack is not protected
by K-anonymity.
Plain knowledge of the k-anonymization
algorithm can be dishonored by the privacy.
Applied to high-dimensional data is not possible.
K- Anonymity cannot protect against Attribute
disclosure.
Variants of K-anonymity
A micro data satisfies the p-sensitive k-anonymity [15]
property if it satisfies K-anonymity and the number of
distinct values for each sensitive attribute is at least p
within the same QI. It reduces information loss through
anatomy approach.
(α, k) – Anonymity
A view of a table is said to be an (α, k)anonymization [16] of the table if the view modifies the
table such that the view satisfies both k-anonymity and α
–deassociation properties with respect to the quasiidentifier.
B. L-diversity
L-diversity is proposed to overcome the short comes of
K-anonymity. It is the extension of K-anonymity. Ldiversity [1] is proposed by Ashwin Machanavajjhala in
the year 2005.An equivalence class has l-diversity if there
is l or more well-represented values for the sensitive
attribute. A table is said to be l -diverse if each
equivalence class of the table is l-diverse. This can guard
against by requiring “many” sensitive values are “wellrepresented” in a q* block (a generalization block).
Attacks on l-diversity
In this section, we study about two attacks on ldiversity [2]: the Skewness attack and the Similarity
attack.
Skewness Attack
There are two sensitive values, they are HIV positive
(1%) and HIV negative (99%).Serious privacy risk
Consider an equivalence class that contains an equal
number of positive records and negative records ldiversity does not differentiate Equivalence class.
Equivalence class 1: 49 positive + 1 negative;
Equivalence class 2: 1 positive + 49 negative.
18 | Page
Zip
Age
Salary
Disease
2*
20k
Gastric
code
476**
Similarity attack
ulcer
2*
30k
Gastric
476**
476**
Bob
2*
40k
Stomach
zip
Age
cancer
47678
27
479**
>=4
100k
Gastric
476**
>=4
60k
Flu
476**
3*
70k
Bronchitis
Table 4. Similarity attack.
As conclude from table, Bob’s salary is in [20k,
40k], which is relative low. Bob has some stomachrelated disease.
Variant of L-diversity
Distinct l-diversity
Each equivalence class has at least l wellrepresented sensitive values. It doesn’t prevent the
probabilistic inference attacks. e.g., In one equivalent
class, there are ten tupules. In the “Disease” area, one of
them is “Cancer”, one is “Lung Disease” and the
remaining eight are “Kidney failure”. This satisfies 3diversity, but the attacker can still affirm that the target
person’s disease is “Kidney failure” with the accuracy of
80%.
Entropy l-diversity
Each equivalence class not only must have enough
different sensitive values, but also the different sensitive
values must be distributed evenly enough.
The entropy of the entire table may be very low. This
leads to the less conservative notion of l- diversity.
Recursive (c,l)-diversity
The most frequent value does not appear too
frequently.
Restrictions of L-diversity
It prevents Homogeneity attack but l-diversity is
insufficient to prevent attribute disclosure.
September 2014, Volume-1, Special Issue-1
L-diversity is unnecessary and difficult to
achieve for some cases.
A single sensitive attribute two values: HIV
positive (1%) and HIV negative (99%) very
different degrees of sensitivity.
C. T-closeness
The t-closeness [14] model was introduced to
overcome attacks which were possible on l-diversity (like
similarity attack). L-diversity model uses all values of a
given attribute in a similar way (as distinct) even if they
are semantically related. Also not all values of an attribute
are equally sensitive. An equivalence class is said to have
t-closeness if the distance between the distribution of a
sensitive attribute in this class and the distribution of the
attribute in the whole table is no more than a threshold t.
It requires that the earth mover's distance between the
distribution of a sensitive attribute within each
equivalence class does not differ from the overall earth
movers distance of the sensitive attribute in the whole
table by more than a predefined parameter t.
Restrictions of t-closeness
T-closeness is an effective way when it is
combined with generalizations and suppressions or
slicing[5]. It can lost co-relation between different
attributes because each attribute is generalized separately
and so we lose their dependencies on each other. There is
no computational procedure to enforce t-closeness. If we
consider very small utility of data is damaged.
III. PROPOSED WORK
(n,t) –CLOSENESS
The (n, t)-closeness principle: An equivalence class E1
is said to have (n, t)-closeness if there exists a set E2 of
records that is a natural superset of E1 such that E2
contains at least n records, and the distance between the
two distributions of the sensitive attribute in E1 and E2 is
no more than a threshold t. A table is said to have (n, t)closeness if all equivalence classes have (n, t)-closeness.
(n,t) -Closeness which requires that the distribution of a
sensitive attribute in any equivalence class to be close to
the distribution of the attribute in the overall table.
S.No Zip Code
1
47696
Age
29
Disease
pnemonia
Count
100
2
47647
21
Flu
100
3
47602
28
Pnemonia
200
4
47606
23
200
5
47952
49
Flu
Pnemonia
6
47909
48
Flu
900
7
47906
47
Pnemonia
100
8
47907
45
Flu
900
9
47603
33
Pnemonia
100
10
47601
30
Flu
100
11
47608
35
Pnemonia
100
12
47606
36
Flu
100
100
Table 5 Original patient data
In the above definition of the (n, t)-closeness
principle, the parameter n defines the breadth of the
observer’s background knowledge. Smaller n means that
the observer knows the sensitive information about a
smaller group of records. The parameter t bounds the
amount of sensitive information that the observer can get
from the released table. A smaller t implies a stronger
privacy requirement
S.No
Age
Disease
Count
1
ZIP
Code
476**
2*
Pnemonia
300
2
3
476**
479**
2*
4*
Flu
Pnemonia
300
100
4
5
479**
476**
4*
3*
Flu
Pnemonia
900
100
6
476**
3*
Flu
100
Table 6 An Anonymous Version of table 5
The intuition is that to learn information about a
population of a large-enough size (at least n). One key
term in the above definition is “natural superset”.
Assume that we want to achieve (1000, 0.1)-closeness
for the above example. The first equivalence class E1 is
defined by (zip code=“476**”, 20 ≤ Age ≤ 29) and
contains 600 tuples. One equivalence class that naturally
.
19 | Page
September 2014, Volume-1, Special Issue-1
contains it would be the one defined by (zip code=
“476**”, 20 ≤ Age ≤ 39). Another such equivalence
class would be the one defined by (zip code= “47***”,
20 ≤ Age ≤29). If both of the two large equivalence
classes contain at least 1,000 records, and E1’s
distribution is close to (i.e., the distance is at most 0.1)
either of the two large equivalence classes, then E1
satisfies (1,000, 0.1)-closeness. In fact, Table 6 satisfies
(1,000, 0.1)-closeness. The second equivalence class
satisfies (1,000, 0.1)-closeness because it contains
2 , 0 0 0 > 1,000 individuals, and thus, meets the
privacy requirement (by setting the large group to be
itself).
The first and the third equivalence classes also satisfy
(1,000, 0.1)-closeness because both have the same
distribution (the distribution is (0.5, 0.5)) as the large
group which is the union of these two equivalence
classes and the large group contains 1,000 individuals.
Choosing the parameters n and t would affect the
level of privacy and utility. The larger n is and the
smaller t is, one achieves more privacy and less utility.
IV. EXPERIMENTAL SETUP
We did a sample experiment to check the efficiency of
the new privacy measure. Here, a sample graph is shown
in fig 1.We compared our different techniques with the
proposed model and gets the sample graph with efficient
manner. We use parameter number of datasets and
privacy degree. In this, datasets are given as sample
input and getting privacy with the efficient manner as an
output.
20
18
16
14
k-anonymity
12
10
8
6
4
2
l-diversity
t-closeness
(n,t)closeness
0
We explained detail about the related works and the
drawbacks of anonymization techniques. The new novel
privacy technique has overcome the drawbacks of
Anonymization technique and generalization and
suppression too. It provides security and proportional
calculation of data. We illustrate how to calculate overall
proportion of data and to prevent attribute disclosure and
membership disclosure. We have explained and compared
between different types of Anonymization. Our
experiments show that (n,t)-Closeness preserves better
data utility than Anonymization techniques .
VI. REFERENCES
[1] A. Machanavajjhala, J. Gehrke, D. Kifer, and M.
Venkitasubramaniam. ℓ-diversity: Privacy beyond kanonymity.
Available
at
http://www.cs.cornell.edu/_mvnak, 2005.
[2] Ashwin Machanavajjhala, Johannes Gehrke, Daniel
Kifer, Muthuramakrishnan Venkitasubramaniam, ℓDiversity: Privacy Beyond k-Anonymity 2006.
[3] Dimitris Sacharidis, Kyriakos Mouratidis, and
Dimitris Papadias.K-Anonymity in the presence of
External database, IEEE Transactions on Knowledge
and Data Engineering, vol.22, No.3, March 2010.
[4] Gayatri Nayak, Swagatika Devi, “A Survey on
Privacy Preserving Data Mining: Approaches and
Techniques”, India, 2011.
[5] Li, N. Li, T. Venkatasubramanian, S. t-Closeness:
Privacy Beyond k-Anonymity and l-Diversity. ICDE
2007: 106-115.
[6] In Proceedings of the 12th ACM SIGKDD
International Conference on Knowledge Discovery
and Data Mining (ACM SIGKDD 2006), pages 754 –
759.
[7] Inan.A,Kantarcioglu.M,and
Bertino.e,
“Using
Anonymized Data for Classification,” Proc. IEEE
25th Int Conf. Data Eng. (ICDE), pp. 429-440, 2009.
[8] Li T. and Li N. (2007), Towards Optimal kAnonymization, Elsevier Publisher, CERIAS and
Department of Computer Science, Purdue University,
305 N. University Street, West Lafayette, IN 479072107, USA.
[9] L. Sweeney. "Database Security: k-anonymity".
Retrieved 19 January 2014.
Fig 1 Comparison of different anonymization technique
with number of datasets and privacy efficiency
V. CONCLUSION
[10] L. Sweeney. k-anonymity: a model for protecting
privacy. International Journal on Uncertainty,
Fuzziness and Knowledge-based Systems, 10 (5),
2002; 557-570.
This paper presents a new approach called (n,t)Closeness to privacy-preserving micro data publishing.
20 | Page
September 2014, Volume-1, Special Issue-1
[11] R. Agrawal, R. Srikant, “Privacy-Preserving Data
Mining”, ACM SIGMOD Record, New York, vol.29,
no.2, pp.439-450,2000.
[12] Sweeney.L, k-anonymity: a model for protecting
privacy.
International
Journal
on
Uncertainty,Fuzziness
and
Knowledge-based
Systems, 10 (5), 2002; 557-570.
[13] Sweeney, L. Achieving k-Anonymity Privacy
Protection
Using
Generalization
and
Suppression.International Journal of Uncertainty,
Fuzziness and Knowledge-Based System, 10(5) pp.
571-588, 2002.
[14] t-Closeness: Privacy Beyond k-Anonymity and l –
Diversity ICDE Conference, 2007, Ninghui Li ,
Tiancheng Li , Suresh Venkatasubramanian.
[15] Truta, T.M. and Bindu, V. (2006) Privacy Protection:
P‐Sensitive K-Anonymity Property. In Proceedings
of the Workshop on Privacy Data Management,
bwith ICDE 2006, pages 94.
[16] Wong, R.C.W., Li, J., Fu, A.W.C., and Wang, K.
(2006) (α, k)-Anonymity:An Enhanced k-Anonymity
Model for PrivacyPreserving Data Publishing.
21 | Page
September 2014, Volume-1, Special Issue-1
E-WASTE MANAGEMENT – A GLOBAL SCENARIO
R. Devika
Department of Biotechnology,
Aarupadai veedu institute of technology, Paiyanoor
INTRODUCTION:
Advances in the field of science and technology
in the 18th century brought about the industrial evolution
which marked a new era in human civilization. Later in
the 20th century, Information and Communication
Technology has brought out enomorous changes in Indian
economy, industries etc. which has undoubtedly enhanced
the quality of human life. At the same time, it had led to
manifold problems including enomorous amount of
hazardous wastes which poses a great threat to human
health and environment.
Rapid changes in technologies, urbanization,
change in media, planned obsolescence etc. have resulted
in a fast growing surplus of electronic waste (E-waste)
around the Globe, About 50 million tones of e – waste are
been produced every year, wherein USA discards 3
million tones of each year amounting 30 million
computers per year and Europe disposes 100 million
phones every year and China leads second with 2.3
million tons of e – waste. Electronic wastes or e – waste
or e – scrap or electronic disposal refers to all the
discarded electrical or electronic devices like mobile
phones, television sets, computers, refrigerators etc [1].
The other definitions are re-usable (working and
repairable electronics), secondary scrap (Copper, Steel,
Plastic etc.) and others are wastes which are damped or
incinerated. Cathode Ray Tubes (CRTs) are considered
one of the hardest types to recycle and the United States
Environmental Protection Agency (EPA) included CRT
monitors as “Hazardous Household Waste” since it
contains lead, cadmium, beryllium or brominated flame
retardants as contaminants [2].
Guiyu in the Shantou region of China is referred
as the “E – Waste Capital of the world” [3] as it employs
about 1,50,000 workers with 16 hour days disassembling
old computers and recapturing metals and other reusable
parts (for resale or reuse). Their workmanship includes
snip cables, pry chips from circuit boards, grind plastic
computer cases, dip circuit boards in acid baths to
dissolve the lead, cadmium and other toxic metals [4].
Uncontrolled burning, disassembly and disposal causes a
variety of environmental problems such as groundwater
contamination, atmospheric pollution, immediate
discharge or due to surface runoff, occupational health
hazards (directly or indirectly). Professor Huoxia of
Shantou University Medical College has evident that out
of 165 of Guiyu, 82% children had lead in their blood
(Above 100 µg) with an average of 149 µg which is
considered unsafe by International health experts [5].
Tossing of equipment onto an open fire, in order to melt
plastics and to burn away non – valuable metals releases
22 | Page
carcinogens and neurotoxins into the air, contributing to
an acrid, lingering smog which includes dioxins and
furans [6]
ENVIRONMENTAL IMPACTS OF E – WASTE [2]
E – Waste
component
Process
Environmental Impact
Cathode Ray Tubes
Breaking
and
removal of yoke,
then dumping
Leaching of lead,
barium and other
heavy metab into the
water table in turn
releasing
phosphor
(toxic)
Printed Circuit
Board
Desoldering
and
removal
Open burning
Acid
bath
to
remove fine metals
Emission of glass dust,
tin, lead, brominated
dioxin,
beryllium
cadmium, mercury etc.
Chips and other
gold planted
components
Chemical stripping
using nitric and
hydrochloric acid.
Release
of
hydrocarbons,
tin,
lead,
brominated
dioxins etc.
Plastics from
printers,
keyboards,
monitors etc.
Shredding and low
temperature melting
Emission
of
brominated dioxins,
hydrocarbons etc.
Computer wires
Open burning &
Stripping to remove
copper
Ashes of hydrocarbons
Other Hazardous Components of E – Wastes
E. Waste
Hazardous components
Environmental
Impacts
Smoke Alarms
Fluorescent tubes
Americium
Mercury
Lead acid batteries
Sulphur
Resistors, Nickel –
Cadmuim batteries
Cadmium (6-18%)
Cathode Ray tubes
(CRT)
Lead (1.5 pounds of
lead in 15 inch CRT).
Thermal
grease
used as heatsinks
for CPUS and
power transistors,
magnetrons
etc.
Vacuum tubes and
gas lasers.
Beryllium Oxide
Caricinogenic
Health
effects
includes
sensory
impairment,
dermatitis, memory
loss
muscle
weakness etc.
Liver, kidney, heart
damages, eye and
throat irritation –
Acid rain formation
Hazardous
wastes
causing
severe.
Damage to lungs,
kidneys etc.
Impaired cognitive
functions,
Hyper
activity, Behavioural
disturbances, lower
IQ etc.
Health impairments
September 2014, Volume-1, Special Issue-1
Non
–
Stick
Cookware (PTFE)
Perfluorooctanoic acid
(PFOA)
Risk of spontaneous
abortion,
preterm
birth, stillbirth etc
INDIAN SCENARIO
India has the label of being the second largest ewaste generator in Asia. According to MAIT-GT2
estimate India generated 3,30,000 lakh tonnes of e-waste
which is equivalent of 110 million laptops. [ “Imported ewaste seized by customs officials”- The Times of India,
20th August. 2010].
Guidelines have been formulated with the
objectives of providing broad guidance of e-waste and
handling methodologies and disposal.
Extended Producer Responsibility (EPR)
It is an environment protection strategy that
makes the producer responsible for the entire life cycle of
the product, take back, recycle and final disposal of the
product.
E – Waste Treatment & Disposable Methods
I.
INCINERATION
Complete combustion of waste material at high
temperature (900 - 1000°C)
Advantage
Reduction of e-waste volume
Maximum Utilization of energy content of
combustible material
Hazardous organic substances are converted into
less hazardous compounds.
Disadvantage:
Release of large amount of residues from gas
cleaning and combustion.
Significant emission of cadmium and mercury
(Heavy metal removal has to be opted).
II. RECYCLING
Recycling of monitors, CRT, Keyboards,
Modems, Telephone Boards, Mobile, Fax machines,
Printers, Memory Chips etc., can be dismantled for
different parts and removal of hazardous substance like
PCB, Hg, Plastic, segregation of ferrous and non-ferrous
metals etc.
Use of strong acids to remove heavy metals like
copper, lead, gold etc.
III. Re – Use
This constitute the direct second hand use or use
after slight modifications to the original functioning
equipment.
This method will considerably reduce the
volume of e-waste generation.
IV. LANDFILLING
This method is widely used methods disposal of
e – waste. Landfilling trenches are made on the earth and
23 | Page
waste materials are buried and covered by a thick layer of
soil.
Modern techniques includes impervious liner
made up of plastic or clay and the leacheates are collected
and transferred to wastewater treatment plant.
Care should be taken in the collection of
leachates since they contain toxic metals like mercury,
cadmium, lead etc. which will contaminate the soil and
ground water.
Disadvantage:
Landfills are prone to uncontrolled fires and
release toxic fumes.
Persistence of Poly Chlorinated Biphenyl (Non
biodegradable).
E – Waste Management
-50-80% of e – wastes collected are exported for
recycling by U.S. Export.
-Five e-waste recyclers are identified by Tamil Nadu
pollution control Board.
 Thrishyiraya Recycling India Pvt. Ltd.
 INAA Enterprises.
 AER World Wide (India) Pvt. Ltd.
 TESAMM recycler India Pvt. Ltd.
 Ultrust Solution (I) Pvt. Ltd
Maharashtra Pollution Control Board has
authorized Eco Reco company, Mumbai for e-waste
management across India.
TCS, Oberoi groups of Hotels, Castrol, Pfizer,
Aventis Pharma, Tata Ficosa etc. recycle their e-waste
with Eco Reco.
REFERENCES
1. Prashant and Nitya. 2008. Cash for laptops offers
Green Solution for broken or outdated computers.
Green Technology, National Center for Electronics
Recycling News Summary.08-28.
2. Wath SB, Dutt PS and Chakrabarti. T. 2011. E –
waste scenario in India, its management and
implications. Environmental Monitoring and
Assessment. 172,249-252.
3. Frazzoli C. 2010. Diagnostic health risk assessment
of electronic waste on the population in developing
countries
scenarios.
Environmental
Impact
Assessment Review. 388-399.
4. Doctorow and Cory.2009. Illegal E – waste
Dumped in Ghana includes - Unencrypted Hard
Drives full of US Security Secrets. Boing.
5. Fela. 2010. Developing countries face e-waste
crisis. Frontiers in Ecology and the Environmental.
8(3), 117.
6. Sthiannopkao S and Wong MH. 2012. Handling e –
waste in developed and developing countries
initiatives, practices and consequences. Science
Total Environ
September 2014, Volume-1, Special Issue-1
AN ADEQUACY BASED MULTIPATH ROUTING IN 802.16
WIMAX NETWORKS
1
2
K.Saranya
Dr. M.A. Dorai Rangasamy
1
Research Scholar
2
Senior Professor& HOD CSE & IT
1
Bharathiar University, Coimbatore
2
AVIT, Chennai
1
[email protected]
2
[email protected]
Abstract — Multipath Routing in 802.16 WiMax
Networks approach consists of a multipath routing
protocol and congestion control. End-to-End Packet
Scatter (EPS), alleviates long term congestion by
splitting the flow at the source, and performing rate
control. EPS selects the paths dynamically, and uses a
less aggressive congestion control mechanism on nongreedy paths to improve energy efficiency fairness and
increase throughput in wireless networks with location
information.
I. INTRODUCTION
WiMAX (Worldwide interoperability for Microwave
access) or IEEE 802.16 is regarded as a standard for
metropolitan area networks (MANs) It is one among the
most reliable wireless access technologies for upcoming
generation all-IP networks.IEEE 802.16[1].(Wimax) is
“defacto”
standard
for
broadband
wireless
communication. It is considered as the missing link for
the”last mile” connection in Wireless Metropolitan Area
Networks (WMAN). It represents a serious alternative to
the wired network, such as DSL and cablemodem.
Besides Quality of Service (QoS) support, the IEEE
802.16 standard is currently offering a nominal data rate
up to 100 Mega Bit Per Second (Mbps), and a covering
area around 50 kilometers. Thus, a deployment of
multimedia services such as Voice over IP (VoIP), Video
on Demand (VoD) and video conferencing is now
possible by this Wimax Networks[2].WiMAX is regarded
as a disruptive wireless technology and has many
potential applications. It is expected to support business
applications, for which QoS support will be a
necessity[3].In Wimax the nodes can communicate
without having a direct connection with the base station.
This improves coverage and data rates even on uneven
terrain [4].
II. ROUTING IN MESH NETWORKS
Mesh mode that only allows communication between the
BS and SS, each station is able to create direct
communication links to a number of other stations in the
network instead of communicating only with a BS.
However, in typical network deployments, there will still
be certain nodes that provide the BS function of
connecting the Mesh network to the backbone networks.
24 | Page
When using Mesh centralized scheduling to be describe
below, these BS nodes perform much of the same basic
functions as the BSs do in mesh mode. Communication in
all these links in the network are controlled by a
centralized algorithm (either by the BS or decentralized
by all nodes periodically), scheduled in a distributed
manner within each node's extended neighborhood, or
scheduled using a combination of these. The stations that
have direct links are called neighbors and forms a
neighborhood. A nodes neighbor is considered to be one
hop away from the node. A two-hop extended
neighborhood contains, additionally, all the neighbors of
the neighborhood. Our solution reduces the variance of
throughput across all flows by 35%, reduction which is
mainly achieved by increasing throughput of long-range
flows with around 70%. Furthermore, overall network
throughput increases by approximately 10%.
There are two basic mechanisms for routing in the IEEE
802.16 mesh network
A. Centralized Routing
In mesh mode the concept BS (Base Station) refers to the
station that has directed connection to the backhaul
services outside the Mesh Network. All the others
Stations are termed SSs (Subscriber Stations). Within the
Mesh Networks there are no downlink or uplink concepts.
Nevertheless a Mesh Network can perform similar as
PMP, with the difference that not all the SSs must be
directly connected with the BS. The resources are granted
by the Mesh BS. This option is termed centralized
routing.
B. Distributed Routing
In distributed routing each node receives some
information about the network from its adjacent nodes.
This information is used to determine the way each router
forwards its traffic. When using distributed routing, there
is no clearly defined BS in the network [5]
In this paper, we present a solution that seeks to utilize
idle or under-loaded nodes to reduce the effects of
throughput.
September 2014, Volume-1, Special Issue-1
III. PROBLEM MODELING
In this section we first discuss about the EPS. End-to-End
Multipath Packet Scatter.
EPS successfully support the aggregate traffic (i.e. Avoid
congestion), it will only scatter packets to a wider area
Potentially amplifying the effects of congestion collapse
due to its longer paths (a larger number of contending
nodes lead to a larger probability of loss). In such cases a
closed loop mechanism is required to regulate the source
rates. EPS is applied at the endpoints of the flows, and
regulates the number of paths the flow is scattered on and
the rate corresponding to each path. The source requires
constant feedback from the destination regarding network
conditions, making this mechanism more expensive than
its local counterpart. The idea behind EPS is to
dynamically search and use free resources available in the
network in order to avoid congestion. When the greedy
path becomes congested, EPS starts sending packets on
two additional side paths obtained with BGR, searching
for free resources.
To avoid disrupting other flows, the side paths perform
more aggressive multiplicative rate decrease when
congested.EPS dynamically adjusts to changing
conditions and selects the best paths to send the packets
without causing oscillations. The way we achieve this is
by doing independent congestion control on each path. If
the total available throughput on the three paths is larger
than the sender’s packet rate, the shortest path is preferred
(this means that edge paths will send at a rate smaller
than their capacity). On the other hand, if the shortest
path and one of the side paths are congested but one other
side path has unused capacity, our algorithm will
naturally send almost all the traffic on the latter path to
increase throughput.
IV. SYSTEM MODELING
A. Congestion Signaling
Choosing an appropriate closed loop feedback
mechanism impacts the performance of EPS. Unlike
WTCP[6] which monitors packet inter-arrival times or
CODA[7] which does 100 local congestion measurements
at the destination, we use a more accurate yet lightweight
mechanism, similar to Explicit Congestion Notification
[8]. Nodes set a congestion bit in each packet they
forward when congestion is detected. In our
implementation, the receiver sends state messages to the
sender to indicate the state of the flow. State messages are
triggered by the receipt of a predefined number of
messages, as in CODA.The number of packets
acknowledged by one feedback message is a parameter of
the algorithm, which creates a tradeoff between high
overhead and accurate congestion signaling (e.g., each
packet is acknowledged) and less expensive but also less
accurate signaling. The destination maintains two
counters for each path of each incoming flow: packets
25 | Page
counts the number of packets received on the path, while
congested counts the number of packets that have been
lost or received and have the congested bit set to 1. When
packets reaches a threshold value (given by a parameter
called messages_per_ack), the destination creates a
feedback message and sends it to the source. The
feedback is negative if at least half of the packets
received by the destination have the congestion bit set, or
positive otherwise. As suggested in the ECN paper[8].
This effectively implements a low pass filter to avoid
signaling transient congestions, and has the positive effect
that congestion will not be signaled if it can be quickly.
B. RTT estimation
When the sender starts the flow, it starts a timer equal to:
messages_per_ack / packet rate + 2·hopcount·hop_time.
We estimate hop count using the expected inter-node
distance;hop_time is chosen as an upper bound for the
time taken by a packet to travel one hop. Timer expiration
is treated as negative feedback. A more accurate timer
might be implemented by embedding timestamps in the
packets (such as WTCP,TCP) but we avoid that due to
energy efficiency considerations. However, most times
the ECN mechanism should trigger the end-to-end
mechanism, limiting the use of timeouts to the cases
when acknowledgements are lost.
A.
Rate control
When congestion persists even after the flow has been
split at the source, we use congestion control (AIMD) on
each individual path to alleviate congestion. When
negative feedback is received, multiplicative decrease is
performed on the corresponding path’s rate. We use
differentiated multiplicative decrease that is more
aggressive on exterior paths than on the greedy path, to
increase energy efficiency; effectively, this prioritizes
greedy traffic when competing with split traffic. Additive
increase is uniform for all paths; when the aggregate rate
of the paths exceeds the maximum rate, we favor the
greedy path to increase energy efficiency. More
specifically, if the additive increase is on the shortest
(central) path, exterior paths are penalized proportionally
to their sending rate; otherwise, the rate of side path is
increased only up to the overall desired rate.
D. Discussion
EPS is suited for long lived flows and adapts to a wider
range of traffic characteristics, relieving persistent or
wide-spread congestion when it appears. The paths
created by this technique are more symmetric and thus
further away from each other, resulting in lessinterference. The mechanism requires each end-node
maintain state information for its incoming and outgoing
flows of packets, including number of paths, as well as
spread angle and send rate for each path. The price of
source splitting is represented by the periodic signaling
September 2014, Volume-1, Special Issue-1
messages. If reliable message transfer is required, this
cost is amortized as congestion information can be
piggybacked in the acknowledgement messages.
Pseudocode for a simplified version of EPS
//For simplicity, we assume a single destination and three
paths MaxPaths = 3;
bias={ 0, 45o,-45o}; reduce_rate= {0.85, 0.7, 0.7};
//sender side pseudo code
receive Feedback (int path, bool flowCongested) { if
(!EPS_Split) //not already split
if(flowCongested) splitSinglePath();
else sendingRates[0]+=increase_rate; //additive
increase
else //we have already split the flow into
multiple paths
if(flowCongested)sendingRates[path]*=
reduce_rate[path]; else { // no congestion, we
increase the path
sending rate
if(path == 0) { // main path sendingRates[0] +=
increase_rate; //additive increase totalAvailableRate
= sum(sendingRates);
if(totalAvailRate > 1) {//we can transmit more
than we want
diff = 1 – totalAvailableRate; for(int i =
1; i < MaxPaths; i++)
sendingRates[i] – = diff*sendingRates[i]/ (
totalAvailableRate - sendingRates[0]);
}
}
else sendingRates[path] += min(increase_rate, 1sum(sendingRates))
}
}
splitSinglePath(){
for(int i = 0; i < MaxPaths; i++) sendingRates[i] = 1 /
MaxPaths;
EPS_Split = true;
}
sendPacketTimerFired(){
path_choice = LotteryScheduling(sendingRates); Packet
p = Buffer.getNext(); //orthogonal buffer policy
p.split = EPS_Split; // if we split or not p.bias
= bias[path_choice];
next = chooseBGRNextHop(p); …//other
variables sendLinkLayerPacket(next,p);
When congestion is widespread and long-lived, splitting
might make things worse since paths are longer and the
entire network is already congested. However, as we
show in the Evaluation section, this only happens when
the individual flow throughput gets dramatically small
(10% of the normal value) and when the costs of path
splitting – in terms of loss in throughput – are
insignificant. Also, if paths interfere severely, splitting
traffic might make things worse due to media access
collisions, as more nodes are transmitting. This is not to
say that we can only use completely non-interfering
paths. In fact, as we show in Section our approach
exploits the tradeoff between contention (when nodes
hear each other and contend for media) and interference
nodes do not hear each other but their packets collide)
throughput is more affected by high contention than by
interference.
V. IMPLEMENTATION
In this section we present simulation results obtained
through ns2 simulations [9]. We use two main metrics for
our measurements: throughput increase and fairness
among flows. We ran tests on a network of 400 nodes,
distributed uniformly on a grid in a square area of 6000m
x 6000m. We assume events occur uniformly at random
in the geographical area; the node closest to the event
triggers a communication burst to a uniformly selected
destination. To emulate this model we select a set of
random source-destination pairs and run 20-second
synchronous communications among all pairs. The data
we present is averaged over hundreds of such iterations.
The parameters are summarized in Table 1.An important
parameter of our solution is the number of paths a flow
should be split into and their corresponding biases.
Simulation measurements show that the number of no
interfering paths between a source and a destination is
usually quite small (more paths would only make sense
on very large networks). Therefore we choose to split a
flow exactly once into 3 sub-flows if congestion is
detected. We prefer this to splitting in two flows for
energy efficiency considerations (the cheaper, greedy
path is also used). We have experimentally chosen the
biases to be +/-45 degrees for EPS.
}
// receiver side pseudocode receive
Packet(Packet p){
receivedPackets[p.source][p.path]++;
if(p.congested)congestedPackets[p.source]
[p.path]++;
if(receivedPackets[p.source][p.path] >
messagesPerAck) { boolean
isCongested = congestedPackets
[p.source][p.path] >
packets[p.source][p.path]/2);
sendFeedback(p.source, isCongested);
…//reinitialize state variables
}
}
26 | Page
September 2014, Volume-1, Special Issue-1
TABLE 1. SUMMARY OF PARAMETERS
Parameter
Value
Number of
Nodes
Parameter
Value
Link Layer
Transmission
Rate
2Mbps
6000m x
6000m
RTS0CTS
No
MAC
802.11
Retransmission
Count(ARQ)
4
Radio Range
250m
Interface queue
4
550m
Packet size
100B
Packet of
frequency
80/s
400
Area size
Contention
Range
Average Node
Degree
8
Figure 4 Received vs Transmission
VI. RESULTS
Figure 1 Throughput vs Transmission
27 | Page
As expected, our solution works well for flows where the
distance between the source and the destination is large
enough to allow the use of non-interfering multiple paths.
The EPS combination increases long-range flow
throughputs with around 70% as compared to single path
transmission (both with and without AIMD). For shortrange flows, where multiple paths cannot be used, the
throughput obtained by our solution is smaller with at
most 14%, as the short-range flows interfere with split
flows of long-range communications. However,by
increasing long-range flows’ throughput we improve
fairness among the different flows achieving a lower
throughput variance across flows with different lengths
by 35% compared to a single path with AIMD. Moreover,
the overall throughput is increased with around 10% for a
moderate level of load (e.g. 3-6 concurrent
transmissions).Finally, we show that our algorithm EPS
does not increase the number of losses compared to
AIMD.
September 2014, Volume-1, Special Issue-1
A. Throughput and Transmission
Fig. 1 presents how the number of transmissions in the
network affects the average flow throughput. Throughput
drastically decreases as the network becomes congested
regardless of the mechanism used. For moderate number
of transmissions (3-5) the combination EPS increases the
overall throughput by around 10%.However, it is not
using rate control and a lot of the sent packets are lost,
leading to inefficiency.
B.Impact of factor rate
Fig. 2a shows that the combination EPS has a similar
packet loss rate to “AIMD”.
Fig. 2b displays the overall throughput for different
transmission rates. As we can see the throughput flattens
as congestion builds in the network but the (small) overall
increase remains approximately steady.
C.Received and Transmission
Fig. 3 shows this is also true when the transmission rate
varies. This is important on two counts: first, for energy
efficiency reasons, and second, to implement reliable
transmission.
[4]. Vinod Sharma, A. Anil Kumar, S. R. Sandeep, M.
Siddhartha Sankaran “Providing QoS to Real and
Data Applications in WiMAX Mesh Networks” In
Proc. WCNC, 2008.
[5]. Yaaqob A.A. Qassem, A. Al-Hemyari, Chee Kyun
Ng, N.K. Noordin and M.F.A. Rasid “Review of
Network Routing in IEEE 802.16 WiMAX Mesh
Networks”, Australian Journal of Basic and Applied
Sciences, 3(4): 3980-3996, 2009.
[6]. Sinha P. Nandagopal T., Venkitaraman N.,
Sivakumar R., Bhargavan V., "A Reliable Transport
Protocol for Wireless Wide-Area Networks.", in
Proc. of Mobihoc, 2003.
[7]. Wan C.Y. Eisenman S.B., Campbell A.T., "CODA:
Congestion Detection and Avoidance in Sensor
Networks," in Proc. of SenSys, 2003.
[8]. Ramakrishnan K.K. Jain R., "A Binary Feedback
Scheme for Congestion Avoidance in Computer
Networks," in Transactions on Computer Systems,
vol. 8, 1990.
[9]. NS2 simulator, http://www.isi.edu/nsnam/ns/.
VII. CONCLUSION
In this paper, we have presented a solution that increases
fairness and throughput in dense wireless networks. Our
solution achieves its goals by using multipath geographic
routing to find available resources in the network.
EPS (end-to-end packet scatter), that split a flow into
multiple paths when it is experiencing congestion. EPS is
activated. EPS performs rate control to minimize losses
while maintaining high throughput. It uses a less
aggressive congestion response for the non-greedy paths
to gracefully capture resources available in the network.
REFERENCES
[1]. Murali Prasad, Dr.P. Satish Kumar “An Adaptive
Power Efficient Packet Scheduling Algorithm for
Wimax Networks” (IJCSIS) International Journal of
Computer Science and Information Security, Vol. 8,
No. 1, April 2010.
[2]. Adlen Ksentini “IPv6 over IEEE 802.16 (WiMAX)
networks: Facts and challenges” Journal of
Communications, Vol. 3, No. 3, July 2008.
[3]. Jianhua Hey, Xiaoming Fuz, Jie Xiangx, Yan Zhangx,
Zuoyin Tang “Routing and Scheduling for WiMAX
Mesh Networks” in WiMAX Network Planning and
Optimization, edited by Y. Zhang, CRC Press, USA,
2009.
28 | Page
September 2014, Volume-1, Special Issue-1
CALCULATION OF ASYMMETRY PARAMETERS FOR LATTICE
BASED FACIAL MODELS
M. Ramasubramanian1
Dr. M.A. Dorai Rangaswamy2
1
Research Scholar & Associate Professor
2
Research Supervisor & Sr. Professor
1,2
Department of Computer Science and Engineering,
Aarupadai Veedu Institute of Technology, Vinayaka Missions University,
Rajiv Gandhi Salai, (OMR), Paiyanoor-603104, Kancheepuram District, Tamil Nadu, India
1
[email protected]
2
[email protected]
Abstract— Construction of human like avatars
is a key to produce realistic animation in virtual reality
environments and has been a commonplace in present
day applications. However most of the models proposed
to date intuitively assume human face as a symmetric
entity. Such assumptions produce unfavorable
drawbacks in applications where the analysis and
estimation of facial deformation patterns play a major
role. Thus in this work we propose an approach to
define asymmetry parameters of facial expressions and
a method to evaluate them. The proposed method is
based on capturing facial expressions in threedimension by using a rangefinder system. Threedimensional range data acquired by the sy6 tem are
analyzed by adapting a generic LATTICE with facial
topology. The asymmetry parameters are defined based
on the elements of the generic mash and evaluated for
facial expressions of normal subjects and patients with
facial nerve paralysis disorders. The proposed system
can be used to store asymmetric details of expressions
and well fitted to remote doctor-patient environments.
Keywords- Generic 3d models, morphing,
animation, texture etc.
I. INTRODUCTION
The construction of facial models that interpret
human like behaviors date hack to 1970's, where Parke [l]
introduced first known "realistic" CG animation model to
move facial parts to mimic human-like expressions. Since
then, a noticeable interest in producing virtual realistic
facial models with different levels of sophistication has
been seen in the areas of animation industry, telecommunication, identification and medical related areas
etc. However, most of these models have inherently
assumed that the human face as a symmetric entity. The
relevance and the importance of defining asymmetric
properties of facial models can he illustrated in many
application areas. In this study, its relevance in the field
of Otorhinolaryngology in Medicine is illustrated. A
major requirement in such an application is to construct
robust facial parameters that determine the asymmetric
deformation patterns in expressions of patients with facial
nerve paralysis disorders. These parameters can he used
to estimate the level deformation in different facial parts,
as well as to transmit and receive at the ends of remote
doctor-patient
environments.
'Acknowledgement:
Authors would like to thank Dr Toshiyuki Amam
29 | Page
(Nagoya Institute of Technology) and Dr. Seiichi Nakata
(School of Medicine, Nagoya University) for their
support rendered towards the SUCC~SS of this work.
Yukio Sat0 Dept. of Electrical and Computer
Engineering Nagoya Institute of Technology Many
attempts have been made in the past by researches to
develop systems to analyze and represent the levels of
facial motion dysfunction in expressions. Pioneering
work of Neely et al. [2] reported a method to analyze the
movement dysfunction in the paretic side of a face hy
capturing expressions with 2D video frames. Selected
frames of expressions are subtracted from the similar
frames captured at the rest condition with image
subtraction techniques. Similarly, most of other attempts
proposed to date are based on 2D intensity images and
they inherently possess the drawbacks associated with
inconstant lighting in the environment, change of skin
colors etc. To eliminate these drawbacks, use of 3D
models is observed to be of commonplace. Although
there are many techniques available today for the
construction of 3D models, a laser-scanned method that
accurately produces high-density range data is used here
to acquire 3D facial data of expressions. Construction of
3D models from scanned data can he done by
approximating measured surface by continuous or
discrete techniques. The continuous forms, such as spline
curve approximations can he found in some previous
animation works [3], 141. A great disadvantage in these
approaches is the inevitable loss of subtle information of
facial surface during the approximation. To make it more
realistic and preserve the subtle information, there must
he intensive computations, in the way of introducing
more control points, which makes it difficult to
implement in analysis stages. On contrary, LATTICE
based methods make it less complicated to implement
and widely used in modeling tasks. Thus the approach
proposed here adheres to LATTICE based 3D facial
models in deriving asymmetry parameters.
2. CONSTRUCTION OF 3D MODEL
Predesigned facial actions are measured hy a
rangefinder system [5], which produces &bit 512 x 242
resolution frontal range images and color texture images.
A symmetric generic face LATTICE with triangular
patches is adapted to each of these range images to
produce 3D models to he used in asymmetry estimations.
September 2014, Volume-1, Special Issue-1
The LATTICE adaptation is a tedious and time
consuming process if it involves segmentation of range
images to extract feature points for mapping. Instead,
here we resort to a much simpler approach hy extracting
feature points from the texture images since both range
and texture images cap t u r d by this system have oneteone correspondence. Forty-two evenly distributed
feature points are selected as mapping points, of which
the corresponding LATTICE locations are predetermined.
We then calculate the displacements between these
feature points and corresponding LATTICE locations.
The least squares approximation method is used to fit the
face LATTICE to the feature
from the corresponding range images are mapped to the
vertices of the face LATTICE to produce a 3D model of
the measured expression 161. Constructed 3D models of
a patient with Bell’s palsy are depicted in Fig.,l with
eyeclosure and grin facial expressions.
3.ESTIMATION
OF
DEFORMATION
ASYMMETRY
The facial deformations during expressions are
calculated based on the 3D models generated for each
expression as described in previous section. Since facial
expressions do modifications to the facial surface at rest,
30 models we generated also reflect these deformations in
their constituent triangular patches. To estimate these
deformations, we implement the 3D LATTICE model as
a LATTICE of connected linear springs. Suppose a
particular patch in the left side consist of three springs
with their gained lengths at the rest condition, from the
equilibrium as CL,, C L ~an d <r,$ respectively (Fig. 2).
Figure 1: 3D models of eye closure and grin expressions
of a patient.
points, by minimizing the displacements between map
ping points. We use a polynomial of degree N, shown in
Eq. (1) as the mapping function f(z,y) of the least squares
estimation.
Thus, 3D model construction for each facial
action consists of following steps.
Extract 42 feature points from the color image,
whose corresponding mapping nodes on the
generic face LATTICE are known.
Calculate the displacement vectors between the
mapping points and the feature points.
Apply the polynomial function in Eq. (1) with
order N = 2 for initial mapping and calculate the
coefficients am, am,.. . , aoz a by minimizing the
error term of the least squares estimator for the
best fit.
Once evaluating the second order mapping
function [Eq. (l)] with the coefficients aoo,. . . ,aoz.
Map all other points accordingly.
Calculate the error term of the least squares
estimator for all these points and compare it with a
pre-defined threshold value.
If the fitting error exceeds the threshold value,
increase the order of the polynomial (N) and repeat
the fitting process by evaluating new coefficients.
Thus, once the fitting error lies within a
satisfactory margin of the threshold value, depth values
30 | Page
Figure 2: Patch deformation during an expression.
Thus, the energy stored in the patch at the rest condition
is,
Where IC is the spring constant identical to all
springs. Suppose during an expression this patch deforms to
a new state with each edge modifying to lengths ****
September 2014, Volume-1, Special Issue-1
and respectively. Thus the change of energy from the rest
condition can he stated as,
Similarly, the energy change of its mirror patch in
the right side can he stated as, Thus, if we let
Thus, if we let
to PL during an expression. The change in
orientationduring the expression can be estimated by
considering the following transformations.
Let the center of gravity of both patches PL and
PL’ be GL and GL’ respectively
Let NL and NL‘ denote the surface normal
vectors of patches PL and PL’
Translate GL and GL to the origin, so that they
coincide with each other
Align surface normal vectors NL and NL along
the Z-axis so as to make the patches co-planer
with theXY-Plane
Calculate the direction vectors r1 and rz from the
center of the gravity of each patch to a similar
vertex.
8 Rotate the patch in XY-plane so that rI and r2
coincide with X axis.
and
This transformation scenario is depicted in Fig.3.
Thus, resulting transformation can be expressed as,
as deformations of left and right sides respectively, from
Eq. (3) and Eq. (4)we can deduce AEL = ~ICWL’ and
AER = i k w R Z . Ignoring constant parts, WL and WR
can be considered as candidate parameters to describe the
deformation of the triangular patches.
4 ESTIMATION OF ORIENTATION ASYMMETRY
Apart from the measure of asymmetry in
deformations, locally to the patches, another factor that
contributes to the asymmetry is global orientation of
patches in both sides even when they have identical
deformations. Sup pose a particular patch in the left side
has the orientation PL in the rest condition. It changes the
orientation
We can now define the transformation parameter
for the left side patches as,
Similarly, the transformation parameter for the
right side can be derived as.
Figure 5 Correlation between qr. and qR of normal
subject in grin action.
Figure 3: Transformation of a left side patch between
rest and a facial action.
31 | Page
Therefore the composite orientation parameter
can be stated as,
September 2014, Volume-1, Special Issue-1
For identical orientations of left and right side
patches, u=O. For the patches with little or no
deformation during expressions compared to the rest
condition, TLNTL,TR'ZTk,R LER~R,R ER',, and
RLLSR',,. Therefore, UL= U R ~ O . Thus the orientation
asymmetry can be estimated for all the patches in left and
right sides. Let eta be the composite asymmetry
parameter, where, q = w + U. Evaluating q for left and
right side patches of different expressions give a measure
of asymmetric deformation in different expressions.
5. RESULTS
In this work we measured patients as well as
normal subjects to assess the reliability of estimation.
Five facial expressions, namely, eye closure, lines on the
fore head, sniff, grin and lip purse are measured. In each
case, frontal range and texture images are obtained by the
rangefinder system. Then we construct 3D models of t h
w expressions as described in section 2. Once the 3D
LATTICE models are generated, surface deformations
are estimated for each facial action as described in section
3. To calculate the composite deformation asymmetries,
orientation of the patches in 3D space is evaluated as
described in section 4. Here we present the results of eye
closure and grin actions of two subjects, one is a normal
subject with no apparent expression asymmetries and the
other is a patient with Bell's paralysis. Surface
deformation and 3D orientation estimations are done for
the left and right sides separately. The composite
asymmetry q is calculated for each patch in the left and
right sides. For the left side patches qL = WL + UL and
for the right side patches qR = WR + UE is evaluated. For
ideally symmetric deformations, the correlation between
7 7 ~an d qn should confirm to a straight line of y = mz
type. The Fig. 4 and Fig. 5 depict the respective
correlations of the eve-closure and e-r in actions of the
normal subject. Similarl”v.. Fie. 6 and Fig. 7 denict the
eveclosure and I grin actions of a patient with facial
parzysis. Table 1 and Table 2 summarize the mean and
standard deviation of qL and qn of normal subject as we
as the patient in eye closure and grin actions respectively.
Figure 7 Correlation between qL and qR of a patient in
grin action.
6. SUMMARY
In this work we have presented an approach to
estimate the asymmetric deformations in facial
expressions in 3D. By analyzing the correlations of
asymmetry in left and right sides of the normal subject
and the patient, we can confirm that the patient has
paralysis in the right side in both facial actions. His
distributions in both expressions lean towards the X-axis
(left side) since that side produce most of the movements
during expressions. Therefore with two proposed
parameters w and U we have shown that it is possible to
encode the asymmetric properties of facial expressions.
Although the proposed method is illustrated on a
triangular patch based model,
it does not impose constraints on the underline LATTICE
structure. Thus it can be readily applied on different
LATTICE topologies.
REFERENCES
[1] Xiaogang Wang and Xiaoou Tang “Unified
Subspace Analysis for Face Recognition” Ninth
IEEE International Conference on Computer Vision
(ICCV’03) 0-7695-1950-4/03
[2] P.N. Belhumeur, J. Hespanda, and D. Kiregeman,
“Eigenfaces vs. Fisherfaces: Recognition Using
32 | Page
September 2014, Volume-1, Special Issue-1
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
Class Specific Linear Projection”, IEEE Trans. on
PAMI, Vol. 19, No. 7, pp. 711-720, July 1997.
K. Fukunnaga, “Introduction to Statistical Pattern
Recognition”, Academic Press, second edition,
1991.
B. Moghaddam, T. Jebara, and A. Pentland,
“Bayesian Face Recognition”, Pattern Recognition,
Vol. 33, pp. 1771-1782, 2000.
P. J. Phillips, H. Moon, and S. A. Rozvi, “The
FERET Evaluation Methodolody for Face
Recognition Algorithms”, IEEE Trans. PAMI, Vol.
22, No. 10, pp. 1090-1104, Oct. 2000.
M. Turk and A. Pentland, "Eigenfaces for
Recognition", J. of Cognitive Neuroscience, Vol. 3,
No. 1, pp. 71-86, 1991.
W. Zhao, R. Chellappa, and P. Phillips. “Face
Recognition: A Literature Survey”, Technical Repot,
2002.
W. Zhao, R. Chellapa, and P. Philips, “Subspace
Linear
Discriminant
Analysis
for
Face
Recognition”, Technical Report CAR-TR-914, 1996.
M. Ramasubramanian, Dr.M.A.DoraiRangaswamy,
"RECONSTRUCTION OF PRACTICAL 3D FACE
MATCHING PART FROM 2D IMAGES – A
CROSS BREED APPROACH", in the National
Conference
On
Recent
Approaches
in
Communication and Information Technology
NCRACIT 2012, organized by Department
Information Technology Madha Engineering
College, Chennai on 20th March 2012.
M. Ramasubramanian, Dr.M.A.DoraiRangaswamy,
"Reconstruction of sensible 3D Face Counterpart
From 2D Images - A hybrid approach", in the
International Journal of Research Journal of
Computer Systems Engineering,ISSN: 22503005,Page No: 139-144,July,2012.
M. Ramasubramanian, Dr.M.A.DoraiRangaswamy,
,"EFFICIENT 3D OBJECT EXTRACTION FROM
2D IMAGES", in the National Conference On
Emerging Trends In Computer Applications &
Management NCETCAM 2013 organized by
Department of Computer Application and
Management of Aarupadai Veedu Insitute of
Technology, Chennai on 17th, April 2013.
M. Ramasubramanian, Dr.M.A.DoraiRangaswamy,
"3D OBJECT EXTRACTION FROM 2D OBJECT",
in the National Conference on Architecture,
Software systems and Green Computing NCASG2013 , organized by Department of Computer
Science and Engineering , Aarupadai Veedu Insitute
of Technology, Chennai on 03rd April, 2013.
M. Ramasubramanian, Dr.M.A.DoraiRangaswamy,
"3D OBJECT CONVERSTION VIA 2D IMAGES A
SURVEY REPORT", in the National Conference on
Architecture, Software systems and Green
Computing
NCASG-2013
,
organized
by
Department of Computer Science and Engineering ,
Aarupadai Veedu Insitute of Technology, Chennai
on 03rd April, 2013
33 | Page
[14] M. Ramasubramanian, Dr.M.A.DoraiRangaswamy,
3D OBJECT EXTRACTION FROM 2D OBJECT",
in the International Journal on Emerging Trends &
Technology in Computer Science,
[15] M.Ramasubramanian, P. Shankar,Dr.M.A.Dorai
Rangaswamy," 3D OBJECT CONVERSTION VIA
2D IMAGES A SURVEY REPORT", in the
International Journal on Emerging Trends &
Technology in Computer Science,
[16] M.
Ramasubramanian,
Dr.M.A.Dorai
Rangaswamy,"
EFFICIENT
3D
OBJECT
EXTRACTION FROM 2D IMAGES ", in the
INTERNATIONAL CONFERENCE on Intelligent
Interactive Systems and Assistive Technologies
IISAT-2013, Coimbatore on August June,2013.
[1] Mr. M. Ramasubramanian worked in the school of
computer
science
and
engineering CEG, Guindy, for
more than 4 years. He received
his M.E. degree in the field of
Computer
Science
and
Engineering from Vinayaka
Missions University in the year
of 2009. He is presently
working as a Senior Assistant
Professor, in Aarupadai Veedu Institute of
Technology, Vinayaka Missions University, India.
He is a Life member in International Society for
Technology in Education (ISTE) and Indian
Science Congress(ISCA). He has nearly 20
publications in reputed referred Journals and
conferences. He is doing his Research in the area
of Image Processing in the same University, under
the guidance of Dr.M.A. Dorai Rangaswamy.
[2] Dr.M.A.Dorai Rangaswamy is currently Head and
the Senior Professor in
Department
of
Computer
Science and Engineering in
Aarupadai Veedu Institute of
Technology,
Vinayaka
Missions
University.
His
specializations include Mining,
Image processing, computer
architecture
and
microcontrollers. He has nearly 40 publications in reputed
referred Journals and conferences. He is an active
IEEE member.
September 2014, Volume-1, Special Issue-1
MULTI-SCALE AND HIERARCHICAL DESCRIPTION USING
ENERGY CONTROLLED ACTIVE BALLOON MODEL
1
T. Gandhimathi
2
M. Ramasubramanian
3
M.A. Dorai Rangaswamy
1
Assistant Professor
2
Associate Professor
3
Sr.Professor & Head
Department of Computer Science and Engineering, Aarupadai Veedu Institute of Technology,
Vinayaka Missions University, Rajiv Gandhi Salai, (OMR), Paiyanoor 603 104,
Kancheepuram District, Tamil Nadu, India
1
[email protected]
2
[email protected]
3
[email protected]
Abstract
A novel multi-scale tree construction algorithm for
three dimensional shape using “Energy Controlled
Active Balloon Model (ECABM)" is proposed. The key
idea is that (1) internal and external energies and a
number of surfaces which make up the ABM are
controlled at each shrinking steps, and (2) each
converged shape is regarded as one multi-scale shape.
Some experimental three dimensional recognition result
of the proposed method with human face data are
shown.
dimensional extension of snake and Tsuchiya[4] also
proposed three dimensional extension model \Active
Balloon Model(ABM)". In this paper, we propose a novel
multi-scale tree construction algorithm for three
dimensional shape using “Energy Controlled Active
Balloon Model(ECABM)". The key idea is that (1)
internal and external energies and a number of surfaces
which make up the ABM are controlled at each shrinking
steps, and (2) each converged shape is regarded as one
multi-scale shape.
2. THE ACTIVE BALLOON MODEL
1. INTRODUCTION
When humans recognize the shape of the object, rough
observation of whole object shape and detail observation
of partial shape are used at the same time. If these
procedures are applied to recognition algorithm for
computer vision, exible and robust matching can be
achieved. Many recognition algorithm for one
dimensional signal and shapes contours using multi-scale
data were proposed[1]. Shape of 3D object are
transformed into multi-scale representation in order to
observe each discrete portion of the shape contours at
different view scales. The reason for using multi-scale
representation is that the inflection point, which is an
important characteristic information for image
recognition, is increased monotonically by increasing the
resolution. While making 3D multi scale resolution
images, convolution with Gaussian function is commonly
used. However features such as an inflection point does
not increase monotonically by increasing the resolution.
Consequently multi-scale tree structure which has suit
able feature for matching is impossible to construct. In
recent years, segmentation technique which combine a
local edge extraction operation with the use of active
contour models, or snakes, to perform a global region
extraction have achieved considerable success for certain
applications[2]. These models simulate elastic material
which can dynamically conform to object shapes in
response to internal forces, external forces, and user
specified constraints. The result is an elegant method of
linking sparse or noisy local edge information into a
coherent object description. Cohen[3]proposed three
34 | Page
The active balloon model is a discrete dynamic
deformable model constructed from a set of mobile nodes
and adjustable strings which describes 3D tri angle patch
models from scattered 3D points in space. Each node of
the ABM moves according to it's local energy iteratively
and constructs 3D shape. The ABM is a model which
expands snakes into three dimensional shell structure.
The initial shape of ABM has 1280 triangle patches, and
it's structure has the well known geodesic dome. The
energy function which acts on each node is defined by
equation (1), and the following position of each node is
decided using greedy algorithm which uses only the
energies of connected nodes[5]. The energy of a point x is
defined by a summation of an internal energy Eint and
external energy Eext.
Formula 1
The internal energy Eint corresponds to a smooth factor
in the regularization theory which is defined by the
following equation.
Formula 2
where p(x) denotes position vector of each node, e0;
_ _ _ ; e5 denotes unit vector of X, X, Y , Y , Z and Z
axis respectively. _ denotes the parameters which defines
the ratio of internal and external energies. J de notes sets
of nodes connected to x. This factor controls the
smoothness of connections between nodes. The external
September 2014, Volume-1, Special Issue-1
energy Eext corresponds to a penalty factor. This external
energy makes the node approach the object shape. During
iteration procedure, the location having the smallest
energy value is chosen as the new position. The external
energy is the force produced by 3D points in space. It is
the space potential energy which is defined using Gauss
function. Consequently correspondence procedure
between each node and measured 3D points in space is
eliminated.
a more detail shape description. Parameters which control
shrinking process of ABM are changed for constructing a
robust multi-scale description against noise and partial
shape difference. Three parameters, the ratio of internal
and external parameters _ in equation (2), the variance of
potential distribution of external energy _ in equation (4),
and the number of triangle patch that makes up the active
balloon model which controls the multi-scale resolution,
are defined.
Formula 3
3.3. External Energy Control
3. ENERGY CONTROLLED ACTIVE BALLOON
MODEL
In order to get smooth shrinking procedure, spatial
potential defined by equation (4) is controlled. When
constructing low resolution shape, standard deviation _ is
set to be small (this means space potential is decreased
gently). To increase the resolutions, _ is increased.
Therefore, movement of each node depends on the
general shape at low resolution and also depends on the
partial shape at high resolution. When the parameter _
sets large, the external potential represents a blurred
object shape. On the other hand, if it is small, the
distribution of potential is similar to the object shape.
3.1. The Problem of Constructing Multi scale Tree MEGI
(More Extended Gaussian Image) [6] is a description
model to represent arbitrary shape. However many MEGI
elements are necessary to represent uneven or curved
surfaces with accuracy; hence it is difficult to use them
for recognition. As a solution, Matsuo [7] proposed a
multi-scale matching model using multi-scale description
by tracing the tree from the root, which corresponds to
the coarsest representation, to the leaf and using a
matching algorithm. Matsuo [7] also proposed simple
method to construct multi-scale tree for 3D object.
Making the multi-scale tree with ASMEGI is a \bottom
up method" that is the tree is generated from high
resolution to low resolution. Therefore, at low resolution,
the generated multi-scale tree changed drastically when
the shape of pose of object is slightly changed.
Consequently, the feature of multi-scale tree, (i.e. (1)
detail shape of the object is shown in a high resolution,
(2) the outline shape which neglect the object
deformation and object displacement are shown in low
resolution) are lost. Figure 1 shows normal vector
description for a circle in low resolution. If the phase of
polygon is moved a little bit, features of the set of the
normal vectors changes drastically. However if making a
multi-scale tree using a \top down method", these
problems can be solved. Description Using Normal
Vector Figure 1. The normal vector description for a
circle in low resolution
figure 1
3.2. Multi-scale Description using ECABM
The key idea of this paper is that each shrinking step of
the ABM is regarded as a multi-scale description. The
initial shape of ECABM changed from 1280 triangle
patches to icosahedrons. By making iteration procedure,
initial icosahedrons shape converges to a certain shape.
This converged shape is regarded as one scale of a multiscale description. To increase the resolution, each triangle
element is divided into four subdividing triangle elements
and iteration procedure is continued. However, a lot of
calculation is needed for recognition if unconditional
division is used. A division method will be described in 4.
This subdividing operation enables the ABM to describe
35 | Page
3.4. Internal Energy Control
Changing the ratio _ in equation (2) between the external
energy and the internal energy, smoothness
(discontinuity) factor can be controlled. Therefore, during
low resolution constructing procedure (the first stage of
shrinking), the external energy is dominated by the
internal energy. With the progress of the shrinking, the
ratio of internal energy becomes high. As a result, exact
3D shape can be reconstructed.
4. Constructing Multi-scale Tree In Sec 2, the algorithm
to generate a 3D multi-scale space using ECABM was
discussed. The 3D multi scale tree cannot be constructed
because there is no relation between each triangle patches
of each resolution. In this section, the method of
composing 3D multi-scale tree from a low resolution to a
high resolution is proposed. When the resolution is
increased by dividing single triangle into four, the
hierarchical relations between these triangles are defined.
Doing this operation from low resolution to high
resolution, multi-scale tree can be constructed. However
when all triangles are divided into four, triangles which
describes as part of the object are almost motionless after
dividing. These division cases not only becomes tedious
expression, but also matching candidate are increased for
recognition. Useless division is limited by the following
division rule of triangle patch.
1. Let three vertices of a triangle patch T be Pa; Pb; Pc
respectively.
2. New vertex Pab is added to the middle of Pa and
Pb.Pbc and Pca are added similarly.
September 2014, Volume-1, Special Issue-1
3. Calculate summation of these external energies e1. e1
= Eext(Pab) + Eext(Pbc) + Eext(Pca) (5)
4. The vertex Pab is moved in the direction where
external energy becomes low. Let new point be
P0 ab. P0 bc, P0 ca are also calculated.
5. Summation e2 of these three node is calculated. e2 =
Eext(P0 ab) + Eext(P0 bc) + Eext(P0 ca) (6)
6. If e1 e2 < Th, then this division procedure from patch
T to patches T1;T2;T3 and T4 is stopped
else this division step is confirmed. where This threshold
parameter.
7. The above mentioned steps are performed to all
triangle patches.
Figure 2 shows a division step. This figure shows that at
areas are not divided into high resolution triangles. The
outline of the algorithm to construct multi figure 2 scale
shape and multi-scale tree using ECABM are shown as
follows.
Step1 The initial shape of active balloon model is set in
icosahedrons. Let the initial value of these parameter _, _
be _0, _0 respectively.
Step2 Do iteration procedure until a shape converges to a
certain shape.
Step3 This converged shape is regarded as one scale of
multi-scale description.
Step4 If present resolution attains to maximum resolution,
then exits this procedure.
Step5 All triangle patches are divided into four new
triangle patches, and check the division rule as described
in 4.
all experiments, parameters _0, _ 0 were set at 3.5 and
10,respectively. __ and __ are set at 2 and 0.5
respectively, and _ve scales ware constructed. Figure 5
shows correlation coefficient between one face in Figure
3 with no rotation and all 25 faces when the rotation
angle was changed. Upper figure shows the correlation of
coefficient using bottom up method, lower figure shows
using proposed top down method using ASMEGI. The
correlation coefficients between correct pair are plotted,
and the correlation coefficients to other 24 faces are
plotted only the maximum, the minimum, the average and
the standard deviation values. If max value exceed to the
correlation coefficients between the correct pair,
recognition is considered to fail at this rotation angle. The
correlation coefficient was calculated by the multi-scale
tree matching algorithm proposed by Matsuo[7]. When
rotation angle is increased, recognition rate become low.
This is because original face range data is not a complete
3D image. Therefore rotation image has a lot of occluded
part compared with the original image. The proposed top
down method obtained a higher correlation coefficients
figure 3 than bottom up method at all rotation angle and
for correct and incorrect matching pair. The reason is that
all low resolution shape which was constructed by top
down method becomes almost the same shape. However
using the bottom up method, all low resolution shapes
become quite different shape.
Figure 4
Figure 6 shows recognition rate using the top down multiscale tree construction algorithm using AS MEGI and
bottom up algorithm which was used in [7]. Recognition
result is defined as the maximum correlation coefficient
between the rotated angle data and the original data.
Using top down method, recognition rate becomes 100%
for all rotation angle. These results shows extremely high
recognition ability of top down method, even if applying
to the images contains curved surface which have very
few features like human faces.
Figure 5 (a) (b)
Step6 New _ is set to _=__, new _ is set to _ __, and goto
Step2.
[1] M.Leyton. A process grammar for shape. Artificial
Intell., 34:213247,1988.
5. The Experiment
[2] M.Kass, A.Witkin and D.Terzopoulos. Snakes: Active
Contour Models. Int. J. Comput. Vision,1(4):321331,
1988.
The experiment was performed with the range data of
human full face data (25 faces) produced by the National
Research Council Canada(NRCC)[8]. Hair part of each
full face data was eliminated by hand. In this experiment,
3D shape model(CAD model) was also generated using
range data. Changing view points of the range data were
also rebuilt using the original range data. Figure 3 shows
one human full face data. Figure 4 shows multi-scale
shapes of each resolutions shown in Figure 3. In this
matching experiment, a view angle of elevation is fixed at
0 degrees, an azimuthal angle is changed from 0 degree to
10 degrees respectively, and range data was measured. In
36 | Page
[3] L.D. Cohoen. On active contour models and balloons.
In CVGIP: Image Understanding, 53(2):211-218,
1991.
[4]
K.Tsuchiya, H.Matsuo, A.Iwata. 3D Shape
Reconstruction from Range Data Using Active
Balloon Model and Symmetry Restriction. Trans. of
Institute of Elec. Info. and Comm. Eng. (in J) ,
J76DII(9):19671976, Sep. 1993.
September 2014, Volume-1, Special Issue-1
[5] Williams D.J. and Shah M. A Fast Algorithm for
Active Contours. In Proc. of Third Int. Conf. on
Comput. Vision, 592595, 1990.
[6] H.Matsuo and A.Iwata. 3D Object Recognition using
MEGI Model From Range Data. 12th Int. Conf. on
Pattern Recognition(ICPR), I:843846, Sep. 1994.
[7] H.Matsuo, J.Funabashi and A.Iwata. 3D Object
Recognition using Adaptive Scale MEGI. 13th
International Conference
on
Pattern
Recognition(ICPR), IV:117122. Aug. 1996.
[8] Rioux M., and Cournoyer L. The NRCC Three
dimensional Image Data Files. The Report CNRC
29077, National Research Council Canada, Ottwa,
Canada, 1988.
[1] Miss. T.Gandhimathi, Completed her Bachelor of
Engineering in VMKV Engineering college,vinayaka
missions university. She completed her Post graduate
in M.Tech in Periyar Maniammai University,
Thanjavur. Presently she working as a Assistant
Professor in Aarupadai Veedu Institute of
Technology,Vinayaka Missions University.
[2] Mr. M. Ramasubramanian worked in the school of
computer science and engineering CEG, Guindy, for
more than 4 years. He received his M.E. degree in the
field of Computer Science and Engineering from
Vinayaka Missions University in the year of 2009. He
is presently working as a Senior Assistant Professor,
in Aarupadai Veedu Institute of Technology,
Vinayaka Missions University, India. He is a Life
member in International Society for Technology in
Education
(ISTE) and Indian Science Congress(ISCA). He has
nearly 20 publications in reputed referred Journals and
conferences. He is doing his Research in the area of
Image Processing in the same University, under the
guidance of Dr.M.A. Dorai Rangaswamy.
[3] Dr.M.A.Dorai Rangaswamy is currently Head and the
Senior Professor in Department of Computer Science
and Engineering in Aarupadai Veedu Institute of
Technology, Vinayaka Missions University. His
specializations include Mining, Image processing,
computer architecture and micro-controllers. He has
nearly 40 publications in reputed referred Journals and
conferences. He is an active IEEE member.
37 | Page
September 2014, Volume-1, Special Issue-1
CURRENT LITERATURE REVIEW - WEB MINING
1
K.Dharmarajan-Scholar
2
Dr.M.A.Dorairangaswamy
1
Research and Development Centre
2
Dean, CSE
1
Bharathiar University, Coimbatore – 641 046, India
2
AVIT, Chennai, India
1
[email protected]
2
[email protected]
Abstract — This study presents the role of Web mining
an explosive growth of the World Wide Web; websites
are providing an information and knowledge to the end
users. This is the review paper which show deep and
intense study of various technologies available for web
mining and it is the application of data mining
techniques to extract knowledge from web. Current
advances in each of the three different types of web
mining are reviewed in the categories of web content
mining, web usage mining, and web structure mining.
data mining techniques, as mentioned above it is not
purely an application of traditional data mining due to the
heterogeneity and semi-structured or unstructured nature
of the Web data. Many new mining tasks and algorithms
were invented in the past decade. Based on the main
kinds of data used in the mining process, Web mining
tasks can be categorized into three types: Web structure
mining, Web content mining and Web usage mining.
Index Terms—web mining, web content mining, web
usage mining, web structure mining.
I. INTRODUCTION
World Wide Web or Web is the biggest and popular
source of information available, reachable and accessible
at low cost provides quick response to the users and
reduces burden on the users of physical movements. The
data on the Web is noisy. The noise comes from two
major sources. First, an emblematic Web page contains
many pieces of information, e.g., the main content of the
page, routing links, advertisements, copyright notices,
privacy policies, etc. Second, due to the fact that the Web
does not have quality control of information, i.e., one can
write almost anything that one likes, a large amount of
information on the Web is of low quality, erroneous, or
even misleading. Retrieving of the required web page on
the web, efficiently and effectively, is becoming a
difficult.
Web mining is an application of data mining which has
become an important area of research due to vast amount
of World Wide Web services in recent years. The
emerging field of web mining aims at finding and
extracting relevant information that is hidden in Webrelated data, in particular in text documents published on
the Web. The survey on data mining technique is made
with respect to Clustering, Classification, Sequence
Pattern Mining, Association Rule Mining and
Visualization [1]. The research work done by different
users depicting the pros and cons are discussed.
II. WEB MINING
Web mining aims to discover useful information or
knowledge from the Web hyperlink structure, page
content, and usage data. Although Web mining uses many
38 | Page
Fig. 1. Web Mining Categories
Web structure mining: Web structure mining discovers
useful knowledge from hyperlinks (or links for short),
which represent the structure of the Web. For example,
from the links, we can discover important Web pages,
which, incidentally, is a key technology used in search
engines. We can also discover communities of users who
share common interests. Traditional data mining does not
perform such tasks because there is usually no link
structure in a relational table.
Web content mining: Web content mining extracts or
mines useful information or knowledge from Web page
contents. For example, we can automatically classify and
cluster Web pages according to their topics. These tasks
are similar to those in traditional data mining. However,
we can also discover patterns in Web pages to extract
useful data such as descriptions of products, postings of
forums, etc, for many purposes. Furthermore, we can
mine customer reviews and forum postings to discover
consumer sentiments. These are not traditional data
mining tasks.
September 2014, Volume-1, Special Issue-1
Web usage mining: Web usage mining refers to the
discovery of user access patterns from Web usage logs,
which record every click made by each user. Web usage
mining applies many data mining algorithms. One of the
key issues in Web usage mining is the pre-processing of
click stream data in usage logs in order to produce the
right data for mining.
III. SURVEY ON WEB CONTENT MINING
Web content mining is the process of extracting useful
information from the contents of web documents. Content
data is the collection of facts a web page is designed to
contain [6]. It may consist of text, images, audio, video,
or structured records such as lists and tables [1].
TABLE 1: WEB CONTENT MINING USING
DIFFERENT ALGORITHMS
The web content data consist of unstructured data such as
free texts, semi-structured data such as HTML
documents, and a more structured data such as data in the
tables or database generated HTML pages. So, two main
approaches in web content mining arise, (1) Unstructured
text mining approach and (2) Semi-Structured and
Structured mining approach. In this section we begin by
reviewing some of the important problems that Web
content mining aims to solve. We then list some of the
different approaches in this field classified depend on the
different types of Web content data. In each approach we
list some of the most used techniques.
The various clustering technique are follows: Text based
Clustering : the text based clustering approaches
characterize each document according to its, i.e. the
words contained in it (or phrases or snippets).The basic
idea is that if two documents contain many common
words then it is very possible that the two document are
very similar. The approaches in this category can be
further categorized accounting to the clustering method
used into the following categories: Partitioned,
Hierarchical, Graph Based, Probabilistic algorithms [7].
IV. SURVEY ON STRUCTURE MINING
The challenge for Web structure mining is to deal with
the structure of the hyperlinks within the Web itself. Link
analysis and Stochastic Approach for Link-Structure
Analysis (SALSA) InDegree are an old area of research
[4]. The Web contains a variety of objects with almost no
unifying structure, with differences in the authoring style
and content much greater than in traditional collections of
text documents. The link analysis algorithm contains page
rank, weighted page rank and HITS [3].
TABLE 2: WEB STRUCTURE MINING USING
DIFFERENT ALGORITHMS
39 | Page
September 2014, Volume-1, Special Issue-1
A. HITS (Hyper-link Induced Topic Search)
TABLE 3: COMPARISON OF DIFFERENT
ALGORITHMS
A HIT is a purely link-based algorithm. It is used to rank
pages that are retrieved from the Web, based on their
textual contents to a given query. Once these pages have
been assembled, the HITS algorithm ignores textual
content and focuses itself on the structure of the Web
only.
B. Weighted Page Rank (WPR)
The more popular webpages are the more linkages that
other webpages tend to have to them or are linked to by
them. The proposed extended PageRank algorithm–a
Weighted PageRank Algorithm assigns larger rank values
to more important (popular) pages instead of dividing the
rank value of a page evenly among its outlink pages.
Each outlink page gets a value proportional to its
popularity (its number of inlinks and outlinks).
C. Page Rank Algorithm
Pageranking algorithms are the heart of search engine and
give result that suites best in user expectation. Need of
best quality results are the main reason in innovation of
different page ranking algorithms, HITS, PageRank,
Weighted PageRank, DistanceRank, DirichletRank
Algorithm , Page content ranking are different examples
of page ranking used in different scenario. Since
GOOGLE search engine has great importance now days
and this affect many web users now days, so page rank
algorithm used by GOOGLE become very important to
researches [2].
D. Page Rank Based on VOL
We have seen that original Page Rank algorithm, the rank
score of a page p, is equally divided among its outgoing
links or we can say for a page, an inbound links brings
rank value from base page, p( rank value of page p
divided by number of links on that page)[3]. Which more
rank value is assigned to the outgoing links which is most
visited by users. In this manner a page rank value is
calculate based on visits of inbound links.
The values of page rank using WPR, WPRVOL and
EWPRVOL have been compared [5]. The values retrieved
by EWPRVOL are better than original WPR and
WPRVOL. The WPR uses only web structure mining to
calculate the value of page rank, WPRVOL uses both
web structure mining and web usage mining to
calculate value of page rank but it uses popularity only
from the number of inlinks not from the number of
outlinks. The proposed algorithm EWPRVOL method uses
number of visits of inlinks and outlinks to calculate
values of page rank and gives more rank to important
pages.
V. SURVEY ON WEB USAGE MINING
Web Usage Mining is the application of Data Mining to
discover and analyze patterns from click streams, user
transactions and other logged user interactions with a
website. The goal is to capture, model and analyze the
behavior of users, and define patterns and profiles from
the captured behaviors. There are three phases: data
collection and pre-processing, pattern discovery, and
pattern analysis.
Data collection and pre-processing: this concerns the
generating and cleaning of web data and transforming it
to a set of user transactions representing activities of each
user during his/her website visit. This step will influence
the quality and result of the pattern discovery and
analysis following therefore it needs to be done very
carefully [8].
E. Result Analysis
This section compares the page rank of web pages using
standard Weighted PageRank (WPR), Weighted
PageRank using VOL (WPRVOL) and the proposed
algorithm. We have calculated rank value of each page
based on WPR, WPRVOL and proposed algorithm i.e.
EWPRVOL for a web graph shown in Table2.
40 | Page
Pattern discovery: during pattern discovery, information
is analyzed using methods and algorithms to identify
patterns. Patterns can be found using various techniques
such as statistics, data mining, machine learning and
pattern recognition
Pattern analysis: it describes the filtering of
uninteresting and misleading patterns. The content and
structure information of a website can be used to filter
patterns.
September 2014, Volume-1, Special Issue-1
semantic web mining
Frequent pattern-based
classification
Lee and Fu
Tree-based frequent
patterns
Zhihua Zhang
Sequential pattern
th
mining with K order
Markov model
clustering
Mehrdad, Norwati Ali,
Md Nasir
Fig 2: Web Usage Mining process
Bing Liu’s
They are web server data, application server data and
application level data. Web server data correspond to the
user logs that are collected at Webserver. Some of the
typical data collected at a Web server include IP
addresses, page references, and access time of the users
and is the main input to the present Research. This
Research work concentrates on web usage mining and in
particular focuses on discovering the web usage patterns
of websites from the server log files.
The result of Web Usage Mining process is usually an
aggregated user model, which describes the behavior of
user groups or pinpoints a trend in user behavior.
We then list some of the different approaches in this field
classified depend on the different types of Web usage
data. In each approach we list some of the most used
techniques.
TABLE 4: WEB USAGE MINING USING DIFFERENT
ALGORITHMS
Algorithms Used
Author
Year
Bezdek
1981
Self-Organizing Map
Kohonen
1982
Association Rules
Agrawal
1993
Gruber
1993
fuzzy clustering
Ontologies
Apriori or FP Growth
Module
Direct Hashing and
Pruning
Agrawal and R. Srikant
1994
J. S. Park, M. Chen, P.S.
Yu
1995
Sequential Patterns
R. Agrawal and R. Srikant
1995
R. Srikant and R. Agrawal
1996
Generalized Sequential
Pattern
Parameter Space
Partition
Shiffrin & Nobel
1998
Jiawei Han, Jian Pei,
Yiwen Yin
2000
Zaki
2000
TREE-PROJECTION
Ramesh C. Agarwal,
Charu C. Aggarwal,
V.V.V. Prasad
2000
Baraglia and Palmerini
SUGGEST
2002
An average linear time
algorithm
José Borges , Mark Levene
2004
Wang et al
2005
FP-GROWTH
Vertical data format
Harmony
41 | Page
Nicolas Poggi, Vinod
Muthusamy, David
Carrera, and Rania
Khalaf
Berendt
2005
Cheng et al
2007
pattern-growth
principl
2008
Fan et al
2008
intelligent algorithm
2009
A. Anitha
2010
LCS Algorithm, clustering
2010
tools & technology
2011
process mining techniques
2013
VI. CONCLUSION
Now a day Web mining become very popular, interactive
and innovative technique and it is application of the Data
Mining technique that automatically discovers or extracts
the information from web documents. In this paper have
provided a more current evaluation study research papers
the various algorithms methods, techniques, phases that
are used for web mining and its three categories. This
paper has provided the efficient algorithms of web mining
to have an idea about in their application and
effectiveness. Weighted Page Content Rank user can get
relevant and important pages easily as it employs web
structure mining. The new approach uses different
technique using Genetic Algorithm (GA) for web content
mining. Web usage can combine FP-Tree with Apriori
candidate generation method to solve the disadvantages
of both apriori and FP-growth. Since this is a broad area,
and there a lot of work to do, we wish this paper could be
a useful for identifying opportunities for further research.
REFERENCES
[1] Dushyant Rathod, “A Review on Web Mining,”
IJERT, vol. 1, Issue 2, April – 2012.
[2] KaushalKumar, Abhaya and Fungayi Donewell
Mukoko, “PageRank algorithm and its variations: A
Survey report”, IOSR-JCE., Vol 14, Issue 1, Sep. Oct. 2013, PP 38-45.
[3] Sonal Tuteja,” Enhancement in Weighted PageRank
Algorithm Using VOL,” in IOSR-JCE, Volume 14,
Issue 5 Sep. - Oct. 2013), PP 135-141.
[4] Tamanna Bhatia, “Link Analysis Algorithms For Web
Mining,” in IJCST Vol. 2, Issue 2, June 2011.
[5] ALLAN BORODIN, GARETH O. ROBERTS ,
JEFFREY S. ROSENTHAL , and PANAYIOTIS
TSAPARAS “Link Analysis Ranking: Algorithms,
Theory,and Experiments”, ACM Transactions on
September 2014, Volume-1, Special Issue-1
Internet Technology, Vol. 5, No. 1, February 2005,
Pages 231–297.
[6] D. Jayalatchumy, and P.Thambidurai, “Web Mining
Research Issues and Future Directions – A Survey,”
IOSR-JCE, Vol 14, Issue 3 ,Sep. - Oct. 2013, PP 2027.
42 | Page
[7] Michael Azmy, “Web Content Mining Research: A
Survey”, DRAFT Version 1, - Nov. 2005.
J Vellingiri, and S.Chenthur Pandian, “A Survey on Web
Usage Mining,” in Global Journals Inc. (USA), Vol 1,
Issue 4 Version 1.0 March 2011.
September 2014, Volume-1, Special Issue-1
THE LATCHKEY OF THE RESEARCH PROPOSAL
FOR FUNDED
Mrs. B. Mohana Priya,
Assistant Professor in English,
AVIT, Paiyanoor, Chennai.
[email protected]
ABSTRACT:
A research proposal describes planned
research activities. It is the plan submitted to an
institutional Review Board for review and may be
submitted to a sponsor for research support. The
research plan includes a description of the research
design or methodology, how prospective subjects are
chosen, a description of what will happen during the
study and what data analysis will be used on the data
collected. Whether proposal receives funding will rely in
large part on whether the purpose and goals closely
match the priorities of granting agencies. Locating
possible grantors is a time consuming task, but in the
long run it will yield the greatest benefits... This paper
aims in explaining how the proposal should be drafted
keeping the key elements in mind.
Keywords: statement of research plan, title,
introduction, Literature Review, personnel, budget,
methodology
INTRODUCTION
Grant writing varies widely across the
disciplines, and research intended for epistemological
purposes (philosophy or the arts) rests on very different
assumptions than research intended for practical
applications (medicine or social policy research).
Nonetheless, this article attempts to provide a general
introduction to grant writing across the disciplines.
Although some scholars in the humanities and
arts may not have thought about their projects in terms of
research design, hypotheses, research questions, or
results, reviewers and funding agencies expect us to
frame our project in these terms. Learning the language
of grant writing can be a lucrative endeavor, so give it a
try. We may also find that thinking about our project in
these terms reveals new aspects of it to us.
Writing successful grant applications is a long
process that begins with an idea. Although many people
think of grant writing as a linear process (from idea to
proposal to award), it is a circular process. We need to
plan accordingly.
PROJECT TITLE
The title page usually includes a brief yet
explicit title for the research project, the names of the
principle investigator(s), the institutional affiliation of the
applicants (the department and university), name and
address of the granting agency, project dates, amount of
43 | Page
funding requested, and signatures of university personnel
authorizing the propos al (when necessary). Most funding
agencies have specific requirements for the title page;
make sure to follow them.
ABSTRACT
The abstract provides readers with their first
impression of our project. To remind themselves of our
proposal, readers may glance at our abstract when making
their final recommendations, so it may also serve as their
last impression of our project. The abstract should explain
the key elements of our research project in the future
tense. Most abstracts state: (1) the general purpose, (2)
specific goals, (3) research design, (4) methods, and (5)
significance (contribution and rationale). Be as explicit as
possible in our abstract. Use statements such as, " The
objective of this study is to ..."
INTRODUCTION
The introduction should cover the key elements
of our proposal, including a statement of the problem, the
purpose of research, research goals or objectives, and
significance of the research. The statement of problem
should provide a background and rationale for the project
and establish the need and relevance of t he research.
How is our project different from previous research on
the same topic? Will we be using new methodologies or
covering new theoretical territory? The research goals or
objectives should identify the anticipated outcomes of the
research and should match up to the needs identified in
the statement of problem. List only the principle goal(s)
or objective(s) of our research and save sub-objectives for
the project narrative.
BACKGROUND/RATIONALE/LITERATURE
REVIEW
Basis for doing the research study, Explain why
the research should be done, is a good research question.
There is no need for an extensive literature review for a
simple study. The literature review can be the
bibliography compiled to support the research question.
Many proposals require a literature review. Reviewers
want to know whether we have done the necessary
preliminary research to undertake our project. Literature
reviews should be selective and critical, not exhaustive.
Reviewers want to see our evaluation of pertinent works.
PROJECT NARRATIVE
The project narrative provides the meat of our
proposal and may require several subsections. The project
narrative should supply all the details of the project,
September 2014, Volume-1, Special Issue-1
including a detailed statement of problem, research
objectives or goals, hypotheses, methods, procedures,
outcomes or deliverables, and evaluation and
dissemination of the research.
For the project narrative, pre-empt and/ or
answer all of the reviewers' questions. Don't leave them
wondering about anything. For example, if we propose to
conduct unstructured interviews with open-ended
questions, be sure we should explain why this
methodology is best suited to the specific research
questions in our proposal. Or, if we're using item
response theory rather than classical test theory to verify
the validity of our survey instrument, explain the
advantages of this innovative methodology. Or, if we
need to travel to India, abroad to access historical
archives, clearly and explicitly state the connections
between our research objectives, research questions,
hypotheses, methodologies, and outcomes. As the
requirements for a strong project narrative vary widely by
discipline,
PERSONNEL
Explain staffing requirements in detail and make
sure that staffing makes sense. Be very explicit about the
skill sets of the personnel already in place (we will
probably include their Curriculum Vitae as part of the
proposal). Explain the necessary skill sets and functions
of personnel we will recruit. To minimize expenses,
phase out personnel who are not relevant to later phases
of a project.
BUDGET
The budget spells out project costs and usually
consists of a spreadsheet or table with the budget detailed
as line items and a budget narrative (also known as a
budget justification) that explains the various expenses.
Even when proposal guidelines do not specifically
mention a narrative, be sure to include a one or two page
explanation of the budget. Consider including an
exhaustive budget for our project, even if it exceeds the
normal grant size of a particular funding organization.
Simply make it clear that we are seeking additional
funding from other sources. This technique will make it
easier for us to combine awards down the road should we
have the good fortune of receiving multiple grants.
Make sure that all budget items meet the funding
agency's requirements. If a line item falls outside an
agency's requirements (e.g. some organizations will not
cover equipment purchases or other capital expenses),
explain in the budget justification that other grant sources
will pay for the item.
Many universities require that indirect costs
(overhead) be added to grants that they administer. Check
with the appropriate offices to find out what the standard
(or required) rates are for overhead. Pass a draft budget
by the university officer in charge of grant administration
44 | Page
for assistance with indirect costs and costs not directly
associated with research (e.g. facilities use charges).
TIME FRAME
Explain the timeframe for the research project in
some detail. When will we begin and complete each step?
It may be helpful to reviewers if we present a visual
version of our timeline. For less complicated research, a
table summarizing the timeline for the project will help
reviewers understand and evaluate the planning and
feasibility. For multi-year research proposals with
numerous procedures and a large staff, a time line
diagram can help clarify the feasibility and planning of
the study.
RESEARCH METHODS
Study design: Explain the study design and choice
of methodology.
Statistical bias: Measures take to avoid bias (if
relevant). If random sample, how will sample be
chosen?
Study procedures: What will happen to the
people participating in the study?
Study duration: How long will the study last;
expected duration of subject participation?
Standard tools: Will any standard tools be
utilized (e.g. Beck Depression Inventory)?
Study Participants
Who will participate in the research?
How will research participants be recruited?
Sampling: (If applicable) explain how sampling
will occur?
Selection and withdrawal of subjects
Statistical Analysis (only if applicable)
Statistical methods including interim analysis if
appropriate Number of subjects to be enrolled
Rationale for choice of sample size (power
calculation and justification) Level of significance
to be used Criteria for terminating the study
Procedures for reporting deviations from the
original plan Selection of subjects for inclusion in
the analysis
ANTICIPATED RESULTS AND POTENTIAL
PITFALLS
Obviously we do not have results at the proposal
stage. However, we need to have some idea about what
kind of data we will be collecting, and what statistical
procedures will be used in order to answer your research
question or test our hypothesis.
TIPS TO GET FUNDING
1.
Make a cost/benefit decision. Decide whether
you want to go after external funding. There are
two units of academic currency: articles and
grants. The opportunity cost of writing a
competitive grant proposal is high, and we may be
better suited to writing articles.
September 2014, Volume-1, Special Issue-1
2.
Make ourselves valuable. Develop a set of
demonstrable core competencies through our
publications. Our Curriculam Vitae is our portfolio
of skill sets, and we will be judged on our ability
to deliver. Don’t submit a proposal before we have
a few publications under our belt in the relevant
area.
3.
Get to know the funding sources. Different
funding sources have different missions and
different criteria. Our sponsored research office
(SRO) should be able to help us get this
information, and we should also peruse the
foundation websites. NSF, for example, funds
basic research, so intellectual merit and broader
impact, are the key criteria. Foundations have
specific goals in terms of advancing a particular
agenda. Government agencies have specific
missions. Don’t forget about doing consulting
work, particularly if we can turn the information
gleaned from the work into an insightful
publication. Identify the funding source which has
the greatest overlap with your research interest and
invest heavily in getting to know more about their
interests.
4.
5.
Get to know the key people. If we are going after
grants, get in touch with the cognizant program
officer. It is their job to know about their
foundation, and they will often know about
upcoming opportunities at both their foundation
and others. But don’t waste their time. A courteous
email which provides a concise outline of our
research idea, and connects it to their mission is a
much better introduction than a phone call out of
the blue.
Get to know the community by presenting at
their conferences. This helps in several ways.
First, a good presentation helps establish us as
competent and explains our research agenda
beyond our proposal. Second, the networking with
others who have been successful at getting grants
helps us get a better sense of the funding source’s
portfolio, and the style of research they support.
Third, members of the community will typically be
asked to review any grant proposal we submit.
6.
Submit our first few grants with senior
colleagues who have been successful in getting
grants. Grant writing is a skill that is not typically
taught in graduate schools, and on the job training
is the best way to learn how to acquire that skill.
7.
Write well and have a focus. In your opening
paragraph, state your focus. Every sentence that
we write in the grant should develop our key idea.
Write clear prose that assumes the reader is an
expert, but not necessarily deeply embedded in our
project. We should have a clear and logical
45 | Page
beginning, a middle, and an end to our proposal.
Write multiple drafts and eliminate verbosity,
jargon and extraneous sentences. Cite other
research that relates to our idea, but make it clear
how our work fills an important gap in that
research.
8.
Ask for feedback. It’s very important to get others
to read our proposal and make critical suggestions
so that we submit the strongest possible proposal
to the funder. There are reputation consequences to
submitting poor proposals.
9.
Resubmit. If we get good, constructive, reviews,
consider resubmitting the proposal. Consult with
the program officer before doing so, and spend a
lot of time making sure we address each point
carefully.
10.
Deliver. Most foundations are interested in
developing an academic community that studies a
set of problems related to their mission. Once we
get that first grant, make sure we deliver on what
we promised. Let the program officer know about
our publications, presentations, and other visible
consequences of their investment in us. The more
valuable that our research is, and the more active
we are in the professional community, the more
likely it is that the funding agency will continue to
support us throughout our career.
DISCUSSION
It is important to convince the reader of the
potential impact of our proposed research. We need to
communicate a sense of enthusiasm and confidence
without exaggerating the merits of your proposal. That is
why we also need to mention the limitations and weak
nesses of the proposed research, which may be justified
by time and financial constraints as well as by the early
developmental stage of your research area.
REFERENCE
1.
Cres well, J. (1998). Qualitative inquiry and
research design: Choosing among five traditions.
Thousand Oaks, California: Sage Publications.
2.
Cres well, J. (2003). Research Design: Qualitative,
Quantitative, and Mixed Methods Approaches.
Thousand Oaks, California: Sage Publications.
3.
Guba, E. and Lincoln, Y. (1989). Fourth
Generation E valuation. Newbury Park, California:
Sage Publications.
4.
Patton, M.Q. (2002). Qualitative research &
evaluation methods (3rd edition). Thousand Oaks,
California: Sage Publications.
5.
Webster's New International Dictionary of the
English Language, Second Edition, Unabridged,
W.A. Neilson, T.A. Knott, P.W. Carhart (eds.), G.
& C. Merriam Company, Springfield, MA, 1950.
September 2014, Volume-1, Special Issue-1
A COMBINED PCA MODEL FOR DENOISING OF CT IMAGES
1
Mredhula.L
2
Dorairangaswamy M A
1
Research Scholar
2
Senior Professor and Head, CSE & IT
1
Sathyabama University Chennai
2
AVIT, Chennai
1
[email protected]
2
[email protected]
Abstract—Digital images play a vital part in the medical
field in which it has been utilized to analyze the
anatomy. These medical images are used in the
identification of different diseases. Regrettably, the
medical images have noises due to its different sources
in which it has been produced. Confiscating such noises
from the medical images is extremely crucial because
these noises may degrade the quality of the images and
also baffle the identification of the disease. Hence,
denoising of medical images is indispensable. In
medical imaging, the different imaging techniques are
called modalities. Anatomical modalities provide insight
into the anatomical morphology. They include
radiography, ultrasonography or ultrasound (US),
computed tomography (CT), and magnetic resonance
imagery (MRI).Image denoising is one of the classical
problems in digital image processing, and it has a
important role as a pre-processing step in various
applications. The main objective of image processing is
to recover the best estimate of the original image from
its noisy version .
Index Terms— Denoising, CT images, Principal
component analysis ,Gaussian
I. INTRODUCTION
Distortion is one of the most prevalent cases due to
additive white Gaussian noise which can be caused by
poor image acquisition or by transferring the image data
in noisy communication channels. Impulse and speckle
noises are comprised in other types of noises [1].
Denoising is necessary frequently and the initial step to be
taken prior to the image data is analyzed. To compensate
such data corruption, it is essential to apply an efficient
denoising technique [1].The image acquires a mottled,
grainy, textured or snowy appearance with the existence
of noise. Hence, in recent years, in the case of recovering
an original image from noisy image, an irresistible
interest has been observed [2]. The challenge of removing
the noise from images has a sturdy history. In computer
vision and image processing, Noise reduction is an
essential step for any complicated algorithms [3]. As a
corollary, in order to build quantitative post-processing
more robust and efficient, image processing procedures
often entail removing image artifacts in advance [4].
The rapid progress in medical imaging technology and the
introduction of the imaging modalities, has invited novel
46 | Page
image processing techniques comprising specialized noise
filtering, classification, enhancement and segmentation
techniques. Presently, the extensive utilization of digital
imaging in medicine, the quality of digital medical images
turns out to be a serious issue. In order to accomplish the
top diagnoses it is essential that medical images should be
sharp, fine, and devoid of noise and artifacts. Even though
the technologies for obtaining digital medical images is
on the progress by providing images of higher resolution
and quality, noise stands out to be an open problem for
medical images. To eliminate noise from these digital
images continues to be a big issue in the study of medical
imaging.
In general, Image processing, refers to a broad class of
algorithms for modification and analysis of an image.
During
acquisition,
post-processing,
or
rendering/visualization, Image Processing refers to the
initial image manipulation [5]. For converting the
captured RGB image found from the real source,
preprocessing steps are essential so that they can be
qualified for performing any binary operations onto it [6].
Image processing alters pictures to progress them
(enhancement, restoration), extract information (analysis,
recognition), and change their structure (composition,
image editing) [7]. Image processing is exploited for two
different purposes: a) improving the visual appearance of
images to a human viewer, and b) preparing images for
measurement of the features and structures present [8].
Denoising, Restoration, Pre-Segmentation, Enhancement,
Sharpening and Brightness Correction are some of the
steps included in image pre-processing [9]. The difficulty
of image denoising is to recuperate an image that is
cleaner than its noisy observation. Thus, a significant
technology in image analysis is noise reduction and the
initial step to be taken prior to images is analyzed [10].
II .RELATED SURVEY
In image processing, data-driven descriptions of structure
are becoming increasingly important. Traditionally, many
models used in applications such as denoising and
segmentation have been based on the assumption of
piecewise smoothness. Unfortunately, this type of model
is too simple to capture the textures present in a large
percentage of real images. This drawback has limited the
performance of such models, and motivated data-driven
representations. One data-driven strategy is to use image
neighborhoods or patches as a feature vector for
September 2014, Volume-1, Special Issue-1
representing local structure. Image neighbourhoods are
rich enough to capture the local structures of real images,
but do not impose an explicit model. This representation
has been used as a basis for image denoising [11].
Gaussian white noise models have become increasingly
popular as a canonical type of model in which to address
certain statistical problems. Gaussian noise model has a
very significant feature. It does not matter how much the
variance and histogram of the original image follows the
Gauss distribution. In Gaussian method firstly according
to the image feature, estimate of whether the pixel point is
on the image edge, the noise point or the edge texture
point can be done. Then according to the local continuity
of the image edge and the texture feature, locate the noise
points. Lastly for the noise which is not on the edge or the
texture. The mean value of the non-noise points in the
adaptive neighbourhood are used to eliminate the noise.
The noise on the edge and texture region uses the pixel
points of the neighbourhood edge. With the help of this
method the Gaussian noise can be removed in the image
well and the number of the residual noise points decreases
sharply [12].
Principal component analysis (PCA) is a mathematical
procedure that uses an orthogonal transformation to
convert a set of observations of possibly correlated
variables into a set of values of uncorrelated variables
called principal components. The number of principal
components is less than or equal to the number of original
variables. This transformation is defined in such a way
that the first principal component has the largest possible
variance (that is, accounts for as much of the variability in
the data as possible), and each succeeding component in
turn has the highest variance possible under the constraint
that it be orthogonal to (i.e., uncorrelated with) the
preceding components [13].
PCA fully de-correlates the original data set so that the
energy of the signal will concentrate on the small subset
of PCA transformed dataset. The energy of random noise
evenly spreads over the whole data set, we can easily
distinguish signal from random noise over PCA domain .
PCA is the way of identifying patterns in data, and
expressing the data in such a way as to highlight their
similarities and differences. Since patterns in data can be
hard to find in data of high dimension, where the luxury
of graphical representation is not available PCA is a
powerful tool for analyzing data. The other main
advantage of PCA is that once you have found these
patterns in the data and you compress the data i.e. by
reducing the number of dimensions without much loss of
information [13].
PCA is a classical de-correlation technique in statistical
signal processing and it is pervasively used in pattern
recognition and dimensionality reduction, etc. By
transforming the original dataset into PCA domain and
preserving only the several most significant principal
components, the noise and trivial information can be
47 | Page
removed. PCA-based scheme was proposed for image
denoising by using a moving window to calculate the
local statistics, from which the local PCA transformation
matrix was estimated. However, this scheme applies PCA
directly to the noisy image without data selection and
many noise residual and visual artifacts will appear in the
denoised outputs.
For a better preservation of image local structures, a pixel
and its nearest neighbors are modelled as a vector
variable, whose training samples are selected from the
local window by using block matching based LPG. The
LPG algorithm guarantees that only the sample blocks
with similar contents are used in the local statistics
calculation for PCA transform estimation, so that the
image local features can be well preserved after
coefficient shrinkage in the PCA domain to remove the
random noise.
PCA-based denoising method with local pixel grouping
(LPG) or a self similarity indexing is regarded as the most
efficient method[14]. This is comparable to other non
local means such as Block Matching 3D. The non-local
means (NLM) image denoising algorithm averages pixel
intensities using a weighting scheme based on the
similarity of image neighborhoods [11].
In the modified Self Similarity pixel Strategy (SSS)-PCA
method, a pixel and its nearest neighbors are modelled as
a vector variable. The training samples of this variable are
selected by identifying the pixels with self similarity
based local spatial structures to the underlying one in the
local window. With such an SSS procedure, the local
structural statistics of the variables can be accurately
computed so that the image edge structures can be well
preserved after shrinkage in the PCA domain for noise
removal. The SSS-PCA algorithm is implemented as two
stages. The first stage yields an initial estimation of the
image by removing most of the noise and the second stage
will further refine the output of the first stage. The two
stages have the same procedures except for the parameter
of noise level. Since the noise is significantly reduced in
the first stage, the SSS accuracy will be much improved
in the second stage so that the final denoising result is
visually much better [15].
PCA based image processing is done in other transformed
domains such as wavelet [13], Contourlet [16] etc.
Wavelet transform has been used for various image
analysis problems due to its nice multi-resolution
properties and decoupling characteristics. For the wavelet
transform, the coefficients at the course level represent a
larger time interval but a narrower band of frequencies.
This feature of the wavelet transform is very important for
image coding. In the active areas, the image data is more
localized in the spatial domain, while in the smooth areas,
the image data is more localized in the frequency domain
[13]. Wavelets may be a more useful image representation
than pixels. Hence, we consider PCA dimensionality
reduction of wavelet coefficients in order to maximize
September 2014, Volume-1, Special Issue-1
edge information in the reduced dimensionality set of
images. The wavelet transform will take place spatially
over each image band, while the PCA transform will take
place spectrally over the set of images. Thus, the two
transforms operate over different domains. Still, PCA
over a complete set of wavelet and approximation
coefficients will result in exactly the same eigen spectra
as PCA over the pixels [11].
The Contourlet transform provides a multiscale and
multidirectional representation of an image. As the
directional filter banks can capture the directional
information of images, the Contourlet transform is very
effective to represent the detailed information of images
[16]. The Contourlet transform was developed as a true
two dimensional representation for images that can
efficiently capture the intrinsic geometrical structure of
pictorial information. Because of the employment of the
directional filter banks (DFB), Contourlet can provide a
much more detailed representation for natural images
with abundant textural information than wavelets. This
paper, [16] proposed an image denoising algorithm based
on the Contourlet transform and the 2DPCA. The
Contourlet transform performs the multiresolutional and
multidirectional decomposition to the image, while the
2DPCA is carried out to estimate the threshold for the soft
thresholding.
format.
B. Preprocessing:
The main step involved in this to rearrange the pixel
values to find the noisy area without affecting the lower
dimensional areas. Compute the pixel values by using
difference between the nearest neighbours .Then find the
intensity values of the clustering. After finding the
intensity values update the values in the image by setting
a threshold of 0.5.the process is repeated till eligibility
criteria satisfies.
C. PCA:
Then do PCA on the above obtained set of datas.
Calculate the eigen vectors and eigen values. Then edge
sobel detector is used for detecting thick edges and also
blur is to be removed.
IV. RESULTS
D. Input Image 1
III. PROPOSED METHODOLOGY
The block below shows the steps involved in the image
denoising. And the proposed method is compared with
wavelet decomposition with soft thresholding .
Figure 1
Figure 2
A. Image acquisition:
CT scanned Images of a patient is displayed as an array of
pixels and stored in Mat lab 7.0. The following figure
displays a CT brain image in Matlab7.0. A Grey scale
image can be specified by a large matrix with numbers
between 0 and 255 with 0 corresponding to black and 255
to white. The images are stored in the database in JPEG
48 | Page
Figure 3
The figure 1 shows the input original image 1 of CT lungs
and figure 2 shows the noise image which we added in to
the original image 1 and figure 3 shows the final output
image, which is the denoised image. The noise which is
added is the Gaussian noise.
September 2014, Volume-1, Special Issue-1
TABLE 1
B.Input Image 2
INPUT IMAGE 1
Figure 4
TABLE 2
INPUT IMAGE 2
Figure 5
VI. CONCLUSIONS
In this paper an effective way to denoising of image is
achieved using rearrangement of pixels and PCA . Sobel
edge detector was used to have better edge effect. Here
this approach was developed for CT images as CT is one
of the most common and very significant modalities
employed in Medical imaging.
Figure 6
REFERENCES
The figure 4 shows the input original image 2 of CT lungs
and figure 5 shows the noise image which we added in to
the original image 2 and figure 6 shows the final output
image, which is the denoised image. The noise which is
added is the Gaussian noise.
V. PERFORMANCE EVALUATION:
It is very difficult to measure the improvement of the
image. If the restored images prove good to the observer
so as to perceive the region of interest better, then we say
that the restored image has been improved. The parameter
such as the mean and PNSR helps to measure the local
contrast enhancement. The PSNR measure is also not
ideal, but is in common use. The PSNR values were
compared with the existing soft thresholding with wavelet
decomposition and the comparison of the PSNR values of
the proposed method with the existing method is shown in
Table 1 and Table 2 for 2 input images and for various
values of variance.
49 | Page
[1] Arivazhagan,
Deivalakshmi
and
Kannan,
“Performance Analysis of Image Denoising System
for different levels of Wavelet decomposition”,
International Journal of Imaging Science and
Engineering, Vol.3, 2007.
[2] Syed Amjad Ali, Srinivasan Vathsal and Lal kishore,
"A GA-based Window Selection Methodology to
Enhance
Window-based
Multi-wavelet
transformation and thresholding aided CT image
denoising technique", (IJCSIS) International Journal
of Computer Science and Information Security, Vol.
7, No. 2, 2010
[3] Syed Amjad Ali, Srinivasan Vathsal and Lal kishore,
"CT Image Denoising Technique using GA aided
Window-based Multiwavelet Transformation and
Thresholding with the Incorporation of an Effective
Quality Enhancement Method", International Journal
of Digital Content Technology and its Applications,
Vol.4, No. 4, 2010
September 2014, Volume-1, Special Issue-1
[4] Pierrick Coupé ,Pierre Yger, Sylvain Prima, Pierre
Hellier, Charles Kervrann , Christian Barillot, “An
optimized blockwise nonlocal means denoising filter
for 3-D magnetic resonance images”, Vol.27, No.4,
pp.425-441, 2008
[5] K. Arulmozhi, S. Arumuga Perumal, K. Kannan, S.
Bharathi, “Contrast Improvement of Radiographic
Images in Spatial Domain by Edge Preserving
Filters”, IJCSNS International Journal of Computer
Science and Network Security, VOL.10 No.2,
February 2010.
[15] K. John Peter, Dr K. Senthamarai Kannan, Dr S.
Arumugan and G.Nagarajan, “Two-stage image
denoising by Principal Component Analysis with Self
Similarity pixel Strategy”, International Journal of
Computer Science and Network Security, VOL.11
No.5, May 2011.
[16] Sivakumar, Ravi, Senthilkumar And Keerhikumar, “
Image Denoising Using Contourlet and 2D PCA”,
International Journal Of Communications And
Engineering Vol. 05, No.5, Issue. 02, March 2012.
[6] G. M. Atiqur Rahaman and Md. Mobarak Hossain,”
Automatic Defect Detection and Classification
Technique From Image: A Special Case Using
Ceramic Tiles”, (IJCSIS) International Journal of
Computer Science and Information Security, Vol. 1,
No. 1, May 2009
[7]Carl Steven Rapp, "Image Processing and Image
Enhancement", Frontiers in Physiology, 1996
[8] John Russ, "The Image Processing Handbook",
Library of Congress Cataloging-in-Publication Data,
Vol.3, 1999
[9] Hiroyuki Takeda, Sina Farsiu and Peyman Milanfar,
“Kernel Regression for Image Processing and
Reconstruction”, IEEE Transactions on Image
Processing, VOL. 16, NO. 2, Feb 2007.
[10] Yong-Hwan Lee and Sang-Burm Rhee, "Waveletbased Image Denoising with Optimal Filter",
International Journal of Information Processing
Systems, Vol.1, No.1, 2005
[11] Tolga
Tasdizen,
“Principal
Neighborhood
Dictionaries for Non-local Means Image Denoising”,
Ieee Transactions on
Image Processing, Vol. 20, No. 10, January 2009.
[12] Sushil Kumar Singh and Aruna Kathane, “Various
Methods for Edge Detection in Digital Image
Processing”, IJCST Vol.
2, issue 2, June 2011.
[13] Vikas D Patil and Sachin D. Ruikar, “PCA Based
Image Enhancement in Wavelet Domain”,
International Journal Of
Engineering Trends And Technology, Vol. 3, Issue 1,
2012
[14] Sabita Pal, Rina Mahakud and Madhusmita Sahoo, “
PCA based Image Denoising using LPG”, IJCA
Special Issue on 2nd National ConferenceComputing, Communication and Sensor Network”
CCSN, 2011.
50 | Page
September 2014, Volume-1, Special Issue-1
RFID BASED PERSONAL MEDICAL DATA CARD FOR TOLL
AUTOMATION
Ramalatha M1, Ramkumar.A K2, Selvaraj.S3, Suriyakanth.S 4
Kumaraguru College of Technology, Coimbatore, India.
1
[email protected]
2
[email protected]
3
[email protected]
4
[email protected]
Abstract--- In traditional toll gate system, the vehicle
passing through must pay their tolls manually at the
gate for obtaining the entry across the toll gate. The
proposed RFID system uses tags that are mounted on
the windshields of vehicles, through which the
information embedded on tags are read by RFID
readers. This eliminates the need to pay the toll
manually enabling automatic toll collection with the
transaction being done on the account held by the
vehicles. This enables a more efficient toll collection by
reducing traffic and eliminating possible human errors.
In an emergency the paramedics or doctors can read
RFID device that retrieve the medical records of the
customer who owns the tag. This plays a vital role
during emergencies where one need not wait for basic
tests to be done and by referring the customer medical
details the treatment can be made as it may save the life
of a human which is very precious.
Keywords--- Electronic toll collection, medical RFID,
GSM
1. INTRODUCTION
In traditional toll gate system, the vehicle passing through
must pay their tolls manually at the gate for obtaining the
entry across the toll gate. In order to pay tax we are
normally going to pay in form of cash. The main
objective of this project is to pay the toll gate tax using
smart card with medical details to be used during
emergency. Smart card must be recharged with some
amount and whenever a person wants to pay the toll gate
tax, he needs to insert his smart card and deduct amount
using keypad[6]. By using this kind of device there is no
need to carry the amount in form of cash and so we can
have security as well. This system is capable of
determining if the car is registered or not, and then
informing the authorities of toll payment violations,
debits, and participating accounts[1]. These electronic toll
Collection systems are a combination of completely
automated toll collection systems (requiring no manual
operation of toll barriers or collection of tolls) and semiautomatic lanes.
The most obvious advantage of this technology is the
opportunity to eliminate congestion in tollbooths,
especially during festive seasons when traffic tends to be
heavier than normal. It is also a method by which to curb
complaints from motorists regarding the inconveniences
involved in manually making payments at the
51 | Page
tollbooths[1]. Other than this obvious advantage,
applying ETC could also benefit the toll operators.
The benefits for the motorists include:
Fewer or shorter queues at toll plazas by increasing
toll booth service turnaround rates;
Faster and more efficient service (no exchanging toll
fees by hand);
The ability to make payments by keeping a balance on
the card itself or by loading a registered credit card;
and The use of postpaid toll statements (no need to
request for receipts).
Other general advantages for the motorists include
fuel savings and reduced mobile emissions by
reducing or eliminating deceleration, waiting time,
and acceleration. Meanwhile, for the toll operators,
the benefits include:
Lowered toll collection costs;
Better audit control by centralized user accounts; and
Expanded
capacity
without
building
more
infrastructures.
In case of emergency the customer medical details
like blood group, diabetics reports, blood pressure
reports etc. can be viewed.
The next sections of this paper are organized as follows.
Section 2 deals with RFID technology 3. ATC
(Automatic toll collection) components. Micro controller
programming is discussed in Section 4. Section 5 deals
with VB programming. Section 6 contains Procedure for
transaction. Finally Section 7 contains the concluding
remarks.
2. RFID TECHNOLOGY
Radio frequency identification (RFID) technology is a
non-contact method of item identification based on the
use of radio waves to communicate data about an item
between a tag and a reader. The RFID data is stored on
tags which respond to the reader by transforming the
energy of radio frequency queries from the reader (or
transceiver), and sending back the information they
enclose. The ability of RFID to read objects in motion
and out of the line-of-sight is its major advantage. The
tags can be read under harsh conditions of temperature,
chemicals and high pressure[2]. The use of RFID
technology reduces operational costs by reducing the
need for human operators in systems that collect
information and in revenue collection.
September 2014, Volume-1, Special Issue-1
2. 1 RFID TAGS
RFID tag is an object that can be attached to or
incorporated into a product, animal, or person for the
purpose of identification using radio waves. The RFID
tag is essentially a memory device with the means of
revealing and communicating its memory contents, when
prompted to do so.
RFID tags come in three general varieties:- passive,
active, or semi-passive (also known as battery-assisted).
Passive tags require no internal power source, thus being
pure passive devices (they are only active when a reader
is nearby to power them), whereas semi-passive and
active tags require a power source, usually a small
battery.
Passive RFID tags have no internal power supply. The
minute electrical current induced in the antenna by the
incoming radio frequency signal provides just enough
power for the CMOS integrated circuit in the tag to power
up and transmit a response. Most passive tags signal by
backscattering the carrier wave from the reader. Unlike
passive RFID tags, active RFID tags have their own
internal power source, which is used to power the
integrated circuits and broadcast the signal to the reader.
Active tags are typically much more reliable (i.e. fewer
errors) than passive tags unlike passive RFID tags, active
RFID tags have their own internal power source, which is
used to power the integrated circuits and broadcast the
signal to the reader. Active tags are typically much more
reliable (i.e. fewer errors) than passive tags.
To communicate, tags respond to queries generating
signals that must not create interference with the readers,
as arriving signals can be very weak and must be told
apart. The RFID tags which have been used in the system
consist of user details such as username, userid, address,
contact number and medical details. The most common
type of tag is mounted on the inside of the vehicle's
windshield behind the rear-view mirror.
2.2 RFID READER
RFID reader is the device which is used to convert the
received radio signals of a particular frequency into the
digital form for the usage by the controller and PC. This
reader has on-chip power supply[5]. It incorporates
energy transfer circuit to supply the transponder.
3. ATC COMPONENTS
The Automatic Tollgate Collection (ATC) is a technology
that permits vehicles to pay highway tolls automatically
using RFID. Automatic Toll Collection is a concept that
is being readily accepted globally.
The process is less time consuming. ATC s are an open
system, toll stations are located along the facility, so that
a single trip may require payment at several toll stations.
Each system has designated toll booths designated for
ATC collections.
Automatic vehicle identification
These are electronic tags placed in vehicles which
communicate
with
reader.
Automatic
Vehicle
Identification tags are electronically encoded with unique
identification numbers. User information is scanned for
identification. By using unique numbers communication
is being done. Then it classifies the type of vehicle and
the amount is being deducted based on the vehicle
classification.
4. MICROCONTROLLER PROGRAMMING
In this project the micro-controller is playing a major
role. Micro-controllers were originally used as
components in complicated process-control systems.
However, because of their small size and low price,
micro-controllers are now also being used in regulators
for individual control loops. In several areas microcontrollers are now outperforming their analog
counterparts and are cheaper as well.
The purpose of this project work is to present control
theory that is relevant to the analysis and design of
Micro-controller system with an emphasis on basic
concept and ideas. It is assumed that a Microcontroller
with reasonable software is available for computations
and simulations so that many tedious details can be left to
the Microcontroller. In this project we use ATMEL 89c51
microcontroller where it consists of 4 ports namely port0,
port1, port2 and port3. The control system design is also
carried out up to the stage of implementation in the form
of controller programs in assembly language OR in CLanguage.
5. VB PROGRAMMING
The Visual Basic Communication application consists of
four different parts; the part which communicates with
the RFID hardware, the part which communicates with
the database, the part which communicates with the
microcontroller and the part which enables addition of
new users.
The system was developed with an aim towards enabling
it to indicate the registration number of a car as it passes,
according to the RFID details are taken from the
database. The station attendant has a chance to see if
there is any difference between the plate in the database
as displayed on the application window and that on the
car. It also displays the current account balance on the
52 | Page
September 2014, Volume-1, Special Issue-1
card from the database. There is an automatic deduction
of balance which works according to an algorithm in the
Visual Basic (VB) code. The deduction occurs with
respect to the type of car which has passed. The system
shows the status of the gate to see if it is closed or
opened. This helps the station attendant to switch to the
emergency operation of opening and closing the boom if
the RFID system fails.
7. BLOCK DIAGRAM
For security, a login window for privacy and
authentication was developed to reduce fraud since only
authorized users are held accountable for any losses
incurred. Visual basic is very good at designing smart and
user friendly windows, just like any other Microsoft
Windows application[2]. This can be seen from the
graphical user interface image shown in Figure 2.
Fig3.Components of the system (Block diagram)
GSM
Global System for mobile communications. The main
application of GSM In our project is that it provides
support of SMS to the customer about the transaction.
Fig 2: Window for entering new users into the system
6. TRANSACTION RELATED OPERATIONS
Microsoft Access is a relational database management
system from Microsoft that combines the relational
Microsoft Jet Database Engine with a graphical user
interface and software-development tools. It also has the
ability to link to data in its existing location and use it for
viewing, querying, editing, and reporting. This allows the
existing data to change while ensuring that Access uses
the latest data[4]. Here the transaction related operations
are being done like insertion, deletion and updation are
being made. The transaction details are being made by
accessing the database as each user have their own user
id.
LCD
Liquid Crystal Display. The main application of LCD in
our project is that it provides display information about
the transaction being carried near toll booth.
RS232
It is being used to connect the RFID module to the
personal computer (PC) .By using this port a PC is being
connected and share information.
DC MOTOR
This circuit is designed to control the motor in forward
and reverse direction. It consists of two relays named
relay1, relay2.The relay ON and OFF is controlled by the
pair of switching transistors. In this project the DC motor
is being used to open the gate if the transaction is
successful.
The following flowchart gives the process flow of ATC
during the passing of the vehicle through the toll gate.
53 | Page
September 2014, Volume-1, Special Issue-1
existing system. The advantages of this proposed system
is summarized as follows:
1. Higher efficiency in toll collection;
2. Cheaper cost;
3. Smaller in size compared with the existing system;
4. Durable tags;
5. Medical details; and
6. Life saver.
Fig.4 Flowchart for RFID Toll gate automation
8. PROPOSED SYSTEM
The main objective behind this proposal is to create a
suitable ETC (Electronic toll collection) system to be
implemented in India. The term “suitable” here refers to
minimal changes in the current infrastructure with
maximum increase in efficiency. In India there are many
toll gates but they are being operated manually. As India
is one of the largest populated country obviously it leads
to increase in vehicles, so toll gates performing manually
lead to heavy traffic and wastage in time waiting in the
queue. Our proposed system eliminates all these issues
and ensures safer journey.
Another main advantage of our system is the customer
medical details which are being stored in the RFID tag
and are being used in case of emergencies. Here the
customer medical details like blood group, blood pressure
(BP), Diabetics, etc. are being stored by using the
customer unique id which is being given to them during
registration. During emergencies it plays a vital role for
the doctors to take decision easily and to give treatment
according to the reports being stored in the tag.
The proposed system also considers the size issue. All the
system requires is a tag the size of a sticker, which could
be affixed on the windshield[4]. In this system, the tag
used is capable of withstanding all kinds of weather, and
is much more durable compared with the one used in the
54 | Page
Fig.5.Flowchart for medical details usage
9. CONCLUSION
The automatic toll gate system we have discussed has
been implemented in various countries though not in
India. But it is essential in India where the population is
much higher and the condition of the roads is not very
good. This creates a long queue in the toll gates
inconveniencing a lot of population, particularly during
peak times. This also leads to lot of accidents since
people tend to go faster to avoid queues.
Hence the smart card which also contains the medical
details of the card holder will definitely be boon to the
public in case of emergencies. Nowadays all toll gates are
equipped with mobile emergency medical assistance unit
and these details will be particularly useful if the person
who needs attention has an additional longstanding
ailment such as heart problem or diabetes. The screen
shots of our system given below. In future this can be
enhanced to give an instantaneous alert to the medical
unit nearby.
The following screen shots show the vehicle
identification and amount deduction from the card using
the Automatic Toll Collecting System.
September 2014, Volume-1, Special Issue-1
Fig. 6 Vehicle identification
Fig. 7 Vehicle classification and fee deduction
REFERENCES
[1] Khadijah Kamarulazizi, Dr.Widad Ismail (2011),
“Electronic Toll Collection System Using Passive
Rfid Technology”(Appeared in Journal of Theoretical
and Applied Information Technology.)
[2] Lovemore Gonad, Lee Masuka, Reginald
Gonge,“RFID Based Automatic Toll System
(RATS)”,(presented
in
CIE42,16-18
July
2012,Capetown South Africa).
[3] V.Sandhaya, A.pravin, "Automatic Toll Gate
Management and vehicle Access Intelligent control
System Based on ARM7 Microcontroller” (Appeared
in, International Journal Of Engineering Research
Technology (IJERT), ISSN: 2278-0181, Vol 1,Issue
5,July 2012.)
[4] Janani.SP, Meena.S, “Atomized Toll Gate System
Using Passive RFID Technology and GSM
technology”(Appeared in,, Journal of Computer
Applications ISSN:0974-1925 volume 5,issue EICA
2012-13,feb 10,2012.)
[5] Sagar Mupalla, G.Rajesh Chandra and N.Rajesh Babu
in the article “Toll Gate Billing And Security of
Vehicles using RFID”(,Appeared in, International
Journal of Mathematics Trends and Technology,
volume 3,Issue 3 2012.)
[6] V. Sridhar, M. Nagendra,” Smart Card Based Toll
Gate Automated System”(Appeared in International
journal of Advanced Research in Computer Science
and Engineering)
[7] The Rfid’s in medical Technology and latest trends
from the article The Institute Of Engineering And
Technology.
[8] http://en.wikipedia.org/wiki/Radio-frequency
identification.
[9] www.btechzone.com
55 | Page
September 2014, Volume-1, Special Issue-1
ADEPT IDENTIFICATION OF SIMILAR VIDEOS FOR WEBBASED VIDEO SEARCH
1
Packialatha A.
Dr.Chandra Sekar A.
2
[1]
[2]
Research scholar, Sathyabama University, Chennai-119 .
Professor,St.Joseph‟s College of Engineering, Chennai-119 .
Abstract— Adept identification of similar
videos is an important and consequential issue in
content-based video retrieval. Video search is done on
basis of keywords and mapped to the tags associated to
each video, which does not produce expected results.
So we propose a video based summarization to
improve the video browsing process with more
relevant search results from large database. In our
method, a stable visual dictionary is built by clustering
videos on basis of the rate of disturbance caused in the
pixellize, by the object. An efficient two-layered index
strategy with background texture classification
followed by disturbance rate disposition is made as
core mapping methodology.
Key terms – Video summarization, Pixellize, Twolayered indexing, Mapping.
1. INTRODUCTION
Currently videos do cover a large part of data
on the online server. Videos have become one of the
most important base for information and it is widely
improving its range of use in the teaching field. More
than all this they provide great reach for people who
expect online entertainment. We have large number of
websites dedicated especially for browsing and viewing
videos. We type in keywords and retrieve videos, as
simple as that. But not much importance is given to
criteria „relevancy‟ when videos are considered. Since
videos are of great entertainment values, they will reach
almost all age group people who are online. In such
case, relevancy and faster retrieval becomes a major
need which has been ignored till now.
In all the video oriented websites, the
searching is based on the keywords which we type in.
Using the keywords, the engine will search for all the
matching tags available in the videos. Each video will
have many tags associated to it. Here tags refer to the
concept on which the video is based on. Definitely a
video will contain a minimum of five tags. Most of the
websites allow the user who uploads the video to
specify their own tags. So the tags are completely
independent of the website‟s vision. In other websites,
the words in the name of the video specified by the user
56 | Page
will be used as the tag words. Here neither of the
methods deal with the actual content of the video but
just takes words as filtering criteria for a video base
search.
Thus existing system shows the following flaws are
1.Browsing time is very high, since the results produced
are vast. 2. Results are not relevant. Since the tag words
may be generic, the database search is lengthy and time
consuming.
3. There is no filtering of redundant videos.
Thus here we propose a better method of
filtration with help of content based video retrieval.
Here we take into account the actual content of the
video and not any words which are provided by the user
over it.
2. RELATED WORK
There many works related to our proposal
which have adopted a similar objective with a different
perspective. The initiation started way back in 2002,
when video was getting more attention from the online
user. But that time they were only able to propose a
theoretical procedure for structure analysis of the
images of the video for better filtration and retrieval [4].
But that proposal failed to explain the practical
implementation of it.
Later to overcome the difficulty of variation of
in the dimension between the videos, a proposal came
over to match low with high dimensional videos over
comparing which did a contribution to video
comparison factor [7]. With all the advancements, came
up the new idea of feature extraction for comparison of
videos in content matter with help of video signature
[6]. Even though this notion gave good similarity
results, it is not practical to implement it in a busy
network like internet because of its high complexity and
time consuming factor. Since time matters, indexing
was simplified with the help of vector based mapping
which uses slicing the videos [8] and using pointers,
which performed great solely. Later dynamic binary
tree generation [9] came into being to avoid storage
September 2014, Volume-1, Special Issue-1
problems which saved storage space but consumed
time.
A very similar proposal to ours but
complicated in its implementation came up which uses
threshold and color histogram [10] to do content based
analysis which has large complexity which we have
resolved. Later came up a completely dedicated
searching and retrieval method for MPEG-7 [5] which
is not much use now days. Personalized video searching
with reusability depending on user came up with high
caches [3] which can be used for private use but not
much for a public explosion. When queries become
difficult to express, a proposal came up to implement a
application based technology combined with multitouch exploitation which would result in compelling the
user to give entry to an external application inside their
browser [2].
Finally, the base of our proposal was from a
content based retrieval idea [1] which uses a complex
B+ tree method to retrieve videos using symbolization,
which is feasible except for its complexity. Here we try
to have the complexity level at minimum with high
responsive and relevant videos with limited time
consumption.
3.
figure(3) and the object identification which is done by
comparing the rate of disturbances caused by the pixels
as shown in figure(4).
We further develop a number of query
optimization methods, including effective query video
summarization and frame sequence filtering strategies,
to enhance the search . We have conducted an exclusive
performance study over large video database. This
background keyframe filtering method tends to produce
an unstable visual dictionary of large size, which is
desirable for fast video retrieval.
Then we Sequentially scan all the summaries to
identify similar videos from a large database which is
clearly undesirable. But the efficient indexing video
summaries are necessary to avoid unnecessary space
access and costs. So we propose most effective two
layered mapping in which it reduces the browsing time
and retrieves more relevant videos on a large database.
Some of the Advantages of proposed systems are
Effective processing on large video database. Retrieval
of more relevant content of video. To save the time for
checking the whole video to know the contents.
Mitigate the computational and storage cost by reducing
redundancy.
SYSTEM DISCRIPTION
We propose an efficient CBVR (Content based
video retrieval), for identifying and retrieving similar
videos from very large video database. Here searching
is based on the input given as a video clip rather than
caption.
We store the video in an efficient manner so
that retrieving is easier and more relevant. This is
mapped by two-level indexing, first segregated on basis
of background texture, followed by object pixellize
disturbance.
There are three major modules which defines
the proposed system as the figure (1) shows. Here the
first module of key frame generation is the major part.
Where the videos are divided into the multiple images
as keyframes. Then we are going to trace the actual
background of the video. Then the background key
frame is used as the first level of filtration done in the
database.
Then the second module includes mapping in
database which is done by two level of mapping
techniques called background comparison shown in
57 | Page
Fig. 1 System architecture diagram
4. IMPLEMENTATION
The proposed idea can be implemented in
system using the following modules.
4.1 KEYFRAME GENERATION
Major module which includes the key frame generation.
Initial steps include the following:
=>Break up the video into multiple images.
September 2014, Volume-1, Special Issue-1
=>Map out the background key frame.
=>Plot the position of the object in the video,
sorting out the disturbance.
by
v.That position will be filled with the color of the
selected frame‟s corresponding pixel.
4.1.1 BACKGROUND KEYFRAME
GENERATION
Here we are going to trace the actual background
of the video. This background key frame is used in the
first level of filtration done in the database. We apply
the following step to trace the background of a video.
ALGORITHM
i.Initially the video is converted into multiple frames
of pictures.
ii.Now number each frame Ni,Ni+1,..,Nn
iii.Compare pixel(k) Ni[k] == Ni+1[k]
iv.If they are same update store them in key
frame(kfi[k])
v.Key frame may result in more than one, if the video
has highly different backgrounds.
vi.Again continue the same with kfi to produce a
single background key frame.
vii.Some pixels may not be filled, they can be
computed from the surrounding pixels.
KSD
Fig.4 Identification of Object Position by filtering
the pixel that do not match.
4.2 MAPPING IN DATABASE
Database mapping are typically implemented
by manipulating object comparison to retrieve relevant
search videos. It includes the two level of filtering used
to find relevant videos in the database. Given a query
video, First the background of each keyframe are
mapped by looking up the visual dictionary in which
the related videos has been stored. Then, the video
segments (frames) containing these backgrounds are
retrieved, so the potentially similar video results are
displayed according to their matching rate.
Fig.3 Background Keyframe Generation.
4.1.2 OBJECT POSITION IDENTIFICATION
ALGORITHM
i.Now we have kfi which shows the key frame of the
video segment which has same background.
ii.Compare the pixel(k) kfi[k] with the same pixel
middle frame from the that video segment.
iii.Fill the object key frame pixels with black when
they match.
iv.Only few pixels won‟t match.
58 | Page
Second is by identifying the object position
which compares the pixels of two keyframes, we
assume that they
are matched if they are similar, and unmatched
otherwise. However, since the neighboring clusters in
multidimensional space may be overlapped with each
other, two similar subdescriptors falling into the
overlapping part of clusters may be represented as
different pixel matching, thus, misjudged or misplaced
as dissimilar ones. These dissimilar ones are considered
to be the disturbance rate called the objects.
pixels
Accordingly, the matched keyframes to the
containing overlapping frames may be
September 2014, Volume-1, Special Issue-1
considered as unmatched, which degrades the accuracy
of pixel sequence matching. Therefore, the unmatched
pixels are considered to be the rate of disturbance called
the error rate. With this error rate we can retrieve the
second level of filtration that is object identification. As
a result, retrieval will be easier and effective by our two
layered filtration as shown in fig (5 and 6).
4.3 RETRIEVING RELEVANT VIDEOS
To retrieve similar videos more efficiently and
effectively, several key issues need to be noted. First, a
video is required to be represented compactly and
informatively. This issue is important, as a video
typically contains a number of keyframes, and the
similarity between two videos is usually measured by
finding the closest match or minimum error rate for
every single keyframe of them. Thus, searching in large
databases over a raw video data is computationally
expensive. The second issue is how to measure the
similarity between videos based on their pixel matching
rate. To overcome this, we used the most effective and
efficient two layered filtration, first is background
keyframe generation and object identification.
Therefore, the user can select any retrieved
videos and playback the video clip. Figure 7 shows one
of the sample example of retrieval result. The retrieval
results will be even better when the backgrounds are
masked out. On the other hand, if the background
becomes much clumsy or its area increases, the results
will degrade gradually.
Fig.5 Sample background keyframes mapping
with query keyframes.
Fig.6 Mapping by Object disposition.
59 | Page
But the current video search engines are
based on lexicons of semantic concepts and perform tag
based queries. These systems are generally desktop
applications or have simple web interfaces that show
the results of the query as a ranked list of keyframes.
For each result of the query it is shown the first or
similar frames of the video clip. These frames are
obtained from the video streaming database, and are
shown within a small video player. Users can then play
the video sequence and, if interested, zoom in each
result displaying it in a larger player that shows more
details on the video player and allows better video
detection. The extended video player also allows to
search for visually similar video clips.
Therefore at the bottom of the result lists there
are the concepts which are related to the video results.
By selecting one or more of these concepts, the video
clips returned are filtered in order to improve the
information retrieval process. The user can select any
video element from the results list and play it as they
neeeded. This action can be repeated for other videos,
returned by the same or other queries. Videos, out of the
list can be moved along the screen, resized or played.
Therefore the overall retrieval process is simple and
effective which gives the results faster.
September 2014, Volume-1, Special Issue-1
[5] Quan Zheng , Zhiwei Zhou “An MPEG-7
Compatible Video Retrieval System with Support
for Semantic Queries(IEEE tansactions 2011).
[6] Sen-Ching S.Cheung, and Avideh Zakhor, “Fast
Similarity Search and
Clustering of Video
Sequences on Wold Wide Web, vol.7,No:3,IEEE
2005.
[7] B. Cui, B.C.Ooi,J.Su, and K.L.Tan, Indexing High
Dimensional Data for Efficient similarity search.,
IEEE 2005.
[8] H.Lu , B.C.Ooi,H.T.Shen, and x.xue. Hierarchical
indexing structure for efficient similarity search in
video retrieval.,18:1544-1559, IEEE 2006.
Fig: - 7 Sample Example of Retrieval result
5.
CONCLUSION
In this paper, we discussed our proposal for
all video search engines and their related issues. It
extracts various video metadata from a video query
themselves on a large database and displays the most
relevant videos on the webpage. Then our paper also
deals with the identification and extraction of
keyframes and pixellate matching followed by the video
retrieval. Then, we presented an effective disturbance
rate disposition, to measure the similarity between two
video clips by taking into account the visual features
and sequence contexts of them. Finally, we introduced a
two tier indexing scheme, which outperforms the
existing solutions in terms of efficiency and
effectiveness.
6.
1]
[9] V.Valdes, J.M.Martinez .,Binary tree based on line
Video Summarization in TVS 08,pages 134-138
New York, NY,USA,2008,ACM.
[10] Sathishkumar L Varma, Sanjay N Talbar,
“Dynamic threshold in Clip Analysis and
Retrieval” IEEE 2011.
REFERENCES
Xiangmin
Zhou;Xiaofang
Zhou; Lei
Chen;Yanfeng
Shu;Bouguettaya,A
Taylor,.A.;CSIRO ICT Centre, Canberra, ACT,
Australia . “Adaptive subspace symbolization
for content based video Detection “IEEE on vol.
22 No.10 2010.
[2] „Interactive Video Search and Browsing system‟,
Marco Bertini, Alberto Del Bimbo, Andrea
Feracani, 2011 CMBI
[3] Victor Valdes, Jose M.Martinez ., Efficient Video
Summarization and Retrieval., CMBI 2011.
[4] Ng Chung Wing., ADVISE: Advanced Digital
Video Information Segmentation
Engine IEEE
2002.
60 | Page
September 2014, Volume-1, Special Issue-1
PREDICTING BREAST CANCER SURVIVABILITY USING
NAÏVE BAYSEIN CLASSIFIER AND C4.5 ALGORITHM
R.K.Kavitha1, Dr. D.DoraiRangasamy2
1
Research Scholar
2
HOD
1
Vinayaka Mission University, Salem, Tamil Nadu
2
Computer Science Engineering, AVIT, Vinyaka Missions University, Chennai, Tamil Nadu
1
[email protected]
2
[email protected]
Abstract-This paper analyses the performance
of Naïve Baysien Classifier and C4.5 algorithm in
predicting the survivable rate of breast cancer patients.
The data set used for analysing is SEER data set which
is a preclassified data set. These techniques helps the
physician to take decisions on prognosis of breast
cancer patients. At the end of analysis, C4.5 proves
better performance than Naïve Baysien Classifier.
Keywords: SEER, Breast cancer,C4.5
I INTRODUCTION
Breast cancer has become the most hazardous
types of cancer among women in the world. The
occurrence of breast cancer is increasing globally. Breast
cancer begins in the cells of the lobules or the ducts 510% of cancers are due to an abnormality which is
inherited from the parents and about 90% of breast
cancers are due to genetic abnormalities that happen as a
result of the aging process. According to the statistical
reports of WHO, the incidence of breast cancer is the
number one form of cancer among women .
In the United States, approximately one in eight
women have a risk of developing breast cancer. An
analysis of the most recent data has shown that the
survival rate is 88% after 5 years of diagnosis and 80%
after 10 years of diagnosis. Hence, it can be seen from the
study that an early diagnosis improves the survival rate.
In 2007, it was reported that 202,964 women in the
United States were diagnosed with breast cancer and
40,598 women in the United States died because of breast
cancer. A comparison of breast cancer in India with US
obtained from Globocon data, shows that the incidence of
cancer is 1in 30. However, the actual number of cases
reported in 2008 were comparable; about 1,82,000 breast
cancer cases in the US and 1,15,000 in India.
A study at the Cancer Institute, Chennai shows
that breast cancer is the second most common cancer
among women in Madras and southern India after cervix
cancer. Early detection of breast cancer is essential in
reducing life losses. However earlier treatment requires
the ability to detect breast cancer in early stages. Early
diagnosis requires an accurate and reliable diagnosis
procedure that allows physicians to distinguish benign
breast tumors from malignant ones without going for
surgical biopsy.
61 | Page
The use of computers with automated tools,
large volumes of medical data are being collected and
made available to the medical research groups. As a
result, data mining techniques has become a popular
research tool for medical researchers to identify and
exploit patterns and relationships among large number of
variables, and made them able to predict the outcome of a
disease using the historical datasets. Data Mining is the
process of extracting hidden knowledge from large
volumes of raw data. Data mining has been defined as
“the nontrivial extraction of previously unknown, implicit
and potentially useful information from data. It is “the
science of extracting information from large databases”.
Data Mining is used to discover knowledge out of data
and presenting it in a form that is easily understand to
humans. The objective of this paper to analyse the
performance of naïve baysein classifier and C4.5
algorithm.
II. RELATED WORK
Jyoti Soni et. al [8] proposed three different
supervised machine learning algorithms. They are Naïve
Bayes, K-NN, and Decision List algorithm. These
algorithms have been used for analyzing the heart disease
dataset. Tanagra data mining tool is used for classifying
these data. These classified data is evaluated using 10
fold cross validation and the results are compared.
Santi Wulan Purnami et al[7]. in their research
work used support vector machine for feature selection
and classification of breast cancer. They emphasized how
1-norm SVM can be used in feature selection and smooth
SVM (SSVM) for classification. Wisconsin breast cancer
dataset was used for breast cancer diagnosis.
Delen et al[6], in their work, have developed
models for predicting the survivability of diagnosed cases
using SEER breast cancer dataset Two algorithms
artificial neural network (ANN) and C5 decision tree
were used to develop prediction models. C5 gave an
accuracy of 93.6% while ANN gave an accuracy of
91.2%. and the diagnosis was carried out based on nine
chosen attributes.
Bellachia et al [2] uses the SEER data to
compare three prediction models for detecting breast
cancer. They have reported that C4.5 algorithm gave the
best performance of 86.7% accuracy.
September 2014, Volume-1, Special Issue-1
Endo et al [3] implemented common machine
learning algorithms to predict survival rate of breast
cancer patient. This study is based upon data of the SEER
program with high rate of positive examples (18.5 %).
Logistic regression had the highest accuracy, artificial
neural network showed the highest specificity and J48
decision trees model had the best sensitivity.
III DATA CLEANING AND PREPARATION
Before the dataset is used, it needs to be properly
preprocessed and a complete relevancy analysis needs to
be completed. Preprocessing functions like replacing
,missing values, normalizing numeric attributes and
converting discrete attributes to nominal type. Feature
selection involves selecting the attributes that are most
relevant to the classification problem. The method used in
relevancy analysis is information gain ranker.
Data pre-processing was applied to SEER data
to prepare the raw data. Pre-processing is an important
step that is used to transform the raw data into a format
that makes it possible to apply data mining techniques
and also to improve the quality of data. It can be noted
from the related work, that attribute selection plays an
important role in identifying parameters that are
important and significant for proper breast cancer
diagnosis. It was also found that the prediction quality
was retained even with a small number of non-redundant
attributes. As a first step, non cancer related parameters;
also termed as socio demographic parameters were
identified and removed. For example, parameters relating
to race, ethnicity etc. was discarded.
The number of attributes removed in this
process was 18 and the total number of attributes was
reduced from 124 to 106. Next the attributes having
missing values in more than 60% of the records were
discarded. For example, the parameter EOD TUMOR
SIZE had no values in all the records. 34 attributes were
removed in this way and the number of attributes became
72. Then, attributes which were duplicated, that is
contained the same values, were overridden or re-coded
were discarded. Finally, the records which had missing
values in any of these 5 attributes were discarded. Hence,
out of the total 1403 records, 1153 records without
missing values were selected for further processing.
TABLE 1
SEER ATTRIBUTES AFTER PREPROCESSING
S.NO.
1
ATTRIBUTE
Age
Domain
20-80
2
Clump thickness
1-10
3
4
Menopause
Tumour
Size
CS
Extension
35-55
1-10
5
62 | Page
1-10
IV CLASSIFICATION METHOD
NAÏVE BAYSEIN CLASSIFIER
The Naive Bayes is a quick method for creation
of statistical predictive models. NB is based on the
Bayesian theorem.It is commonly used to solve prediction
problems for ease of implementation and usage. This
classification technique analyses the relationship between
each attribute and the class for each instance to derive a
conditional probability for the relationships between the
attribute values and the class.
During training, the probability of each class is
computed by counting how many times it occurs in the
training dataset. This is called the “prior probability”
P(C=c). In addition to the prior probability, the algorithm
also computes the probability for the instance x given c
with the assumption that the attributes are independent.
This probability becomes the product of the probabilities
of each single attribute. The probabilities can then be
estimated from the frequencies of the instances in the
training set.
P (Ci | X) > P (Cj | X) for 1≤ j ≤ m, j ≠ i (1)
To maximize P (Ci | X), Bayes rule is applied as stated in
Eq. (2)
P (Ci | X) = P(X | Ci) P (Ci) (2)
P (X)
P (X) is constant for all classes and P (Ci) is calculated as
in Eq. (3),
P(Ci)= Number of training sample in a class(3)
Total number of training samples
To evaluate P(X | Ci), the naïve assumption of class
conditional independence is used as in Eq. (4),
n
P(X|Ci) = Π P (xk | Ci) (4)
k=1
The given sample X is assigned to the class Ci for which
P(X | Ci) P (Ci) is the maximum 1.
C4.5 DECISION TREE ALGORITHM
C4.5 decision tree is a flowchart like tree
structure where each internal node denotes a test on an
attribute and each branch represents the outcome of the
test, and leaf nodes represent classes or class distribution.
In order to classify an unknown sample, the attribute
values of the sample are tested against the decision tree.
Decision trees can be easily converted to classification
rules. The Decision tree induction is based on greedy
algorithm which constructs the trees in a top down,
recursive, divide and conquer method.
C4.5 Decision Tree algorithm is a software
extension of the basic ID3 algorithm designed by Quinlan
recursively visits each decision node selecting the optimal
split. The process is continued until no further split is
September 2014, Volume-1, Special Issue-1
possible 1. The algorithm uses the concept of information
gain or entropy reduction to select the optimal split.
Information gain is the increase in information produced
by partitioning the training data according to the
candidate split.
TABLE 2 CONFUSION MATRIX
ACTUAL PREDICTED
POSITIVE NEGATIVE
Positive
TP
FN
Negative
FP
TN
The C4.5 algorithm chooses the split with
highest information gain as the optimal split. The
information gain measure is used to select the best test
attribute at each node in the tree. To avoid over fitting
problem, C4.5 uses post pruning method and thus
increases the accuracy of the classification. These include
avoidance of over fitting the data; reduced error pruning,
rule post-pruning, handling continuous attributes and
handling data with missing attribute values. In testing
phase we used training data with known result and. C4.5
algorithm was applied to obtain the rule set. In the testing
phase, the classification rules obtained were applied to
the whole pre-processed data. The results obtained are
analysed.
TABLE 3 PERFORMANCES ON TEST DATA
METHOD TESTED DATA
Acc
Sens Spec Err.rate Time
Naïve
95.79 0.966 0.972 3.21
0.09
baysein
C4.5
97.9
0.988 0.962 2.41
0.05
IV CLASSIFIER EVALUATION
The experiment is done using WEKA. The
Weka is an ensemble of tools for data classification,
regression, clustering, association rules, and visualization.
WEKA version 3.6.9 was utilized as a data mining tool to
evaluate the performance and effectiveness of the 6breast cancer prediction models built from several
techniques. This is because the WEKA program offers a
well defined framework for experimenters and developers
to build and evaluate their models.
The performance of a chosen classifier is
validated based on error rate and computation time. The
classification accuracy is predicted in terms of Sensitivity
and Specificity. The computation time is noted for each
classifier is taken in to account. The evaluation
parameters are the specificity, sensitivity, and overall
accuracy. The sensitivity or the true positive rate (TPR) is
defined by TP / (TP + FN); while the specificity or the
true negative rate (TNR) is defined by TN / (TN + FP);
and the accuracy is defined by (TP + TN) / (TP + FP +
TN + FN)
True positive (TP) = number of positive samples correctly
predicted.
False negative (FN) = number of positive samples
wrongly predicted.
False positive (FP) = number of negative samples
wrongly predicted as positive.
True negative (TN) = number of negative samples
correctly predicted.
These values are often displayed in a confusion
matrix as be presented in Table 2. Classification Matrix
displays the frequency of correct and incorrect
predictions. It compares the actual values in the test
dataset with the predicted values in the trained model.
63 | Page
V CONCLUSION
In this paper the performance of Naïve baysein
Classifier and C4.5 analysis on SEER data set in
survivability of breast cancer is done. The performance of
C4.5 shows the high level compare with other classifiers.
Therefore C4.5 decision tree is suggested for predict
survivability of Breast Cancer disease based classification
to get better results with accuracy, low error rate and
performance.
REFERENCES
[1] American Cancer Society. Breast Cancer Facts &
Figures 2005-2006. Atlanta: American Cancer
Society, Inc. (http://www.cancer.org/).
[2] A.Bellachia and E.Guvan,“Predicting breast cancer
survivability using data mining techniques”,
Scientific Data Mining Workshop, inconjunction
with the 2006 SIAM Conference on Data Mining,
2006.
[3] A. Endo, T. Shibata and H. Tanaka (2008),
Comparison of seven algorithms to predict breast
cancer survival, Biomedical Soft Computing and
Human Sciences, vol.13, pp.11-16.
[4] Breast Cancer Wisconsin Data [online]. Available:
http://archive.ics.uci.edu/ml/machine-learningdatabases/breast-cancerwisconsin/breast-cancerwisconsin.data.
[5] Brenner, H., Long-term survival rates of cancer
patients achieved by the end of the 20th century: a
period analysis. Lancet. 360:1131–1135, 2002.
[6] D. Delen, G. Walker and A. Kadam (2005), Predicting
breast cancer survivability: a comparison of three
data mining methods, Artificial Intelligence in
Medicine.
[7] Santi Wulan Purnami, S.P. Rahayu and Abdullah
Embong, “Feature selection and classification of
breast cancer diagnosis based on support vector
machine”, IEEE 2008.
[8]. Jyoti Soni, Ujma Ansari, Dipesh Sharma, Sunita Soni
“Predictive Data Mining for Medical Diagnosis: An
Overview of Heart Disease Prediction” IJCSE Vol. 3
No. 6 June 2011
September 2014, Volume-1, Special Issue-1
VIDEO SUMMARIZATION USING COLOR FEATURES AND
GLOBAL THRESHOLDING
1
Nishant Kumar
2
Amit Phadikar
1
Department of Computer Science & Engineering
2
Department of Information Technology
MCKV Institute of Engineering, Liluah, Howrah, India
1
[email protected]
2
[email protected]
Abstract— Compact representations of video data can
enable efficient video browsing. Such representations provide
the user with information about the content of the particular
sequence being examined. Most of the methods for video
summarization relay on complicated clustering algorithms that
makes them too computationally complex for real time
applications. This paper presents an efficient approach for
video summary generation that does not relay on complex
clustering algorithms and does not require frame length as a
parameter. Our method combines color feature with global
thresholding to detect key frame. For each shot a key frame is
extracted and similar key frames are eliminated in a simple
manner.
Index Terms — Video Summarization, YCbCr Color Space,
Color Histogram.
I. INTRODUCTION
Enormous popularity of the Internet video repository
sites like YouTube or Yahoo Video caused increasing
amount of the video content available over the Internet. In
such a scenario, it is necessary to have efficient tools that
allow fast video browsing. These tools should provide
concise representation of the video content as a sequence
of still or moving pictures - i.e. video summary. There are
two main categories of video summarization [1]: static
video summary and dynamic video skimming. In static
video summary methods, a number of representative
frames, often called keyframes, are selected from the
source video sequence and are presented to the user.
Dynamic video summary methods generate a new, much
shorter video sequence from the source video. Since static
video summaries are the most common technique used in
practical video browsing applications, we focused our
research on static video summarization. Most of the
existing work on static video summarization is performed
by clustering similar frames and selecting representatives
per clusters [2-6]. A variety of clustering algorithms were
applied such as: Delaunay Triangulation [2], k-medoids
[3], k-means [4], Furthest Point First [5] and [6] etc.
Although they produce acceptable visual quality, the most
of these methods relay on complicated clustering
algorithms, applied directly on features extracted from
sampled frames. It makes them too computationally
complex for real-time applications. Another restriction of
these approaches is that they require the number of
clusters i.e. representative frames to be set a priori.
64 | Page
The contribution of this paper is to propose a fast and
effective approach for video summary generation that
does not relay on complicated clustering algorithms and
does not require length (number of summary frames) as a
parameter. The proposed model is based upon color
histogram in YCbCr color space.
The rest of the paper is outlined as: In section II, color
space has been discussed. Section III discusses the
proposed work. Performance evaluation is discussed in
section IV. Finally, section V discusses the conclusion.
II. COLOR SPACE
RGB Color Space: This is an additive color system
based on tri-chromatic theory. Often found in systems that
use a CRT (Cathode Ray Tube) to display images. It is
device dependent and specification of colors is semi–
intuitive. RGB (Red, Blue & Green) is very common,
being used in virtually every computer system as well as
television etc.
YCbCr Color Space: The difference between YCbCr
and RGB is that YCbCr represents color as brightness and
two color difference signals, while RGB represents color
as red, green and blue. In YCbCr, the Y is the brightness
(luma), Cb is blue minus luma (B-Y) and Cr is red minus
luma (R-Y). This color space exploits the properties of the
human eye. The eye is more sensitive to light intensity
changes and less sensitive to hue changes. When the
amount of information is to be minimized, the intensity
component can be stored with higher accuracy than the
Cb and Cr components. The JPEG (Joint Photographers
Engineering Group) file format makes use of this color
space to throw away unimportant information [7]. RGB
images can be converted to YCbCr Color Space using
following conversion process given in matrix form in Eq:
1. Y component is luminance, Cb is blue chromaticity and
Cr is red chromaticity.
September 2014, Volume-1, Special Issue-1
III.
OVERVIEW OF THE PROPOSED METHOD
We propose an approach which is based on several
efficient video processing procedures. At first, video
frames are sampled in order to reduce further
computational burden. Then, Color feature is extracted on
pre-sampled video frames and the Euclidean distance
measure is used to measure the similarity between the
frames. These features are deployed for key frames
detection using a threshold approach. Based on the preset
threshold, key frame is said to be detected at places where
the frame difference is maximal and larger than the global
threshold. Then, a representative key frame is extracted
and similar key frames are eliminated in a simple manner.
As a final result the most informative key frames are
selected as a video summary. In the rest, detailed
description of every step of the method is presented.
STEP 4: Threshold Selection: The problem of choosing
the appropriate threshold is a key issue in the key
frame algorithms. Here, we have chosen global thresholds
as an appropriate method. The threshold is calculated
from average value of distance of all frames.
Threshold
STEP 1: Frame Sampling: The video is sampled at 24
frames per second. This sampling may contain redundant
frames.
Fig. 1. Threshold value on frame difference (video 2).
STEP 2: Frame Feature Extraction: Frame feature
extraction is a crucial part of a key frame extraction
algorithm which directly affects performances of the
algorithm.
Color: Several methods for retrieving images on the basis
of color feature have been described in the literature.
Color feature is easy and simple to compute. The color
histogram is one of the most commonly used color feature
representation in image retrieval as it is invariant to
scaling and rotation. Color histogram of an image in the Y
(Luminance), Cb (Chrominance of blue), and Cr
(Chrominance of Red) color space are calculated. Color
histogram is very effective for color based image analysis.
They are especially important for classification of images
based on color.
STEP 3: Dissimilarity Measure: The next important step
is similarity measures.
Similarity measure is playing
important role in the system. It compares the image
feature vector of a frame with the feature vectors of
previous image.
It actually calculates the distance
between them. Images at high distance are tagged as key
frame and will be selected finally.
The above figure that is Figure 1: shows key frames
being detected. The bar which crossed the threshold value
is selected as the key frames. For example, frame numbers
like 9, 154, 176,195, 257, etc have been selected as the key
frames because they have crossed the threshold value in
the experimental data set (video 2).
STEP 5: Detection of Key frames: The proposed model is
based on color histogram. Given a video which contains
many frames, the color histogram for each frame is
computed and the Euclidean distance measure is used
to measure the dissimilarities between the frames.
Based on the predefined threshold, a key frame is said
to be detected if the dissimilarity between the frames is
higher than the threshold value.
IV. PERFORMANCE EVALUATION
Euclidean Distance is represented as:
This section presents the results of the experiments
conducted to corroborate the success of the proposed
model. The experimentation is
conducted on set of
YOUTUBE- videos and The Open Video Project- videos.
The performance of the proposed model is evaluated
using precision and recall as evaluation metrics. The
precision measure is defined as the ratio of number of
correctly detected keyframe to the sum of correctly
detected & falsely detected keyframe of a video data and
recall is defined as the ratio of number of detected
keyframe to the sum of detected & undetected keyframe.
These parameters were obtained for the proposed model on
three different video samples.
(2)
(3)
Euclidean distance measure is used to find the
histogram difference. If this distance between the two
histograms is above a threshold, a key frame is assumed
The dissimilarity between frames, fi and fi+1 is computed
as the Euclidian distance between feature vector of fi and
feature vector of fi+1
换
where, Fi and Fi+1: feature vector containing components
of Y, Cb and Cr channels of frames.
65 | Page
September 2014, Volume-1, Special Issue-1
Fig. 3. Preview of( 4g)enerated summaries of test videos: (a) Wildlife, (b) New Horizon 1 & (c) New Horizon 2.
TABLE 1: METRICS OF THE PROPOSED WORK.
Key frame detection performance of proposed work
Size
Video 1
7.89 MB
No. of frames
tested
901
95.72%
80.00%
Video 2
8.81 MB
1813
89.90%
87.10%
Video 3
8.73 MB
1795
91.40%
91.80%
Precision
Recall
Threshold
Fig. 2. Plot of frame dissimilarity for video 1.
Video 1
(a)
Wildlife.
(c)
New Horizon 1.
Video 2
Video 3
(e)
66 | Page
New Horizon 2.
September 2014, Volume-1, Special Issue-1
The results for three test videos randomly selected
from YOUTUBE and The Open Video Project- videos
action data is presented in Table 1. The Figure 2 is the plot
of frame dissimilarity of video 1. The frame numbers that
have crossed the threshold value have been selected as key
frames. Figure 3 presents results of our method, preview of
generated summaries of test videos, Video 1, Video 2, and
Video 3 respectively.
The precision and recall comparisons between our
method and Angadi et al. [9] are shown in Table 2. It is
found that our method offers nearly similar result like S.
A. Angadi and Vilas Naik [9]. Moreover, it is to be noted
that the method proposed by S. A. Angadi and Vilas Naik
[9] had high computation complicity as the scheme used
color moments. However, our scheme has low
computational complicity as it uses simple color
histogram.
[4]
[5]
[6]
[7]
TABLE 2. PRECISION AND RECALL COMPARISONS.
S. A. Angadi and Vilas Naik [9]
Proposed
Precision
Recall
Precision
Recall
90.66%
95.23%
92.34%
91.80%
[8]
[9]
V. CONCLUSION
In this paper, we proposed an efficient method for
video summary generation. Every color histogram
computed for an image in YCbCr color space is used to
find difference between two frames in a video.
The
difference between consecutive frames to detect
similarity/dissimilarity is computed as Euclidian distance
between
feature
vector
containing
color
of
Y(Luminance),
Cb (Chrominance
of
blue), Cr
(Chrominance of red) values of frame. The key frames
are detected wherever difference value is more than
predefined threshold. Experimental results on standard
YOUTUBE videos and on The Open Video Projectvideos, data reveal that the proposed model is robust and
generates video summary efficiently.
Proceedings of the ACM Symposium on Applied
Computing, New York, p. 1400–1401, 2006.
S. E. F. De Avila, A. P. B. Lopes, A. Luz and A.
Albuquerque Araújo, “VSUMM: A mechanism
designed to produce static video summaries and a
novel evaluation method,”Pattern Recognition Letters,
vol. 32, pp. 56–68, 2011.
M. Furini, F. Geraci, M. Montangero and M.
Pellegrini, “VISTO: visual storyboard for web video
browsing”, in Proceedings of the ACM International
Conference on Image and Video Retrieval, p. 635–
642, 2007.
M. Furini, F. Geraci, M. Montangero and M.
Pellegrini, “STIMO: Still and moving video
storyboard for the web scenario,” Multimedia Tools
and Applications, pp. 47–69, 2007.
H. B. Kekre, S. D. Thepade, and R. Chaturvedi,
“Walsh, Sine, Haar & Cosine Transform With Various
Color Spaces for „Color to Gray and Back,”
International Journal of Image Processing, vol. 6, pp.
349-356, 2012.
S. Cvetkovic, M. Jelenkovic, and S. V. Nikolic,
“Video summarization using color features and
efficient adaptive threshold technique,” Przegląd
Elektrotechniczny, R. 89NR 2a, pp. 274-250, 2013.
S. A. Angadi and Vilas Naik., “A shot boundary
detection technique based on local color moments in
YCbCr color space,” Computer Science and
Information Technology, vol. 2, pp. 57-65, 2012.
Future work will focus on further performance
improvement of the proposed scheme by selecting
adaptive threshold based on genetic algorithm (GA) and
combination of motion, edge and color to increase the
efficiency of key frame detection.
REFERENCES
[1] B. T. Truong and S. Venkatesh, “Video abstraction: A
systematic
review and classification,” ACM
Transactions
on
Multimedia
Computing
Communications and Applications, vol. 3, pp. 1-37,
2007.
[2] P. Mundur, Y. Rao and Y. Yesha, “Keyframe-based
video summarization using Delaunay clustering,”
International Journal on Digital Libraries, vol. 6, pp.
219–232, 2006.
[3] Y. Hadi, F. Essannouni and R. O. H. Thami, “Video
summarization by k-medoid clustering,” in
67 | Page
September 2014, Volume-1, Special Issue-1
ROLE OF BIG DATA ANALYTIC IN HEALTHCARE USING DATA
MINING
1
K.Sharmila
2
R.Bhuvana
Asst. Prof. & Research Scholar
Dept. of BCA & IT, Vels University, Chennai, India
1
[email protected]
2
[email protected]
Abstract— The paper describes the promising field of
big data analytics in healthcare using data mining
techniques. Big Data concerns large-volume, complex,
growing data sets with multiple, autonomous sources.
With the fast development of networking, data storage,
and the data collection capacity, Big Data is now rapidly
expanding for healthcare researchers and practitioners.
In healthcare, data mining is becoming progressively
more popular, if not increasingly essential. Healthcare
industry today generates large amount of complex data
about patients, hospitals resources, disease diagnosis,
electronic patient records, medical devices, etc. The
large amount of data is a key resource to be processed
and analyzed for knowledge extraction that enables
support for cost-savings and decision making. Data
mining provides a set of tools and techniques that can
be applied to this processed data to discover hidden
patterns and also provides healthcare professionals an
additional source of knowledge for making decisions. In
the past few eras, data collection related to medical field
saw a massive increase, referred to as big data. These
massive datasets bring challenges in storage,
processing, and analysis. In health care industry, big
data is expected to play an important role in prediction
of patient symptoms, and hazards of disease occurrence
or reoccurrence, and in improving primary-care
eminence.
Index Terms—
Healthcare,
Big
data,
Analytics,
Hadoop,
I. INTRODUCTION
Big data refers to very large datasets with complex
structures that are difficult to process using traditional
methods and tools. The term process includes, capture,
storage, formatting, extraction, curation, integration,
analysis, and visualization. A popular definition of big
data is the “3V” model proposed by Gartner, which
characteristics three fundamental features to big data:
high volume of data mass, high velocity of data flow, and
high variety of data types. Big data in healthcare refers to
electronic health data sets so large and complex that they
are difficult (or impossible) to manage with traditional
software and/or hardware; nor can they be easily
managed with traditional or common data management
tools and methods. Big data in healthcare is
overwhelming not only because of its volume but also
because of the diversity of data types and the speed at
which it must be managed. The totality of data related to
patient healthcare and well-being make up “big data” in
the healthcare industry. It includes clinical data from
68 | Page
CPOE and clinical decision support systems (physician’s
written notes and prescriptions, medical imaging,
laboratory, pharmacy, insurance, and other administrative
data); patient data in electronic patient records (EPRs).
The Table shows the growth of global big data volume
and computer science papers on big data since 2009. This
table exemplifies that stored data will be in the tens of
zettabytes range by 2020, and research on how to deal
with big data will grow exponentially as well.
TABLE 1: GLOBAL GROWTH OF BIG DATA AND
COMPUTER SCIENCE PAPERS ON BIG DATA
a-Data from oracle-Data from Research Trends,cCS,
computer science; ZB, zettabytes (1 zettabyte = 1000
terabytes = 106 petabytes = 1018 gigabytes, GB).. Please
follow them and if you have any questions, direct them to
the production editor in charge of your proceedings (see
author-kit message for contact info).
This paper provides an outline of big data analytics in
healthcare as it is evolving as a discipline and discuss the
various advantages and characteristics of big data
analytics in healthcare. Then we define the architectural
framework of big data analytics in healthcare and the big
data analytics application development methodology.
Lastly, it provides examples of big data analytics in
healthcare reported in the literature and the challenges
that are identified.
BIG DATA ANALYTICS IN HEALTHCARE:
Health data volume is expected to grow dramatically in
the years ahead. It is vitally important for healthcare
organizations that profit is not and should not be a
primary motivator so it is necessary to acquire the
available tools, infrastructure, and techniques to leverage
big data effectively or else risk losing potentially millions
of dollars in revenue and profits. The chief application of
Big Data in healthcare lies in two distinct areas. First, the
September 2014, Volume-1, Special Issue-1
filtering of vast amounts of data to discover trends and
patterns within them that help direct the course of
treatments, generate new research, and focus on causes
that were thus far unclear. Secondly, the complete volume
of data that can be processed using Big Data techniques is
an enabler for fields such as drug discovery and
molecular medicine.
Big data can enable new types of applications, which in
the past might not have been feasible due to scalability or
cost constraints. In the past, scalability in many cases was
limited due to symmetric multiprocessing (SMP)
environments,On the other hand, MPP(Massively Parallel
Processing) enables nearly limitless scalability. Many
NoSQL Big Data platforms such as Hadoop and
Cassandra are open source software, which can run on
commodity hardware, thus driving down hardware and
software costs.
The theoretical framework for a big data analytics project
in healthcare is similar to that of a traditional health
informatics or analytics project. The key difference lies in
how processing is executed. In a regular health analytics
project, the analysis can be performed with a business
intelligence tool installed on a stand-alone system, such
as a desktop or laptop. Because big data is by definition
large, processing is broken down and executed across
multiple nodes. Furthermore, open source platforms such
as Hadoop/MapReduce, available on the cloud, have
encouraged the application of big data analytics in
healthcare.
PROTAGONIST OF BIG DATA IN HEALTHCARE:
Healthcare and life sciences are the fastest growing and
biggest impact industries today when it comes to big
data.Disease research is also being supported by big data
to help tackle conditions such as diabetes and cancer. The
ability to create and capture data is exploding and offers
huge potentia to save both lives and scarce resources.
Many NoSQL Big Data platforms such as Hadoop and
Cassandra are open source software, which can run on
commodity hardware, thus driving down hardware and
software costs.
DATA MINING CHALLENGES WITH BIG DATA:
BIG DATA PROCESSING FRAMEWORK:
A Big Data processing framework form a three tier
structure and center around the “Big Data mining
platform”
Tier I, which focuses on low-level data accessing and
computing. Challenges on information sharing and
privacy, and Big Data application domains and
knowledge form Tier II, which concentrates on high level
semantics, application domain knowledge, and user
privacy issues. The outmost circle shows Tier III
challenges on actual mining algorithms.
The limitations with big data include “adequacy,
accuracy, completeness, nature of the reporting sources,
and other measures of the quality of the data”, and some
of the Data Mining Challenges with Big Data in
healthcare are inferring knowledge from complex
heterogeneous patient sources, leveraging the patient/data
correlations in longitudinal records, understanding
unstructured clinical notes in the right context, efficiently
handling large volumes of medical imaging data and
extracting potentially useful information and biomarkers,
analyzing genomic data is a computationally intensive
task and combining with standard clinical data adds
additional layers of complexity and capturing the
patient’s behavioral data through several sensors; their
various social interactions and communications.
TOOLS USED IN
HEALTHCARE:
BIG
DATA
ANALYTICS
The key obstacle in the healthcare market is data liquidity
and some are using Apache Hadoop to overcome this
challenge, as part of modern data architecture. Hadoop
can comfort the soreness caused by poor data liquidity.
For loading the data the tool Sqoop efficiently transfers
bulk data between Apache Hadoop and structured
datastores such as relational databases. It import data
from external structured datastores into HDFS or related
systems like Hive and HBase. Sqoop can also be used to
extract data from Hadoop and export it to external
structured datastores such as relational databases and
enterprise data warehouses. Sqoop works with relational
databases such as: Teradata, Netezza, Oracle, MySQL,
Postgres, and HSQLDB.
69 | Page
September 2014, Volume-1, Special Issue-1
To process the health data Depending on the use case,
healthcare organizations process data in batch using
Apache Hadoop MapReduce and Apache Pig;
interactively with Apache Hive; online with Apache
HBase or streaming with Apache Storm.
To analyze the data, the data once stored and processed in
Hadoop can either be analyzed in the cluster or exported
to relational data stores for analysis there.
CONCLUSION:
[8] Kuperman GJ, Gardner RM, Pryor TA, "HELP: A
dynamic hospital information system". SpringerVerlag, 1991.
[9] A survey on Data Mining approaches for
Healthcare,Divya Tomar and Sonali Agarwal,
International Journal of Bio-Science and BioTechnology Vol.5, No.5 (2013), pp. 241-266.
D. S. Kumar, G. Sathyadevi and S. Sivanesh,
“Decision Support System for Medical Diagnosis
Using Data Mining”, (2011).
The big data has recently helped a major healthcare
provider determine its strategy, use cases, and roadmap
for utilizing it as part of their strategic plan through 2020.
Perficient is currently assisting a client in using Big Data
technologies for leveraging medical device data in real
time. It is progressing into a promising field for providing
insight from very large data sets and improving outcomes
while reducing costs. Though it is progressing there are
some challenges faced by big data analytics are,
widespread implementation and guaranteeing privacy,
safeguarding security, establishing standards and
governance, and continually improving the tools and
technologies will garner attention. The data is often
contained within non-integrated systems, and hospitals
and health systems lack the software applications needed
to transform this data into actionable clinical information
and business intelligence. In future these challenges are to
be considered and so we can have health organizations
can bring to the forefront better patient care and better
business value.
REFERENCES
[1] Data Mining with Big Data, Xindong Wu1,
Xingquan Zhu, Gong-Qing Wu, Wei Ding.
[2] Big Data and Clinicians: A Review on the State of
the Science, Weiqi Wang, PhD; Eswar Krishnan,
JMIR Med Inform 2014 | vol. 2 | iss. 1 | e1 | p.1
[3] Big Data Analytics for Healthcare, Jimeng Sun,
Chandan K. Reddy.
[4] Data mining concepts,Ho Veit Lam- Nguyen Thi My
Dung May- 14,2007
[4] Data Mining Over Large Datasets Using Hadoop In
Cloud Environment,
[5] A Survey on Data Mining Algorithms on Apache
Hadoop Platform, DR. A. N.Nandakumar,Nandita
ambem2ISSN
2250-2459,ISO9001:2008Certified
Journal, Volume 4, Issue 1, January2014)
[6] An Interview with Pete Stiglich and Hari Rajagopal
on big data.
[7 ] Application of Data Mining Techniques to Healthcare
Data,Mary K. Obenshain, MAT,,Infection Control
and Hospital Epidemiology, August2004.
70 | Page
September 2014, Volume-1, Special Issue-1
THE EFFECT OF CROSS-LAYERED COOPERATIVE
COMMUNICATION IN MOBILE AD HOC NETWORKS
1
N. Noor Alleema
2
D.Siva Kumar, Ph.D
1
Research Scholar
2
Professor
Sathyabama University,
Department of Information Technology,
Easwari Engineering College.
Abstract:
In the emerging trends, a need for cooperative
communication that ensures the reliability in data
communication across wireless networks, especially
for the ones that change network topology quite
often, has come into existence. Most existing works
on cooperative communications are focused on
link-level physical layer issues. As a result of this,
most of the issues related to the physical layer and
other routing issues are ignored and assumed to be
good, without actually providing a solution for the
same. In this article, a Cooperative topology control
scheme that is also capacity optimized (COCO), to
improve the network capacity in MANETs. This
performed by jointly considering both upper layer
network capacity and physical layer cooperative
communications. The Radio Interference Detection
Protocol is enhanced using COCO for Mobile ad
hoc network in this paper.
Simulations are
performed using the network simulator to prove the
efficiency of the proposed scheme.
1. Introduction:
Wireless Ad hoc networks are usually
ignored for the cross layer adaptability while
proposing novel schemes. Network capacity is one of
the scarce assets, which has to be used in resourceful
ways to occupy a large number of paths or links
which has to provide exceptional throughput. In
cooperative communication a single antenna device
to attain the spatial diversity and it can harvest the
utilities of MIMO system such as fade resistant, large
throughput, network connectivity and lower power
consumption. There are issues which are jointly
considered with topology control in a network. They
are Power controlling and channel maintenance.
Controlling the network topology is essential along
with the appropriate use of network capacity.
Cooperative communication has emerged as a new
dimension of diversity to emulate the strategies
designed for multiple antenna systems [1]. This is
mainly because a wireless mobile device may not be
able to support multiple transmit antennas due to the
limitations like cost and size.
A virtual antenna array can be formed by
the emergence of cooperative communication where
the antenna can be share due to the nature of the
71 | Page
wireless channel. The IEEE 802.16j standard has
been designed with this feature in mind and this is
budding in Long Term Evolution (LTE) multi-hop
cellular networks [2].
Some existing works have been performed in Outage
Behavior of Selective Relaying Schemes [3] and
Distributed Optimal Relay Selection in Wireless
Cooperative Networks with Finite-State Markov
Channels [4] that have to some extent brought out the
cooperative advantages of MANETs.
Mobile Node
Transmission Range
Figure 1: Example of a MANET
2. Related Work
Relay selection is crucial in improving the
performance of wireless cooperative networks. For
the most part previous works for relay selection use
the current pragmatic channel conditions to make the
relay-selection decision for the subsequent frame.
However, this memoryless channel supposition is
often not realis-tic given the time-varying nature of
some mobile environment. In this paper, consider
finite-state Markov channels in the relay-selection
problem. Moreover, incorporate adaptive inflection
and coding, as well as residual relay energy in the
relay-selection progression. The objectives of the
proposed scheme are to increase spectral efficiency,
mitigate error transmission, and maximize the
network lifetime. The formulation of the proposed
relay-selection format is based on current advances
in stochastic control algorithms. The obtain relayselection policy has an index ability property that
September 2014, Volume-1, Special Issue-1
dramatically reduces the computation and
implementation density. In addition, there is no need
for a centralized control point in the network, and
relays can freely connect and leave from the set of
potential relays. Simulation results are accessible to
show the effectiveness of the proposed scheme.[4]
Topology control in ad-hoc networks tries to lower
node energy consumption by reducing transmission
power and by connecting intervention, collisions and
consequently retrans-missions. Generally low
intervention is claimed to be a consequence to
sparseness of the resulting topology. In this paper
invalidate this implication. In dissimilarity to most of
the related work| claiming to solve the interference
issue by graphs parseness without providing clear
argumentation or proofs|, by providing a concise
and spontaneous definition of interference. Based on
this definition it has been shown that most currently
proposed topology control algorithms do not
effectively constrain interference. Moreover, propose
connectivity-preserving and spanner construction
that are interference-minimal.[10]
Cooperative diversity techniques can
improve the transmission rate and trustworthiness of
wireless networks. For systems employing such
multiplicity techniques in slow-fading channels,
outage probability and outage capability are
important performance measures. Existing studies
have derived approxi-mate expressions for these
performance measures in different scenarios. In this
paper, derive the accurate expressions for outage
probabilities and outage capacities of three proactive
accommodating diversity schemes that select a best
relay from a set of relays to forward the information.
The derived expressions are valid for illogical
network topology and operating signal-to-noise ratio,
and serve as a helpful tool for network design.[3]
3. Radio Interference Detection Protocol: RID
Besides the interferences in one direction,
interferences in different directions are also
measured, and Figure 3 shows the experimental
results. As Figure 3 shows, neither the radio
interference pattern nor the radio communication
pattern is spherical, which is consistent with the
result.
The basic idea of RID is that a transmitter
broadcasts a High Power Detection packet (HD
packet), and immediately follows it with a Normal
Power Detection packet (ND packet). This is called
an HD-ND detection progression. The receiver uses
the HD-ND detection sequence to estimate the
transmitter’s interference strong point. An HD packet
includes the transmitter’s ID, from which the
receiver knows from which transmitter the following
ND packet comes. The receiver estimates possible
interference caused by the transmitter by sensing the
power level of t he transmitter’s ND packet. In order
to make sure every node within the transmitter’s
interference range is able to receive the HD packet,
by assuming that the communication range, when the
high sending power i s used, is at least as large as the
interference range, when the normal s ending power
is used. After t he HD-ND detection, each node
begins to exchange the detected interference
information among its neighborhood, and then uses
this information t o figure out all collision cases
within the system. In what follows, t he three stages
of RID, (i) HD-ND detection, (ii) information s
haring, and (iii) interference calculation, are
discussed in detail. 1) HD-ND Detection: With a
high s ending power, the transmitter first s ends out
an HD packet, which only contains its own ID
information ( two bytes) and the packet type (one
Byte) t o minimize t he packet length and to save
transmission energy.
Competence based topology administration is
essential for a wireless ad hoc network due to its
restricted capacity so that topology control becomes
indispensable to deploy large wireless ad hoc
networks. This paper discusses the collision of
topology control on network capacity by introducing
a new definition of the estimated capacity that is first
analyzed in the perspective of cross layer
optimization. Based on the analytical result, most
favorable schemes for neighbor selection and
transmission power organize, which are two
functions of topology control, are studied to exploit
the capacity. A hopeful conclusion indicates that
topology control with stable node degree renders the
capacity not to reduce with the increase of the
number of nodes present in the network. The
systematic results in this paper can provide a
principle for the design of topology control
schemes.[8]
Figure 3: Interference Pattern [11]
72 | Page
September 2014, Volume-1, Special Issue-1
Then the transmitter waits until the
hardware is ready to send again. After the Minimal
Hardware Wait Time (MHWT), t he transmitter
immediately sends out a fixed-length ND packet,
with the normal s ending power. The ND packet’s
length is fixed in order that the receiver is able to
estimate when the ND packet’s transmission will end
once it starts to be sensed. At the recipient side, the
HD-ND detection sequences are used to estimate the
interference strength from corresponding transmitters
[11].
4. Cooperative Communication:
Cooperative communication is the principle
of relay communication. In COCO all the
intermediate nodes send data across the network in a
cooperative manner. In cooperative communication,
there are three types of transmission manners in its
physical layer of MANETS, such as, direct
communication, multi hop transmission and
cooperative transmission.
For cooperative communication in MANETS, the
topology control appearance will be given as
G*=arg max f(G) (1)
Where G represents original network
topology that has mobile nodes along with link
connection as their input. Based on the network
capacity function, the most attractive topology can be
derived from the algorithm output. The two different
types of network capacity are, transfer capacity and
throughput capacity as proposed by Gupta and
Kumar [9].
The proposed work is explained using the below
flowchart in figure 4. The RID protocol is modified
using the cooperative communications. A CapacityOptimized Cooperative (COCO) topology control
scheme to improve the network capacity in MANETs
by jointly optimizing transmission mode selection,
relay node range, and interference organize in
MANETs with cooperative communications is
explained in figure 4.
5. Simulation Analysis:
The NS2 Simulator is mainly used in the
research field of networks and communication. The
NS2 is a discrete event time driven simulator which
is used to evaluate the performance of the network.
Two languages such as C++, OTCL (Object Oriented
Tool Command Language) is used in NS2. The C++
is act as back end and OTCL is used as front end.
The X-graph is used to plot the graph. The
parameters used in the simulation are tabulated as
follows:
Table 1 Simulation parameters used
Parameter
Value
Channel Type
Wireless Channel
Radio Propagation
model
Network interface
type
TwoRayGround
WirelessPhy
MAC Type
IEEE 802.11
Interface Queue
Type
PriQueue
Link Layer Type
LL
Antenna Model
Omni Antenna
Figure 4: Working of the proposed scheme
The COCO Topology scheme is proposed
to improve the topology control problem in the
Cooperative communication by considering link
level issues in physical layers and upper layer issues
such as network capacity. There are two must
conditions are taken into account in COCO scheme.
73 | Page
The packet data rates, packet loss, delay and
network capacities are the parameters used in the
simulation to evaluate the proposed method. The red
colored curves
September 2014, Volume-1, Special Issue-1
A. Data Rate
The rate at which the data is transmitted
from node to node is called as data rate. The
proposed system has a good data rate in the figure 5
below.
Figure 5: Data Rate Comparison
B.
Figure 7: Packet Delay Comparison
D. Network Capacity
Packet Loss
Figure 8: Network Capacity Comparison
Figure 6: Packet Loss Comparison
Packets loss indicates the number of packets lost
while data is transmitted from node to node. The
figure indicates that the proposed scheme has
reduced amount of loss.
C. Packet Delay
The delay occurred during data transmission
is given in the figure below. It shows that the
proposed system has the least delay.
74 | Page
The network capacity is estimated from the
energy of the nodes. As the energy per unit time can
also be termed as power in Watt, the network
capacity is plotted in figure 8. and it is greatly
improved for the proposed system.
6. Conclusion:
In this article, physical layer cooperative
communications, topology control, and network
capacity in MANETs is introduced. To improve the
network capacity of MANETs with cooperative
communications, a Capacity Optimized Cooperative
(COCO) topology control scheme that considers both
upper layer network capacity and physical layer relay
selection in cooperative communications. Simulation
results have shown that physical layer cooperative
September 2014, Volume-1, Special Issue-1
communications techniques have significant impacts
on the network capacity.
References
[1] J. Laneman, D. Tse, and G. Wornell,
“Cooperative Diversity in Wireless Networks:
Efficient protocols and Outage Behavior,” IEEE
Trans. Info. Theory, vol. 50, no. 12, 2004, pp.
3062–80.
[2] P. H. J. Chong et al.,“Technologies in Multihop
Cellular Network,” IEEE Commun. Mag., vol.
45, Sept. 2007, pp. 64–65.
[3] K. Woradit et al.,“Outage Behavior of Selective
Relaying Schemes,” IEEE Trans. Wireless
Commun., vol. 8, no. 8, 2009, pp. 3890–95.
[4] Y. Wei, F. R. Yu, and M. Song, “Distributed
Optimal
Relay
Selection
in
Wireless
Cooperative Networks with Finite-State Markov
Channels,” IEEE Trans. Vehic. Tech., vol. 59,
June 2010, pp. 2149–58.
[5] Q. Guan et al.,“Capacity-Optimized Topology
Control for MANETs with Cooperative
Communications,” IEEE Trans. Wireless
Commun., vol. 10, July 2011, pp. 2162–70.
[6] P. Santi, “Topology Control in Wireless Ad Hoc
and Sensor Networks,” ACM Computing
Surveys, vol. 37, no. 2, 2005, pp. 164–94.
[7] T. Cover and A. E. Gamal, “Capacity Theorems
for the Relay Channel,” IEEE Trans. Info.
Theory, vol. 25, Sept. 1979, pp. 572–84.
[8] Q. Guan et al., “Impact of Topology Control on
Capacity of Wireless Ad Hoc Networks,” Proc.
IEEE ICCS, Guangzhou, P. R. China, Nov.
2008.
[9] P. Gupta and P. Kumar, “The Capacity of
Wireless Networks,” IEEE Trans. Info. Theory,
vol. 46, no. 2, 2000, pp. 388–404.
[10] M. Burkhart et al, “Does Topology Control
Reduce Interference?,” Proc. 5th ACM Int’l
Symp. Mobile Ad Hoc Networking and
Computing, Tokyo, Japan, May 2004, pp. 9-19.
[11] Gang Zhou, Tian He, John A. Stankovic,Tarek
Abdelzaher, “Proc. 24th Annual Joint
Conference of the IEEE Computer and
Communications Societies. Proceedings IEEE”
March 2005, pp 891 – 901.
75 | Page
September 2014, Volume-1, Special Issue-1
SECURE CYBERNETICS PROTECTOR IN SECRET
INTELLIGENCE AGENCY
1
G.Bathuriya
2
D.E.Dekson
1
PG Scholar
2
Professor
Department of Computer Science and Engineering,
Aarupadai Veedu Institute of Technology, Vinayaka Missions University, Rajiv Gandhi Salai,
(OMR), Paiyanoor-603104, Kancheepuram District, Tamilnadu, India
1
[email protected]
2
[email protected]
Abstract:
Cybernetic Protectors is to provide a secure way of
communication and transferring evidences in Secret
Intelligence Agency of defence system which always
uses undercover agents to solve complex cases and
dismantle
criminal
organizations.
We
are
conceptualizing this software as a solution so that Secret
Intelligence Agencies and their agents can
communicate through this Software for the exchange of
evidences in a secure way, and maintain the details of
main officer.
II. SYSTEM FUNCTIONALITY
Figure 1 explains overall functionality of the system
along with different set of users involved. In the
information systems development field, requirements
form the foundation for the rest of the software
development process. since building a high-quality
requirements specification is essential to ensure that the
product satisfies the users.
Keywords—Secret IntelligenceAgency, Security,
FaceRecognition, Digital Signature.
I. INTRODUCTION
The Secret Intelligence Agency is the nations
first line of defence. It accomplishes what others cannot
accomplish and go where others cannot go. It carries out
the mission by collecting information that reveals the
plans, intentions and capabilities of the adversaries and
provides the basis for decision and action.
The Cybernetics Protector is software which
allows a security agency to handlevarious confidential
missions in a secured way.
The Cybernetics Protector software is concerned
with the security of the country and thus proper care has
to be taken that confidential data from within the database
is not leaked out.
Every country requires a Secret Agency who
undertakes cases which are a threat to the national
security. These agencies operate with the help of
undercover agents who help solve these cases. Since
these cases deal with the nations security, the
communication and data transfer between the agents and
higher authorities need to be protected. Hencedeveloping
such a system is necessary to help these agencies operate
in a secret and secured way. The system will be used by a
set of five different users. These users are Defence
Ministry, Chief, Agents, employees and Citizens of the
country.
Figure 1 Cybernetics Protector Users
The Defense Ministry-The Defense Ministry assigns
cases to the Secret Agency and allocates resources toit. It
should be able to receive reports regarding the cases.
The Security Chief-The Chief of the Secret Agency has
the highest powers. He can administer the agents, assign
cases and resources. Also he has right to view the
database.
The Agent-The undercover agent can send the evidence
and data collected in an encrypted fashion so that the data
cannot be intercepted.
Citizen-A citizen has the lowest access rights. A citizen
can only view the success stories of the agency and chat
with the officials.
The functions of these different users shown in Figure 1
are as listed here,
76 | Page
September 2014, Volume-1, Special Issue-1
1Agent Manipulation
This feature is provided to the Chief of Security. The
Chief will be able to Add/Delete/Edit Agent Records.
2. Agent Appointment
This feature is provided to the Chief of Security. The
chief will appoint an agent for the case.
3.Secure sending and retrieval of data
This feature is provided to the Chief of Security, Agent
and the Defense Ministry. This feature basically enhances
the security of the software.
4. Access of Data Logs
This feature is provided to the Chief of Security. This
feature enables him to analyze the data logs.
5. View Case Details
This feature is provided to the Agent. The agent will
receive the entire case details from the Chief of Security.
6. View Resources
The chief and agents can view the resources available.
7. Report Management
This feature is provided to the Chief of Security, Agent
and the Defense Ministry. The Chief will use this feature
to generate reports and send them to the Defense
Ministry. The agents can use this feature to send the
reports to the Chief. The Defense Ministry will be able to
receive the reports.
8. Send Resources to Secret Agency
This feature is provided to the Defense Ministry. The
Defense Ministry is responsible for any resources that are
to be made available to the agents.
9. Assign Case to Agency
This feature is provided to the Defense Ministry. The
defense ministry will create a new case and the case
details along with the mission objectives to be sent to
agency.
10. View Success Stories
This feature is provided to the Citizen. The citizen has the
least powers. The citizen can view the details of
completed missions which are posted by the agency.
11. Provide Tips and Feedback
This feature is provided to the Citizen. The citizen can
provide tips and feedback regarding any article that is
posted by the agency.
12. Apply for Job
This feature is provided to the Citizen. The citizen can
inquire about the different job profiles available at with
the agency. Also he can inquire about the various
qualifications required for different job profiles.
77 | Page
III. . BACKGROUND
The Cybernetics Protector software is concerned with the
security of the country and thus proper care has to be
taken that confidential data from within the database is
not leaked out. The main focus of the system is on
security and thus the following sets of features are used to
provide high security.
o Encryption and Decryption
o RSA Algorithm
o Login
o Case details
A.Encryption and Decryption
This system is based on the 3 pillars of information
security- Confidentiality, Integrity and Availability. The
digital signature used here protects the integrity and
authenticity of a message. However other techniques are
still required to provide confidentiality of the message
being sent. Encryption is the process of transforming
information (referred to as plaintext) using an algorithm
(called a cipher) to make it unreadable to anyone except
those possessing special knowledge, usually referred to as
a key. The result of the process is encrypted information
(in cryptography, referred to as cipher text).
To provide higher integrity and confidentiality project
uses both the digital signature and encryption
mechanisms. The document is digitally signed by the
sender as well as the document is encrypted.
B. RSA:
The RSA algorithm was publicly described in 1977 by
Ron Rivest, Adi Shamir, and Leonard Adleman
The RSA algorithm involves three steps:
key generation, encryption and decryption.
Key generation
RSA involves a public key and a private key.
The public key can be known to everyone and is used
for encrypting messages.
Messages encrypted with the public key can only be
decrypted using the private key.
The keys for the RSA algorithm are generated the
following way:
1. Choose two distinct prime numbersp and q.
o For security purposes, the integersp
and q should be chosen at random,
and should be of similar bit-length.
2.
Compute n = pq.
o n is used as the modulus for both the
public and private keys. Its length,
usually expressed in bits, is the key
length.
3.
Compute φ(n) = (p – 1)(q – 1), where φ is
Euler's totient function.
September 2014, Volume-1, Special Issue-1
4.
Choose an integer e such that 1 <e<φ(n) and
greatest common divisorgcd(e, φ(n)) = 1; i.e., e
and φ(n) are coprime.
o e is released as the public key
exponent.
o e having a short bit-length and small
Hamming weight results in more
efficient encryption – most commonly
216 + 1 = 65,537. However, much
smaller values of e (such as 3) have
been shown to be less secure in some
settings.[4]
5. Determine d as d ≡ e−1 (mod φ(n)), i.e., d is the
multiplicative inverse of e (modulo φ(n)).
This is more clearly stated as solve
for d given de ≡ 1 (mod φ(n))
This is often computed using the
extended Euclidean algorithm.
d is kept as the private key exponent.
By construction, d⋅e ≡ 1 (mod φ(n)).
The public key consists of the modulus n and the public
(or encryption) exponent e.
The private key consists of the modulus n and the
private (or decryption) exponent d, which must be kept
secret. p, q, and φ(n) must also be kept secret because
they can be used to calculate d.
An alternative, used by PKCS#1, is to choose
d matching de ≡ 1 (mod λ) with λ = lcm(p − 1,
q − 1), where lcm is the least common
multiple. Using λ instead of φ(n) allows more
choices for d. λ can also be defined using the
Carmichael function, λ(n).
The ANSI X9.31 standard prescribes, IEEE
1363 describes, and PKCS#1 allows, that p
and q match additional requirements: being
strong primes, and being different enough that
Fermat factorization fails.
(In practice, there are more efficient methods of
calculating cd using the precomputed values below.)
Using the Chinese remainder algorithm
For efficiency many popular crypto libraries (like
OpenSSL, Java and .NET) use the following
optimization for decryption and signing based on the
Chinese remainder theorem. The following values are
precomputed and stored as part of the private key:
and : the primes from the key generation,
,
and
.
These values allow the recipient to compute the
exponentiation m = cd (mod pq) more efficiently as
follows:
(if
then some libraries compute h as
)
This is more efficient than computing m ≡ cd (mod pq)
even though two modular exponentiations have to be
computed. The reason is that these two modular
exponentiations both use a smaller exponent and a
smaller modulus.
A working example
Here is an example of RSA encryption and decryption.
The parameters used here are artificially small, but one
can also use OpenSSL to generate and examine a real
keypair.
1.
Encryption
Alice transmits her public key (n, e) to Bob and keeps
the private key secret. Bob then wishes to send message
M to Alice.
2.
He first turns M into an integer m, such that 0 ≤ m<n by
using an agreed-upon reversible protocol known as a
padding scheme. He then computes the ciphertextc
corresponding to
4.
This can be done quickly using the method of
exponentiation by squaring. Bob then transmits c to
Alice.
5.
Decryption
Alice can recover m from c by using her private key
exponent d via computing
Given m, she can recover the original message M by
reversing the padding scheme.
78 | Page
3.
Choose two distinct prime numbers, such as
and
.
Compute n = pq giving
.
Compute the totient of the product as φ(n) =
(p − 1)(q − 1) giving
.
Choose any number 1 <e< 3120 that is
coprime to 3120. Choosing a prime number for
e leaves us only to check that e is not a divisor
of 3120.
Let
.
Compute d, the modular multiplicative inverse
of e (mod φ(n)) yielding
.
The public key is (n = 3233, e = 17). For a padded
plaintext message m, the encryption function is m17
(mod 3233).
The private key is (n = 3233, d = 2753). For an
encrypted ciphertextc, the decryption function is c2753
(mod 3233).
September 2014, Volume-1, Special Issue-1
For instance, in order to encrypt m = 65, we calculate
To decrypt c = 2790, we calculate
.
Both of these calculations can be computed efficiently
using the square-and-multiply algorithm for modular
exponentiation. In real life situations the primes
selected would be much larger; in our example it would
be relatively trivial to factor n, 3233, obtained from the
freely available public key back to the primes p and q.
Given e, also from the public key, we could then
compute d and so acquire the private key.
Practical implementations use the Chinese remainder
theorem to speed up the calculation using modulus of
factors (mod pq using mod p and mod q).
The values dp, dq and qinv, which are part of the private
key are computed as follows:
(Hence:
)
Here is how dp, dq and qinv are used for efficient
decryption. (Encryption is efficient by choice of public
exponent e)
(same as above but computed more efficiently)
LOGIN
The following snapshots of the project screens explain
how recognition is implemented and used by different
authorities of the Intelligence system
Figure 3 Case details
IV. FUTURE ENHANCEMENT
Some of the future enhancements that can be done
to this system are: As the technology emerges, it is
possible to upgrade the system and can be adaptable to
desired environment. Because it is based on objectoriented design, any further changes can be easily
adaptable. Based on the future security issues, security
can be improved using emerging technologies. Sub
admin module can be added.
V.CONCLUSION
This project thus allows secret agencies to manage
secret cases in a secured and confidential way. This
application software has been computed successfully
the software is developed using Java as front end and
Oracle as back end in Windows environment. The goals
that are achieved by the software are:Optimum
utilization of esources, Efficient management of
records, Simplification of the operations, Less
processing time and getting required information.
REFERENCES
(1) Java Complete Reference by Herbert Shield
(2) Database Programming with JDBC and Java by
George Reese
(3) Java and XML By Brett McLaughlin
(4) Wikipedia, URL: http://www.wikipedia.org.
(5) Answers.com, Online Dictionary, Encyclopedia and
much more, URL:http://www.answers.com
(6) Google, URL: http://www.google.co.in
(7) Project Management URL:
http://www.startwright.com/project.htm
Figure 2 User login
CASE DETAILS
The following snapshots of the project screens explains
the case details of the cybernetic protector implemented
and used by different authorities of the Intelligence
system
79 | Page
September 2014, Volume-1, Special Issue-1
REVITALIZATION OF BLOOM’S TAXONOMY FOR THE
EFFICACY OF HIGHERS
Mrs. B. Mohana Priya,
Assistant Professor in English,
AVIT, Paiyanoor, Chennai.
[email protected]
Abstract: Blooms Taxonomy is the most widely used and
applied one in higher education today. This paper is
roughly divided into three parts: need to use Bloom s
Taxonomy, role of ICT in English language learning
and Web resources for developing language skills.
Blooms Taxonomy for Higher Education can be evoked
in this context. The paper focuses on the use of internet
resources in designing a curriculum for English
language teaching as a skill as opposed to teaching
English as a subject. This leads us to the need to develop
Critical Thinking.
Key Words: Critical Thinking, Problem based Learning,
Domains, Websites.
INTRODUCTION
Blooms Taxonomy is the most widely used and
applied one in higher education today. The cognitive
domain
includes:
Knowledge,
comprehension,
Application, Analysis, Synthesis and Evaluation. The
cognitive and affective domains were researched and the
strategies that can be used to foster thinking skills were
identified. They were grouped into micro skills and
macro abilities. Critical Thinking is the disciplined
activity of evaluating arguments or propositions and
making judgments than can guide the development of
beliefs and taking action. Non-critical thinking can be
compared and contrasted with Critical thinking to
understand Critical Thinking. It can be habitual thinking,
based on past practices without considering current data,
brainstorming whatever comes to mind without
evaluation, creative thinking putting facts, concepts and
principles together in new and original ways, prejudicial
thinking gathering evidence to support a particular
position without questioning the position itself, or
emotive thinking responding to the emotion of a message
rather than content. The use of internet resources in
designing a curriculum for English language teaching as a
skill is opposed to teaching English as a subject. This
leads us to the need to develop Critical Thinking.
Thinking is a very important skill to be developed as it
Generates purposes
Raises questions
Uses information
Utilizes concepts
Makes inferences
Makes assumptions
Generates implications
Embodies a point of view
80 | Page
The major idea of the taxonomy is that what
educators want students to know (encompassed in
statements of educational objectives) can be arranged in a
hierarchy from less to the more complex. Students can
know about a topic or subject at different levels.
While most teacher-made tests still test at the
lower levels of the taxonomy, research has shown that
students remember more when they have learned to
handle the topic at the higher levels of the taxonomy. For
example within this domain for a reading comprehension
exercise, the multiple-choice questions will require
To identify an authors purpose in a passage
To rate selected inferences as justified, unjustified
with substantiated statements
To select among formulations of the problem at
issue in a passage isolating the reasonable ones
from unreasonable ideas
To recognize unstated assumptions
To rate described evidence as reliable or
unreliable.
Abilities play a central role in a rich and
substantive concept of Critical Thinking. They are
essential to approaching actual issues, problems, and
situations rationally. Understanding the rights and duties
of citizenship, for example, requires that one at least have
the ability to compare perspectives and interpretations to
read and listen critically, to analyze and evaluate policies.
Similarly the capacity to make sound decisions, to
participate knowledgeably in the work place, to function
as part of a global economy, to master the content in
anything as complex as academic disciplines, to apply to
subject area insights to real-life situations, to make
insightful cross-disciplinary connections, to communicate
effectively each of these relies in a fundamental way on
having a significant number of the abilities listed. Take,
for example, the capacity to make sound decisions: such
decision-making is hardly possible without an attendant
ability to refine generalizations, compare analogous
situations, develop ones perspective, clarify issues and so
forth. Thinking can be at the lower order or higher order.
Higher order thinking requires more than higher order
thinking skills.
COGNITIVE STRATEGIES - MACRO-ABILITIES
S-10 refining generalizations
oversimplifications
and
avoiding
September 2014, Volume-1, Special Issue-1
S-11 comparing analogous situations: transferring
insights to new contexts
S-12 developing ones perspective: creating or
exploring beliefs, arguments, or theories
S-13 clarifying issues, conclusions, or beliefs
S-14 clarifying and analyzing the meanings of
words or phrases
S-15 developing criteria for evaluation: clarifying
values and standards
S-16 evaluating the credibility of sources of
information
S-17 questioning deeply: raising and pursuing root
or significant questions
S-18 analyzing or evaluating arguments,
interpretations, beliefs, or theories
S-19 generating or assessing solutions
S-20 analyzing or evaluating actions or policies
S-21 reading critically: clarifying or critiquing
texts
S-22 listening critically: the art of silent dialogue
S-23 making interdisciplinary connections
S-24 practicing Socratic discussion: clarifying and
questioning beliefs, theories, or perspectives
S-25
reasoning
dialogically:
comparing
perspectives, interpretations, or theories
S-26
reasoning
dialectically:
evaluating
perspectives, interpretations, or theories.
COGNITIVE STRATEGIES - MICRO-SKILLS
S-27 comparing and contrasting ideals with actual
practice
S-28 thinking precisely about thinking: using
critical vocabulary
S-29 noting significant similarities and differences
S-30 examining or evaluating assumptions
S-31 distinguishing relevant from irrelevant facts
S-32 making plausible inferences, predictions, or
interpretations
S-33 giving reasons and evaluating evidence and
alleged facts
S-34 recognizing contradictions
S-35 exploring implications and consequences
Critical Thinking, in a substantive sense, includes
more than abilities. The concept also includes, in a
critical way, certain attitudes, dispositions,
passions, traits of mind.
AFFECTIVE STRATEGIES
S-1 thinking independently
S-2 developing insight into egocentricity or sociocentricity
S-3 exercising fair-mindedness
S-4 exploring thoughts underlying feelings and
feelings underlying thoughts
S-5 developing intellectual humility and suspending
judgment
S-6 developing intellectual courage
S-7 developing intellectual good faith or integrity
81 | Page
S-8 developing intellectual perseverance
S-9 developing confidence in reason
Questions play a very crucial and dominant role
in teaching content. Questions engage learners in active
thinking and it has to be accepted that every declarative
statement in the textbook is an answer to a question.
Hence it is possible to rewrite textbooks in the
interrogative mode by translating every statement into a
question. Every intellectual field is born out of a cluster
of questions. Questions define tasks, express problems
and delineate issues. Answers, on the other hand, often
signal a full stop in thought. Only when an answer
generates a further question does thought continue its life
as such. That we do not test students by asking them to
list questions and explain their significance is again
evidence of the privileged status we give to answers. That
is, we ask questions only to get thought-stopping answers,
not to generate further questions. If we want to engage
students in thinking through content we must stimulate
their thinking with questions that lead them to further
questions. We must give students what might be called
artificial cogitation (the intellectual equivalent of artificial
respiration). Socratic questioning is important for a
critical thinker. Socrates believed in a synthesis of the
knowledge of the past with that of the present and future.
Mere studying of the past was not a vital use of the
historical tradition. Socratic discussion adds to depth, and
a keen interest to assess the truth and plausibility of
things. The following logical prior questions for Socratic
discussion on History will highlight the nature of
questioning that is necessary.
1. What is history?
2. What do historians write about?
3. What is the past?
4. Is it possible to include all of the past in a history
book?
5. How many events in a particular period are left
out?
6. Is more left out than is included?
7. How does a historian know what to emphasize or
focus on?
8. Do historians make value judgments?
9. Is history an interpretation?
This leads us to the need to develop Critical
Thinking. Thinking is a very important skill to be
developed as it
Generates purposes
Raises questions
Uses information
Utilizes concepts
Makes inferences
Makes assumptions
Generates implications
Embodies a point of view
September 2014, Volume-1, Special Issue-1
Problem based Learning can help to develop the
thinking skills. Problem based Learning (PBL) is an
educational approach that challenges students to learn to
learn. Students work cooperatively in groups to seek
solution to real world problems, and more importantly to
develop skills to become self-directed learners. Learning
is much more than the process of mere knowledge
seeking in PBL. In the PBL model, learners use
Cooperative learning skills
Inquiry skills
Reflection skills
Assessment.
Some instructional techniques that teachers can
use to make students thinking public during the class:
Think-Pair-Share (TPS) can be used to involve students
more actively with the material through interaction with
peers in the class, Concept Tests are like TPS but can be
used for debates, Think Aloud Pair Problem Solving
(TAPPS) can be used for elaborate thinking. One student
can be the problem solver and the other can be the
listener. The Minute Paper in which students write a brief
answer about their learning during the class period. It can
be used at any point in the class to monitor student
thinking. The activities listed above and a host of others
can be used to promote conceptual change by having
students articulate and examine their own ideas and then
try to reconcile them with alternative views.
Writing instructional objectives to include
Critical Thinking and to chart out the learner roles and
learning outcomes is important. Questions that
instructional objectives should help answer are:
• What is the purpose of this instruction?
• What can the learner do to demonstrate he/she
understands the material?
• How can you assess if the learner has mastered
the content?
The function of the objectives is to enable the
teacher to select and organize instructional activities and
resources that will facilitate effective learning, provide an
evaluation framework and guide the learning. There are 4
elements in an instructional Objective: ABCD.



A is for Audience. It Specifies the learner(s) for
whom the objective is intended. Example: The
tenth grade Biology
B is for Behavior (action verb). It Describes the
capability expected of the learner following
instruction, stated as a learner performance,
stated as observable behavior, describes a realworld skill versus mere test performance.
C is for Conditions (materials and/or
environment). It describes the conditions under
which the performance is to be demonstrated,
equipment, tools, aids, or references the learner
82 | Page

may or may not use, special environmental
conditions in which the learner has to perform.
D is for Degree (criterion). It identifies the
standard for acceptable performance, time limit,
accuracy tolerances, proportion of correct
responses required, qualitative standards. Eg:
Randi will write correct answers to five of five
inference questions on a grade level reading
passage. Kathy and Chuck will correctly
compute the amount of wallpaper needed to
cover a wall of given dimensions Richard will
complete all of his independent seatwork
assignments during class with at least 90%
accuracy for 2 consecutive weeks.
Cognitive objectives include objectives related
to information or knowledge, naming, solving, predicting,
and other intellectual aspects of learning. While writing
objectives care should be taken to see that it is observable
as learning outcome. A precise statement that answers the
question: “What behavior can the learner demonstrate to
indicate he/she has mastered the knowledge or skills
specified in the instruction?”
Objectives need to be related to intended
(performance based) outcomes rather than the process
Specific Measurable, rather than broad and intangible
Concerned with students not teachers.
FIVE
FACTORS
UNDERPIN
LEARNING
AFFECTIVELY ARE:
Learning by doing: includes practice, repetition
and learning by mistakes
Learning from feedback: other people’s
reactions
Wanting to learn: intrinsic motivation
Needing to learn: extrinsic motivation
Making sense of what has been learned:
digesting or understanding the material.
It is necessary to teach the way learners like to
be taught. Learners have four perceptual styles of
learning: visual, auditory, kinesthetic and tactile. Use of
the computers and the internet can make learning more
fun and purposeful. There are a variety of web resources
for second language learning. Learners can have hands on
experience in an ICT enabled classroom as there is scope
for self-learning which is absent in a traditional set up.
The World Wide Web is a vast database of current
authentic materials that present information in multimedia
form and react instantly to a users input. It is also a major
drawback of the Web as it is easy to plagiarize online
content. The teachers’ role as facilitator can help the
learner locate the resources and provide guidance to
learners to use them. Web resources can be used for a
variety of activities to foster language learning.
Reading and literacy development: creating own
newspaper, stories with audio component
September 2014, Volume-1, Special Issue-1
Writing and Grammar: Grammar, punctuation,
interactive grammar exercises, quiz, games,
vocabulary worksheets
Listening and Speaking: American rhetoric,
BBC world English, Voice of America English,
Cyber listening lab, repetition lab
Online interactive Games: dictionary games,
Hangman.
The Encyclopedia Britannica Reference Library
is available in an online format and in CDs. The Oxford
Talking Dictionary, Merriam Websters Collegiate
Dictionary, Power Vocabulary and a host of other
interactive modules are available on the internet.
Frameworks for a language curriculum can be devised to
impart the four skills very effectively.
Listening: Audio versions of famous speeches
from youtube.
Speaking: Audio and recordable versions of the
spoken texts are available and can also be
prepared to suit the level of the learners.
Reading: Texts can be graded and pitched at the
learners level separately and also as an
integrated listening and speaking activity.
Writing: Process-oriented method can be
effective when monitored by the software than
by human intervention.
Howard Gardners Multiple Intelligences has
paved the way for a learning-centered curriculum as
opposed to a learner-centered curriculum propagated by
the Communicative Language Teaching paradigm.
Designing modules that enable learners learn in their
perceptual styles makes learning more purposeful and
meaningful. For example the creative learning center
provides a range of assessment instruments designed to
improve study skills. Results are delivered in the form of
personal profiles with pictures and graphs, in easy-toread, objective and non-judgmental way.
The primary problem of doing research on the
Web is not merely identifying the myriad sites available,
but being able to evaluate them and discover what each
does and how well it does it. New information
technologies will transform notions of literacy, making
online navigation and online research two critical skills
for learners of English. The new reading skills required of
the students include
Efficiently locating information
Rapidly evaluating the source, credibility, and
timeliness of the information located.
Rapidly making navigational decisions about
whether to read the current page of information,
pursue links internal to the page, or revert to
further searching.
83 | Page
SOME WEBSITES THAT CAN BE USED ARE:
Reading and literacy:
http://www.earobics.com/gamegoo/gooey.html
For beginning readers, aimed at younger
learners: http://www.bartleby.com
Thousands of free online texts (poetry, fiction,
etc.).
The
Online
Books
http://onlinebooks.library.upenn.edu/
Project
Guttenberg
www.gutenberg.org/wiki/Main_Page
Page:
and
http://
CNN
Student
http://edition.cnn.com/studentnews/
Multimedia news resources
News:
Create Your Own Newspaper (CRAYON):
http://www.crayon.net/
A toll for managing news sources on the internet
and making a newspaper. No fee.
Moonlit Road http: //www.themoonlitroad.com/
Spooky stories with an audio component so
students can listen while they read.
Writing and Grammar
Grammar, Punctuation and Spelling from
Purdues’ Online Writing Lab (OWL)
http://owl.english.purdue.edu/owl/resource/679/
01/
Reference materials and practice activities. This
OWL also contains many helpful writing guides
and exercises, including business-related writing
(CVs, memos, etc)
http://www.marks-english
school.com/games.html
Interactive grammar exercises, Grammar Bytes,
Interactive Grammar Review
http://www.chompchomp.com/menu.htm
Index of grammar terms, interactive exercises,
handouts, and a section on grammar rules.
Guide
to
Grammar
and
Writing:
http://grammar.ccc.commnet.edu/grammar/
Guides and quizzes for grammar and writing
from Capital Community College, USA.
ESL Galaxy: http://www.esl-galaxy.com/
Handouts, lesson plans, links to other ESL sites.
ESL Gold: http://www.eslgold.com/
Lesson plans, links to grammar quizzes, good
listening section with clear audio ESL Tower:
http://www.esltower.com/guide.html
September 2014, Volume-1, Special Issue-1
Online grammar quizzes, grammar and
vocabulary worksheets, pronunciation guides
Listening and Speaking
American
http://www.americanrhetoric.com/
Online Interactive Games
Dictionary Online Games:
http://www.yourdictionary.com/esl/Free-OnlineESL-Games.html
Links to online English games: Online Hangman
http://www.manythings.org/hmf/
Rhetoric:
Speeches and voice recordings from authors,
leaders, comedians and hundreds of notable
Online, interactive hangman vocabulary game.
Teacher Development: Online ELT Journals
http://eltj.oxfordjournals.org/
http://iteslj.org/
http://www.languageinindia.com/
http://eca.state.gov/forum/
figures (MP3 format). Some material has an
accompanying vide.
Voice
of
America
Special
English:
http://www.voanews.com/specialenglish/
News reports in language adapted for English
Language Learners. Includes a glossary and
podcasts for English Learners. Broadcasts can be
downloaded and played while offline, and
transcripts of broadcasts are also available.
BBC World English, Learning English:
http://www.bbc.co.uk/worldservice/learningengli
sh/index.shtml
Music, audio and interactivity to help students
learn English. Language study modules are
based on news events from the radio. Listening
Skill Practice
http://esl.about.com/od/englishlistening
English_Listening_Skills_and_Activities
Effective_Listening_Practice.htm
WORKS CONSULTED
1.
Atkinson, D. 1997. A Critical Approach to
Critical Thinking. TESOL Quarterly 31, 71-94.
2.
Hongladarom, Soraj. Asian Philosophy and
Critical Thinking: Divergence or Convergence.
Department of Philosophy. Chulalongkorn
University.
3.
www.u.oregon.edu
4.
Matilal, Bimal Krishna. Logic, Language and
Reality: Indian Philosophy and Contemporary
Issues. Delhi: Motilal Banarsidass. 1990.
5.
Paul, Richard. Critical Thinking: How to Prepare
Students for a Rapidly Changing World. 1993.
6.
Paul, Richard and Linda Elder. The Miniature
Guide to Critical Thinking Concepts and Tools.
Foundation for Critical Thinking Press. 2008.
7.
Needham, Joseph. The Grand Titration: Science
and Society in East and West. London: Allen &
Unwin. 1969.
This resource provides listening quizzes,
interviews, specific English learning listening
ONLINE RESOURCES
Resources. Randalls ESL Cyber Listening Lab:
http://www.esl-lab.com/
1.
2.
A good selection of listening exercises for easy
to advanced levels.
3.
Repeat After Us: http://repeatafterus.com/
Copyright-free classics with audio clips,
including poems, fables, essays, soliloquies,
historical speeches, memorable audio quotes,
nursery rhymes, and childrens stories from
around the world.
Shaggy
Dog
Stories:
http:/
antimoon.com/other/shaggydog.htm
4.
5.
http://www.adprima.com/wl05.htm
http://www.ojp.usdoj.gov/BJA/evaluation/glossa
ry/glossary_c.htm
http://www.bsu.edu/IRAA/AA/WB/chapter
2.htm
http://www.serc.carleton.edu/338
http://www.wcer.wisc.edu/archivecl1/CL/doingc
l/thinkps.htm
/www.
Humorous stories with actors who speak clearly
and slowly. Recordings can be downloaded,
saved and played while offline.
84 | Page
September 2014, Volume-1, Special Issue-1
SECURITY AND PRIVACY-ENHANCING MULTI CLOUD
ARCHITECTURES
R.Shobana1
Dr.Dekson2
Department of Computer Science and Engineering
Vinayaka Missions University Chennai, Tamil Nadu, India.
1
[email protected], [email protected]
Abstract— Security challenges are still among the
biggest obstacles when considering the adoption of cloud
services. This triggered a lot of research activities,
resulting in a quantity of proposals targeting the various
cloud security threats. Alongside with these security
issues, the cloud paradigm comes with a new set of
unique features, which open the path toward novel
security approaches, techniques, and architectures. This
paper provides a survey on the achievable security merits
by making use of multiple distinct clouds simultaneously.
Various distinct architectures are introduced and
discussed according to their security and privacy
capabilities and prospects.
Keywords— Cloud, security, privacy, multi cloud,
application partitioning, tier partitioning, data
partitioning, multiparty computation.
INTRODUCTION
CLOUD computing offers dynamically scalable resources
provisioned as a service over the Internet. The third-party,
on-demand, self-service, pay-per-use, and seamlessly
scalable computing resources and services offered by the
cloud paradigm promise to reduce capital as well as
operational expenditures for hardware and software.
Clouds can be categorized taking the physical location
from the viewpoint of the user into account. A public cloud
is offered by third-party service providers and involves
resources outside the user’s premises.
In case the cloud system is installed on the user’s
premise usually in the own data center this setup is called
private cloud. A hybrid approach is denoted as hybrid
cloud. This paper will concentrate on public clouds,
because these services demand for the highest security
requirements but also as this paper will start arguing
include high potential for security prospects.
In public clouds, all of the three common cloud service
layers Paas, share the commonality that the end-users’
digital assets are taken from an intra organizational to an
inter organizational context. This creates a number of
issues, among which security aspects are regarded as the
most critical factors when considering cloud computing
adoption Legislation and compliance.
A.CLOUD SECURITY ISSUES
Cloud computing creates a large number of security issues
and challenges. A list of security threats to cloud
computing is presented in these issues range from the
85 | Page
required trust in the cloud provider and attacks on cloud
interfaces to misusing the cloud services for attacks on
other systems. The main problem that the cloud computing
paradigm implicitly contains is that of secure outsourcing
of sensitive as well as business-critical data and processes.
When considering using a cloud service, the user must be
aware of the fact that all data given to the cloud provider
leave the own control and protection sphere. Even more, if
deploying data-processing applications to the cloud (via
IaaS or PaaS), a cloud provider gains full control on these
processes.
Hence, a strong trust relationship between the
cloud provider and the cloud user is considered a general
prerequisite in cloud computing. Depending on the
political context this trust may touch legal obligations. For
instance, Italian legislation requires that government data
of Italian citizens, if collected by official agencies, have to
remain within Italy. Thus, using a cloud provider from
outside of Italy for realizing an e-government service
provided to Italian citizens would immediately violate this
obligation. Hence, the cloud users must trust the cloud
provider hosting their data within the borders of the
country and never copying them to an off-country location
(not even for backup or in case of local failure) nor
providing access to the data to entities from abroad. An
attacker that has access to the cloud storage component is
able to take snapshots or alter data in the storage. This
might be done once, multiple times, or continuously.
An attacker that also has access to the processing
logic of the cloud can also modify the functions and their
input and output data. Even though in the majority of cases
it may be legitimate to assume a cloud provider to be
honest and handling the customers’ affairs in a respectful
and responsible manner, there still remains a risk of
malicious employees of the cloud provider, successful
attacks and compromisation by third parties, or of actions
ordered by a subpoena. In, an overview of security flaws
and attacks on cloud infrastructures is given. Some
examples and more recent advances are briefly discussed
in the following. Ristenpart et al. presented some attack
techniques for the virtualization of the Amazon EC2 IaaS
service. In their approach, the attacker allocates new
virtual machines until one runs on the same physical
machine as the victim’s machine. Then, the attacker can
perform cross-VM side channel attacks to learn or modify
the victim’s data.
September 2014, Volume-1, Special Issue-1
B.SECURITY PROSPECTS BY MULTICLOUD
ARCHITECTURES
The authors present strategies to reach the desired
victim machine with a high probability, and show how to
exploit this position for extracting confidential data, e.g., a
cryptographic key, from the victim’s VM. Finally, they
propose the usage of blinding techniques to fend cross-VM
side-channel attacks. In, a flaw in the management
interface of Amazon’s EC2 was found. The SOAP-based
interface uses XML Signature as defined in WS-Security
for integrity protection and authenticity verification.
Gruschka and Iacono discovered that the EC2
implementation for signature verification is vulnerable to
the Signature Wrapping Attack.
The basic underlying idea is to use multiple distinct clouds
at the same time to mitigate the risks of malicious data
manipulation, disclosure, and process tampering. By
integrating distinct clouds, the trust assumption can be
lowered to an assumption of non collaborating cloud
service providers. Further, this setting makes it much
harder for an external attacker to retrieve or tamper hosted
data or applications of a specific cloud user. The idea of
making use of multiple clouds has been proposed by
Bernstein and Celeste. However, this previous work did
not focus on security. Since then, other approaches
considering the security effects have been proposed. These
approaches are operating on different cloud service levels,
are partly combined with cryptographic methods, and
targeting different usage scenarios.
Replication of applications allows to receive multiple
results from one operation performed in distinct clouds and
to compare them within the own premise. This enables the
user to get evidence on the integrity of the result. Partition
of application System into tiers allows separating the logic
from the data. This gives additional protection against data
leakage due to flaws in the application logic. . Partition of
application logic into fragments allows distributing the
application logic to distinct clouds. This has two benefits.
First, no cloud provider learns the complete application
logic. Second, no cloud provider learns the overall
calculated result of the application. Thus, this leads to data
and application confidentiality. . Partition of application
data into fragments allows distributing fine-grained
fragments of the data to distinct clouds. None of the
involved cloud providers gains access to all the data,
which safeguards the data’s confidentiality.
Each of the introduced architectural patterns provides
individual security merits, which map to different
application scenarios and their security needs. Obviously,
the patterns can be combined resulting in combined
security merits, but also in higher deployment and runtime
effort. The following sections present the four patterns in
more detail and investigate their merits and flaws with
respect to the stated security requirements under the
assumption of one or more compromised cloud systems.
86 | Page
Fig. 1. Replication of application systems.
Assume that n > 1 clouds are available (like, e.g., Clouds
A and B in Fig. 1). All of the n adopted clouds perform the
same task. Assume further that f denotes the number of
malicious clouds and that n _f > f the majority of the
clouds are honest. The correct result can then be obtained
by the cloud user by comparing the results and taking the
majority as the correct one. There are other methods of
deriving the correct result, for instance using the
TurpinCoan algorithm for solving the General Byzantine
Agreement problem. Instead of having the cloud user
performing the verification task, another viable approach
consists in having one cloud monitoring the execution of
the other clouds. For instance, Cloud A may announce
intermediate results of its computations to an associated
monitoring process running at Cloud B. This way, Cloud B
can verify that Cloud A makes progress and sticks to the
computation intended by the cloud user. As an extension
of this approach, Cloud B may run a model checker service
that verifies the execution path taken by Cloud an on-thefly, allowing for immediate detection of irregularities.
This architecture enables to verify the integrity of results
obtained from tasks deployed to the cloud. On the other
hand, it needs to be noted that it does not provide any
protection in respect to the confidentiality of data or
processes. On the contrary, this approach might have a
negative impact on the confidentiality because—due to the
deployment of multiple clouds—the risk rises that one of
them is malicious or compromised. To implement
protection against an unauthorized access to data and logic
this architecture needs to be combined with the
architectureThe idea of resource replication can be found
in many other disciplines. In the design of dependable
systems, for example, it is used to increase the robustness
of the system especially against system failures.
C.PARTITION OF APPLICATION SYSTEM INTO
TIERS
The architectural pattern described in the previous enables
the cloud user to get some evidence on the integrity of the
computations performed on a third-party’s resources or
services. The architecture introduced in this section targets
the risk of undesired data leakage. It answers the question
on how a cloud user can be sure that the data access is
implemented and enforced effectively and that errors in the
September 2014, Volume-1, Special Issue-1
application logic do not affect the user’s data. To limit the
risk of undesired data leakage due to application logic
flaws, the separation of the application system’s tiers and
their delegation to distinct clouds is proposed (see Fig. 2).
In case of an application failure, the data are not
immediately at risk since it is physically separated and
protected by an independent access control scheme.
Moreover, the cloud user has the choice to select a
particular—probably specially trusted—cloud provider for
data storage services and a different cloud provider for
applications.
It needs to be noted, that the security services provided by
this architecture can only be fully exploited if the
execution of the application logic on the data is performed
on the cloud user’s system. Only in this case, the
application provider does not learn anything on the users’
data. Thus, the SaaS-based delivery of an application to the
user side in conjunction with the controlled access to the
user’s data performed from the same user’s system is the
most far-reaching instantiation.
.
Fig. 2. Partition of application system into tiers
Besides the introduced overhead due to the additionally
involved cloud, this architecture requires, moreover,
standardized
Interfaces to couple applications with data services
provided by distinct parties. Also generic data services
might serve for a wide range of applications there will be
the need for application specific services as well. The
partitioning of application systems into tiers and
distributing the tiers to distinct clouds provides some
coarse-grained protection against data leakage in the
presence of flaws in application design or implementation.
This architectural concept can be applied to all three cloud
layers. In the next section, a case study at the SaaS-layer is
discussed.
D.PARTITION OF APPLICATION LOGIC INTO
FRAGMENTS
Fig. 3. Partition of application logic into fragments.
This architecture variant targets the confidentiality of data
and processing logic. It gives an answer to the following
question: How can a cloud user avoid fully revealing the
data or processing logic to the cloud provider. The data
should not only be protected while in the persistent
storage, but in particular when it is processed.
The idea of this architecture is that the application logic
needs to be partitioned into fine-grained parts and these
parts are distributed to distinct clouds (see Fig. 3). This
approach can be instantiated in different ways depending
on how the partitioning is performed. The clouds
participating in the fragmented applications can be
symmetric or asymmetric in terms of computing power
and trust. Two concepts are common. The first involves a
trusted private cloud that takes a small critical share of the
computation, and a untrusted public cloud that takes most
of the computational load. The second distributes the
computation among several untrusted public clouds, with
the assumption that these clouds will not collude to break
the security.
E.OBFUSCATING SPLITTING
By this approach, application parts are distributed to
different clouds in such a way, that every single cloud has
only a partial view on the application and gains only
limited knowledge. Therefore, this method can also hide
parts of the application logic from the clouds. For
application splitting, a first approach is using the existing
sequential or parallel logic separation. Thus, depending on
the application, every cloud provider just performs
subtasks on a subset of data. An approach by Danezis and
Livshits is build around secure storage architecture and
focusing on online service provisioning, where the service
depends on the result of function evaluations on the user’s
data.
This proposal uses the cloud as a secure storage, with keys
remaining on client side, e.g., in a private cloud. The
application is split in the following way: The service sends
the function to be evaluated to the client. The client
retrieves his necessary raw data and processes it according
87 | Page
September 2014, Volume-1, Special Issue-1
to the service needs. The result and a proof of correctness
is given back to the service providing public cloud.
approaches that suffice for both technical and regulatory
requirements.
F.PARTITION OF APPLICATION LOGIC/DATA
Constantly and even fast changing network topology,
it is very difficult to maintain a deterministic route. The
discovery and recovery procedures are also time and
energy consuming. Once the path breaks, data packets will
get lost or be delayed for a long time until the
reconstruction of the route, causing transmission
interruption.
Pseudonymization based on the Obfuscated
Splitting approach could be used, e.g., in Human
Resources or Customer Relationship Management. A
potential cloud customer would have to remove all directly
identifying data in the first place, like name, social security
number, credit card information, or address, and store this
information separately, either on premise or in a cloud
with adequately high-security controls. The remaining data
can still be linked to the directly identifying data by means
of an unobvious identifier (the pseudonym), which is
unusable for any malicious third parties. The unlink ability
of the combined pseudonymized data to a person can be
ensured by performing a carefully conducted privacy risk
assessment. These assessments are always constrained by
the assumptions of an adversary’s “reasonable means” The
cloud customer has the option to outsource the
pseudonymized data to a cloud service provider with fewer
security controls, which may result in additional cost
savings. If the customer decides to outsource the directly
identifiable data to a different cloud service provider, she
has to ensure that these two providers do not cooperate,
e.g., by using the same IaaS provider in the backend.
REFERENCES
[1] P. Mell and T. Grance, “The NIST Definition of Cloud
Computing, Version 15,” Nat’l Inst. of Standards and
Technology, Information Technology
Laboratory, vol. 53, p. 50, http://csrc.nist.gov/groups/
SNS/cloud- computing/, 2010.
[2] F. Gens, “IT Cloud Services User Survey, pt.2: Top
Benefits
&
Challenges,”
blog,
http://blogs.idc.com/ie/?p=210, 2008.
[3] Gartner, “Gartner Says Cloud Adoption in Europe Will
Trail
U.S.
by
at
Least
Two
Years,”
http://www.gartner.com/it/page. jsp?id=2032215, May
2012.
[4] J.-M. Bohli, M. Jensen, N. Gruschka, J. Schwenk, and
L.L.L. Iacono, “Security Prospects through Cloud
Computing by Adopting Multiple Clouds,” Proc. IEEE
Fourth Int’l Conf. Cloud Computing
(CLOUD),
2011.
[5] D. Hubbard and M. Sutton, “Top Threats to Cloud
Computing
V1.0,”
Cloud
Security Alliance,
http://www.cloudsecurityalliance.org/topthreats, 2010.
[6] M. Jensen, J. Schwenk, N. Gruschka, and L. Lo Iacono,
“On Technical Security Issues in Cloud Computing,”
Proc. IEEE Int’l Conf. Cloud Computing (CLOUD-II),
2009.
[7] T. Ristenpart, E. Tromer, H. Shacham, and S. Savage,
“Hey, You, Get Off of My Cloud: Exploring
Information Leakage in Third-Party Compute Clouds,”
Proc. 16th ACM Conf. Computer and Comm. Security
(CCS ’09), pp. 199-212, 2009.
[8] Y. Zhang, A. Juels, M.K.M. Reiter, and T. Ristenpart,
“Cross-VM Side Channels and Their Use to Extract
Private Keys,” Proc. ACM Conf. Computer and Comm.
Security (CCS ’12), pp. 305-316, 2012.
[9] N. Gruschka and L. Lo Iacono, “Vulnerable Cloud:
SOAP Message Security Validation Revisited,” Proc.
IEEE Int’l Conf. Web Services (ICWS ’09), 2009.
[10] M. McIntosh and P. Austel, “XML Signature Element
Wrapping Attacks and Countermeasures,” Proc.
Workshop Secure Web Services, pp. 20-27, 2005.
[11] J. Kincaid, “Google Privacy Blunder Shares Your Docs
without
Permission,”
TechCrunch,
http://techcrunch.com/2009/03/07/huge-google-privacyblunder-shares-your-docs-ithoutpermission/,
2009.
[12] J. Somorovsky, M. Heiderich, M. Jensen, J. Schwenk,
N. Gruschka, and L. Lo Iacono, “All Your Clouds Are
Belong to Us: Security Analysis of Cloud Management
Interfaces,” Proc. Third ACM
Workshop Cloud
Computing Security Workshop (CCSW ’11), pp. 3-14,
2011.
[13] S. Bugiel, S. Nu¨ rnberger, T. Po¨ppelmann, A.-R.
Sadeghi, and T.Schneider, “AmazonIA: When Elasticity
Snaps Back,” Proc. 18th ACM Conf. Computer and
Comm. Security (CCS ’11), pp. 389-400, 2011.
[14] D. Bernstein, E. Ludvigson, K. Sankar, S. Diamond,
and M.Morrow, “Blueprint for the Intercloud—
Protocols and Formats for Cloud Computing
Interoperability,” Proc. Int’l Conf. Internet and Web
Applications and Services, pp. 328-336, 2009.
G.CONCLUSION
The cloud providers for gaining security and privacy
benefits are nontrivial. As the approaches investigated in
this paper clearly show, there is no single optimal
approach to foster both security and legal compliance in an
omniapplicable manner. Moreover, the approaches that are
favorable from a technical perspective appear less
appealing from a regulatory point of view, and vice versa.
The few approaches that score sufficiently in both these
dimensions lack versatility and ease of use, hence can be
used in very rare circumstances only. As can be seen from
the discussions of the four major multi cloud approaches,
each of them has its pitfalls and weak spots, either in terms
of security guarantees, in terms of compliance to legal
obligations, or in terms of feasibility. Given that every type
of multi cloud approach falls into one of these four
categories, this implies a state of the art that is somewhat
dissatisfying. However, two major indications for
improvement can be taken from the examinations
performed in this paper. First of all, given that for each
type of security problem there exists at least one technical
solution approach, a highly interesting field for future
research lies in combining the approaches presented here.
For instance, using the n clouds approach (and its integrity
guarantees) in combination with sound data encryption
(and its confidentiality guarantees) may result in
88 | Page
September 2014, Volume-1, Special Issue-1
STRATAGEM OF USING WEB 2.0 TOOLS IN TL PROCESS
Mrs. B. Mohana Priya,
Assistant Professor in English,
AVIT, Paiyanoor, Chennai.
[email protected]
Abstract : Just as pen, paper, scissors, glue,
crayons, construction paper, typewriter and
watercolors were some of the tools many of us used
to produce reports and share what we were
learning, blogs, wikis, photo sharing sites, podcasts
and other new online resources are the tools of
today’s students. And just as we had to learn to cut,
to color, to use cursive writing, our students must
learn how to use these new tools. That means we
must use the tools, evaluate their usefulness, and
teach students to use them effectively as well. It is
the teachers and teacher educators who must
embrace these new digital tools— hopefully,
leading the way as we have in other areas of
technology in the past. While early websites were
passive—that is one could read information from
the page, but couldn’t add to the information or
change it in any way. This paper will focus on the
newer tools that are commonly called Web 2.0 tools
because they allow for much interactivity and usercreated content. It is said that Web 1.0 was about
locating information, and Web 2.0 is about using
websites as application software much as one uses
MSWord or PowerPoint or other software on our
computer. Web 2.0 sites allow one to read and write.
In addition, most Web 2.0 sites offer the opportunity
to share and/or collaborate on the work. Web
2.0 tools provide digital equity, too, providing
knowledge about tools students and teachers can
use outside of school. This paper discusses some
web 2 tools and how they can be used in teaching
and learning process. Actually the list of web 2 tools
is endless; however here only the important ones
are given.
Key words: Blogs, Wikis, Social Bookmarking,
Media-Sharing Services, Podcasting, flickr,
INTRODUCTION
It is the teachers and teacher educators who
must embrace these new digital tools— hopefully,
leading the way as we have in other areas of
technology in the past. While early websites were
passive—that is one could read information from the
page, but couldn’t add to the information or change it
in any way. The newer tools that are commonly
called Web 2.0 tools because they allow for much
interactivity and user-created content. It is said that
Web 1.0 was about locating information, and Web
2.0 is about using websites as application software
much as one uses MSWord or PowerPoint or other
software on our computer. Web 2.0 sites allow one to
read and write. In addition, most Web 2.0 sites offer
89 | Page
the opportunity to share and/or collaborate on the
work. Web 2.0 tools provide digital equity, too,
providing knowledge about tools students and
teachers can use outside of school. Actually the list
of web 2 tools is endless; however here only the
important ones are given.
BLOGS
A blog is a system that allows a single
author, or sometimes, but less often, a group of
authors to write and publicly display time-ordered
articles called posts. Readers can add comment to
posts. It is usually maintained by an individual with
regular entries of commentary, descriptions of
events, or other material such as graphics or video.
Entries are commonly displayed in reversechronological order.
They provide commentary or news on a
particular subject; others function as more personal
online diaries. They combine text, images, and links
to other blogs, WebPages, and other media related to
its topic. The ability for readers to leave comments in
an interactive format is an important part of many
blogs. The author of a blog usually organizes it as a
chronological series of postings. Although some
groups of people contribute to blogs, there is usually
only one central author for each. As of December
2007, blog search engine Technorati was tracking
more than 112 million blogs. The uses of blogs are as
follows:
The user can view entries in other users blogs.
If the author prohibits this capability, the user
will not be able to read any blogs on the
system. The user can create new blog entries
The user can edit and manage entries in her
own blog or change and delete other users
entries.
The user can create and delete user-defined tags
that others may use.
A user can create and delete the official tags
that all users see.
A group of bloggers using their individual
blogs can build up a corpus of interrelated
knowledge via posts and comments. This might
be a group of learners in a class, encouraged
and facilitated by a teacher, or a group of
relatively dedicated life-long learners.
Teachers can use a blog for course
announcements, news and feedback to students.
September 2014, Volume-1, Special Issue-1
Blogs can be used with syndication
technologies to enable groups of learners and
teachers to
easily keep track of new posts
WIKIS
A wiki is a system that allows one or more
people to build up a corpus of knowledge in a set of
interlinked web pages, using a process of creating
and editing pages. The most famous wiki is
Wikipedia. Wiki is collaborative learning software to
enhance group learning. Leveraging on such social
networks allows facilitate collaborative learning and
knowledge sharing amongst learners, especially the
younger generation. This medium augurs well with
Gen Y learners who are tech-savvy and used to
collaborating with each other in a networked
environment. Wiki is used as a support group
learning platform in learning programmes after
participants completed the initial traditional
classroom teaching. A Wiki site would be set up to
allow participants work in teams to resolve their
common challenges, which had been discussed in the
classroom. The participants will then work together
over the web for some weeks after the training.
EDUCATIONAL USES OF WIKIS
• Wikis can be used for the creation of annotated
reading lists by one or more teachers
• Wikis can be used in class projects, and are
particularly suited to the incremental accretion
of knowledge by a group, or production of
collaboratively edited material, including
material documenting group projects.
• Wikis can be used by teachers to supply
scaffolding for writing activities thus in a group
project a teacher can supply page structure, hints
as to desirable content, and then provide
feedback on student generated content.
• Students can flag areas of the wiki that need
attention, and provide feedback on each others
writing.
SOCIAL BOOKMARKING
A social bookmarking service provides
users the ability to record (bookmark) web pages,
and tag those records with significant words (tags)
that describe the pages being recorded. Examples
include delicious and Bibsonomy. Over time users
build up collections of records with common tags,
and users can search for bookmarked items by likely
tags. Since items have been deemed worthy of being
bookmarked and classified with one or more tags,
social bookmarking services can sometimes be more
effective than search engines for finding Internet
resources.
Users can find other users who use the same
tag and who are likely to be interested in the same
topic(s). In some social bookmarking systems, users
with common interests can be added to an
90 | Page
individual’s own network to enable easy monitoring
of the other users tagging activity for interesting
items. Syndication (discussed below) can be used to
monitor tagging activity by users, by tags or by both
of these.
EDUCATIONAL
USES
OF
SOCIAL
BOOKMARKING
Teachers and learners can build up collections
of resources, and with a little ingenuity can
also use social bookmarking systems to
bookmark resources that are not on the web.
In this way it is easy to build up reading lists
and resource lists. These may, with the use of
multiple tags, be structured into subcategories.
Groups of users with a common interest can
team together to use the same bookmarking
service to bookmark items of common
interest. If they have individual bookmarking
accounts, they all need to use the same tag to
identify their resources.
MEDIA-SHARING SERVICES
These services store user-contributed media,
and allow users to search for and display content.
Besides being a showcase for creative endeavour,
these services can form valuable educational
resources. Compelling examples include YouTube
(movies), iTunes (podcasts and vidcasts), Flickr
(photos), Slideshare (presentations), DeviantArt (art
work) and Scribd (documents). Scribd is particularly
interesting as it provides the ability to upload
documents in different formats and then, for
accessibility, to choose different download formats,
including computer-generated speech, which
provides a breadth of affordances not found in
traditional systems. Podcasting is a way in which a
listener may conveniently keep up-to-date with
recent audio or video content. Behind the scenes
podcasting is a combination of audio or video
content, RSS, and a program that deals with (a) RSS
notifications of new content, and (b) playback or
download of that new content to a personal
audio/video player. Vidcasts are video versions of
podcasts.
EDUCATIONAL USES OF MEDIA SHARING
SERVICES
Podcasts can be used to provide introductory
material before lectures, or, more commonly,
to record lectures and allow students to listen
to the lectures again, either because they were
unable to attend, or to reinforce their learning.
Podcasts can be used to make lectures
redundant while still supplying (possibly
didactic) presentations of learning material by
lecturers.
September 2014, Volume-1, Special Issue-1
Vidcasts can be used to supply to supply
videos of experimental procedures in advance
of lab sessions
Podcasts can be used to supply audio tutorial
material and/or exemplar recordings of native
speakers to foreign language learners.
Distribution and sharing of educational media
and resources. For example, an art history
class could have access to a set of art works
via a photo sharing system.
The ability to comment on and critique each
others’ work; including by people on other
courses or at other institutions.
Flickr allows for annotations to be associated
with different areas of an image and for
comments to be made on the image as a
whole,
thereby
facilitating
teacher
explanations,
class
discussion,
and
collaborative comment. It could be used for
the example above.
For Flickr, FlickrCC is a particularly useful
ancillary service that allows users to find
Creative Commons licensed images that are
freely reusable as educational resources.
Instructional videos and seminar records can
be hosted on video sharing systems. Google
Video allows for longer higher quality videos
than YouTube, and contains a specific genre
of educational videos.
POD CASTING
A pod cast is a series of audio or video
digital media files distributed over the Internet by
syndicated download, through Web feeds, to portable
media players and personal computers. Though the
content may be made available by direct download or
streaming, a pod cast is distinguished from most
other digital media formats by its ability to be
syndicated, subscribed to, and downloaded
automatically when new content is added. The author
of a pod cast is called a pod caster. Pod casting is
becoming increasingly popular in education. They
enable students and teachers to share information
with anyone at any time. An absent student can
download the pod cast of the recorded lesson. It can
be a tool for teachers or administrators to
communicate curriculum, assignments and other
information with parents and the community.
Teachers can record book discussions, vocabulary or
foreign language lessons, international pen pal
letters, music performance, interviews, and debates.
Pod casting can be a publishing tool for student oral
presentations. Video Pod casts can be used in all
these ways as well.
FLICKR
Flickr is an image and video hosting
website, web services suite, and online community
platform. It was one of the earliest Web 2.0
applications. In addition to being a popular Web site
91 | Page
for users to share personal photographs, the service is
widely used by bloggers as a photo repository. Its
popularity has been fueled by its organization tools,
which allow photos to be tagged and browsed by
folksonomic means. As of November 2008, it claims
to host more than 3 billion images. The steps in
Flickr are as follows:
_ Upload
_ Edit
_ Organize
_ Share
_ Maps
_ Make Stuff
_ Keep in Touch
YOU TUBE
They are the video sharing website where
users can upload, view and share video clips. It uses
the Adobe Flash Video technology to display a wide
variety of user-generated video content, including
movie clips, TV clips, and music videos, as well as
amateur content such as video blogging and short
original videos. Most of the content uploaded by
members of the public
SKYPE
It is software that allows users to make
telephone calls over the Internet. Calls to other users
of the service and to free-of-charge numbers are free,
while calls to other landlines and mobile phones can
be made for a fee. Additional features include instant
messaging, file transfer and video conferencing.
Skype-casting is a pod casting recording Skype voice
over IP calls and teleconferences. The recordings
would be used as podcasts allowing audio/video
content over the internet. Some of the common
characteristic features of Skype are its Great Value
Calls, Online number, SMS facility, Voicemail and
Call forwarding, etc.
SOCIAL NETWORKING AND SOCIAL
PRESENCE SYSTEMS
Systems allow people to network together
for various purposes. Examples include Facebook
and MySpace, (for social networking / socialising),
LinkedIn (for professional networking), Second Life
(virtual world) and Elgg (for knowledge accretion
and learning). Social networking systems allow users
to describe themselves and their interests, and they
generally implement notions of friends, ranking, and
communities. The ability to record who one’s friends
are is a common feature that enables traversal and
navigation of social networks via sequences of
friends. Ranking and communities are more
selectively implemented. Ranking of user
contributions by community members allows for
reputations to be built and for individuals to become
members of good standing; this can be an important
motivator for the individual contributions that make
September 2014, Volume-1, Special Issue-1
for a thriving community. The ability to create subcommunities allows for nurturing and growth of subcommunity interests in an environment that provides
a degree of insulation from the general hub-bub of
system activity.
EDUCATIONAL
USES
OF
SOCIAL
NETWORKING SYSTEMS
LinkedIn acts, at a professional level, as a model
of educational use in the way in which it can
be used to disseminate questions across the
community for users seeking particular
information.
There are a wide variety of educational
experiments being carried out in Second Life.
These vary from the mundane with a virtual
world gloss to more adventurous experiments
that take advantage of the virtual reality facilities
(e.g. construction of ancient environments for
exploration by students).
Students can create their end of year show in
Second Life
Other varieties of social networking systems are
used at a professional level for community
learning and act as potential models for
educational use: e.g. Confluence, a corporate
wiki system with a social network focus, is
currently being used in a pilot project by
Manchester Business School to promote the
spread of knowledge in Local Government
communities.
FACEBOOK
In this account, one can post updates of their
activities to their friends. But social blogging is
different from blogging in a learning environment,
and we will need to work closely with our students to
create effective blogs. It is recommended that we
allow each student to create his own blogging goals.
As David Hawkins writes in his book The
Roots of Literacy, “Children can learn to read and
write with commitment and quality just in proportion
as they are engaged with matters of importance to
them, and about which at some point they wish to
read and write.”
MYSPACE
It is a social networking website with an
interactive, user-submitted network of friends,
personal profiles, blogs, groups, photos, music, and
videos for teenagers and adults internationally.
MySpace is the most popular social networking site.
It attracts 230 000 new users per day. Bulletins
boards, MySpace IM, MySpace TV, MySpace
Mobile, MySpace News, MySpace Classifieds,
MySpace Karaoke, MySpace Polls and MySpace
forums are some of the features of MySpace.
92 | Page
COLLABORATIVE EDITING TOOLS
These allow users in different locations to
collaboratively edit the same document at the same
time. As yet most of these services do not allow for
synchronous voice or video communication, so the
use of third party synchronous communication
systems is often needed to co-ordinate editing
activity. Examples are Google Docs & Spreadsheets
(for text documents and spreadsheets), and Gliffy
(for diagrams). There are over 600 such applications.
EDUCATIONAL USES OF COLLABORATIVE
EDITING TOOLS
For collaborative work over the web, either
edited simultaneously or simply to share work
edited by different individuals at different
times
Creation of works of art or design can across
disciplines. For instance, architecture and
interior design students from different
universities are working together to complete
a commercial brief.
SYNDICATION
AND
NOTIFICATION
TECHNOLOGIES
In a world of newly added and updated
shared content, it is useful to be able to easily keep
up-to-date with new and changed content,
particularly if one is interested in multiple sources of
information on multiple web sites. A feed reader
(sometimes called an aggregator) can be used to
centralize all the recent changes in the sources of
interest, and a user can easily use the
reader/aggregator to view recent additions and
changes. Behind the scenes this relies on protocols
called RSS (Really Simple Syndication) and Atom to
list changes (these lists of changes are called feeds,
giving rise to the name feed reader). A feed reader
regularly polls nominated sites for their feeds,
displays changes in summary form, and allows the
user to see the complete changes.
EDUCATIONAL USES OF SYNDICATION
AND NOTIFICATION TECHNOLOGIES
In a group project where a wiki is being
developed collaboratively RSS feeds can be used
to keep all members of the group up to date with
changes as they can be automatically notified of
changes as they are made. Similarly for new
blog posts made by class members.
Feed Readers enable students and teachers to
become aware of new blog posts in educational
blogging scenarios to track the use of tags in
social bookmarking systems, to keep track of
new shared media and to be aware of current
news, e.g. from the BBC.
September 2014, Volume-1, Special Issue-1
TWITTER
Twitter is an online tool that lets you write
brief text updates of up to 140 characters and
broadcast them. People can choose to follow the
updates or tweets and you can follow others and
receive their tweets. One should go to
www.twitter.com and sign up by choosing a unique
username. A member will need to give an email
address. After sign up, Twitter home page will now
be set up. It is one of the most famous applications
being integrated with mobile phones also.
CONCLUSION
This paper highlights Web 2.0 will have
profound implications for learners and teachers in
formal, informal, work-based and life-long
education. Web 2.0 will affect how universities go
about the business of education, from learning,
teaching and assessment, through contact with school
communities, widening participation, interfacing
with industry, and maintaining contact with alumni.
However, it would be a mistake to consider Web 2.0
as the sole driver of these changes; instead Web 2.0
is just one part of the education system. Other drivers
include, for example, pressures to greater efficiency,
changes in student population, and ongoing emphasis
on better learning and teaching methods. Then only
the web tools can be utilized with more efficiency
and efficacy.
5.
6.
7.
8.
9.
learning in higher education. Wollongong:
Faculty
of
Education,
University
ofWollongon
Lam, P. and McNaught, C. (2006). Design
and evaluation of online courses containing
media enhanced
learning materials.
Educational Media International, 43 (3),
199218.
Mason, R. (2003). Online learning and
supporting students. New possibilities. In A.
Tait & R.Mills (Eds.) Re-thinking Learner
Support in Distance Education: Change and
Continuity in an International Context (pp.
91-99). London: Routledge Falmer.
Owen, M., Grant, L., Sayers, S, and Facer,
K, Social Software and Learning, Futurelab,
2006.
Stiggins R., Student-Involved Assessment
For Learning, Prentice Hall, 2004.
Volman, M. (2005). A variety of roles for a
new type of teacher Educational technology
and the teaching profession. Teaching and
Teacher Education, 21, 1 (pp. 15-31)
REFERENCES
1.
Alexander, B. 2006. Web 2.0: A new wave
of innovation for teaching and learning?
EDUCAUSE Review, 41 (2): 32-44.
2.
Gray, D., Ryan, M. & Coulon, A. (2004).
The Training of Teachers and Trainers:
Innovative
Practices,
Skills
and
Competencies in the use of eLearning.
European Journal of Open, Distance and elearning. 2004 / II
3.
4.
93 | Page
Ham, V. & Davey, R. (2005). Our First
Time: Two Higher Education Tutors Reflect
on Becoming a “Virtual Teacher”.
Innovations in Education and Teaching
International, 42 (3), 257-264.
Herrington, J., Herrington, A., Mantei, J.,
Olney, I., & Ferry, B. (Eds.). (2009). New
Technologies, newpedagogies: Mobile
September 2014, Volume-1, Special Issue-1
THE COLLISION OF TECHNO- PEDAGOGICAL
COLLABORATION
Mrs. B. Mohana Priya,
Assistant Professor in English,
AVIT, Paiyanoor, Chennai.
[email protected]
Abstract: Information and communications technology
adds value to learning by providing real-world contexts
for learning; connections to outside experts;
visualizations and analysis tools; scaffolds for problem
solving; and opportunities for feedback, reflection, and
revision. Framework that is focused on standards,
engaged learning, on teachers developing curriculum
locally, and on professional development in which
teacher trainers build capacity as they become experts
and take that expertise into their local school systems.
This paper provides a review of existing policy
guidelines as well as a short discussion of possible
project
guidelines
centered
on
Professional
Development for in-service teachers, and it describes
several innovative approaches to integrating ICT into
curriculum. This paper does not address pre-service
education, although many of the concepts can be
integrated into a teacher education program. The paper
builds upon tools that already exist and revise those
tools as needed to the national, regional, and local
contexts of project stakeholders. It is vital to provide
ongoing professional development so that all educators
will participate in decisions about learning and
technology.
Key words: Techno-pedagogical Training in
ICT, Professional Development, WWK (What We
Know)
PRELUDE
“Universal participation means that all students in all
schools have access to and
are active on the information highway in ways that
support engaged learning”.
Information and communications technology
(ICT) adds value to learning by providing real-world
contexts for learning; connections to outside experts;
visualizations and analysis tools; scaffolds for problem
solving; and opportunities for feedback, reflection, and
revision. Now-a-days, framework is prepared on
standards, engaged learning, teachers developing
curriculum locally, and professional development in
which teacher trainers build capacity as they become
experts and take that expertise into their local school
systems.
REVIEW OF EXISTING POLICY GUIDELINES:
Technology identified several sets of policy issues
that affect a school’s ability to use technology for
engaged learning experiences and should be factored into
94 | Page
a professional development plan: equity, standards,
finance, coordination, commitment, and the role of
parents and community members.
EQUITY:
If we believe that all students can learn, we must
overcome barriers to all students using technology. For
schools with high populations at risk, policymakers must:

Provide opportunities for administrators, teachers,
and students to become informed about and
experience the best technologies and technologyenhanced programs.

Establish curricula and assessments that reflect
engaged learning to the highest degree for students at
risk.

Give teachers permission and time to explore and
experiment with new learning and instructional
methods.

Provide ongoing professional development to
develop new learner outcomes, and assessment that
use the best technologies and programs.
STANDARDS:
This issue involves making sure that there are high
standards for all children and that students have
opportunities to complete challenging tasks using
technology. Policies need to integrate curriculum,
instruction, assessment, and technology to ensure support
of engaged learning. Additionally, standards for what
constitutes high-performance technologies that promote
learning need to be agreed upon.
FINANCE:
If education is to change, in whatever form that
applies to this project, the funding structures of schooling
must be a part of that change.
COORDINATION:
Coordination involves many different policy
players and many different configurations of technology
and telecommunications in the private and public sector.
Shared financing and improving technology access and
use in school-to-work programs is essential for promoting
workplace technologies for students.
COMMITMENT:
It is vital to provide ongoing professional
development so that all educators will participate in
decisions about learning and technology. It involves time,
financing, staffing, and powerful models based on
research on learning, professional development, and
September 2014, Volume-1, Special Issue-1
program. Experienced teachers do not necessarily
turn to institutions to get help in using technology or
integrating it into curriculum.
technology emerging from cognitive science and related
fields.
ROLE OF THE PARENT/COMMUNITY:
Many parents or community members do not
understand the educational shift toward technology use.
They do not understand its significance in their children’s
schooling and on their children’s later capability in the
workplace. It is essential to place these policy issues into
the context of a teacher education curriculum and
professional development programs. Teaching in-service
and pre-service teachers just to use technologies is not
enough. Teachers can, and must, play a critical role as
instructional leaders who are aware of the policy
implications associated with instructional decisions.
Specifically, the success or failure of technology is more
dependent on human and contextual factors than on
hardware or software.
Human factors include 1.) The extent to which
teachers are given time and access to pertinent training to
use technology to support learning; 2.) Seeing technology
as a valuable resource; 3.) Determining where it can have
the highest payoff and then matching the design of the
application with the intended purpose and learning goal;
4.) Having significant critical access to hardware and
applications that are appropriate to the learning
expectations of the activity; and 5.) Teachers’ perception
that technology has improved the climate for learning.
Technology implementation requires a well-designed
systemic plan and extensive professional development.
SUGGESTED GUIDELINES:
Professional Development can be designed in such
a way as to deliver experiences that meet the unique
needs of diverse learners and build capacity to improve
educational practice. Two questions are central to all
activities: 1) In what ways does this technology promote
engaged and meaningful learning? 2) How does
technology enhance and extend this lesson in ways that
would not be possible without it? Print, video, and other
electronic resources can help project participants address
these questions. Resources should reflect research about
teaching, learning, and technology, but with guidance
from the wisdom of practitioners.
INNOVATIVE APPROACHES TO TECHNOPEDAGOGICAL TRAINING IN ICT:
1. NCREL’s Learning with Technology (LWT) is a
professional development experience that has been
structured around five adult-oriented instructional
phases that are cyclical and serve as scaffolds for
each other: Build a Knowledge Base; Observe
Models and Cases; Reflect on Practice; Change
Practice; and Gain and Share Expertise.
2.
Creating online learning environments for teachers to
extend face-to-face professional development and to
connect them to their peers seems to have great
potential in planning a professional development
95 | Page
3.
The role of local, region, and nation in the
effectiveness of a professional development program
for technology is to integrate cultural factors related
to pedagogy that need attention include communal
versus individuality orientation, student discipline,
assessments, forms of communication, group work
versus individual work, notions of duty and
responsibility, amount of structuring of educational
experiences, and so on. Local contexts range from
rural to urban, many languages to English, Non-ICT
environments to ICT-rich environments, agricultural
to commercial/industrial, low literacy to high
literacy, educational goals from minimal education to
university graduates, less funding to more funding,
few Professional Development (PD) opportunities to
many PD opportunities, and finally, more localized
schooling to more centralized or nationalized
schooling. A professional development framework
that can standardize a good amount of professional
development activities, perhaps with 70 percent
overlap across all countries, with the remaining
professional development customizable for local and
regional contexts, may provide a successful model
for technology integration. The following table
illustrates this approach as an example:
Survey
targeted
regions to
identify
educational
contexts.
Cultural
factors
Systemic
factors
Describe
Adapt
generic
existing
contexts, in resources:
terms related Case
to:
studies
Pedagogy
Instructional Assessments
use of ICT Lesson
plans
Classroom
resources
Teacher
training
materials
Package
the
resources
for use
within
the
generic
contexts
(local,
regional,
national).
POSSIBILITIES
FOR
PROFESSIONAL
DEVELOPMENT:
We can provide school leaders with more and
better access to procedural knowledge necessary to
implement systemic applications of technology to
learning; provide educators with high-quality professional
development resources related to the application of
technology to learning; help leaders, policymakers, and
administrators align governance and administration
around technology integration; and provide educators and
policymakers with information to help them understand
the issues related to technology access and equity.
September 2014, Volume-1, Special Issue-1
CONCRETE/EXPECTED OUTCOMES:
A holistic framework for pre-service and in-service
teacher education in use of ICT as tools and a master
plan for project implementation.
A regional guideline developed on policies,
approaches and curriculum framework for both preservice teacher education and in-service teacher
training in ICT use as tools and educational
resources.
A set of ICT standards proposed for both teacher
candidates and in-service teachers, a set of developed
course units and training modules reviewed, revised
and finalized for publication.
Experimental evaluation schemes/rubrics developed
in the form of self-evaluation, peer evaluation and
other assessment methods.
Teacher trainers trained in performance-based or
process-based assessment methods in evaluating
ICT- enabled teaching and learning.
Education
leaders’
understanding
of
ICT
contribution
to
improved
teaching-learning
enhanced.
Awarded modules reproduced in CD-ROMs and
posted on Web sites for wider use in teacher training.
The online teacher resource base linked with other
teacher-oriented Web sites.
Most needed lap top computers, printers, LCDs,
overhead projectors, and Internet access fee subsidies
provided to country’s leading institutions.
EXISTING
RESOURCES
AVAILABLE
TO
SUPPORT GOALS:
International Society for Technology in Education
(ISTE)
Professional Development (PD)Program
Technology Connections for School Improvement
Blueprints
Online teacher facilitator certification
WHAT WE KNOW:
Professional Development Program identifies Six
Essential Conditions – system wide factors critical to
effective uses of technology for student learning:
Educator Proficiency, Vision, Effective Practice, Equity,
Systems and Leadership, and Access. Attention to these
Essential Conditions supports high-performance learning
of academic content using 21st Century Skills and tools.
Real improvement begins when educators, as team
members, use data to clarify their goals. Data patterns can
reveal system weaknesses and provide direction to
combat those weaknesses. As the impact of strategies and
practices is measured, collaborative and reflective data
study allows for a deeper understanding of learning.
Ongoing data study and team collaboration efforts
perpetuate the school improvement cycle. All classroom
teachers should be prepared to meet technology standards
and performance indicators, including technology
operations and concepts; planning and designing learning
environments and experiences; teaching, learning, and the
96 | Page
curriculum; assessment and evaluation; productivity and
professional practice; and social, ethical, legal, and
human issues. The development of Blueprints was based
on these values: belief in the importance of continuous,
active, and collaborative learning; recognition of the
worth of reflection; and commitment to the design of
tools that enable facilitators to construct their own
meaning and tailor content and processes to meet the
unique needs of their participants. Allowing each
participant to become actively engaged with new ideas
and applications while developing ownership in the
process are meant to be learning experiences, which
require the facilitator to be sensitive to the adult learners
Online learning is one of the most important and
potentially significant new instructional approaches
available for supporting the improvement and teaching
and learning. Informing teacher leaders and decision
makers on the full range of issues concerning
development and deployment of e-learning is considered
a critical priority. Educators can apply this knowledge to
support e-learning strategies and online collaborative
environments in the classroom and in professional
development activities.
EFFECTIVE PROFESSIONAL DEVELOPMENT:
Coherent: Consistent with agreed-upon goals; aligned
with other school improvement initiatives; clearly
articulated; purposeful.
Research Based: Meeting a demanding standard in that
all decisions are based on careful, systematic examination
of effective practice.
Capacity Building: A willingness to work together to
learn new skills and understandings, with ultimate goal of
self-sufficiency; gaining ability to independently plan,
implement, and evaluate PD.
Customized: Designed according to the unique needs of
the client; implemented web understanding of the specific
context.
Comprehensive: Understanding complexity and
addressing it effectively; engaging key stakeholders in
designing long-term solutions.
Cost-Effective: Producing good results for the amount of
money spent; efficient; economical.
Compilation is based on the premise that children become
educated, successful, and happy individuals through the
combined efforts of parents, guardians, family members,
teachers, administrators, and community who come
together over time for children’s benefit.
FINALE: POSSIBLE PROJECT STRATEGIES
We can use an online framework that helps schools
plan and evaluate their system wide use of educational
technology; provide online assessments to help schools
gauge their progress with learning technology and
develop an informed plan of action. We can train teams to
collect, analyze, and report data on their schools’ use of
technology for teaching and learning; develop capacity of
teams to use these data to inform school improvement;
build a collaborative learning community where teams
learn with and from each other how to use a framework to
September 2014, Volume-1, Special Issue-1
improve teaching and learning with technology. We can
provide school leadership teams a unique opportunity to
analyze and uncover patterns in their school’s data.
Teams can identify problems and successes, establish
clear and specific goals, develop strategies for
improvement, and create a data-based school
improvement plan. We can help technology planners to
develop vision and policy, analyze technology needs,
focus on student-centered learning, involve parents and
community, support professional development, build tech
infrastructure, establish multiyear funding strategies, and
evaluate process and outcomes. We can use benchmarks
to assist teachers in reviewing research studies linked to
content standards for information that can inform their
practice and provide a review and synthesis of current
literature on e-learning; use narratives connecting elearning Web curriculum and standards-based content,
teaching and learning, instructional technology systems,
and cultural and organizational context; and provide a
strategic framework to assist schools in developing and
implementing customized professional development
plans. We can have teams from schools participate in a
Data Retreat and an intensive Leadership Institute to train
in the essentials of developing, implementing,
monitoring, and sustaining high-quality PD plans;
Coaches work with teams to customize plans for local
needs. Web site supports teams by providing information,
tools, and opportunities for collaboration; focuses on
planning and actions essential for implementing,
managing, and supporting educational technology in
schools; uses modules that provide goals and resources
for creating a workshop. We can provide access to
resources that include a range of technology-oriented
analysis, planning, and skill development; Provide a
compilation of information, research, landmark articles,
and activities to be used by educators, parents, and
community members as they work together to improve
student learning and provide training for developing
online facilitators for ongoing professional development
or for providing content to students.
REFERENCES:
www.iste.org
www.ncrel.org/engauge/framewk/efp/range/efpranin.h
tm/
97 | Page
September 2014, Volume-1, Special Issue-1
NO MIME WHEN BIO-MIMICRIES BIO-WAVE
J.Stephy Angelin1, Sivasankari.P2
1,2
Department of EEE, Jayaram College of Engineering and Technology,
Trichy, India
1
[email protected] [email protected]
Abstract— Today’s modern world is in search of green
methodologies to safeguard our mother .World is
circulating as a wild wind to enlighten the maps with
green power .As giving hands to this wind my paper
focus on producing blue-green power which is the biowave .The bio-wave is used for utility scale power
production from ocean waves .It is nature inspired
design converts high conversion ability to avoid
excessive wave forces ,enabling supply of grid connected
electricity at a competitive price per MW hour .The biowave is designed to operate in ocean swell waves
,absorbing energy both from the surface and bottom .It
is a bottom mounted pitching device ,which spans the
full depth .The energy capture and conversion process is
done by the o-drive testing system .It’s main motto is to
produce 20000MW of power in the coming 20-20 .
Keywords— Bio-mimicry, O-drive testing, Sub-sea cable.
I. INTRODUCTION
The nature is the gift for the human life. The
needs for the human life in earth are: fresh water,
unpolluted air and ease of energy. World is entirely
moving towards the natural energy. This paper mainly
focuses on the renewable energy. The various renewable
energy resources are from solar energy, wind energy, biogas, geothermal, tidal energy etc. This paper shows an
alternative source for these energies, THE BIO-WAVE. It
is the nature inspired design called the bio-mimicry. [1]
shows the various theories and practices that had been
done. Bio-mimicry is the imitation of models that is
proposed from the nature to solve the complex problems.
The bio-mimicry is a Greek word. Bios means life and
mimesis means to imitate. The field that relates biomimicry is the bionics [2, 3]. The upcoming bio-mimetic
has given rise to newer technologies that is inspired from
the nature and created biologically in the macro and nano
scales. It’s a new idea which solves the problems. The
nine principles in the life is obtained from the nature is
shown in [4] it shows the relation between
transdisciplinary and bio-mimicry.
II. BIO-MIMICRY
Bio-mimicry shows the natures idea and imitates
these designs to solve the human problems. It’s the
innovation inspired by the nature.
Like the viceroy
butterfly imitating the monarch, we humans are imitating
the Like the viceroy butterfly imitating the monarch; we
humans are imitating the modified organisms in our
environment. We are learning for examples how to
produce energy like a leaf, grow food like prairie, self-
98 | Page
medicate like ape, create colour like a peacock, and
multiply like a cell. Our world is a natural home. We
need to make aware of the natural energies since the
home that is ours is not ours alone. The life of home i.e,
nature is in m3. The nature penetrates in the life of
humans as model, mentor and measure
A. Nature as Model
Bio-mimicry shows the model of nature and imitates
these process or systems to solve human problems.
The bio-mimicry association and their faculties
developed the design called the bio-mimicry design
in natures model.
B. Nature as Measure
Nature as measure is shown in the life’s principle
and is inbuilt in the measure of bio-mimicry design.
C. Nature as Mentor
It shows the period based on which we can obtained
from the natural world.
III. BIO-WAVE – THE FUTUER POWER
The bio-wave is mounted on the seafloor, with a
pivot near the bottom. The array of buoyant floats, or
"blades", interacts with the rising and falling sea surface
(potential energy) and the sub-surface back-and-forth
water movement (kinetic energy).
As a result, the
pivoting structure sways back-and-forth in tune with the
waves, and the energy contained in this motion is
converted to electricity by an onboard self-contained
power conversion module, called O-Drive. The O-Drive
contains a hydraulic system that converts the mechanical
energy from this motion into fluid pressure, which is used
to spin a generator. Power is then delivered to shore by a
subsea cable. The result: efficient clean energy from the
ocean.
An ocean-based 250kW bio-wave demonstration
project is currently under development at a gridconnected site with further plans in place to develop a
1MW demonstration, followed by multi-unit wave energy
farms. After 3 years of development and testing, Biowave Industries Inc. presents the New Green Revolution.
Bio-wave machines emit subsonic harmonic waves that
resonate with plant frequencies and cause the Stomata to
dilate.
September 2014, Volume-1, Special Issue-1
V. RESULTS


Design from sea sponges
Can power 500 homes.
VI. DISADVANTAGES
 Initial cost is high.
 Skilled labours are required.
VII. CONCLUSION
We use bio-mimicry technique for getting green
energy.
That is more effective and most important
technique. Modelling echolocation in bats in darkness has
led to a cane for the visually impaired. Research at
the University of Leeds, in the United Kingdom, led to
the Ultra Cane, a product formerly manufactured,
marketed and sold by Sound Foresight Ltd.
VIII. REFERENCES
1)
Fig.1.Working of bio-wave
Bio-wave proprietary technology is patent
pending in 160 countries. Bio-power is form of wave
power is less of an eyesore and more friendly to shipping
— as well as being more efficient than units that simply
bob up and down on the surface. The energy from this
wave is high and large torques can be generated. To make
the most out of that torque, and take as much power as
possible through into a generator, the torque should react
against a fixed surface. And the only fixed surface nearby
is the seabed. Unlike many other wave power units, biopower produces its energy at sea. The experimental
CETO — which stands for Cylindrical Energy Transfer
Oscillating unit. CETO merely acts as a pump, pushing
water along a seabed pipe that leads to the generator on
land.
Vincent JulianF.V, Bogatyreva, Olga. A,
Nikolaj.R, Bowyer, Adrian, Pahl, Anja-Karina,
―Biomimetics: its practice and theory‖, Journal
of Royal Society Interface 3(9):471-482.Aug
2006.
2) Reading University, ―What is Bio-Mimetic‖
3) Mary Mc Craty, ―Life of Bionics founder a fine
adventure‖, Dayton Daily News,29 Jan 2009
and
4) SueL.T.McGregor,―Transdisciplinarity
Biomimicry‖, Transdisciplinary Journal of
Engineering and Science,Vol.4,pp.57-65
Bio-wave offers two products presently. One is
solar-assisted for outside farm use. The other is for
Greenhouses and Hydroponics facilities. All machines are
made of stainless steel. All carry a one year warranty. All
the machines are manufactured and assembled in the
U.S.A.
IV. ADVANTAGES





99 | Page
Eco-friendly.
Free fuel source.
Power delivered nearly 250mw.
Maintenance cost is less.
High efficient energy power
September 2014, Volume-1, Special Issue-1
A NOVEL CLIENT SIDE INTRUSION DETECTION AND
RESPONSE FRAMEWORK
Padhmavathi B1, Jyotheeswar Arvind M2, Ritikesh G3
1,2,3
Dept. of Computer Science and Engineering, SRM University,
1, Jawaharlal Nehru Road, Vadapalani, Chennai-600026, Tamil Nadu
1
[email protected]
2
[email protected]
3
[email protected]
Abstract— This paper proposes a secure,
platform independent tool to detect intrusions and
respond to intrusion attacks. Current web application
intrusion systems function predominantly on the
network layer and are web platform dependent. Our tool
detects intrusion attacks on the client side (application
layer) of the application and thus prevents any damage
to the application such as the loss of confidential data.
A major requirement is to create a tool that can be
easily integrated into any web application, is easy to use
and doesn't slow down the application's performance.
This tool implements an intrusion system by matching
behavior patterns with an attack rule library. This
implementation improves existing systems by reducing
the number of false alarms generated by traditional
systems eg: similar username matching. A statistical
model is used to validate the detection and take the
necessary responsive action only if it is validated by the
test.
applications tested in 2011 proved to have vulnerabilities
in them. Open Web Application Security Project (
OWASP ) in 2013[4] indicates Injection, Broken
authentication and session management, Cross Site
Scripting (XSS), Insecure Direct Object References,
Security Misconfiguration, Sensitive Data Exposure,
Missing Function Level Access Control, Cross Site
Request Forgery ( CSRF), Using Components with
known vulnerabilities and invalid redirects and forwards
as the 10 major categories of web application attacks.
According to the Cenzic 2013[2] and White Hat
Security[5] reports state that among the top web
application attacks, SQLi and XSS Injection attacks were
found to constitute 32%, authentication, authorization and
session attacks 29% , information leakage attacks 16% of
the total attacks. These common and often preventable
attacks are employed in order to attack the web
application and extract confidential information or infect
the system.
Index Terms—Web Applications, Security,
Intrusion Detection System, IDPS, Application Layer
Security, Web Application Attacks.
Security for web applications on the application
layer are often not given due importance. Although Web
Application Firewalls, Proxy Servers and Intrusion
Detection and Response Systems are employed, these
predominantly function on the network layer. Advanced
security systems such as the firewalls are often
expensive[6] and have a limited functionality and may
not thwart attacks on the application layer.
I. INTRODUCTION
Web Applications, after the internet revolution
are becoming the favored interface for providing services.
Web applications provide a simple and intuitive interface
that can be accessed anywhere providing mobility and
simplicity to the users. Advancements in Cloud
Computing and concepts such as Software as a Service,
Platform as a Service rely on web applications to
function. However, the focus of web application
development usually is on the implementation and
providing a service to the customer at the earliest with
minimal concentration on security. Cost constraints also
lead the developers to reduce the level of importance for
security and testing for vulnerabilities. Software giants
having large development teams can dedicate
programmer hours and resources to work on security.
However, for startups and organizations that do not have
such resources at their disposal, security becomes a major
concern. New developers often also leave vulnerabilities
in the application that can be easily exploited due to lack
of security focused development experience.
The Imperva Web Application report indicates
that web applications are probed and attacked 10 times
per second 2012 [3]. According to Cenzic, 99% of the
100 | Page
II. PREVIOUS RELATED WORK
The Secure Web Application Response Tool
(SWART)[1] specifies a ASP .NET web application
based approach that can detect and prevent web
application based attacks at the time of occurrence. A
Chi-squared test has been performed in order to validate
the assumptions made in the development of this tool. It
has future potential to detect and prevent attacks with less
complexity.
AMNESIA[7], a tool that proposes detection and
prevention of SQLi attacks by combining static and
dynamic analysis, functions by identifying a hotspot and
building a SQL Query model for each hotspot. If the
input query match fails, it is marked as malicious. A
major drawback of this tool is that a lot of false positives
might occur and it detects and responds only to SQli
attacks.
September 2014, Volume-1, Special Issue-1
SAFELI[8], a static analysis framework for
identifying SQLi vulnerabilities, inspects the byte code of
an ASP.NET application using symbolic execution. The
attack patterns are stored in an attack library for pattern
matching and a hybrid constraint solver is employed to
find the malicious query for each hotspot and then error
trace it step by step. The drawback is that this system
functions only on ASP.NET based web applications and
also can prevent only SQLi attacks.
A Web Application Based Intrusion Detection
Framework (WAIDS)[9] proposes a profile matching
based approach for the web request data. Keyword
extraction and similarity measure for detecting malicious
activity are the main techniques employed in this tool.
This tool however requires extensive developer
knowledge and is complex to implement.
IV. PROPOSED TOOL
Our proposed system aims at creating an open
source cross platform application side intrusion detection
and response framework to detect and respond to web
application based intrusion attacks. The system employs
statistical models such as the Chi squared fitness test and
Bayesian probability in order to validate the attacks and
reduce the number of false alarms. Using the power of
open source, the tool can be further expanded and
validated with the input of the open source community.
We also use only open source software and tools in the
development of this framework.
Our proposed system functions on the
application layer of the OSI architecture whereas most of
the current systems function only on the application layer.
This is represented in Figure 1.
A Web Application Firewall (WAF)[10] is an
appliance, server plugin or filter that applies to a set of
rules to an HTTP conversation. Generally, these rules
cover common attacks such as Cross-site Scripting (XSS)
and SQL Injection. By customizing the rules according to
the application, many attacks can be identified and
blocked. The effort to perform this customization can be
significant and needs to be maintained as the application
is modified. This system functions by filtering the data
packets at the network layer.
Intrusion Detection Systems (IDS) and Intrusion
Detection and Prevention Systems (IDPS) are available
from third party vendors. However, these systems are
costly and also function only on the network layer.
Other proposed systems and tools to detect and
prevent web application based attacks are discussed in
[11, 12, 13, 14, 15].
III. LACUNA OF CURRENT SYSTEMS
Though many systems are currently present for
detecting and preventing web attacks, they are often
limited in scope and functionality. Many of the systems
discussed above focus only on the network layer security
alone. Many proposed tools can only respond to certain
types of attacks. Most of the systems are also platform
specific. Thus, for a developer to make a secure system, it
is extremely difficult to implement the different tools
across different layers and also make it platform
independent. The advancements in technology such as
Cloud Computing also leads to new platforms and
modifying an existing security system to function on new
platforms is a tedious and expensive task. Many of the
existing systems provided by third party vendors are
costly and also need extensive customization in order to
fit the needs of the client. False alarms are also frequently
generated by these systems which cause unwanted delay
or resource wastage due to the responses made.
Fig.1. Comparison of existing and proposed system
Our system has a Domain Authenticator, a
Detection Engine, an Analysis Engine and a Response
Engine. These constituents together form our Web
Application Based Intrusion Detection and Response
Tool. The architecture diagram for the system is show in
Figure 2.
Fig.2. Architecture Diagram
The overall flow chart of the system is shown in
Figure 3. The system functions by executing the login
sensor module. The inputs are then parsed by the system
101 | Page
September 2014, Volume-1, Special Issue-1
and any prospective intrusion detection is done by the
Detection Engine using the input from the SQLi Cheat
Sheet and the attack rule library. If the patterns from the
input are matched, then the analysis engine is run. Based
on the evaluation of the severity and risk of the attack by
the Analysis Engine, the corresponding response is
performed by the Response Engine. The detected attack
and the user log that triggered the intrusion detection are
stored in the Attack Detected library in the database.
Fig.3. Architecture Diagram
A. DOMAIN AUTHENTICATOR
Our tool will be utilized by many web
applications hosted across several domains. A Denial of
Service (DoS) attack might be achieved to slow the tool
down or prevent the registered applications from utilizing
it properly. To prevent such situations, every application
will need to be registered to our database. After
successful registration, a salted sha3 hash value of their
domain URL is provided to them as a unique
Authentication-token key. When an application
communicates with our server, the authentication token
and the domain from the request headers are validated
and only if they're authenticated, a short-lived session
will be set on the client side for near future access and the
server will continue processing requests from the same.
Otherwise, a 401 response is sent restricting the user from
accessing the server any further.
B. DETECTION ENGINE
Web Applications rely on HTML forms for
obtaining information from the user. All applications
provide an interface for users to input their data and based
on the input, interact with the system. Irrespective of how
the application looks on the front end, all applications
rely on forms for getting input and sending it across to the
web server. Inputs can be broadly classified into two
modules: Login Input and User Input. The detection
engine checks for the following attacks: Dictionary
attack, SQL Injection (SQLi), Cross Site Scripting (XSS),
URI attack, Unsafe content, Code Injection, HTTP
Attack, Cookie Attack.
Login Input
The login input module mainly deals with the
102 | Page
login page of a web application that is used to
authenticate and authorize a user accessing the system. It
generally consists of a username field and a password
field, the values of which are hashed and passed across to
the web server. However, they are still forms and most of
the attacks are targeted at this type of input as they
provide access to the password of the user, which can
then be used to gain complete access of the information
of the user from the system. The algorithm behind this
module functionsby obtaining the user input and then
matching it against the attack rule library. If a match
occurs, then the analysis engine is invoked. If three
attempts with wrong password are made, the system
redirects the user to an alarm page. If more than 5
incorrect entries are made, then the IP is blackmarked in
order to reduce the possibility of brute forcing. A general
use case diagram of this module is shown in Figure 4.
User Input
The user input module mainly deals with the
input given by the user after logging into the system post
authorization and authentication from the login module.
User inputs mainly deal with the majority of the
functionality provided by the system. Because of this,
attacks can be made across the different functionalities. A
general algorithm for this module functions by initially
verifying the session ID, parsing the input data and
matching it against an attack rule library. If a match
occurs, the number of intrusions value is incremented and
the attack pattern from the user along with the data log is
stored in the attack detected database. The corresponding
attack point is then forwarded to the analysis engine. A
general use case diagram of this system is given in Figure
5.
Fig. 4. Use case for Login Input
Fig. 5. Use case for User Input
September 2014, Volume-1, Special Issue-1
B. ANALYSIS ENGINE
The Analysis Engine consists of the False Alarm
Detector Module, the Statistical Analyzer Module and the
Categorizer and Threshold Evaluator. Figure 6 shows the
flowchart of the functioning of the False Alarm Detector,
Figure 7 the functioning of the Statistical Analyzer and
Figure 8 the functioning of the Categorizer and Threshold
Evaluator.
False Alarm Detector
In many cases, the user might repeatedly enter a
combination of wrong id or wrong password.
These cases are not malicious and the user can
potentially trigger a false intrusion detection.
This module aims to prevent such typos from
causing a false alarm and thus potentially
preventing unnecessary responses using the
Levensthein edit distance algorithm to identify
possible similar inputs and thus ignore if similar,
thereby reducing false alarms.
If the edit distance is more than 3 for threshold
maximum number of inputs (eg: 5 attempts), then a
possible intrusion is detected and the Human
Verification module is triggered. If not, an alert
with warning message is generated.
A technical forum or a discussion board in general
will have a lot of technical discussions. Users here
might add content which might be falsely detected
as an attack.eg: a user posting a query in response to
another user's request. In such cases, the application
when registering to avail the service, should
mention in advance if it would have any such
forums in its use. If true, an appropriate class/form
name is provided to avoid false detections.
Statistical Analyzer
The User inputs are parsed and tested using a fitness
test to check if there are any attempts made by the
user to attack the system.
If a deviation is detected, an alarm is raised and the
Response Engine is invoked.
The system has the ability to learn through
experiences. That is, it logs every alert raised and uses
it to perform the test the next time.
Categorizer and Threshold Evaluator
This Module is used to categorize the type of attack
attempted by the Attacker.
The Module uses a library to categorize the attacks
and rate it as a function of the Degree of Risk
associated with the Attack using the data fed to it
initially.
After determining the type and degree of risk of an
attack, the appropriate modules from the Response
103 | Page
Engine are invoked.
Fig. 6. Statistical Analyzer
Fig. 7. Categorizer and Threshold Evaluator
Degree of Risk points for the different intrusion
attacks are associated based on the table used by the
SWART system [1]. The table associates risk points for
each attack: An input containing SQL Query is associated
3 risk points, input containing scripts such as JavaScript
and HTML are associated 3 points, Session Invalidation
attacks are associated 4 points and etc. It is detailed in full
in the SWART[1] system proposal.
C. RESPONSE ENGINE
The Response Engine consists of the
Blackmarked IP module, the Privilege Reduction Module,
the Human Verification Module and the Redirector
Module. This engine redirects the user according to the
threshold value as described in Table I. At runtime, the
validation response of the application are checked for
analyzing intrusions. Figure 9 shows the flowchart of the
functioning of the Response Engine.
Blackmarked IP
IP Addresses that are blackmarked based on the
threshold value obtained by the analysis engine are stored
in the BlackmarkIP table in the database.
September 2014, Volume-1, Special Issue-1
Reducing Priveleges
In a web application, the HTML components are
usually grouped under CLASSES. At a certain threshold
value, the response tool reduces the privileges and
functions accessible by the user by hiding the
corresponding forms or information based on the class
name.
Human Verification
Many attacks on web applications are automated
and executed using bots. At the corresponding threshold
value obtained by the Analysis engine, the tool generates
a CAPTCHA to verify that the system is not under attack
from bots.
Redirection
Redirection module redirects the user to a
warning page and provides information as to the
consequences they might face if they involve in attacks.
For the lowest threshold value, this response is generated
by the tool.
V. EXPERIMENTAL RESULTS
Our tool provides the framework that users can
implement in their web applications to provide security.
The tool takes the input and then processes it in order to
detect attacks, and if an intrusion is detected, it takes the
necessary response.
For testing our tool, we have implemented a
sample web application on Ruby on Rails. The tool has
been developed using JavaScript, JQuery, MongoDB,
Node.js and related open source tools. Figure 10 shows
an access of the tool by an unauthenticated application,
Figure 11 shows an authorized access by an application –
both demonstrating the functionality of the Domain
Authenticator module. Figure 12 shows the SQLi attack
inputs being detected by the tool. Figure 13 shows the
effect of such an attack on an application that doesn’t use
our tool. Figure 14 demonstrates the Human Verification
module and Figure 15 demonstrates the Redirector
module of the Response Engine in action.
Fig. 9. Unauthorized access
Fig. 8. Response Engine
TABLE I. RESPONSE TYPE TABLE
Risk
1-10
Score
Low
Response Type
Redirection
10-15
Medium
Blackmarked IP
15 and above
High
Reduce Privilege,
Human
verificationand
Fig. 10. Authorized access
Fig. 11. SQLi detection
104 | Page
September 2014, Volume-1, Special Issue-1
Fig. 12. SQLi attack in action on unprotected
applications
Fig. 13. Human Verification Module
Fig. 14. Redirection Module
VI. FUTURE WORK
Our proposed system can be extended in the
future by implementing a system to check, verify and
validate the content of the file uploads made through the
forms to ensure that there is no malicious content in the
system. This can aid in preventing the remote execution
of any malicious files that haven been uploaded by any
user.
[6] https://www.owasp.org/index.php/Web_Application_
Firewall.
[7] William G.J Halfond and Alessandro Orso,
“Preventing SQL Injection Attacks using AMNESIA,
ACM international Conference of Software
Engineering”, pp.795-798, 2006.
[8] Xiang Fu, Xin Lu, “A Static Analysis Framework for
Detecting SQL Injection Vulnerabilities”, IEEE 31st
Annual International Computer Software and
Application conference, pp-87-96, 2007.
[9] YongJoon Park , JaeChul Park , “Web Application
Intrusion Detection System For Input Validation
Attack” , IEEE Third International Conference On
Convergence And Hybrid Information Technology
,2008, PP 498-504
[10] https://www.owasp.org/index.php/Web_Application_
Firewall
[11] Jin-Cherng Lin , Jan-Min Chen , Cheng-Hsiung Liu ,
“An Automatic Mechanism For Sanitizing Malicious
Injection “ ,
IEEE The 9th International Conference For Young
Computer Scientists 2008 , PP 1470-1475.
[12] Anyi liu ,yi yuan , “SQLProb : A Proxy based
Architecture towards preventing SQL injection
attacks “ , ACM SAC’ March 2009, PP.2054-2061.
[13] Abdul Razzaq ,Ali Hur , Nasir Haider , “Multi Layer
Defense Against Web Application “ , IEEE Sixth
International
Conference
On
Information
Technology :New Generations , 2009 , PP.492-497
[14] Yang Haixia And Nan Zhihong , “A Database
Security Testing Scheme Of Web Application” , ,
IEEE 4th International Conference On Computer
Science And Education,2009 PP .953-955.
[15] Yang Haixia And Nan Zhihong , “A Database
Security Testing Scheme Of Web Application” , ,
IEEE 4th International Conference On Computer
Science And Education,2009 PP .953-955.
REFERENCES
[1] Kanika Sharma, Naresh Kumar, “SWART : Secure
Web Application Response Tool”, International
Conference on Control, Computing, Communication
and Materials (ICCCCM), pp.1-7, 2013.
[2] Cenzic
vulnerability
report
2013
https://www.info.cenzic.com/rs/cenzic/images/Cenzi
c-Application-Vulnerability-Trends-Report-2013.pdf
[Last accessed on: 02/01/2014]
[3] Imperva
Web
Application
Report
2012
http://www.imperva.com
[4] https://www.owasp.org/index.php/Category:OWASP
_Top_Ten_ Project.[Last accessed on: 02/01/2014]
[5] White
hat
report
https://www.whitehatsec.com/assets/WPstats_winter
11_11th.pdf [Last accessed on: 02/01/2014]
105 | Page
September 2014, Volume-1, Special Issue-1
HISTORY GENERALIZED PATTERN TAXONOMY MODEL FOR
FREQUENT ITEMSET MINING
1
Jibin Philip
2
K.Moorthy
1
Second Year M.E. (Computer Science and Engineering)
2
Assistant Professor (Computer Science and Engineering)
Maharaja Prithvi Engineering College, Avinashi - 641654
1
[email protected]
ABSTRACT
Frequent itemset mining is a widely exploratory
technique that focuses on discovering recurrent
correlations among data. The steadfast evolution of
markets and business environments prompts the need
of data mining algorithms to discover significant
correlation changes in order to reactively suit product
and service provision to customer needs. Change
mining, in the context of frequent itemsets, focuses on
detecting and reporting significant changes in the
set of mined itemsets from one time period to
another. The discovery of frequent generalized
itemsets, i.e., itemsets that 1) frequently occur in the
source data, and 2) provide a high-level abstraction of
the mined knowledge, issues new challenges in the
analysis of itemsets that become rare, and thus are no
longer extracted, from a certain point. This paper
proposes a novel kind of dynamic pattern, namely the
HIstory GENeralized Pattern (HIGEN),
that
represents the evolution of an itemset in consecutive
time periods, by reporting the information about its
frequent generalizations characterized by minimal
redundancy (i.e., minimum level of abstraction) in case
it becomes infrequent in a certain time period. To
address HIGEN mining, it proposes HIGEN MINER,
an algorithm that focuses on avoiding itemset mining
followed by postprocessing by exploiting a supportdriven itemset generalization approach.
1. INTRODUCTION
Frequent itemset mining is a widely exploratory
technique that focuses on discovering recurrent
correlations among data. The steadfast evolution of
markets and business environments prompts the need of
data mining algorithms to discover significant
correlation changes in order to reactively suit product
and service provision to customer needs. Change
mining, in the context of frequent itemsets, focuses
on detecting and reporting significant changes in the
set of mined itemsets from one time period to another.
The discovery of frequent generalized itemsets, i.e.,
itemsets that 1) frequently occur in the source data, and
2) provide a high-level abstraction of the mined
knowledge, issues new challenges in the analysis of
itemsets that become rare, and thus are no longer
extracted, from a certain point. This paper proposes a
novel kind of dynamic pattern, namely the HIstory
GENeralized Pattern (HIGEN), that represents the
evolution of an itemset in consecutive time periods, by
reporting the information about its frequent
106 | Page
generalizations characterized by minimal redundancy
(i.e., minimum level of abstraction) in case it becomes
infrequent in a certain time period. To address
HIGEN mining, it proposes HIGEN MINER, an
algorithm that focuses on avoiding itemset mining
followed by postprocessing by exploiting a supportdriven itemset generalization approach. To focus the
attention on the minimally redundant frequent
generalizations and thus reduce the amount of the
generated patterns, the discovery of a smart subset of
HIGENs, namely the NONREDUNDANT HIGENs, is
addressed as well. Experiments performed on both real
and synthetic datasets show the efficiency and the
effectiveness of the proposed approach as well as its
usefulness in a real application context.
2. EXISTING SYSTEM
HIGEN mining may be addressed by means
of a postprocessing step after performing the
traditional generalized itemset mining step ,
constrained by the minimum support threshold and
driven by the input taxonomy, from each timestamped
dataset. However, this approach may become
computationally expensive, especially at lower support
thresholds, as it requires 1) generating all the possible
item combinations by exhaustively evaluating the
taxonomy, 2) performing multiple taxonomy
evaluations over the same pattern mined several
times from different time periods, and 3) selecting
HIGENs by means of a, possibly time-consuming,
postprocessing step. To address the above issues, I
propose a more efficient algorithm, called HIGEN
MINER. It introduces the following expedients: 1) to
avoid generating all the possible combinations, it
adopts, similarly to , an Apriori-based supportdriven
generalized itemset mining approach, in which the
generalization procedure is triggered on infrequent
itemsets only.
2.1 Disadvantages of Existing System
More Resource Consumption.
More Processing Time.
3.PROPOSED SYSTEM
Frequent weighted itemset represent correlations
frequently holding in data in which items may weight
differently. However, in some contexts, e.g., when
the need is to minimize a certain cost function,
discovering rare data correlations is more interesting than
mining frequent ones. This paper tackles the issue of
September 2014, Volume-1, Special Issue-1
discovering rare and weighted itemsets, i.e., the
Infrequent Weighted Itemset (IWI) mining problem. Two
novel quality
3.1Advantages of Proposed System
Less Resource Consumption.
Less Processing Time.
Fast Access
Easy Interaction to System
4.IMPLEMENTATION
This paper tackles the issue of discovering
rare and weighted itemsets, i.e., the Infrequent Weighted
Itemset (IWI) mining problem. Two novel quality
measures are proposed to drive the IWI mining process.
Furthermore, two algorithms that perform IWI and
Minimal IWI mining efficiently, driven by the proposed
measures, are presented. Experimental results show
efficiency and effectiveness of the proposed approach.
The lists of modules used .
1. Data Acquisition
2. HIGEN
3. FP-GROWTH
4. Result
5. Comparison
4.1 Data Acquisition
This module is where data required for testing
the project is undertaken. There are two kinds of
dataset available for processing the data mining
applications. One is synthetic and other is real time
dataset. The process of acquiring the dataset is
carried on this module. Once the data set is acquired.
It has to be converted into suitable structure for further
processing by the algorithm. Java collections are
used to represent the data from the dataset.
4.2 HIGEN Algoritm
Algorithm 1 reports the pseudocode of the
HIGEN MINER. The HIGEN MINER algorithm
iteratively extracts frequent generalized itemsets of
increasing length from each timestamped dataset by
following an Apriori-based level-wise approach and
directly includes them into the HIGE
4.3 FP-Growth
Given a weighted transactional dataset and a
maximum IWI-support (IWI-support- min or IWIsupportmax) threshold ξ, the Infrequent Weighted
Itemset Miner (IWI Miner) algorithm extracts all IWIs
whose IWIsupport satisfies ξ (Cf. task (A)). Since the
IWI Miner mining steps are the same by enforcing either
IWIsupport- min or IWI-support-max thresholds, we
will not distinguish between the two IWI-support
measure types in the rest of this section.
4.4 Result
The result module displays the output of the
clustering, The output are shown as tabular data.
107 | Page
4.5 Comparison
In Comparison module the algorithm is
compared based on different techniques. The basic
techniques included time and space complexity.
Time complexity
The total time required for the algorithm to run
successfully, and produce an output.
T(A) = End Time – Start Time.
Space Complexity
The space complexity is denoted by the
amount of space occupied by the variables or data
structures while running the algorithm.
S(A) = End { Space(Variables)
Space(Data
Structures)
}
–
Start
Space(Variables) + Space(Data Structures)}
+
{
5. CONCLUSION
This paper proposes a novel kind of dynamic
pattern, namely the HIstory GENeralized
Pattern
(HIGEN),
that represents the evolution of an
itemset in consecutive time periods, by reporting the
information about its frequent generalizations
characterized by minimal redundancy (i.e., minimum
level of abstraction) in case it becomes infrequent in a
certain time period. To address HIGEN mining, it
proposes HIGEN MINER, an algorithm that focuses on
avoiding itemset mining followed by postprocessing by
exploiting a support-driven itemset generalization
approach. To focus the attention on the minimally
redundant frequent generalizations and thus reduce the
amount of the generated patterns, the discovery of a
smart
subset
of
HIGENs,
namely
the
NONREDUNDANT HIGENs, is addressed as well.
Experiments performed on both real and synthetic
datasets show the efficiency and the effectiveness of the
proposed approach as well as its usefulness in a real
application context.
There are different types of facilities
included in the future enhancement model. The FP
Growth and its advanced algorithms providing both
the frequent and infrequent item set mining in fast
and easy way.With the less amount of time the
mining can be possible and can be provide fast
access from database.The main advantages in the
future enhancement are fast mining with less amount
of itemset.This also provide easy interaction to the
system.
6. ACKNOWLEDGMENTS
I express my sincere and heartfelt thanks to
our chairman Thiru. K.PARAMASIVAM B.Sc., and
our
Correspondent
Thiru.
September 2014, Volume-1, Special Issue-1
P.SATHIYAMOORTHY B.E., MBA., MS., for
giving
this opportunity and providing all the
facilities to carry out this project work.
I express my sincere and deep heartfelt
special thanks to our Respected Principal Dr.
A.S.RAMKUMAR, M.E., Ph.D., MIE., for provided
me this opportunity to carry out this project work.
I wish to express my sincere thanks to
Mrs.A.BRINDA M.E., Assistant Professor, and
Head of the Department of Computer Science and
Engineering for all the blessings and help rendered
during the tenure of my project work.
I am indebted to my project guide,
Mr.K.Moorthy M.E., Assistant Professor for his
constant help and creative ideas over the period of
project work.
I express my sincere words of thankfulness to
members of my family, friends and all staff members
of the Department of Computer Science and
Engineering for their support and blessings to complete
this project successfully.
7. REFERENCES
1 Luca Cagliero
“Discovering
Temporal
Change
Patterns
in the Presence of
Taxonomies”
IEEE TRANSACTIONS ON
KNOWLEDGE AND DATA ENGINEERING,
VOL 25 NO 3, MARCH 2013
2. Arindam Banerjee, Srujana Merugu, Inderjit S.
Dhillon,
Andrew Joydeep Ghosh “Taxonomy
with Bregman Divergences” A ACM Computing
Surveys, Vol. 31,No.3
September 1999.
3. David M. Blei, Andrew Y. Ng,“Frequent Itemset
Allocation”, J. Machine Learning
Research, vol. 6, pp. 1705-1749, 2005.
4. Inderjit S. Dhillon, Subramanyam Mallela ,Rahul
Kumar “Divisive Information- Theoretic Feature
Algorithm for Text Classification” J. Machine
Learning Research, vol. 3, pp. 993-1022, 2003.
5. Jia-Ling Koh and Yuan-Bin Do ” Approximately
Mining Recently Representative” J. Machine
Learning Research, vol. 3, pp. 1265-1287, 2003.
6. Rakesh Agrawal,
Tomasz Imielinski , Arun
Swami “Mining Association Rules between Sets
of Items in Large Database”
Proc. ACM
SIGMOD-SIGACTSIGART Symp. Principles of
Database Systems (PODS), 2007
7. T. Blaschke “TOWARDS A FRAMEWORK FOR
CHANGE
DETECTION
BASED
IMAGE
OBJECTS”, “Latent Dirichlet Allocation”, J.
Machine Learning Research, vol. 3, pp. 993-1022,
2003.
108 | Page
September 2014, Volume-1, Special Issue-1
IDC BASED PROTOCOL IN
AD HOC NETWORKS FOR SECURITY TRANSACTIONS
1
K.Priyanka
2
M.Saravanakumar
1
Student M.E.CSE,
2
Asst. professor, Department of CSE,
Maharaja Prithvi Engineering College, Avinashi.
1
[email protected]
Abstract— Paper describes a Self-configured,
Self organizing protocol that encompasses IDC
(Identity Card) – unique identity to provide security
and trust in spontaneous wireless ad hoc networks.
IDC is generated and encrypted as signatures and
gotten certificate for trust with a Distributed
Certification Authority. A trusted node exchanges
IDC and ensures Authentication. Data services can
be offered to the authenticated neighboring nodes
without any infrastructure and saves time. Services
can be discovered using Web Services Description
Language (WSDL). Untrustworthy nodes enroll
with Intruded Signatures and (DANIDS) Intrusion
Detection System blocks affected node and alert all
other nodes in network.
Keywords— Identity Card Security, Distributed
Authority, Signatures, Authentication, Intrusion
Detection System,
1. INTRODUCTION
MANET (Mobile Ad hoc Network) refers to a
multi hop packet based wireless network entangled
with a set of mobile nodes that can communicate
spontaneously. The design of a protocol allows the
creation and management of a spontaneous wireless
ad hoc network with highly secured transactions
and with little human intervention. No
Infrastructure is required and is intended to self
organize based upon the environments and
availability. Security, Trust and Authentication is
the key feature included. Network is self
configured based up on the physical and logical
parameters provided and network begins with the
first node and is widespread by attaching forth
coming nodes as neighbor nodes in the network,
thereby achieves scalability. Protocol encloses IDC
(Identity Card) having two components public and
private to provide security and trust in networks.
Encrypted form of IDC evaluates Digital
Signatures and is certified and trusted. No
Centralized Certificate Authority is included.
Joining Node with configured network and
Communication between the nodes is done only
based on trust and certificate issued by the
Distributed Certificate Authority. A trusted node
exchanges their IDC with each other and ensures
Authentication. Thus reliable and secure
communication is enabled. Data services can be
offered to the authenticated neighboring nodes
109 | Page
without any infrastructure and saves time. Various
paths to reach destination could be determined by
nodes itself. Services can be discovered using Web
Services Description Language (WSDL). A node
receives a data packet that is ciphered by a public
key. When the server process received the packet, it
is in charge of deciphering it with the private key of
the user. When the data is not delivered properly, it
is not acknowledged and retransmission is done by
the user. Untrustworthy nodes are blocked by
Intrusion Detection Mechanism within the protocol.
1.1 MANET
A MANET is a type of ad hoc network that can
change locations and configure itself on the fly.
Because MANETS are mobile, they use wireless
connections to connect to various networks. This
can be a standard Wi-Fi connection, or another
medium, such as a cellular or satellite transmission.
Working of MANET
The purpose of the MANET working group is to
standardize IP routing protocol functionality
suitable for wireless routing application within both
static and dynamic topologies with increased
dynamics. Approaches are intended to be relatively
lightweight in nature, suitable for multiple
hardware and wireless environments, and address
scenarios where MANET are deployed at the edges
of an IP infrastructure. Hybrid mesh infrastructure
(e.g., a mixture of fixed and mobile routers) is
supported.
Characteristics of MANET
In MANET, each node acts as both host and
router. It is autonomous in behavior. The nodes can
join or leave the network anytime, making the
network topology dynamic in nature. Mobile nodes
are characterized with less memory, power and
light weight features. Mobile and spontaneous
behavior demands minimum human intervention to
configure the network. All nodes have identical
features with similar responsibilities and
capabilities and hence it forms a completely
symmetric environment. High user density and
large level of user mobility is present. Nodal
connectivity is intermittent.
September 2014, Volume-1, Special Issue-1
Forms of Connections
Infrastructure-based Networks
It is form of network without any access point.
Every station is a simultaneously router that
includes the authority control to be centralized.
Nodes communicate with access point and are
suitable for areas where AP is provided. Figure 1
depicts this form of network.
Infrastructure-less Networks
It is form of network without any backbone and
access point. Every station is a simultaneous router
that includes the authority control to be distributed.
Figure 2 depicts that network is formed with no
backbone and access point. Any node can access
any other node without centralized control.
private key pair for device identification and
symmetric cryptography to exchange session keys
between nodes. There are no anonymous users,
because confidentiality and validity are based on
user identification.
Advantages
The basis is to setup a secure spontaneous
network and solve several security issues.
Authentication phase is included based on IDC
(Identity Card) that helps in unique identification of
node. Each node is identified uniquely with a
public key and LID after authentication process
that verifies the integrity of the data. Trust phase
includes each and every trusted node to behave as
distributed authority and to have direct
communication without any central control.
Validation of integrity and authentication is done
automatically in each node. There exists a
mechanism to allow nodes to check the authenticity
of their IP addresses while not generating
duplicated IP addresses. The mechanism helps
nodes to authenticate by using their IP addresses. It
does not require any infrastructure and every node
self configure the logical and physical parameters
automatically without any user intervention.
Flooding of the data to all nodes in the network is
avoided by allowing each individual node to
choose a path to reach the destination. Hacking
signatures - a class of Intrusion Detection can be
blocked and prevented. Also Intrusion can be
alerted to all individual users in the network. This
is shown in fig.3.
2.1 Registration
User accesses application and provides Identity
Card information to the system protocol. New
Node and Network are created. Node Join
Approach (Distributed Algorithm) authenticates the
information to join the node in the network.
Services are discovered. Data is Delivered and
acknowledged. Hacked nodes are detected and
blocked with an alert. IDC include Public
Component comprises of Logical Identity- unique
ID, public key and an IP. Private Component of
IDC includes private key.
2. Implementing IDC Security in Protocol
The protocol proposed in this paper can establish
a secure self-configured Ad Hoc environment for
data distribution among users. Security is
established based on the service required by the
users, by building a trust network to obtain a
distributed certification authority between the users
that trust the new user. We apply asymmetric
cryptography, where each device has a public-
110 | Page
2.2 Node Creation
The basic idea behind is to encrypt the registered
IDC information along with encrypted message.
IDC generates Message Digest generated by SHA
Algorithm. It is encrypted with user‘s public key
known to be Digital Signature. Each of the nodes is
validated with Distributed Certificate Authority and
is considered to be trusted node and thus provides
Node Creation. Public key, LID and Private Key is
assigned and is given for data exchange. If failed
the device won‘t exchange data. The User
introduces its personal data while login at first time
and the security information is generated. Data are
September 2014, Volume-1, Special Issue-1
stored persistently in the device. Both clients and
servers can request or serve requests for
information or authentication from other nodes.
User Certificate has expiration.
SHA-1 Algorithm
SHA-1 algorithm uses 160 bit. A Hash value is
generated by a function H of the form h=H (M),
where M-variable length message and H (M)-fixed
length hash value. It takes as input a message with
maximum length of less than 264 bits and produces
output a 160 bit message digest. The Input is
processed on 512 bit blocks. Word Size includes 32
bits and number of step includes 80. Process
includes, Appending padding bits and length,
Initialize MD Buffer, Process message in 512 bit
blocks, and Producing Output.
In this module, we create a new network for the
trusted users. Network is created by Logical and
Physical configuration parameters that are passed
and each node generate the session key that will be
exchanged with new nodes after authentication.
Each node configures its own data including the
first node. The data include IP, port, user data and
data security. Network begins with the first node
and is widespread by attaching forth coming nodes
as neighbour nodes in the network without
restrictions, thereby achieves scalability. Nodes can
also send requests to update network information.
Reply will contain identity cards of all nodes in the
network. Nodes replying to the request must sign
this data ensuring authenticity. The owner provides
session key. The data is shared between two trusted
users by session key for their respective data‘s and
encrypting their files.
Network Configuration Module
AES Algorithm
Session key is generated by AES (ADVANCE
ENRYPTION
STANDARD)
Algorithm.
Symmetric Key is used as Session Key to cipher
the confidential message between trusted nodes and
uses 128 bit key length and block length. Number
of rounds is 10. Round Key Size is 128 bits and
expanded key size is 176 bits. It offers high
security because its design structure removes sub
key symmetry. Also execution time and energy
consumption are adequate for low power devices.
The user can only access the data file with the
encrypted key if the user has the privilege to access
the file. Encryption process includes Add round
key, Substitute bytes, Shift rows and Mix columns.
Decryption involves inverse sub bytes, Inverse shift
rows, Inverse mix columns and Add round key.
Session key has an expiration time, so it is revoked
periodically. All these values are stored in each
node.
2.3 Node Joining
It employs a distributed algorithm called Node
Join approach. Joining the node in network is done
only if attain trust and gotten certificate from a
valid Distributed CA. Next, Trusted nodes
exchange IDC and Authentication is done using
IDC (Identity Card) and the Certificate. Also Node
authenticates a requesting node by validating the
received information, by verifying the non
111 | Page
duplication of the LID and IDC. IP assignment is
done further if authentication got success. If
Authentication fails, determines intrusions in the
network. WSDL (Web Services Description
Language) configures network and delivers the data
and acknowledges if delivered to the destination.
When the node is authenticated it is able to perform
operations either transparently or by individual
user. The authenticated node can display the nodes,
send data to all nodes, join and leave the network.
After authentication, they are provided with IDC
(Identity card and Certificate) for further
communication. There are only two trust levels in
the system. Any 2 nodes can trust each other or can
be trusted mutual neighbour node.
RSA Algorithm
The Asymmetric key encryption schemes RSA
is used for Distribution of Session key and
Authentication process. RSA includes 512-bit key
and 1024 bit. RSA scheme is a block cipher in
which plain text (M) and cipher text (C) are
integers between 0 and n-1 for some n. Typical
Size includes 1024 bits. Plain text is encrypted in
blocks, with each block having a binary value less
than or equal to log2 n. Block Size is ‗k‘ bits. Both
sender and receiver must know the value of n.
Sender knows value of e and only receiver knows
value of d.
September 2014, Volume-1, Special Issue-1
2.4 Data Transfer
Services can be discovered using Web Services
Description Language (WSDL) if a node asks for
the available services. Services include Data Packet
Delivery to any of the trusted nodes. Node will
forward the packet to its neighbours. Any path to
reach destination can be determined by the user.
Flooding of information to all nodes is avoided.
This is helpful, when the neighbour is an intruder
and is blocked. At that time, user can choose
another path to reach destination. This saves times.
When the data is properly delivered to the trusted
nodes, acknowledgement is given by sender. When
the data is not delivered properly, it is not
acknowledged or the time expires, retransmission is
done by the user. A node receives a data packet that
is ciphered by a public key. When the server
process received the packet, it is in charge of
deciphering it with the private key of the user. To
send the encrypted data with the public key to a
node, user selects remote node and write the data.
Message is encrypted using remote node‘s public
key. Application encrypts the data with the public
key, generates the packet and sends it to the
selected node.
subject on or with an object; for example, login,
read, perform I/O, execute. Object: Receptors of
actions. Examples include files, programs,
messages, and records. Object granularity may vary
by object type and by environment. ExceptionCondition: Denotes which, if any, exception
condition is raised on return. Resource-Usage: A
list of quantitative elements in which each element
gives the amount used of some resource (e.g.
number of records read or written, session elapsed
time). Time-Stamp: Unique time-and-date stamp
identifying when the action took place. From this
data it constructs the information about the entire
network. Each Agent continuously overhears the
neighbour nodes activities and records in audit trail.
The node prepares the control data embedded in
each packet that helps to identify the malicious
nodes. The neighbour node utilizes this data and
updates it further to detect the malicious nodes. On
detection, all other nodes are sent multiple
ALERTS about its malicious activities. Figure 4
shows the overall architecture, which consists of 2
main components,
3. Intrusion Detection
Intruder –Thrust one and producing a sudden
onslaught making the system deteriorate or
blocked. Systems possessing information on
abnormal, unsafe behaviour (attack) is often
detected using various intrusion detection systems.
The channel is shared, and due to lack of
centralized control, the nodes in the network are
vulnerable to various attacks from the intruders.
3.1 DANIDS Architecture
The Distributed Agent Network Intrusion
Detection System, (DANIDS) is a collection of
autonomous agents running within distributed
systems, detects Intrusions. It is proposed a
response based intrusion detection system which
uses several mobile IDS agents for detecting
different malicious activities in a node. These
multiple IDS agents detect and locate the malicious
nodes. The proposed systems rely on the data
obtained from its local neighbourhood. Thus, each
node possesses signatures (data from neighbour)
found in logs or input data streams and detect
attack. Log is known to be audit trail data.
Signature analysis is based on the attacking
knowledge. They transform the semantic
description of an attack into the appropriate audit trail format depicted in fig 4. Each audit trail record
contains the following fields: Subject: Initiators of
actions. A subject is typically a node user or
process acting on behalf of users or groups of users.
Subject issuing commands constitute entire
activities and may be different access classes that
may overlap. Action: Operation performed by the
112 | Page
Agent module: An audit collection module
operating as a background process on a monitored
system. Its purpose is to collect data on security
related events on the host and transmit these to the
central manager. Each agent can be configured to
the operating environment in which it runs. It filters
the needed details from the Audit Trail Record and
ensures with the fuzzy logic determined.
Central manager module: Receives reports from
agents and processes and correlates these reports to
detect intrusion. In addition, other third party tools
-- LAN monitors, expert systems, intruder
recognition and accountability, and intruder tracing
tools -- can be "plugged in" to augment the basic
host-monitoring infrastructure. The Manager
communicates with each agent relaying alert
messages when an attack is detected by agent in
September 2014, Volume-1, Special Issue-1
audit log. We designed Simple Fuzzy rules to
identify the misbehaviour nodes.
Filters: The Medium Access Control layer plays an
important role in DANIDS. Address registration
process guarantee the node‘s IP address
uniqueness. Three new data structures are created
at the edge routers: the filtering database, the
Internet client‘s address table, and the Internet
client blacklist table. Information extracted from
the new ARO and DAR messages are used to fill
the filtering database and filtered. It is filled based
on the data received from other nodes, client
request rate.
Datastructure
The Client Address Table (CAT) includes the
client IP address, Life time, number of times it is
added to the black list (counter).The Client
Blacklist Table(CBT) addresses all client address
that encounter lifetime with 0, same IP address
determined at more than 1 node and nodes that
does not match the encrypted IDC and signature.
Filtering Packets Received: When the edge
router receives/send a packet to a neighbour, agent
filters the details and store in audit trail record. It
verifies IDC and address of node that matches
signatures is filtered to the Client Address Table.
Also if its retransmission request it‘s address if
filtered over Client Address Table. Finally if the
Signature is not matched or if request node life
time expires it is filtered to Black List Table.
Working Procedure
When a node receives a packet from a node it
views the audit trail log and decrypts encrypted
IDC and signature. It verifies 2 cases, Case 1: If the
IDC and signature matches and also verifies black
list table to check if IP address is replicated. If it
does not match the signature or IP is replicated,
Hacked node is detected by the agent and reports
central manager. If matches it will communicate
and deliver data and acknowledgement is sent.
Case2:If the request is for retransmission caused
due to time expire of lifetime, it check the client
address table for confirmation and resend the data
113 | Page
and wait for acknowledgement. In the route-over
routing approach, the process is similar as meshunder approach. 6LRs is used and uses the new
DAR and DAC messages to verify the address
uniqueness on the edge router. New address
registration option (ARO) and duplicate address
request (DAR) message formats is included. ARO
option contains two fields reserved for future use,
the first with 8 bits and the second with 16 bits
length. Moreover, the DAR messages also contain
an 8 bit length reserved field to implement the
security mechanism. Figure 5 depicts this data
flow.
Packet Send Ratio (PSR): The ratio of packets
that are successfully sent out by a legitimate traffic
source compared to the number of packets it
intends to send out at the MAC layer. If too many
packets are buffered in the MAC layer, the newly
arrived packets will be dropped. It is also possible
that a packet stays in the MAC layer for too long,
resulting in a timeout. If A intends to send out n
messages, but only m of them go through, the PSR
is m/n. The PSR can be easily measured by a
wireless device by keeping track of the number of
packets it intends to send and the number of
packets that is successfully sent out.
Packet Delivery Ratio (PDR): The ratio of
packets that are successfully delivered to a
destination compared to the number of packets that
have been sent out by the sender. B may not be able
to decode packet sent by A, due to the interference
introduced by X with an unsuccessful delivery. The
PDR may be measured at the receiver that passes
the CRC check with respect to the number of
packets received. PDR may also be calculated at
the sender A by having B send back an
acknowledge packet. In either case, if no packets
are received, the PDR is defined to be 0.
3.2 Result Analysis:
Table 1 show that data of audit trail record
detecting attacks on MAC Layer and failed login
due to time expiry for various number of events
given. The graph for the data is presented in figure
September 2014, Volume-1, Special Issue-1
6. To detect attack Simple Fuzzy rules are used.
The behaviour of the nodes is observed for the past
N intervals from a Backup Window (similar to a
sliding window). Time Expiration is calculated
setting a threshold time interval value of Δ T from
small to large. Set as =15 sec, 25 sec and 35 sec
and T=1000 millisecond. In Hacked Access, login
with wrong password, compared with stored IDC,
captured with mismatched IDC and attack detected.
CONCLUSION
In this paper, complete secured protocol
implemented in AD Hoc Networks is well defined
with little user intervention. No Infrastructure and
Central Authority control is required. Each node is
identified uniquely with IDC and LID after
authentication process that verifies the integrity of
the data. Encrypted form of IDC evaluates Digital
Signatures and is certified by Distributed
Authority. Network is self configured based up on
the physical and logical parameters provided and
network begins with the first node and is
widespread by attaching forth coming nodes as
neighbour nodes in the network, thereby achieves
scalability. Joining Node with configured network
and Communication between the nodes is done
only based on trust and authentication. Thus
reliable and secure communication is enabled. Data
services can be offered to the authenticated
neighbouring nodes without flooding and is
avoided by allowing each individual node to
choose a path to reach the destination. Thereby
reduces network traffic and saves times. Services
can be discovered using Web Services Description
Language (WSDL).Time Expired packets can be
retransmitted. Hacking signatures - a class of
Intrusion causing Untrustworthy nodes can be
detected and blocked by DANIDS (Intrusion
Detection System) within the protocol. Also
Intrusion can be alerted to all individual users in
the network.
Creation, IEEE Transactions on Parallel and
Distributed Systems Vol.24,No.4, April 2013.
[2] J. Lloret, L. Shu, R. Lacuesta, and M. Chen,
―User-Oriented
and
Service-Oriented
Spontaneous Ad Hoc and Sensor Wireless
Networks,‖ Ad Hoc and Sensor Wireless
Networks, vol. 14, nos. 1/2, pp. 1-8, 2012.
[3] S. Preub and C.H. Cap, ―Overview of
Spontaneous Networking -Evolving Concepts
and Technologies,‖ Rostocker InformatikBerichte, vol. 24, pp. 113-123, 2000.
[4] R. Lacuesta, J. Lloret, M. Garcia, and L. Pen˜
alver, ―A Spontaneous Ad-Hoc Network to
Share WWW Access,‖ EURASIP J. Wireless
Comm. and Networking, vol. 2010, article 18,
2010.
[5] Y. Xiao, V.K. Rayi, B. Sun, X. Du, F. Hu, and
M. Galloway, ―A Survey of Key Management
Schemes in Wireless Sensor Networks,‖
Computer Comm., vol. 30, nos. 11/12, pp.
2314-2341, Sept. 2007.
[6] V. Kumar and M.L. Das, ―Securing Wireless
Sensor
Networks
with
Public
Key
Techniques,‖ Ad Hoc and Sensor Wireless
Networks, vol. 5, nos. 3/4, pp. 189-201, 2008.
[7] S. Zhu, S. Xu, S. Setia, and S. Jajodia, ―LHAP:
A Lightweight Hop-by-Hop Authentication
Protocol For Ad-Hoc Networks,‖ Ad Hoc
Networks J., vol. 4, no. 5, pp. 567-585, Sept.
2006.
[8] Patroklos g. Argyroudis and donal o‘mahony,
―Secure Routing for Mobile Ad hoc
Networks‖, IEEE Communications Surveys &
Tutorials Third Quarter 2005.
[9] Loukas Lazos, and Marwan Krunz, ―Selective
Jamming/Dropping Insider Attacks in Wireless
Mesh Networks‖ An International Journal on
Engineering Science and Technology Arizona
edu, Vol.2, No. 2, pp 265-269, April 2010.
[10] R.Vidhya, G. P. Ramesh Kumar, ―Securing
Data in Ad hoc Networks using Multipath
routing‖, International Journal of Advances in
Engineering & Technology, Vol.1, No. 5, pp
337-341, November 2011.
REFERENCES
[1] Raquel Lacuesta,Jaime Lloret,Miguel Garcia,
Lourdes Penalver,‖A Secure Protocol for
Spontaneous Wireless Ad Hoc Networks
114 | Page
September 2014, Volume-1, Special Issue-1
VIRTUAL IMAGE RENDERING AND STATIONARY RGB
COLOUR CORRECTION FOR MIRROR IMAGES
S.Malathy1
R.Sureshkumar2
V.Rajasekar3
1
Second Year M.E (Computer Science and Engineering)
Assistant Professor (Computer Science and Engineering)
3
Assistant Professor (Computer science and Engineering)
1, 2
Maharaja Prithvi Engineering College, Avinashi - 641654
3
Muthayammal College of Engineering, Rasipuram
1
[email protected]
2
[email protected]
3
[email protected]
2
ABSTRACT
The key idea is to develop an application for digital
image processing which concurrently process both
stationary images and RGB Colour Corrections. It
deals with two types of input, namely system based
input and camera based input. The system based input
will be the manual input from the user without any
hardware devices, the mirror and RGB images will be
available in the system user can use these images for
image rendering. Another method of input will be
Camera based input, for camera method a black box
based camera box will be created, the camera will be
connected with personal Computer through universal
serial port. Whenever the images get placed before
the camera, the camera can be operated from the
system. So that the image can be capture from the
camera and given as the input image. Using image
rendering, edge detections and support vector
machine method the image will be recognized by the
application. Despite rapid progress in mass-storage
density, processor speeds, and digital communication
system performance, demand for data storage
capacity
and
data-transmission
bandwidth
continues to outstrip the capabilities of available
technologies.
General Terms
Edge Detection, Laplacian of Gaussian, Canny Edge
Detection, Sobel Edge Detection Methodologies, SVM
Classification.
Keywords -- Mirrors, Image reconstruction, RGB
system, Depth image denoising, Support Vector
Machine.
1. INTRODUCTION
Digital image processing is an area that has seen great
development in recent years. It is an area which is
supported by a broad range of disciplines, where signal
processing and software engineering are among the
most important. Digital image processing applications
tend to substitute or complement an increasing range of
activities. Applications such as automatic visual
inspection, optical character recognition object
identification etc are increasingly common.
115 | Page
Digital image processing studies the processing of
digital images, i.e., images that have been converted
into a digital format and consist of a matrix of points
representing the intensities of some function at that
point. The main objectives are related to the image
improvement for human understanding and 'the
processing of scene data for autonomous machine
perception'.
This later task normally comprises a number of steps.
The initial processing step is the segmentation of the
image into meaningful regions, in order to distinguish
and separate various components. From this division,
objects can then be identified by their shape or from
other features. This task usually starts with the detection
of the limits of the objects, commonly designated as
edges. Effectively a human observer is able to
recognize and describe most parts of an object by its
contour, if this is properly traced and reflects the shape
of the object itself. In a machine vision system this
recognition task has been approached using a similar
technique, with the difference being that a computer
system does not have other information than that which
is built into the program. The success of such a
recognition method is directly related to the quality of
the marked edges.
Under 'engineered' conditions, e.g. backlighting, edges
are easily and correctly marked. However, under
normal conditions where high contrast and sharp image
are not achievable detecting edges become difficult.
Effectively as contrast decreases the difficulty of
marking borders increases. This is also the case when
the amount of noise present in the image increase,
which is 'endemic' in some applications such as x-rays.
Common images e.g. from interior scenes although
containing only small amounts of noise, present uneven
illumination conditions. This diminishes contrast which
affects the relative intensity of edges and thus
complicates their classification. Finally, image blur due
to imperfections in focus and lens manufacture
smoothes the discontinuities that are typical from edges
and thus once again makes the edges difficult to detect.
September 2014, Volume-1, Special Issue-1
These problems prompted the development of edge
detection algorithms which, to a certain degree of
success, are able to cope with the above adverse
conditions. Under suitable conditions most of the edge
detection algorithms produce clear and well defined
edge maps, from which objects within the image are
easily identified. However, the produced edge maps
degrade as the image conditions degrade. Not only
misplacements of the shape occur, spurious features
appear and edge widths differ from algorithm to
algorithm. It may be hypothesized that edge maps
produced by different algorithms complement each
other. Thus it may be possible to override some of the
vagueness by comparison between different edge maps.
shape of an object. The relation between edges and grey
level discontinuities is not clear, and a decision can only
be made where an understanding of the image exists
(which, in some way, is the ultimate goal of the whole
image processing process). As Vicky Bruce states
2. EDGE DETECTION OBJECTIVES
The interest in digital image processing came from two
principal application areas: improvement of pictorial
information and processing of scene data for
autonomous robot classification within autonomous
machine perception. In the second area, where the most
primordial motivation of this thesis is based and in
which edge detection is used. The first processing steps
are the identification of meaningful areas within the
picture. This process is called segmentation. It
represents an important early stage in image analysis
and image identification. Segmentation is a grouping
process in which the components of a group are similar
in regard to some feature or set of features. Given a
definition of "uniformity", segmentation is a
partitioning of the picture into connected subsets each
of which is uniform but such that no union of adjacent
subsets is uniform. There are two complementary
approaches to the problem of segmenting images and
isolating objects - boundary detection and region
growing. Edge oriented methods generally lead to
incomplete segmentation. The resulting contours may
have gaps or there may be additional erroneous edge
elements within the area. The results can be converted
into complete image segmentation by suitable post
processing methods, such as contour following and
edge elimination. Region growing is a process that
starts from some suitable initial pixels and using
iterations neighboring pixels with similar properties are
assigned, step by step, to sub regions. A suitable basis
for this type of segmentation could be a thresholding
process.
An edge the grey level is relatively consistent in each of
two adjacent, extensive regions, and changes abruptly
as the order between the regions is crossed. Effectively
there are many well known paradoxes in which an edge
or contour is clearly seen where none physically exists.
This is due to the characteristics of our perception
capabilities, and our tendency to group similar
information as described in Gestalt's approaches to
perception or to infer edges from the context of the
image.
The intensity of reflected light is a function of the angle
of the surface to the direction of the incident light. This
Leeds to a smooth variation of the light intensity
reflected across the surface as its orientation to the
direction of the light source changes, which cannot be
considered as an edge. Also shadows give sharp
changes in the brightness within an image of a smooth
and flat surface. This does not represent the limit of an
object. In the other extreme, such as in the case of
technical drawing, where there are thin lines drawn,
where no discontinuity on the represented object exists,
but which are important for the understanding of the
116 | Page
There is a relationship between the places in an image
where light intensity and spectral composition change,
and the places in the surroundings where one surface or
object ends and another begins, but this relation is by no
means a simple one. There are a number of reasons why
we cannot assume that every intensity or spectral
change in an image specifies the edge of an object or
surface in the world
The weakness and limitations of the concept can be
shown through an image with two areas. Between them
grey levels present a linear varying discontinuity from 0
to S. Lets assume that the two areas present a linear
varying grey level with the ranges [0.. n] and [0.. n+S]
respectively, and such that n/x «S. A schematic threedimensional graph of such an image is presented in
figure 1.
Fig 1 : Grey level 3D plot of an image which consists
of two distinct areas
In an image defined like this the edge, as the
discontinuity between the two surfaces, will be marked
by the same operators with a different extension
depending of the value of the parameter n, although the
discontinuity itself is independent from it. All the above
definitions include some ambiguity in the definition of
an edge. Effectively it will be only known at the end of
processing, when segmentation has been performed,
where the edges are. An edge is a subjective entity, as
defined by Blicher. As this author said "The term 'edge'
has been fairly abused and we will continue that
September 2014, Volume-1, Special Issue-1
tradition here". Edges appear with several profiles,
where the most common are step edges. However ramp
edges and roof edges figure 2 are also common.
3. CANNY EDGE DETECTION
Edges characterize boundaries and are therefore a
problem of fundamental importance in image
processing. Edges in images are areas with strong
intensity contrasts – a jump in intensity from one pixel
to the next. Edge detecting an image significantly
reduces the amount of data and filters out useless
information, while preserving the important structural
properties in an image. This was also stated in my Sobel
and Laplace edge detection tutorial, but I just wanted
reemphasize the point of why you would want to detect
edges.
Fig 2 : Common models for edge profiles
Edges also appear with several shapes like curves or
straight lines. The most common shape in our
environment is vertical or horizontal straight lines.
Some particular scenes for instance blood cells do not
have straight edges at all. So a particular edge shape
cannot be assumed a priori. All these shapes have
associated with them some quantity of noise inherent to
the acquisition process. These problems can be seen in
figure 4 which is a three dimensional plot of the image
from the C key of the keyboard image in figure 3.
Fig 3 : Image of a keyboard
In particular the grey level variation in the C print
clearly visible in the middle as a re-entrance, which is
darker than the key is not abrupt.
Fig 4: Three dimensional plot of the
image from the C key of the computer keyboard
With the known exception of some images used in
robotics which are acquired with very high contrast,
upon which binary thresholding is carried out, all
images from real scenes have a small amount of noise.
This can also be originated from external sources to the
acquisition system.
117 | Page
The Canny edge detection algorithm is known to many
as the optimal edge detector. Canny's intentions were to
enhance the many edge detectors already out at the time
he started his work. He was very successful in
achieving his goal and his ideas and methods can be
found in his paper, "A Computational Approach to
Edge Detection". In his paper, he followed a list of
criteria to improve current methods of edge detection.
The first and most obvious is low error rate. It is
important that edges occuring in images should not be
missed and that there be NO responses to non-edges.
The second criterion is that the edge points be well
localized. In other words, the distance between the edge
pixels as found by the detector and the actual edge is to
be at a minimum. A third criterion is to have only one
response to a single edge. This was implemented
because the first 2 were not substantial enough to
completely eliminate the possibility of multiple
responses to an edge.
Based on these criteria, the canny edge detector first
smoothes the image to eliminate and noise. It then finds
the image gradient to highlight regions with high spatial
derivatives. The algorithm then tracks along these
regions and suppresses any pixel that is not at the
maximum. The gradient array is now further reduced by
hysteresis. Hysteresis is used to track along the
remaining pixels that have not been suppressed.
Hysteresis uses two thresholds and if the magnitude is
below the first threshold, it is set to zero. If the
magnitude is above the high threshold, it is made an
edge. And if the magnitude is between the 2 thresholds,
then it is set to zero unless there is a path from this pixel
to a pixel with a gradient above T2.
In order to implement the canny edge detector
algorithm, a series of steps must be followed. The first
step is to filter out any noise in the original image
before trying to locate and detect any edges. And
because the Gaussian filter can be computed using a
simple mask, it is used exclusively in the Canny
algorithm. Once a suitable mask has been calculated,
the Gaussian smoothing can be performed using
standard convolution methods. A convolution mask is
usually much smaller than the actual image. As a result,
the mask is slid over the image, manipulating a square
of pixels at a time. The larger the width of the Gaussian
September 2014, Volume-1, Special Issue-1
mask, the lower is the detector's sensitivity to noise.
The localization error in the detected edges also
increases slightly as the Gaussian width is increased.
The Gaussian mask used in my implementation is
shown below figure 5.
Fig 5: Implementation of Gaussian Mask
After smoothing the image and eliminating the noise,
the next step is to find the edge strength by taking the
gradient of the image. The Sobel operator performs a 2D spatial gradient measurement on an image. Then, the
approximate absolute gradient magnitude (edge
strength) at each point can be found. The Sobel operator
uses a pair of 3x3 convolution masks, one estimating
the gradient in the x-direction (columns) and the other
estimating the gradient in the y-direction (rows). They
are shown below figure 6.
degrees. Otherwise the edge direction will equal 90
degrees. The formula for finding the edge direction is
just theta = invtan (Gy / Gx)
Once the edge direction is known, the next step is to
relate the edge direction to a direction that can be traced
in an image. So if the pixels of a 5x5 image are aligned
as follows:
x x x x x
x x x x x
x x a x x
x x x x x
x x x x x
Then, it can be seen by looking at pixel "a", there are
only four possible directions when describing the
surrounding pixels - 0 degrees in the horizontal
direction, 45 degrees along the positive diagonal, 90
degrees in the vertical direction, or 135 degrees along
the negative diagonal. So now the edge orientation has
to be resolved into one of these four directions
depending on which direction it is closest to (e.g. if the
orientation angle is found to be 3 degrees, make it zero
degrees). Think of this as taking a semicircle and
dividing it into 5 regions as shown in figure7
Fig 7: Dividing the semicircle and orientation angle
of Edge orientation
Fig 6: Estimation of the Gradient in x and y
direction
The magnitude, or EDGE STRENGTH, of the gradient
is then approximated using the formula: |G| = |Gx| +
|Gy|. Finding the edge direction is trivial once the
gradient in the x and y directions are known. However,
you will generate an error whenever sumX is equal to
zero. So in the code there has to be a restriction set
whenever this takes place. Whenever the gradient in the
x direction is equal to zero, the edge direction has to be
equal to 90 degrees or 0 degrees, depending on what the
value of the gradient in the y-direction is equal to. If
GY has a value of zero, the edge direction will equal 0
118 | Page
Therefore, any edge direction falling within the yellow
range 0 to 22.5 & 157.5 to 180 degrees is set to 0
degrees. Any edge direction falling in the green range
22.5 to 67.5 degrees is set to 45 degrees. Any edge
direction falling in the blue range 67.5 to 112.5 degrees
is set to 90 degrees. And finally, any edge direction
falling within the red range 112.5 to 157.5 degrees is set
to 135 degrees.
After the edge directions are known, non maximum
suppression now has to be applied. Non maximum
suppression is used to trace along the edge in the edge
direction and suppress any pixel value (sets it equal to
0) that is not considered to be an edge.
Finally, hysteresis is used as a means of eliminating
streaking. Streaking is the breaking up of an edge
contour caused by the operator output fluctuating above
and below the threshold. If a single threshold, T1 is
applied to an image, and an edge has an average
strength equal to T1, then due to noise, there will be
September 2014, Volume-1, Special Issue-1
instances where the edge dips below the threshold.
Equally it will also extend above the threshold making
an edge look like a dashed line. To avoid this, hysteresis
uses 2 thresholds, a high and a low. Any pixel in the
image that has a value greater than T1 is presumed to be
an edge pixel, and is marked as such immediately.
Then, any pixels that are connected to this edge pixel
and that have a value greater than T2 are also selected
as edge pixels. If you think of following an edge, you
need a gradient of T2 to start but you don't stop till you
hit a gradient below T1.
4. SOBEL EDGE DETECTION
Edges characterize boundaries and are therefore a
problem of fundamental importance in image
processing. Edges in images are areas with strong
intensity contrasts – a jump in intensity from one pixel
to the next. Edge detecting an image significantly
reduces the amount of data and filters out useless
information, while preserving the important structural
properties in an image. There are many ways to perform
edge detection. However, the majority of different
methods may be grouped into two categories, gradient
and Laplacian. The gradient method detects the edges
by looking for the maximum and minimum in the first
derivative of the image. The Laplacian method searches
for zero crossings in the second derivative of the image
to find edges. An edge has the one-dimensional shape
of a ramp and calculating the derivative of the image
can highlight its location as shown in figure 8
Fig 8: Signal with a an Edge shown by the jump in
intensity
If we take the gradient of this signal which, in one
dimension, is just the first derivative with respect to t.
We get the following signal as shown in figure 9
119 | Page
Fig 9: Gradient signal with respect to t
Clearly, the derivative shows a maximum located at the
center of the edge in the original signal. This method of
locating an edge is characteristic of the “gradient filter”
family of edge detection filters and includes the Sobel
method. A pixel location is declared an edge location if
the value of the gradient exceeds some threshold. As
mentioned before, edges will have higher pixel intensity
values than those surrounding it. So once a threshold is
set, you can compare the gradient value to the threshold
value and detect an edge whenever the threshold is
exceeded. Furthermore, when the first derivative is at a
maximum, the second derivative is zero. As a result,
another alternative to finding the location of an edge is
to locate the zeros in the second derivative. This method
is known as the Laplacian and the second derivative of
the signal is shown below figure 10
Fig 10: Second Derivative of Laplacian
4.1 Sobel
Based on this one-dimensional analysis, the theory can
be carried over to two-dimensions as long as there is an
accurate approximation to calculate the derivative of a
two-dimensional image. The Sobel operator performs a
2-D spatial gradient measurement on an image.
Typically it is used to find the approximate absolute
gradient magnitude at each point in an input grayscale
image. The Sobel edge detector uses a pair of 3x3
convolution masks, one estimating the gradient in the xdirection (columns) and the other estimating the
gradient in the y-direction (rows). A convolution mask
is usually much smaller than the actual image. As a
result, the mask is slid over the image, manipulating a
square of pixels at a time. The actual Sobel masks are
shown in the figure 11
September 2014, Volume-1, Special Issue-1
Fig 12: Small Kernels of Discrete Convolution
Fig 11: Actual Sobel Masks
The magnitude of the gradient is then calculated using
the formula:
An approximate magnitude can be calculated using
|G| = |Gx| + |Gy|
4.2 Laplacian of Gaussian
The Laplacian is a 2-D isotropic measure of the 2nd
spatial derivative of an image. The Laplacian of an
image highlights regions of rapid intensity change and
is therefore often used for edge detection. The
Laplacian is often applied to an image that has first
been smoothed with something approximating a
Gaussian Smoothing filter in order to reduce its
sensitivity to noise. The operator normally takes a
single gray level image as input and produces another
gray level image as output. The Laplacian L(x,y) of an
image with pixel intensity values I(x,y) is given by:
Since the input image is represented as a set of discrete
pixels, we have to find a discrete convolution kernel
that can approximate the second derivatives in the
definition of the Laplacian. Three commonly used small
kernels are shown in Figure 12
120 | Page
Because these kernels are approximating a second
derivative measurement on the image, they are very
sensitive to noise. To counter this, the image is often
Gaussian Smoothed before applying the Laplacian
filter. This pre-processing step reduces the high
frequency noise components prior to the differentiation
step. The LoG (`Laplacian of Gaussian') kernel can be
pre-calculated in advance so only one convolution
needs to be performed at run-time on the image.
Fig 13 : The 2-D Laplacian of Gaussian (LoG)
function
The LoG (`Laplacian of Gaussian') kernel can be precalculated in advance so only one convolution needs to
be performed at run-time on the image. The x and y
axes are marked in standard deviations. A discrete
kernel that approximates this function (for a Gaussian σ
= 1.4) is shown in Figure 14
September 2014, Volume-1, Special Issue-1
space the frontier will be an hyperplane. The decision
function that we are searching for has the next form.
Fig 14: Approximate discrete kernel
Note that as the Gaussian is made increasingly narrow,
the LoG kernel becomes the same as the simple
Laplacian kernels shown in figure 4. This is because
smoothing with a very narrow Gaussian (σ < 0.5 pixels)
on a discrete grid has no effect. Hence on a discrete
grid, the simple Laplacian can be seen as a limiting case
of the LoG for narrow Gaussians as shown in figure15
The y values that appear into this expression are +1 for
positive classification training vectors and –1 for the
negative training vectors. Also, the inner product is
performed between each training input and the vector
that must be classified. Thus, we need a set of training
data (x,y) in order to find the classification function.
The values are the Lagrange multipliers obtained in the
minimization process and the l value will be the
number of vectors that in the training process contribute
to form the decision frontier. These vectors are those
with a value not equal to zero and are known as support
vectors.
When the data are not linearly separable this scheme
cannot be used directly. To avoid this problem, the
SVM can map the input data into a high dimensional
feature space. The SVM constructs an optimal
hyperplane in the high dimensional space and then
returns to the original space transforming this
hyperplane in a non-linear decision frontier. The
nonlinear expression for the classification function is
given below where K is the kernel that performs the
non-linear mapping.
The choice of this non-linear mapping function or
kernel is very important in the performance of the
SVM. One kernel used in our previous work is the
radial basis function. This function has the expression
given in
Fig 15 : Comparison of Edge Detection Techniques
(a)Original Image (b) Sobel (c) Prewitt (d) Robert
(e) Laplacian (f)Laplacian of Gaussian
5. SVM CLASSIFICATION
The SVM gives us a simple way to obtain good
classification results with a reduced knowledge of the
problem. The principles of SVM have been developed
by Vapnik and have been presented in several works. In
the decision problem we have a number of vectors
divided into two sets, and we must find the optimal
decision frontier to divide the sets. This optimal
election will be the one that maximizes the distance
from the frontier to the data. In the two dimensional
case, the frontier will be a line, in a multidimensional
121 | Page
The γ parameter in the equation must be chosen to
reflect the degree of generalization that is applied to the
data used. Also, when the input data is not normalized,
this parameter performs a normalization task. When
some data into the sets cannot be separated, the SVM
can include a penalty term (C) in the minimization, that
makes more or less important the misclassification. The
greater is this parameter the more important is the
misclassification error.
5.1 Edge detection using SVM
In this section we present a new way to detect edges by
using the SVM classification. The decision needed in
this case is between "the pixel is part of an edge" or "the
pixel is not part of an edge". In order to obtain this
decision we must extract the information from the
images since the entire image is not useful as the input
to the SVM. The solution is to form a vector with the
September 2014, Volume-1, Special Issue-1
pixels in a window around every one into the image.
The window size may be changed to improve the
detection. In the case of a 3x3 window a nine
components vector is calculated at each pixel except for
the border of the image. The formed vectors are used as
inputs to the SVM in the training process and when it is
applied to the images.
5.2 Detection Method
If we apply the trained SVM (1-2) to an image, a value
for each pixel into the image is obtained. This value
must be a value positive or negative. We can use the
sign of these values to say when a pixel is an edge or
not but, this way, a lost of information is produced. It is
better to use the values obtained and say that there is a
gradual change between “no edge” and “edge”. Thus,
the value obtained indicates a measure of being an edge
or not.
6. ARTIFICIAL NEURAL NETWORK
A multi-layer feed forward artificial neural network
(ANN) model for edge detection is discussed. It is well
known that ANN can learn the input-output mapping of
a system through an iterative training and learning
process; thus ANN is an ideal candidate for pattern
recognition and data analysis.
The ANN model employed in this research has one
input layer, one output layer, and one hidden layer.
There are 9 neurons in the input layer; in other words,
the input of this network is a 9×1 vector which is
converted from a 3×3 mask. There are 10 hidden
neurons in the hidden layer; and one neuron in the
output layer which indicates where an edge is detected.
Initial test results show that, though the neural network
is fully trained by the above 17 binary patterns, the
performance of the neural network detector is poor
when it is applied to test images. The reason is that all
the test images are gray-scale (i.e., the intensities of
images ranging from 0 to 255), not binary. Thus, we
normalize the gray-scale intensities so they are within
the range between 0 and 1. Furthermore, to improve the
generalization ability of neural network, fuzzy concepts
are introduced during the training phase so that more
training patterns can be employed by the neural
network. The membership functions are shown in Fig.
4. The grade of membership function, μ(x) , can be
defined as:
between edge detection techniques. In this paper we
studied the most commonly used edge detection
techniques of Gradient-based and Laplacian based Edge
Detection. The software is developed using MATLAB
7.0.
This work presents a new edge detector that use the
SVM to perform the pixel classification between edge
and no edge. This new detector reduces the execution
time compared with our prior implementation by
reducing the number of support vectors and by using a
linear SVM. Also the window size can be changed with
a reduced time increment and the results obtained with
larger window sizes may be compared to those from
Canny algorithm considered as a standard comparison.
8. ACKNOWLEDGMENTS
I express my sincere and heartfelt thanks to our
chairman Thiru.K.Paramasivam B.Sc., and our
Correspondent
Thiru.P.Sathiyamoorthy
B.E.,
MBA.,MS., for giving this opportunity and providing
all the facilities to carry out this paper. I express my
sincere and heartfelt thanks to our respected principal
Dr.A.S.Ramkumar M.E.,Ph.D.,MIE., for providing me
this opportunity to carry out this paper.
I Wish to express my sincere thanks to Mrs.A.Brinda
M.E., Assistant Professor and Head of the Department
of Computer Science and Engineering for all the
blessings and help rendered during tenure of this paper.
I am indebted to my project guide Mr.R.Sureshkumar
M.Tech., Assistant Professor and Mr.V.Rajasekar M.E.,
Assistant Professor in Muthayammal College of
Engineering for their constant help and creative ideas
over the period of project work.
I express my sincere words of thankfulness to members
of my family, friends and all staff members of the
Department of Computer Science and Engineering for
their support and blessings to complete this paper
successfully.
9. REFERENCES
1.
2.
3.
where ξ = 1 for "high intensity" and ξ = 0 for "low
intensity". In this research, we choose σ = 0.25.
7. CONCLUSION
Since edge detection is the initial step in object
recognition, it is important to know the differences
122 | Page
4.
5.
Jain, A.K., Fundamentals of Digital Image
Processing, Prentice Hall, Englewood Cliffs, NJ,
1989
Canny, J.F., A computational Approach to Edge
Detection, IEEE Trans. On Pattern Analysis
andMachine Intelligence, Vol. 8, 1986, pp. 679698.
Gómez-Moreno, H., Maldonado-Bascón, S.,
López-Ferreras, F., Edge detection in noisy
images by using the support vector machines.
IWANN, Lecture Notes on Computer Science,
Vol. 2084. Springer-Verlag, Heidelberg, 2001, pp.
685-692.
Vapnik, V., The Nature of Statistical Learning
Theory, Springer-Verlag, New York, 1995.
Chih-Chung Chang and Chih-Jen Lin, LIBSVM :a
library for support vector machines, 2001.
September 2014, Volume-1, Special Issue-1
6.
7.
8.
9.
10.
11.
12.
13.
Gonzalez, R., Woods, R., Digital Image
Processing, 3rd Edition. Prentice Hall, 2008.
Terry, P., Vu, D., "Edge Detection Using Neural
Networks", Conference Record of the Twentyseventh Asilomar Conference on Signals, Systems
and Computers. Nov. 1993, pp.391-395.
Li, W., Wang, C., Wang, Q., Chen, G., "An Edge
Detection Method Based on Optimized BP Neural
Network", Proceedings of the International
Symposium on Information Science and
Engineering, Dec. 2008, pp. 40-44.
He, Z., Siyal, M., "Edge Detection with BP Neural
Networks", Proceedings of the International
Conference on Signal Processing, 1998, pp.13821384
Mehrara, H., Zahedinejad, M., Pourmohammad,
A., "Novel Edge Detection Using BP Neural
Network Based on Threshold Binarization",
Proceedings of the Second International
Conference on Computer and Electrical
Engineering, Dec. 2009, pp. 408-412.
Haykin, S., Neural Networks: A Comprehensive
Foundation, Prentice Hall, 1999.
Neural Network Toolbox User’s Guide, The
Mathworks.
http://en.wikipedia.org/wiki/Bladder_cancer
123 | Page
September 2014, Volume-1, Special Issue-1
SECURE CLOUD ARCHITECTURE FOR HOSPITAL
INFORMATION SYSTEM
Menaka.C1, R.S.Ponmagal2
1
Research Scholar
2
Professor
1
Bharathiyar University
2
Dept. of Computer Science and Engineering
2
Dr.MGR Educational and Research Institute, Chennai, Tamil Nadu, India
1
[email protected]
2
[email protected]
ABSTRACT
Using cloud storage, users can remotely store
their data and enjoy the on-demand high-quality
applications and services from a shared pool of
configurable computing resources, without the burden
of local data storage and maintenance. Hospital
Information system such as Telemedicine is an
important application which is recently gaining
momentum on cloud. As telemedicine not only promises
to dramatically reduce the costs, but at the same time it
makes access to care easier for patients and makes more
revenue attainable for practices. Despite cloud's
attractiveness, it has got tremendous security concerns
including accessibility issues, user authentication,
confidentiality concerns, verification of data integrity,
risk identification and mitigation, as well as insider
threats from cloud provider staff. Precise identification
of the patient/clinician during authentication process is
a vital requirement for telemedicine cloud services as it
involves sensitive physiological data. This paper
proposes a secure cloud architecture which includes an
authentication system for telemedicine cloud using a set
of different unobtrusive physiological sensors (ECG)
and web camera that continuously authenticate the
identity of the user. This new type of authentication is
called dynamic authentication. We further extend our
result to enable the TPA to perform audits for multiple
patients simultaneously and efficiently. Extensive
security and performance analysis show the proposed
schemes are provably secure and highly efficient.
Keywords—
Cloud
computing,
Telemedicine, authentication.
TPA,
I. INTRODUCTION
Cloud Computing has been envisioned as the
next-generation architecture of IT enterprise, due to its
long list of unprecedented advantages in the IT history:
on-demand self-service, ubiquitous network access,
location independent resource pooling, rapid resource
elasticity, usage-based pricing. As a disruptive
technology with profound implications, Cloud Computing
is transforming the very nature of how businesses use
information technology. One fundamental aspect of this
paradigm shifting is that data is being centralized or
124 | Page
outsourced into the Cloud. From users’ perspective,
including both individuals and enterprises, storing data
remotely into the cloud in a flexible on-demand manner
brings appealing benefits: relief of the burden for storage
management, universal data access with independent
geographical locations, and avoidance of capital
expenditure on hardware, software, and personnel
maintenances, etc. Although the infrastructures under the
cloud are much more powerful and reliable than personal
computing devices, they are still facing the broad range of
both internal and external threats for data integrity. As
users no longer physically possess the storage of their
data, traditional cryptographic primitives for the purpose
of data security protection cannot be directly adopted.
Thus, how to efficiently verify the correctness of
outsourced cloud data without the local copy of data files
becomes a big challenge for data storage security in
Cloud Computing.
To avail the telemedicine services through
cloud, it is necessary that the identity of the user who may
be a patient or clinician need to be ensured throughout the
session. This can be done using a process called as
continuous authentication. In this kind of authentication
static authentication is done when first accessing a cloud
service and will be valid throughout a full session, until
the user logs off from that session. Hence in this paper a
continuous authentication scheme using ECG or
keystroke along with facial recognition is used.
To fully ensure the data security and save the cloud
users’ computation resources, it is of critical importance
to enable public auditability for cloud data storage so that
the users may resort to a third party auditor (TPA), who
has expertise and capabilities that the users do not, to
audit the outsourced data when needed. Based on the
audit result, TPA could release an audit report, which
would not only help users to evaluate the risk of their
subscribed cloud data services, but also be beneficial for
the cloud service provider to improve their cloud based
service platform. Section II discusses about the related
issues, Section III details about the proposed system
architectures. Section IV, explains the implementation
part and Section V concludes the work.
September 2014, Volume-1, Special Issue-1
II. RELATED WORK
Kallahalla et al. proposed [2] a cryptographic
storage system that enables secure file sharing on
untrusted servers, named Plutus. By dividing files into
filegroups and encrypting each filegroup with a unique
file-block key, the data owner can share the filegroups
with others through delivering the corresponding lockbox
key, where the lockbox key is used to encrypt the fileblock keys. However, it brings about a heavy key
distribution overhead for large-scale file sharing.
Additionally, the file-block key needs to be updated and
distributed again for a user revocation.
Cloud storage enables users to remotely store
their data and enjoy the on-demand high quality cloud
applications without the burden of local hardware and
software management. Though the benefits are clear,
such a service is also relinquishing users’ physical
possession of their outsourced data, which inevitably
poses new security risks towards the correctness of the
data in cloud. In order to address this new problem and
further achieve a secure and dependable cloud storage
service, we propose in this paper a flexible distributed
storage integrity auditing mechanism, utilizing the
homomorphic token and distributed erasure-coded data.
The proposed design allows users to audit the cloud
storage with very lightweight communication and
computation cost. The auditing result not only ensures
strong cloud storage correctness guarantee, but also
simultaneously achieves fast data error localization, i.e.,
the identification of misbehaving server. Considering the
cloud data are dynamic in nature, the proposed design
further supports secure and efficient dynamic operations
on outsourced data, including block modification,
deletion, and append. Analysis shows the proposed
scheme is highly efficient and resilient against Byzantine
failure, malicious data modification attack, and even
server colluding attacks.
Considering TPA might [3] learn unauthorized
information through the auditing process, especially from
owners' unencrypted cloud data, new privacy-preserving
storage auditing solutions are further entailed in the cloud
to eliminate such new data privacy vulnerabilities.
Moreover, for practical service deployment, secure cloud
storage auditing should maintain the same level of data
correctness assurance even under the condition that data
is dynamically changing, and/or multiple auditing request
are performed simultaneously for improved efficiency.
Techniques we are investigating/developing for these
research tasks include proof of storage, random-masking
sampling, sequence-enforced Merkle Hash Tree, and their
various extensions/novel combinations.
session. This can be done using a process called as
continuous authentication
Guennoun, M et. al. [5] proposes a framework for
continuous authentication of the user based on the
electrocardiogram data collected from the user's heart
signal. The electrocardiogram (ECG) data is used as a
soft biometric to continuously authenticate the identity of
the user. Continuous User Authentication Using
Multimodal Biometrics for Cloud Based Telemedicine
Application [6], is been discussed with two phases of
algorithm.
To securely introduce an effective third party
auditor (TPA), the following two fundamental
requirements have to be met: 1) TPA should be able to
efficiently audit the cloud data storage without
demanding the local copy of data, and introduce no
additional on-line burden to the cloud user; 2) the third
party auditing process should bring in no new
vulnerabilities towards user data privacy. We utilize and
uniquely combine the public key based homomorphic
authenticator with random masking to achieve the
privacy-preserving public cloud data auditing system,
which meets all above requirements. Extensive security
and performance analysis shows the proposed schemes
are provably secure and highly efficient.
Another concern is that the computation overhead of
encryption linearly increases with the sharing scale.
Ateniese et al.leveraged proxy reencryptions to secure[8]
distributed storage. Specifically, the data owner encrypts
blocks of content with unique and symmetric content
keys, which are further encrypted under a master public
key.
III. PROPOSED SYSTEM ARCHITECTURE
The cloud hospital information system called
Telemedicine system is having a large amount of data to
be stored in the cloud and the user is not able to check the
integrity of the data which is stored in the cloud storage.
• Patient Members/Clinician: cloud user has a large
amount of data files to be stored in the cloud
• Hospital Manager: cloud server which is managed by
the CSP and has significant data storage and computing
power.
• TPA: third party auditor has expertise and capabilities
that Patient and Manager don’t have. TPA is trusted to
assess the CSP’s storage security upon request from
Patient/Clinician.
The situation that has been envisaged is where a
user provides an identity and gives proof of his
identity[4], in order to get access to certain medical
services. To avail the telemedicine services through
cloud, it is necessary that the identity of the user who may
be a patient or clinician need to be ensured throughout the
125 | Page
September 2014, Volume-1, Special Issue-1
Revocation list
Data
Enrollment
Key Distribution
Fig 1. Secure Cloud Architecture
Each user has to compute revocation parameters
to protect the confidentiality from the revoked users from
the revoked users in the dynamic broadcast encryption
scheme, which results in that both the computation
overhead of the encryption and the size of the ciphertext
increase with the number of revoked users. To tackle this
challenging issue, it is proposed that the manager
compute the revocation parameters and make the result
public available by migrating them into the cloud. Such a
design can significantly reduce the computation overhead
of users to encrypt files and the ciphertext size.
There are two phases in this proposed method.
The registration and login process. Continuous
Authentication (CA) systems represent a new
generation of security mechanisms that continuously
monitor the patient behavior/ physiological signal and
use this as basis to re-authenticate periodically throughout
a login session. Different technologies can be used to
develop a CA system. In this paper a face recognition
camera on a computer that can detect when a user has
changed is the first biometric and ECG/keystroke can be
used as the second biometric. These two can be combined
to provide a robust and efficient authentication for the
user.
3.1
Implementation is the stage of the project when the
theoretical design is turned out into a working system.
Thus it can be considered to be the most critical stage in
achieving a successful new system and in giving the user,
confidence that the new system will work and be
effective. The implementation stage involves careful
planning, investigation of the existing system and it’s
constraints on implementation, designing of methods to
achieve changeover and evaluation of changeover
methods.
1. Periodic Sampling Batch Audit
The Batch TPA (or other applications) issues a
“Random Sampling” challenge to audit the integrity and
availability of outsourced data in terms of the verification
information stored in TPA.
2. Audit for Dynamic Operations:
An authorized application, which holds data
owner’s secret key (sk), can manipulate the outsourced
data and update the associated index hash table stored in
TPA. The privacy of (sk) and the checking algorithm
ensure that the storage server cannot cheat the authorized
applications and forge the valid audit records.
3. Third Party Auditor
In this module, Auditor views the all user data
and verifying data .Auditor directly views all user data
without key. Admin provided the permission to Auditor.
After auditing data, store to the cloud.
Registration Phase:
Step 1:
3.2
stroke characteristics from the key stroke
biometrics.
Step 4: Eks(RR, Facial_feature[], Request))
Step 5: Repeat steps 1,2 ,3 and 4 periodically within a
session.
Step 6: During authentication the facial feature template
extracted is verified against the template stored
and
the
ECG/Keystroke
features
are
compared with the previously acquired feature
into the biometric database.
During Registration the user has to render the
face and ECG/Keystroke Biometric and the
features are extracted and stored into the
biometric database of the server.
Login Phase:
Step 1: Acquire a frame containing face using web
camera and imagegrab.
Step 2: Extract the facial features into a vector facial
features[] using MATLAB from the acquired
frame.
Step 3: Acquire ECG signal periodically through the ECG
sensor and extract RR interval RR or extract key
126 | Page
Fig 2. Third Party Auditing System Architecture.
September 2014, Volume-1, Special Issue-1
4. User Registration and Control:
In this module, the user registration process is
done by the admin. Here every user’s give their personal
details for registration process. After registration every
user will get an ID for accessing the cloud space. If any
of the user wants to edit their information they have
submit the details to the admin after that the admin will
do the edit and update information process. This process
is controlled by the Admin.
5. Sharing Information’s:
In this module, every user’s share their
information and data’s in their own cloud space provided
by the admin. That information may be sensitive or
important data’s. For providing security for their
information every user’s storing the information in their
specific cloud. Registered users only can store the data in
cloud.
6. Proxy Re-Encryption:
Proxy re-encryption schemes are crypto systems
which allow third parties (proxies) to alter a cipher text
which has been encrypted for one user, so that it may be
decrypted by another user. By using proxy re-encryption
technique the encrypted data (cipher text) in the cloud is
again altered by the user. It provides highly secured
information stored in the cloud. Every user will have a
public key and private key. Public key of every user is
known to everyone but private key is known only the
particular user.
7. Integrity Checking:
Integrity checking is the process of comparing
the encrypted information with altered cipher text. If
there is any change in detection a message will send to
the user that the encryption process is not done properly.
If there is no change in detection means then it will allow
doing the next process. Integrity checking is mainly used
for anti-malware controls.
8. Data Forwarding:
In this module, the encrypted data or information
stored in the cloud is forwarded to another user account
by using that user’s public key. If any user wants to share
their information with their friends or someone they can
directly forward the encrypted data to them. Without
downloading the data the user can forward the
information to another user.
IV. IMPLEMENTATION
A private cloud with minimum number of
systems was developed by using a LAN connectivity. The
cloud infrastructure is implemented successfully by using
the Ultidev, a cloud deployment tool. The cloud user,
auditor, cloud admin are using different systems. The file
is uploaded by the user from a system and the auditor is
able to check the integrity of the data from another
system.
We also extended the work by performing batch auditing.
Batch auditing is doing multiple auditing at the same time
127 | Page
for different users. With computer networks spreading
into a variety of new environments, the need to
authenticate and secure communication grows. Many of
these new environments have particular requirements on
the applicable cryptographic primitives. For instance,
several applications require that communication overhead
be small and that many messages be processed at the
same time. In this paper we consider the suitability of
public key signatures in the latter scenario. That is, we
consider signatures that are 1) short and 2) where many
signatures from (possibly) different signers on (possibly)
different messages can be verified quickly.We propose
the first batch verifier for messages from many (certified)
signers without random oracles and with a verification
time where the dominant operation is independent of the
number of signatures to verify. We further propose a new
signature scheme with very short signatures, for which
batch verification for many signers is also highly
efficient. Prior work focused almost exclusively on
batching signatures from the same signer. Combining our
new signatures with the best known techniques for
batching certificates from the same authority, we get a
fast batch verifier for certificates and messages combined.
Although our new signature scheme has some
restrictions, it is the only solution, to our knowledge, that
is a candidate for some pervasive communication
applications.
We designed the work in ASP.Net and code is
written in VB.net for front end designed. The reports in
Crystal Reports is built in Micro Soft Visual Studio 2008.
Vb.Net is very flexible and easy to understand any
application developer. Microsoft SQL Server is a
Structured Query Language (SQL) based, client/server
relational database. Each of these terms describes a
fundamental part of the architecture of SQL Server.
V. CONCLUSION
In this paper, we propose a secure cloud
architecture system which has a number of applications
including the health service for defense services wherein
the health condition of the soldier can be continuously
monitored with strong authentication. The defense
personnel need not carry their health record along with
them as the clinician or the defense personnel can access
the health record from the cloud.
Privacy-preserving public auditing system for
data storage security in Cloud Computing is also
proposed where TPA can perform the storage auditing
without demanding the local copy of data. Considering
TPA may concurrently handle multiple audit sessions
from different users for their outsourced data files, we
further extend our privacy-preserving public auditing
protocol into a multi-user setting, where TPA can
perform the multiple auditing tasks in a batch manner,
i.e., simultaneously. Extensive security and performance
analysis shows that the proposed schemes are provably
secure and highly efficient. We believe all these
September 2014, Volume-1, Special Issue-1
advantages of the proposed schemes will shed light on
economies of scale for Cloud Computing.
REFERENCES
[1] P. Mell and T. Grance, “Draft NIST working
definition of cloud computing,” Referenced on June.
3rd,
2009
Online
at
http://csrc.nist.gov/groups/SNS/cloudcomputing/index. html, 2009.
[2] M. Kallahalla, E. Riedel, R. Swaminathan, Q. Wang,
and K. Fu, “Plutus: Scalable Secure File Sharing on
Untrusted Storage,” Proc. USENIX Conf. File and
Storage Technologies, pp. 29-42, 2003.
[3] Justin J Sam ,V.Cyril Raj,”SystemPrivacy Preserving
Third Party Auditing for Ensuring Data Integrity in
Cloud Computing “, Proceedings of the National
Conference NCICT2014.
[4] Zheng Hua Ten “Biometrics and the cloud”,CWICTiF workshop on Cloud Communication and
applications,2011, Copenhagen.
[5] M. Guennoun, N. Abbad, J. Talom, Sk. Md. M.
Rahman,
and
K.
El-Khatib,
“Continuous
Authentication by Electrocardiogram Data”, 2009
IEEE Toronto International Conference Science and
Technology for Humanity (TIC-STH 2009), ISBN:
978-1-4244-3877-8, 26-27 September, Toronto, ON,
Canada, pp. 40 – 42, 2009.
[6] Rajeswari Mukesh, Continuous User Authentication
Using Multimodal Biometrics for Cloud Based
Telemedicine
Application, Proceedings of the
National Conference NCICT2014.
[7] M. A. Shah, R. Swaminathan, and M. Baker,
“Privacy- preserving audit and extraction of digital
contents,” Cryptol- ogy ePrint Archive, Report
2008/186, 2008.
[8] G. Ateniese, R. Burns, R. Curtmola, J. Herring, L.
Kissner, Z. Peterson, and D. Song, “Provable data
possession at un- trusted stores,” in Proc. of CCS’07,
Alexandria, VA, October 2007, pp. 598–609.
128 | Page
September 2014, Volume-1, Special Issue-1
IMPROVING SYSTEM PERFORMANCE THROUGH
GREEN COMPUTING
A. Maria Jesintha1, G. Hemavathi2
1
M.Tech. Information Technology
2
I year
1,2
Aarupadai Veedu Institute of Technology, Chennai
1
[email protected]
2
[email protected]
ABSTRACT
Green computing or green IT, refers to environmentally
sustainable computing or IT. It is the study and practice
of designing, manufacturing, using, and disposing of
computers, servers, and associated subsystems—such as
monitors, printers, storage devices, and networking and
communications systems—efficiently and effectively
with minimal or no impact on the environment. Green
IT strives to achieve economic viability and improved
system performance and use, while abiding by our
social and ethical responsibilities.
Thus, green IT includes the dimensions of
environmental sustainability, the economics of energy
efficiency, and the total cost of ownership, which
includes the cost of disposal and recycling. It is the
study and practice of using computing resources
efficiently.
This paper will take a look at several green initiatives
currently under way in the computer industry.
Keywords: Environment, carbon, electricity, power,
solar, VIA technology.
Introduction
Green computing researchers look at key issues
and topics related to energy efficiency in computing and
promoting
environmentally
friendly
computer
technologies and systems include energy-efficient use of
computers, design of algorithms and systems for
environmentally-friendly computer technologies, and
wide range of related topics.
With increasing recognition that man-made
greenhouse gas emissions are a major contributing factor
to global warming, enterprises, governments, and society
at large now have an important new agenda: tackling
environmental issues and adopting environmentally
sound practices. Greening our IT products, applications,
services and practices an economic and an environmental
imperative, as well as our social responsibility.
Therefore, a growing number of IT vendors and
users are moving toward green IT and thereby assisting in
building a green society and economy.
A Brief History of Green Computing
One of the first manifestations of the green
computing movement was the launch of the Energy Star
program back in 1992. Energy Star served as a kind of
129 | Page
voluntary label awarded to computing products that
succeeded in minimizing use of energy while maximizing
efficiency.
Energy Star applied to products like computer
monitors, television sets and temperature control devices
like refrigerators, air conditioners, and similar items.
One of the first results of green computing was the Sleep
mode function of computer monitors which places a
consumer's electronic equipment on standby mode when
a pre-set period of time passes when user activity is not
detected. As the concept developed, green computing
began to encompass thin client solutions, energy cost
accounting, virtualization practices, e-Waste, etc.
Roads to Green Computing
To comprehensively and effectively address the
environmental impacts of computing/IT, we must adopt a
holistic approach and make the entire IT lifecycle greener
by addressing environmental sustainability along the
following four complementary paths:
Green use — reducing the energy consumption of
computers and other information systems as well as
using them in an environmentally sound manner
Green disposal — refurbishing and reusing old
computers and properly recycling unwanted computers
and other electronic equipment
Green design — designing energy-efficient and
environmentally sound components, computers,
servers, cooling equipment, and data centers
Green manufacturing — manufacturing electronic
components, computers, and other associated
subsystems with minimal impact on the environment.
Governments go Green
Many governments worldwide have initiated
energy-management programs, such as Energy Star, an
international standard for energy-efficient electronic
equipment that was created by the United States
Environmental Protection Agency in 1992 and has now
been adopted by several other countries. Energy Star
reduces the amount of energy consumed by a product by
automatically switching it into ―sleep‖ mode when not in
use or reducing the amount of power used by a product
when in ―standby‖ mode. Surprisingly, standby
―leaking,‖ the electricity consumed by appliances when
they are switched off, can represent as much as 12
percent of a typical household’s electricity consumption.
September 2014, Volume-1, Special Issue-1
In Australia, standby power is a primary factor
for the country’s increased greenhouse gas emissions —
more than 5 megatons (CO2 equivalent) annually.
Worldwide, standby power is estimated to account for as
much as 1 percent of global greenhouse emissions. Most
of the energy used by products on standby does not result
any useful function. A small amount can be needed for
maintaining memory or an internal clock, remote-control
activation, or other features; but most standby power is
wasted energy. Energy Star–enabled products minimize
this waste.
Approaches to Green Computing
(a) Algorithmic efficiency
The efficiency of algorithms has an impact on
the amount of computer resources required for any given
computing function and there are many efficiency tradeoffs in writing programs. As computers have become
more numerous and the cost of hardware has declined
relative to the cost of energy, the energy efficiency and
environmental impact of computing systems and
programs has received increased attention. A study by
Alex Wissner-Gross, a physicist at Harvard, estimated
that the average Google search released 7 grams of
carbon dioxide (CO₂ ). However, Google disputes this
figure, arguing instead that a typical search produces only
0.2 grams of CO₂ .
(b) Power management
The Advanced Configuration and Power
Interface (ACPI), an open industry standard, allows an
operating system to directly control the power saving
aspects of its underlying hardware. This allows a system
to automatically turn off components such as monitors
and hard drives after set periods of inactivity. In addition,
a system may hibernate, where most components
(including the CPU and the system RAM) are turned off.
ACPI is a successor to an earlier Intel-Microsoft standard
called Advanced Power Management, which allows a
computer's BIOS to control power management
functions.
Some programs allow the user to manually
adjust the voltages supplied to the CPU, which reduces
both the amount of heat produced and electricity
consumed. This process is called undervolting.
Some CPUs can automatically undervolt the
processor depending on the workload; this technology is
called
"SpeedStep"
on
Intel
processors,
"PowerNow!"/"Cool'n'Quiet" on AMD chips, LongHaul
on VIA CPUs, and LongRun with Transmeta processors.
Recently a computer activity and putting computers into
power saving mode if they are idle. 1000 PC and more
can be admisitered very easily resulting in energy
consumption reduction of 40 - 80%.
(c) Storage
130 | Page
Smaller form factor (e.g. 2.5 inch) hard disk
drives often consume less power per gigabyte than
physically larger drives. Unlike hard disk drives, solidstate drives store data in flash memory or DRAM. With
no moving parts, power consumption may be reduced
somewhat for low capacity flash based devices. Even at
modest sizes, DRAM-based SSDs may use more power
than hard disks, (e.g., 4GB i-RAM uses more power and
space than laptop drives). Though most flash based drives
are generally slower for writing than hard disks. In a
recent case study Fusion-io, manufacturers of the world's
fastest Solid State Storage devices, managed to reduce the
carbon footprint and operating costs of MySpace data
centers by 80% while increasing performance speeds
beyond that which is was attainable by multiple hard disk
drives in Raid 0. In response, MySpace was able to
permanently retire several of their servers, including all
heavy-load servers, further reducing their carbon
footprint.
(d) Display
LCD monitors typically use a cold-cathode
fluorescent bulb to provide light for the display. Some
newer displays use an array of light-emitting diodes
(LEDs) in place of the fluorescent bulb, which reduces
the amount of electricity used by the display.[32]
(e) Operating system issues
Microsoft has been heavily critizied for
producing operating systems that, out of the box, are not
energy efficient. Due to Microsoft's dominance of the
huge desktop operating system market this may have
resulted in more energy waste than any other initiative by
other vendors. Microsoft claim to have improved this in
Vista, though the claim is disputed.This problem has been
compounded because Windows versions before Vista did
not allow power management features to be configured
centrally by a system administrator. This has meant that
most organisations have been unable to improve this
situation.
Again, Microsoft Windows Vista has improved
this by adding basic central power management
configuration. The basic support offered has been
unpopular with system administrators who want to
change policy to meet changing user requirements or
schedules. Several software products have been
developed to fill this gap including Auto Shutdown
Manager,Data Synergy PowerMAN, Faronics Power
Save, 1E NightWatchman, Verdiem Surveyor/Edison,
Verismic Power Manager, WakeupOnStandBy (WOSB),
TOff and Greentrac (also promotes behavioral change)
among others.
(f) Materials recycling
Computer systems that have outlived their
particular function can be repurposed, or donated to
various charities and non-profit organizations.
However, many charities have recently imposed
minimum system requirements for donated equipment.
Additionally, parts from outdated systems may be
September 2014, Volume-1, Special Issue-1
salvaged and recycled through certain retail outlets and
municipal or private recycling centers.
Recycling computing equipment can keep
harmful materials such as lead, mercury, and hexavalent
chromium out of landfills, but often computers gathered
through recycling drives are shipped to developing
countries where environmental standards are less strict
than in North America and Europe.The Silicon Valley
Toxics Coalition estimates that 80% of the post-consumer
e-waste collected for recycling is shipped abroad to
countries such as China and Pakistan.
VIA Technologies Green Computing
VIA Technologies, a Taiwanese company that
manufactures motherboard chipsets, CPUs, and other
computer hardware, introduced its initiative for "green
computing" in 2001. With this green vision, the company
has been focusing on power efficiency throughout the
design and manufacturing process of its products. Its
environmentally friendly products are manufactured
using a range of clean-computing strategies, and the
company is striving to educate markets on the benefits of
green computing for the sake of the environment, as well
as productivity and overall user experience.
(a) Carbon-free computing
One of the VIA Technologies’ ideas is to
reduce the "carbon footprint" of users — the amount of
greenhouse gases produced, measured in units of carbon
dioxide (CO2). Greenhouse gases naturally blanket the
Earth and are responsible for its more or less stable
temperature. An increase in the concentration of the main
greenhouse gases — carbon dioxide, methane, nitrous
oxide, and fluorocarbons — is believed to be responsible
for Earth's increasing temperature, which could lead to
severe floods and droughts, rising sea levels, and other
environmental effects, affecting both life and the world's
economy. After the 1997 Kyoto Protocol for the United
Nations Framework Convention on Climate Change, the
world has finally taken the first step in reducing
emissions. The emissions are mainly a result of fossilfuel-burning power plants. (In the United States, such
electricity generation is responsible for 38 percent of the
country’s carbon dioxide emissions.)In addition, VIA
promotes the use of such alternative energy sources as
solar power, so power plants wouldn't need to burn as
much fossil fuels, reducing the amount of energy used.
Wetlands also provide a great service in sequestering
some of the carbon dioxide emitted into the atmosphere.
Although they make up only 4 to 6 percent of the Earth's
landmass, wetlands are capable of absorbing 20 to 25
percent of the atmospheric carbon dioxide. VIA is
working closely with organizations responsible for
preserving wetlands and other natural habitats, and others
who support extensive recycling programs for ICT
equipment. The amount paid to these organizations will
be represented by a proportion of the carbon-free
product’s price.
131 | Page
Carbon-emissions control has been a key issue
for many companies who have expressed a firm
commitment to sustainability. Dell is a good example of a
company with a green image, known for its free
worldwide product-recycling program. Dell’s Plant a
Tree for Me project allows customers to offset their
carbon emissions by paying an extra $2 to $4, depending
on the product purchased. AMD, a global microprocessor
manufacturer, is also working toward reducing energy
consumption in its products, cutting back on hazardous
waste and reducing its eco-impact. The company’s use of
silicon-on-insulator
(SOI)
technology
in
its
manufacturing, and strained silicon capping films on
transistors (known as ―dual stress liner‖ technology),
have contributed to reduced power consumption in its
products.
(b) Solar Computing
Amid the international race toward alternativeenergy sources, VIA is setting its eyes on the sun, and the
company's Solar Computing initiative is a significant part
of its green-computing projects.
Solar powered computing
For that purpose VIA partnered with Motech
Industries, one of the largest producers of solar cells
worldwide. Solar cells fit VIA are power-efficient silicon,
platform, and system technologies and enable the
company to develop fully solar-powered devices that are
nonpolluting, silent, and highly reliable. Solar cells
require very little maintenance throughout their lifetime,
and once initial installation costs are covered, they
provide energy at virtually no cost. Worldwide
production of solar cells has increased rapidly over the
last few years; and as more governments begin to
recognize the benefits of solar power, and the
development of photovoltaic technologies goes on, costs
are expected to continue to decline.
As part of VIA's ―pc-1‖ initiative, the company
established the first-ever solar-powered cyber community
center in the South Pacific, powered entirely by solar
technology.
(c) Energy-efficient computing
A central goal of VIA’s green-computing
initiative is the development of energy-efficient platforms
for low-power, small-form-factor (SFF) computing
devices. In 2005, the company introduced the VIA C7-M
and VIA C7 processors that have a maximum power
consumption of 20W at 2.0GHz and an average power
consumption of 1W. These energy-efficient processors
produce over four times less carbon during their operation
and can be efficiently embedded in solar-powered
devices.
September 2014, Volume-1, Special Issue-1
VIA isn’t the only company to address
environmental concerns: Intel, the world's largest
semiconductor maker, revealed eco-friendly products at a
recent conference in London. The company uses
virtualization software, a technique that enables Intel to
combine several physical systems into a virtual machine
that runs on a single, powerful base system, thus
significantly reducing power consumption. Earlier this
year, Intel joined Google, Microsoft, and other companies
in the launch of the Climate Savers Computing Initiative
that commits businesses to meet the Environmental
Protection Agency’s Energy Star guidelines for energyefficient devices.
On the horizon
Green technology is gaining more and more
public attention through the work of environmental
organizations and government initiatives. VIA is one of
the first corporations to concentrate on green computing
that seems less a passing trend than a first step toward
significant changes in technology. In May 2007, IBM
unveiled its Project Big Green, dedicated to increasing
energy efficiency across the company's branches around
the world. Experts say that businesses will continue to
invest in clean computing, not only because of future
regulations, policies, and social demands to reduce their
carbon footprint, but also due to the significant long-term
savings it can make.
Several companies are already headfirst into the
green-computing business. Located in the Silicon Valley
and founded in 2006, Zonbu was the first company to
introduce a completely environmentally responsible
computer - Their "Zonbox" computer is a carbonemission neutral computer, thanks to a low-power design
and regulatory-grade carbon offsets. The device, which
complies both to Energy Star standards and the
Restriction of Hazardous Substances Directive (RoHS),
consumes only 15W, compared to the 175W consumed
by a typical desktop PC. Zonbu also provides a free takeback program to minimize environmental e-waste.
Conclusion
So Green computing is a mindset that asks how
we can satisfy the growing demand for network
computing without putting such pressure on the
environment. There is an alternative way to design a
processor and a system such that we don't increase
demands on the environment, but still provide an
increased amount of processing capability to customers to
satisfy their business needs. Green computing is not about
going out and designing biodegradable packaging for
products. Now the time came to think about the
efficiently use of computers and the resources which are
non renewable. It opens a new window for the new
entrepreneur for harvesting with E-waste material and
scrap computers.
References
1. San Murugesan,"Going Green with IT: You’re
Responsibility toward Environmental Sustainability."
Cutter Consortium Business-IT Strategies Executive
Report, Vol. 10, No. 8, August 2007.
2. http://greensoda.cs.berkeley.edu/wiki/index.php/Deskto
p,
3. http://www.itpowersaving.com
4. http://www.greenIT-conferences.org
5. ‖The common sense of lean and green IT"
Zonbu’s Zonbox
132 | Page
September 2014, Volume-1, Special Issue-1
FINDING PROBABILISTIC PREVALENT COLOCATIONS IN
SPATIALLY UNCERTAIN DATA MINING IN AGRICULTURE
USING FUZZY LOGICS
Ms.Latha.R1, Gunasekaran E2
1
Assistant Professor
2
Student
1,2
Department of Computer Science & Engg., Aarupadai Veedu Institute of Technology,
Vinayaka Missions University
Abstract: A spatial collocation pattern is a group of
spatial features whose instances are frequently
located together in geographic space. Discovering
collocations has many useful applications. For
example, collocated plants pecies discovered from
plant distribution datasets can contribute to the
analysis of plant geography ,phytosociology studies,
and plant protection recommendations. In this paper,
we study the collocation mining problem in the
context of uncertain data, as the data generated from
a wide range of data sources are in herently
uncertain. One straight forward method to mine the
prevalent collocations in a spatially uncertain data
set is to simply compute the expected participation
index of a candidate and decide if it exceeds a
minimum prevalence threshold. Although this
definition has been widely adopted, it misses
important information about the confidence which
can be associated with the participation index of a
colocation. We
propose
another
definition,
probabilistic prevalent colocations, trying to find all
the collocations that are likely to be prevalent in a
randomly generated possible world. Finding
probabilistic prevalent colocations (PPCs) turn out to
be difficult. First, we propose pruning strategies for
candidates to reduce the amount of computation of
the probabilistic participation index values. Next, we
design an improved dynamic programming algorithm
for identifying candidates. This algorithm is suitable
for
parallel
computation,
and
approximate
computation. Finally, the effectiveness and efficiency
of the methods proposed as well as the pruning
strategies and the optimization techniques are
verified
by
extensive
experiments
with
“real þ synthetic” spatially uncertain data sets.
I. INTRODUCTION
Co-location patterns represent subsets of
Boolean spatial features whose instances are often located
in close geographic proximity. Figure 1 shows a dataset
consisting of instances of several Boolean spatial
features, each represented by a distinct shape. A careful
review reveals two co-location patterns. Real-world
examples of co-location patterns include symbiotic
species, e.g., the Nile crocodile and Egyptian Plover in
ecology. Boolean spatial features describe the presence or
absence of geographic object types at different locations
in a two dimensional or three dimensional metric space,
such as the surface of the Earth. Examples of Boolean
133 | Page
spatial features include plant species, animal species, road
types, cancers, crime, and business types. Advanced
spatial data collecting systems, such as NASA Earth’s
Observing System (EOS) and Global Positioning System
(GPS), have been accumulating increasingly large spatial
data sets. For instance, since 1999, more than a terabyte
of data has been produced by EOS every day. These
spatial data sets with explosive growth rate are considered
nuggets of valuable information.
The automatic discovery of interesting,
potentially useful, and previously unknown patterns from
large spatial datasets is being widely investigated via
various spatial data mining techniques. Classical spatial
pattern mining methods include spatial clustering spatial
characterization spatial outlier detection, spatial
prediction, and spatial boundary shape matching. Mining
spatial co-location patterns is an important spatial data
mining task. A spatial co-location pattern is a set of
spatial featuresthat are frequently located together in
spatial proximity. To illustrate the idea of spatial colocation patterns, let us consider a sample spatial data set,
as shown in Fig. 1. In the figure, there are various spatial
instances with different spatial features that are denoted
by different symbols. As can be seen, spatial feature +
and × tend to be located together because their instances
are frequently located in spatial proximity. The problem
of mining spatial co-location patterns can be related to
various application domains. For example, in location
based services, different services are requested by service
subscribers from their mobile PDA’s equipped with
locating devices such as GPS. Some types of services
may be requested in proximate geographic area, such as
finding the nearest Italian restaurant and the nearest
parking place. Location based service providers are very
interested in finding what services are requested
frequently together and located in spatial proximity.
This information can help them improve the
effectiveness of their location based recommendation
systems where a user requested a service in a location
will be recommended a service in a nearby location.
Knowing co-location patterns in location based services
may also enable the use of pre-fetching to speed up
service delivery. In ecology, scientists are interested in
finding frequent co-occurrences among spatial features,
such as drought, EI Nino, substantial increase/drop in
vegetation, and extremely high precipitation. The
previous studies on co-location pattern mining emphasize
frequent co-occurrences of all the features involved. This
September 2014, Volume-1, Special Issue-1
marks off some valuable patterns involving rare spatial
features. We say a spatial feature is rare if its instances
aresubstantially less than those of the other features in a
co-location. This definition of “rareness” is relative with
respect to other features in a co-location. A feature could
be rare in one co-location but not rare in another. For
example, if the spatial feature A has 10 instances, the
spatial feature B has 20 instances, and the spatial feature
C.
II. SYSTEM ANALYSIS
System Analysis is a combined process
dissecting the system responsibilities that are based on the
problem domain characteristics and user requirements
Existing System: A spatial colocation pattern is
a group of spatial features whose instances are
frequently located together in geographic space.
Discovering colocations has many useful applications.
For example, colocated plant species discovered from
plant distribution data sets can contribute to the analysis
of plant geography, phytosociology studies, and plant
protection recommendations.
Proposed System: New techniques for feature
selection and classification have been proposed for
improving the cultivation production. New feature
selection algorithms have been proposed and
implemented for selecting the optimal number of features
to improve the classification accuracy. An Intelligent
Conditional Probabilistic based Feature Selection
Algorithm has been proposed in this thesis for selecting
the optimal number of features to detect the intruders.
This proposed conditional probabilistic method reduces
redundancy in the number of features selection.
Therefore, it reduces the computation time required to
identify the relevant features. The proposed effectiveness
and efficiency of the methods as well as the pruning
strategies and the optimization techniques are to be
verified by extensive experiments with real + synthetic
spatially uncertain data mining in agriculture using fuzzy
rules. The proposed systems consider the temporal
constraints for effective classification.
III. SYSTEM DESCRIPTION
.NET introduced a unified programming
environment. All .NET-enabled languages compile to
"Microsoft Intermediate Language" before being
assembled into platform-specific machine code. Visual
Basic and C# are language wrappers around this common
.NET "language." Because all .NET-enabled compilers
speak the same underlying language, they no longer
suffer from the many data and language conflicts inherent
in other component-based systems such as COM. The
.NET version of Visual Studio also unified the standard
user interface that lets programmers craft source code.
.NET committed developers to object-oriented
technologies. Not only does .NET fully embrace the
object-oriented programming paradigm, everything in
.NET is contained in an object: all data values, all source
134 | Page
code blocks, and the plumbing for all user-initiated
events. Everything appears in the context of an object.
.NET simplified Windows programming. Programming
in Visual Basic before .NET was easy enough, until it
came time to interact with one of the API libraries,
something that happened a lot in professional
programming.
With .NET, most of these APIs are replaced with
a hierarchy of objects providing access to many
commonly needed Windows features. Because the
hierarchy is extensible, other vendors can add new
functionality without disrupting the existing framework.
.NET enhanced security. Users and administrators can
now establish security rules for different .NET features to
limit malicious programs from doing their damage.
.NET's "managed" environment also resolved buffer
overrun issues and memory leaks through features such as
strong data typing and garbage collection. .NET enhanced
developer productivity through standards. The .NET
Framework is built upon and uses many new and existing
standards, such as XML and SOAP. This enhances data
interchange not only on the Windows platform, but also
in interactions with other platforms and systems.
.NET enhanced Web-based development. Until
.NET, a lot of Web-based development was done using
scripting languages. .NET brings the power of compiled,
desktop development to the Internet. .NET simplified the
deployment of applications. If .NET is installed on a
system, releasing a program is as simple as copying its
EXE file to the target system (although an install program
is much more user-friendly). Features such as side-byside deployment, ClickOnce deployment (new in 2005),
and an end to file version conflicts and "DLL hell" (the
presence of multiple versions of the same DLL on a
system, or the inability to remove a version of a DLL)
make desktop and Web-based deployments a snap.
IV. SYSTEM DESIGN
System Design involves identification of classes
their relationship as well as their collaboration. In
objector, classes are divided into entity classes and
control classes. The Computer Aided Software
Engineering (CASE) tools that are available
commercially do not provide any assistance in this
transition. CASE tools take advantage of Meta modeling
that are helpful only after the construction of the class
diagram. In the FUSION method some object-oriented
approach likes Object Modeling Technique (OMT),
Classes, and Responsibilities. Collaborators (CRC), etc,
are used. Objector used the term ”agents” to represent
some of the hardware and software system. In Fusion
method, there is no requirement phase, where a user will
supply the initial requirement document. Any software
project is worked out by both the analyst and the
designer. The analyst creates the user case diagram. The
designer creates the class diagram. But the designer can
do this only after the analyst creates the use case diagram.
September 2014, Volume-1, Special Issue-1
Once the design is over, it is essential to decide
which software is suitable for the application.
UML Diagram of the Project: UML is a
standard language for specifying, visualizing, and
documenting of software systems and created by Object
Management Group (OMG) in 1997.There are three
important type of UML modeling are Structural model,
Behavioral model, and Architecture model. To model a
system the most important aspect is to capture the
dynamic behavior which has some internal or external
factors for making the interaction. These internal or
external agents are known as actors. It consists of actors,
use cases and their relationships. In this fig we represent
the Use Case diagram for our project.
Use case Diagram: A use case is a set of
scenarios that describing an interaction between a user
and a system. A use case diagram displays the
relationship among actors and use cases. The two main
components a user or another system that will interact
with the system modeled. A use case is an external view
of the system that represents some action the user might
perform in order to complete a task.
Activity Diagram: Activity diagram are
typically used for business process modeling for
modeling the logic captured by a single use case or usage
scenario, or for modeling the detailed logic of a business
rule. Although UML activity diagrams could potentially
model the internal logic of a complex operation it would
be far better to simply rewrite the operation so that it is
simple enough that you don’t requires an activity
diagram. In many ways UML activity diagrams are the
objects-oriented equivalent of flow charts and data flow
diagrams (DFDs) from structured development.
135 | Page
Sequence Diagram: A sequence diagram, in the context
of UML, represents object collaboration and is used to
define event sequences between objects for a certain
outcome. A sequence diagram is an essential component
used in processes related to analysis, design and
documentation.
Collaboration Diagram:A collaboration diagram
describes interactions among objects in terms of
sequenced messages. Collaboration diagrams represent a
combination of information taken from class, sequence,
and use case diagrams describing both the static structure
and dynamic behavior of a system.
September 2014, Volume-1, Special Issue-1
Class Diagram: A class diagram provides an
overview of a system by showing its classes and the
relationships among them. Class diagrams are static: they
display what interacts but not what happens during the
interaction. UML class notation is a rectangle divided
into three parts: class name, fields, and methods. Names
of abstract classes and interfaces are in italics.
Relationships between classes are the connecting links.
In Modeling, the rectangle is further divided with
separate partitions for properties and inner classes.
Data Flow Diagram: The Data Flow diagram is
a graphic tool used for expressing system requirements in
a graphical form. The DFD also known as the “bubble
chart” has the purpose of clarifying system requirements
and identifying major transformations that to become
program in system design. Thus DFD can be stated as the
starting point of the design phase that functionally
decomposes the requirements specifications down to the
lowest level of detail.
The DFD consist of series of
bubbles joined by lines. The bubbles represent data
transformations and the lines represent data flows in the
system. A DFD describes what that data flow in rather
than how they are processed. So it does not depend on
hardware, software, data structure or file organization.
Architectural Diagram:
136 | Page
V. IMPLEMENTATION
Implementation is the stage of the project when
the theoretical design is turned out into a working system
Modules Description:
Spatial Information Input Module
Co-Location Mining Identification
Co-Location Prediction Module
Filtering And Input classifier Validation Module
Top-K Preferable fuzzy temporal List Generation
Performance
Evaluation
–
CoLocation
Validations
Spatial Information Input Module:
Basically spatial means pertaining to or involving
or having the nature of space. Spatial data is any
data with a direct or indirect reference to a specific
location or geographical area.
Spatial data is the modern means of digitally
mapping features on the earth. Spatial data is used
in geographic information systems (GIS) which
merge cartography, statistical analysis and
database techniques.
In this module, the spatial information in the form
of latitude and longitude are gathered from the user
as an input.
September 2014, Volume-1, Special Issue-1
Co-Location Mining Identification:
Co-location are the neighborhood location of
particular location under examination.
In this module, based on the spatial location’s
latitude and longitude the neighbourhood locations
are mined.
The mining of data takes place in terms of distance
relation in miles and kilometres.
Miles and kilometres are the metric unit which is
used to calculate the distance.
Co-Location Prediction Module:
A prediction or forecast is a statement about the
way things will happen in the future, often but
not always based on experience or knowledge.
According
the
neighborhood
location
identification, the prediction about the location
are formulated.
In this module, the Co-location is predicted with
its best and mediate resources availability.
Filtering And Input classifier Validation Module:
Filtering is the process here to segregate the colocation based on the available resources from the
co-location prediction.
Land resources can be taken to mean the resources
available from the land. the agricultural land which
contain natural fertiliser for growth of the products
sown; the underground water, the various minerals
like coal, bauxite, gold and other raw materials.
In this module the process of the filtration is
carried out based on the type of input that defines
the resources required in the particular location
thereby validating the input.
137 | Page
Top-K Preferable fuzzy temporal List Generation
Top-k products are widely applied for retrieving
a ranked set of the k most interesting objects
based on the preferences.
Here reference means the selecting an item or
land for decision or consideration.
In this module, a selection technique is used to
define a list for the co-location with the available
resources thereby defining the severity of the
resource available.
Performance Evaluation – CoLocation Validations
Performance is characterized by the amount of
useful work accomplished by an application
compared to the time and resources used.
In this module, the validation of the co-location
with its available resources are checked for its
accuracy.
The validation of the co-location by the system
are visualized graphically.
September 2014, Volume-1, Special Issue-1
VI. SYSTEM TESTING
Testing Objectives
The purpose of testing is to discover errors.
Testing is the process of trying to discover every
conceivable fault or weakness in a work product. It
provides a way to check the functionality of components,
subassemblies, assemblies and/or a finished product It is
the process of exercising software with the intent of
ensuring that the Software system meets its requirements
and user expectations and does not fail in an unacceptable
manner. There are various types of test. Each test type
addresses a specific testing requirement.
TYPES OF TESTS
Unit testing
Unit testing involves the design of test cases that
validate that the internal program logic is functioning
properly, and that program inputs produce valid outputs.
All decision branches and internal code flow should be
validated. It is the testing of individual software units of
the application .it is done after the completion of an
individual unit before integration. This is a structural
testing, that relies on knowledge of its construction and is
invasive. Unit tests perform basic tests at component
level and test a specific business process, application,
and/or system configuration. Unit tests ensure that each
unique path of a business process performs accurately to
the documented specifications and contains clearly
defined inputs and expected results.
Integration testing
Integration tests are designed to test integrated
software components to determine if they actually run as
one program. Testing is event driven and is more
concerned with the basic outcome of screens or fields.
Integration tests demonstrate that although the
components were individually satisfaction, as shown by
successfully unit testing, the combination of components
is correct and consistent. Integration testing is specifically
aimed at
exposing the problems that arise from the
combination of components.
Functional test
Functional tests provide systematic demonstrations
that functions tested are available as specified by the
business
and
technical
requirements,
system
documentation, and user manuals.
Functional testing is centered on the following
items:
Valid Input
: identified classes of valid input must
be accepted.
Invalid Input
: identified classes of invalid input must
be rejected.
Functions
: identified functions must be exercised.
Output
: identified classes of application
outputs must be exercised.
Systems/Procedures: interfacing systems or procedures
must be invoked.
138 | Page
Organization and preparation of functional tests is
focused on requirements, key functions, or special test
cases. In addition, systematic coverage pertaining to
identify Business process flows; data fields, predefined
processes, and successive processes must be considered
for testing. Before functional testing is complete,
additional tests are identified and the effective value of
current tests is determined.
Test objectives
All field entries must work properly.
Pages must be activated from the identified link.
The entry screen, messages and responses must
not be delayed.
Features to be tested
Verify that the entries are of the correct format
No duplicate entries should be allowed
All links should take the user to the correct page.
SYSTEM TEST
System testing ensures that the entire integrated
software system meets requirements. It tests a
configuration to ensure known and predictable results. An
example of system testing is the configuration oriented
system integration test. System testing is based on
process descriptions and flows, emphasizing pre-driven
process links and integration points.
White Box Testing
White Box Testing is a testing in which in which the
software tester has knowledge of the inner workings,
structure and language of the software, or at least its
purpose. It is purpose. It is used to test areas that cannot
be reached from a black box level.
Black Box Testing
Black Box Testing is testing the software without
any knowledge of the inner workings, structure or
language of the module being tested. Black box tests, as
most other kinds of tests, must be written from a
definitive source document, such as specification or
requirements document, such as specification or
requirements document. It is a testing in which the
software under test is treated, as a black box .you cannot
“see” into it. The test provides inputs and responds to
outputs without considering how the software works.
Unit Testing:
Unit testing is usually conducted as part of a
combined code and unit test phase of the software
lifecycle, although it is not uncommon for coding and
unit testing to be conducted as two distinct phases.
Test strategy and approach
Field testing will be performed manually and
functional tests will be written in detail.
September 2014, Volume-1, Special Issue-1
Test objectives
All field entries must work properly.
Pages must be activated from the identified link.
The entry screen, messages and responses must not
be delayed.
Features to be tested
Verify that the entries are of the correct format
No duplicate entries should be allowed
All links should take the user to the correct page.
Integration Testing
Software integration testing is the incremental
integration testing of two or more integrated software
components on a single platform to produce failures
caused by interface defects.
The task of the integration test is to check that
components or software applications, e.g. components in
a software system or – one step up – software
applications at the company level – interact without error.
Test Results: All the test cases mentioned above passed
successfully. No defects encountered.
[4] C.C. Aggarwal and P.S. Yu, “A Survey of Uncertain
Data Algorithms and Applications,” IEEE Trans.
Knowledge and Data Eng. (TKDE), vol. 21, no. 5, pp.
609-623, May 2009.
[5] T. Bernecker, H-P Kriegel, M. Renz, F. Verhein, and
A. Zuefle, “Probabilistic Frequent Itemset Mining in
Uncertain Databases,” Proc. 15th ACM SIGKDD
Conf. Knowledge Discovery and Data Mining (KDD
’09), pp. 119-127, 2009.
[6] C.-K. Chui, B. Kao, and E. Hung, “Mining Frequent
Itemsets from Uncertain Data,” Proc. 11th PacificAsia Conf. Knowledge Discovery and Data Mining
(PAKDD), pp. 47-58, 2007.
[7] C.-K. Chui and B. Kao, “A Decremental Approach for
Mining Frequent Itemsets from Uncertain Data,” Proc.
12th Pacific-Asia Conf. Knowledge Discovery and
Data Mining (PAKDD), pp. 64-75, 2008.
[8] M. Ester, H.-P. Kriegel, and J. Sander, “Knowledge
Discovery in Spatial Databases,” Proc. 23rd German
Conf. Artificial Intelligence (KI ’99), (Invited Paper),
vol. 1701, pp. 61-74, 1999.
Acceptance Testing
User Acceptance Testing is a critical phase of
any project and requires significant participation by the
end user. It also ensures that the system meets the
functional requirements.
Test Results: All the test cases mentioned above passed
successfully. No defects encountered.
VII. CONCLUSION
This paper studies the problem of pulling out colocations from spatially uncertain data with probability
intervals. It has defined the possible world model with
prospect intervals, and proves that probability intervals of
all possible worlds are feasible. It has also defined the
interrelated concepts of probabilistic widespread colocations. Thenit has proved the closure possessions of
prevalence point probability
REFERENCES:
[1] C.C. Aggarwal et al., “Frequent Pattern Mining with
Uncertain Data,” Proc. 15th ACM SIGKDD Int’l
Conf. Knowledge Discovery and Data Mining, pp. 2937, 2009.
[2] P. Agrawal, O. Benjelloun, A. Das Sarma, C.
Hayworth, S. Nabar, T. Sugihara, and J. Widom,
“Trio: A System for Data, Uncertainty, and Lineage,”
Proc. Int’l Conf. Very Large Data Bases (VLDB), pp.
1151-1154, 2006.
[3] R. Agrawal and R. Srikant, “Fast Algorithms for
Mining Association Rules,” Proc. Int’l Conf. Very
Large Data Ba ses (VLDB), pp. 487-499, 1994.
139 | Page
September 2014, Volume-1, Special Issue-1
QUALITATIVE BEHAVIOR OF A SECOND ORDER DELAY
DYNAMIC EQUATIONS
Dr. P.Mohankumar1, A.K. Bhuvaneswari2
1
Professor of Mathematics
Asst.Professor of Mathematics
Aaarupadaiveedu Institute of Techonology, Vinayaka Missions University,
Paiyanoor, Kancheepuram Dist- 603104 , Tamilnadu, India
2
Abstract :
In this paper we study the qualitative
behavior of delay dynamic equation of the form

 1

y  (t )   p (t ) y ( (t ))  0 , t  T ........(1)

r
(
t
)


using Ricatti substitution method.
1. INTRODUCTION
The theory of time scales, which has recently
established a lot of attention, was introduced by Hilger in
his Ph.D. Thesis in 1988 in order to combine continuous
and discrete analysis. A time scale T is an arbitrary closed
subset of the reals, and the cases when this time scale is
equal to the reals or to the integers represent the usual
theories of differential and of difference equations. Many
other remarkable time scales exist, and they give rise to
plenty of applications, among them the study of
population dynamic models which are discrete in season
and may follow a difference scheme with variable stepsize or often model by continuous dynamic systems die
out, say in winter, while their eggs are incubating or
hidden, and then in season again, hatching gives rise to a
non overlapping population .It not only
unify
the
theories of differential equations and difference equations
but also it extends these classical cases to cases “in
between”, for example, to the so-called q-difference
equations when
which has important applications in quantum theory and
can be applied on various types of time scales like
T=hN, T = N2, and T =Tn the space of the harmonic
numbers.
Consider the second order delay dynamic equation(1)

 1  
y (t )   p(t ) y ( (t ))  0


 r (t )
Where p (t), r (t) are positive right dense continous
functions defined on T and  :T  T satisfies  (t )  t
for every tЄT and
lim (t )  
t 
A solution y (t ) of equation (1) is said to be
oscillatory if it is neither eventually positive nor
eventually negative, that is if for every b  a there
exists t  b such that y(t)=0 or y(t)y(σ(t))<0 ;
otherwise it is called nonoscillatory.Since we are
interested in qualitative behavior of solutions, we will
suppose that the time scale T under considerations is not
140 | Page
bounded above and therefore the time scale is in the form
t0, T  t0,  I T
.
Note that if T=N we have
 ( n)  n  1
 ( n)  1
y  (n)  y (n)
then (1) becomes,
 1


y(n)   p(n) y( (n))  0
 r (n)

If T=R we have
, n N
 (t )  t
 (t )  0
f  (t )  f '(t ).
then equation (1.) becomes
'
 1

y(t)  +p(t)y(τ(t))=0

 r(t)

If T  hN , h  0, we have
 (t )  t  h
 (t )  h
y  (t)   h (t ) 
y(t  h )  y(t )
h
then equation (1) becomes
 1

h 
h ( y (t ))   p(t )y ( (t ))  0
 r (t )

MAIN RESULT
Theorem 1.
Assume that
t
(i) M (t )   r ( s)s   as t  
t0
(ii )  (t )  0



r ( (t ))  (t ) 
(iii)   M ( (t ) p(t ) 
 t   Then
4M ( (t )) 

(1) is oscillatory.
September 2014, Volume-1, Special Issue-1
Proof
Let y(t) be a non oscillatory solution of (1). Without loss
of generality we may assume that y(t)>0 and y(
 (t )  0 for t  t 1
r ( (t ))  (t ) 

V (t )  (V  (t )2
M ( (t ))



 1  
y (t )   0
From (1) we have 
 r (t )

 1  
This implies that( from (2) ) 
y (t )   0
 r (t )

Now make the Ricatti substitution
V (t ) 
M ( (t ))  y  (t ) 


y( (t ))  r (t ) 
 M ( (t )) p (t )
As the polynomial
V  (t ) 
r ( (t ))  (t )
 M ( (t )) p (t )
4 M ( (t )
Integrating from t 1 to t we get

r ( ( s)  ( s) 
V (t )  V (t1 )    M ( ( s)) p( s) 
s
4M ( ( s)) 
t1 
t
Then V (t)>0.Now
 M ( (t ))   y  (t ) 
V (t )  


 y ( (t ))  r (t ) 




 M ( (t ))   y (t ) 


 
 y ( (t ))   r (t ) 
 M ( (t ))   y  (t ) 



 y ( (t ))  r (t ) 

Letting t  
V(t)  
Which is a contradiction and hence the result.
Corollary
Assume that (i) and (ii) are satisfied and if
lim inf

t 
M 2 (t ) p (t ) 1

r (t )
4


 y  (t )   y ( (t ))r ( (t ))  (t )  M ( (t )) y  ( (t ))  (t ) 

 

y ( (t )) y ( (t ))
 r (t )  

 1  
then the equation 
y (t )   p(t ) y (t )  0
 r (t )

is oscillatory.
Proof
  M ( (t )) p(t ) 
 V  (t )
r ( (t )  (t ) 
V (t )
M ( (t )
y  ( (t ) r ( (t )  (t )

r ( (t )
y ( (t )
Put
 (t )  t
in


r ( (t ))  (t ) 
M
(

(
t
)
p
(
t
)

t  
 
4M ( (t )) 
We get the result.
Example
Since
 y  (t ) 

 is decreasing ,
 r (t ) 
V  (t ) 
y  ( (t ) y  (t )

r ( (t )
r (t )
r ( (t ))  (t ) 
r ( (t ))  (t ) y  (t ) 
V (t ) 
V (t )
M ( (t ))
y ( (t ))
r (t )
Consider the dynamic equation of the form
1

 1
 t   y (t )   t 2 y (t  1)  0, t  1,  T ..........(2)
 
 M ( (t )) p(t ).
141 | Page
September 2014, Volume-1, Special Issue-1
Hence every solution of equation (2) is oscillatory.
REFERENCES
1. M. Bohner and A. Peterson, Dynamic Equations on
Time
Scales,
An
Introduction
with
Applications,Birkhuser, Boston, 2001.
2. R. P. Agarwal, M. Bohner and A. Peterson, Dynamic
equations on time scales: A survey. J. Comp. Appl
.Math., Special issue on dynamic equations on time
scales, edited by R.P.Agarwal, M.Bohner and
D.O’Regan (Preprint in Ulmer Seminare 5), 141
(2002), 1–26,.
3. M. Huang andW. Feng, Oscillation for forced second
order nonlinear dynamic equations on time scales,
Elect.J. Diff. Eqn., 145 (2005), 1–8.
4. 7.S. H. Saker, Oscillation of second order nonlinear
neutral delay dynamic equations, J. Comp. Appl.Math.,
187(2006), 123–141.
5. S. R. Grace, R. Agarwal, M. Bohner and D. O’Regan,
Oscillation of second order strongly superlinear and
strongly sublinear dynamic equations, Comm. Nonl.
Sci Num. Sim., 14 (2009), 3463–3471.
6. P.Mohankumar and A.Ramesh, Oscillatory Behaviour
Of The Solution Of The Third Order Nonlinear Neutral
Delay Difference Equation, International Journal of
Engineering Research & Technology (IJERT) ISSN:
2278-0181 Vol. 2 Issue 7,no.1164-1168 July – 2013
7. B.Selvaraj, P.Mohankumar and A.Ramesh, On The
Oscillatory Behavior of The Solutions to Second Order
Nonlinear Difference Equations, International Journal
of Mathematics and Statistics Invention (IJMSI) EISSN: 2321 – 4767 P-ISSN: 2321 - 4759 Volume 1
Issue 1 ǁ Aug. 2013ǁ PP-19-21
142 | Page
September 2014, Volume-1, Special Issue-1
HALL EFFECTS ON MAGNETO HYDRODYNAMIC FLOW PAST
AN EXPONENTIALLY ACCELERATED VERTICAL PLATE IN A
ROTATING FLUID WITH MASS TRANSFER EFFECTS.
Thamizhsudar.M1,
Prof (Dr.) Pandurangan.J2
Assistant Professor
2
H.O.D
1,2
Department of mathematics
1
Aarupadai Veedu Institute of Technology, Paiyanoor-603104, Chennai, Tamil Nadu
1
India [email protected]
1
ABSTRACT
The theoretical solution of flow past an
exponentially accelerated vertical plate in the presence
of Hall current and Magneto Hydrodynamic relative to
a rotating fluid with uniform temperature and variable
mass diffusion is presented. The dimensionless
equations are solved using Laplace method. The axial
and transverse velocity, temperature and concentration
fields are studied for different parameters such as Hall
parameter,Hartmann number, Rotation parameter,
Schmidt number, Prandtl number, thermal Grashof
number and mass Grashof number.It has been observed
that the temperature of the plate decreases with
increasing values of Prandtl number and the
concentration near the plate increases with decreasing
values of Schmidt number. It is also observed that Axial
velocity increases with decreasing values of Magnetic
field parameter, Hall parameter and Rotation
parameter, whereas the transverse velocity increases
with increasing values of Rotation parameter and
Magnetic parameter but the trend gets reversed with
respect to the Hall parameter. The effects of all
parameters on the axial and transverse velocity profiles
are shown graphically.
Gc
Gr
g
k
M
m
Pr
Sc
T
Tw
T
Mass Grashof number
Thermal Grashof number
Acceleration due to gravity (m/s2)
Thermal conductivity (W/m.K)
Hartmann number
Hall parameter
Prandtl number
Schmidt number
Temperature of the fluid near the plate (K)
Temperature of the plate (K)
Temperature of the fluid far away from the plate
(K)
t
t
-
Dimensionless time
Time (s)
u0
(u
Velocity of the plate
v
w)
Components of Velocity Field F.
(m/s)
(u v w)
Non-Dimensional
Velocity
Components
Index Terms:
Hall Effect, MHD flow,
Rotation, exponentially, accelerated plate, variable mass
diffusion.
(x y z )
Cartesian Co-ordinates
z Non-Dimensional co-ordinate normal to the plate.
e
Magnetic Permeability (H/m)
Kinematic Viscosity (m2/s)
Nomenclature
a , A,a
Constants
B0
Applied Magnetic Field (T)
Dimensionless concentration
C
c
Species concentration in the fluid (mol/m3)
cp
Specific heat at constant pressure (J/(kg.K)
cw
Concentration of the plate
c
Concentration of the fluid far away from the plate
Mass diffusion co-efficient
D
erfc Complementary error function
143 | Page
Component of Angular Velocity (rad/s)
Non -Dimensional Angular Velocity
Fluid Density (kg/m3)
Electric Conductivity (Siemens/m)
Dimensionless temperature
Similarity parameter
Volumetric coefficient of thermal expansion
Volumetric coefficient of expansion with
concentration
I Introduction
Magneto Hydro Dynamics(MHD) flows of an
electrically conducting fluid are encountered in many
September 2014, Volume-1, Special Issue-1
industrial applications such as purification of molten
metals, non-metallic intrusion, liquid metal, plasma
studies, geothermal energy extraction, nuclear reactor and
the boundary layer control in the field of aerodynamics
and aeronautics.
The rotating flow of an electrically conducting
fluid in the presence of magnetic fluid is encountered in
cosmical, geophysical fluid dynamics. Also in solar
physics involved in the sunspot development, the solar
cycle and the structure of rotating magnetic stars. The
study of MHD Viscous flows with Hall Currents has
important engineering applications in problems of MHD
Generators, Hall Accelerators as well as in Flight
Magneto Hydrodynamics.
The effect of Hall currents on hydromagneto flow
near an accelerated plate was studied by Pop.I (1971).
Rotation effects on hydromagnetic free convective flow
past an accelerated isothermal vertical plate was studied
by Raptis and Singh(1981). H.S.Takhar.et.al., (1992)
studied the Hall effects on heat and mass transfer flow
with variable suction and heat generation. Watnab and
pop(1995) studied the effect of Hall current on the steady
MHD flow over a continuously moving plate, when the
liquid is permeated by a uniform transverse magnetic
field. Takhar et. al.(2002) investigated the simultaneous
effects of Hall Current and free stream velocity on the
magneto Hydrodynamic flow over a moving plate in a
rotating fluid. Hayat and abbas (2007) studied the
fluctuating rotating flow of a second-grade fluid past a
porous plate with variable suction and Hall current.
Muthucumaraswamy et. al.(2008) obtained the heat
transfer effects on flow past an exponentially accelerated
vertical plate with variable temperature. Magneto hydro
dynamic convective flow past an accelerated isothermal
vertical plate with variable mass diffusion was studied by
Muthucumaraswamy.R.et.al., (2011).
t
0 ,the plate is exponentially
u0
exp(a t ) in its
accelerated with a velocity u
A
own plane along x -axis and the temperature from the
plate is raised to T w and the concentration level near the
be T and c . At time
plate is raised linearly with time. Here the plate is
electrically non conducting. Also, a uniform magnetic
field B 0 is applied parallel to z -axis. Also the pressure
is uniform in the flow field. If u , v , w represent the
components of the velocity vector F, then the equation of
continuity
F 0 gives w =0 everywhere in the
flow such that the boundary condition w =0 is satisfied
at the plate. Here the flow quantities depend on z and
t only and it is assumed that the flow far away from the
plate is undisturbed. Under these assumptions the
unsteady flow is governed by the following equations.
2
2 2
u
e B0
2 v
(u
2
(1 m2 )
z
g (T - T ) g (c - c )
u
t
v
t
cp
T
t
c
t
D
2
v
z
2
2 u
2 2
e B0
2
(1 m )
mv )
(mu
(1)
v)
(2)
2
K
2
T
z2
(3)
c
(4)
z2
In all the above studies, the combined effect of
rotation and MHD flow in addition to Hall current has
not been considered simultaneously. Here we have made
an attempt to study the Hall current effects on a MHD
flow of an exponentially accelerated horizontal plate
relative to a rotating fluid with uniform temperature and
variable mass diffusion.
Where u is the axial velocity and v is the
transverse velocity. The prescribed initial and boundary
conditions are
II Mathematical Formulation
Here we consider an electrically conducting
viscous incompressible fluid past an infinite plate
occupying the plane z =0. The x -axis is taken in the
direction of the motion of the plate and y -axis is normal
u
to both x and z axes. Initially, the fluid and the plate
rotate in unison with a uniform angular velocity
about
the z -axis normal to the plate, also the temperature of
the plate and concentration near the plate are assumed to
144 | Page
u
0, T
at t
c
0
c
for all z
(5)
u0 a t
e , v 0 , T Tw ,
A
c
c w c At ,
at z
u
T ,c
0, v
0 for all t
0 ,T
T ,c
0
(6)
c
as z
(7)
September 2014, Volume-1, Special Issue-1
1
3
u 02
where , A
is a constant.
On introducing the following non-dimensional
quantities,
u
u
1
3
u0
u0
t
,
v
v
u0
1
3
,
z
2
g
Gc
a
( cw c )
,
u0
cp
Pr
K
2
,
Gr
Sc
g (Tw T )
,
u0
v
z
2
t
2M 2
mu v
1 m2
1 2
Pr z 2
2
0
as z
(14)
q
M2
1 m2
2q
i
M 2m
1 m2
Gc C
(15)
(16)
C
t
1 2C
Sc z 2
(17)
0,
(8)
u
(9)
0,
t
C 0
0 for all z
(18)
eat ,
1,
C t at z 0, for all t
0,
C
0
(19)
0,
0 as z
(20)
Where q=u+iv.
III Solution of the problem.
To solve the dimensionless governing equations
(15) to (17), subject to the initial and boundary conditions
(18)-(20) Laplace –Transform technique is used. The
solutions are in terms of exponential and complementary
error function:
(10)
c
C
t
C
t
q
2
2
0,
z2
at
The equations (1)-(7) reduce to the following
non-dimensional form of governing equations
v
t
(13)
1 2
Pr z 2
q
D
u
2M 2
2 v
u mv
2
z
1 m2
Gr
GcC
0
With boundary conditions
q
u
t
0, z
Gr
T T
,
Tw T
,
2
q
t
a,
u02
at t
The above equations (8) - (9) and boundary
conditions (12)-(14) can be combined as
1
3
B02 3
,
u02 / 3
c c
,
cw c
C
1
3
0, v
0,
,
u02
1
M
u0
u
1
3
t,
2
e
2
z
1
3
2
eat , v 0,
1,
C t
u
1 2C
Sc z 2
t
(1 2
2
2
sc
sc) erfc (
exp
2
sc )
sc
(21)
(11)
erfc(
With initial and boundary conditions
pr )
(22)
u
0,
C 0
v
0,
at t
145 | Page
0,
0
for all z
(12)
September 2014, Volume-1, Special Issue-1
at
exp(2
e
2
q
(a b)t erfc (
exp( 2
_
d2
e2
d2
e2
bt )
bt )erfc (
bt )
bt )erfc (
bt )
(b e1 )t )erfc (
(b e1 )t )
(b e1 )t )
exp(2
(b e2 )t )erfc (
(b e2 )t )
exp( 2
2
(b e2 )t )erfc (
exp( 2
(b e2 )t )
pr e1t )erfc (
exp(2
pr
pr e1t )erfc (
pr
e1t )
e1t )
pr )
d2
t1 2
e2
e2 2
bt )
(b e1 )t )erfc (
d1
erfc (
e1
d2
Figure 1 illustrates the effect of Schmidt number
(Sc=0.16, 0.3, 0.6), M=m=0.5,
=0.1, a=2.0, t=0.2 on
the concentration field. It is observed that, as the Schmidt
number increases, the concentration of the fluid medium
decreases.
The effect of Prandtl number (Pr) on the
temperature field is shown in Figure2. It is noticed that an
increase in the prandtl number leads to a decrease in the
temperature.
bt )
exp( 2
d1 exp(e1t )
e1
2
d2
e2
exp( 2
exp(2
d 2 exp(e2 t )
2
e22
_
bt )erfc (
bt )erfc (
exp(2
bt )
bt )erfc (
exp(2
2 b
bt )erfc (
exp(2
exp( 2
t
(a b)t )
exp( 2
d1
e1
t
2
d1 exp(e1t )
e1
2
(a b)t )
(a b)t erfc (
1 d2
2 e22
To interpret the results for a better understanding of
the problem, numerical calculations are carried out for
different physical parameters M,m, ,Gr,Gc,Pr and Sc.
The value of Prandtl number is chosen to be 7.0 which
correspond to water.
2
sc erfc
sc
“F
ig.1.” concentration profiles for different values of Sc.
sc
exp
2
sc
1
pr
erfc (
0.8
Sc )
0.71
0.6
_
d2
e2 2
exp(e2 t )
2
exp( 2
exp(2
Sce2 t )erfc (
Sce2 t )erfc (
Sc
Sc
e2 t )
7.0
0.4
e2 t )
(23)
Where
0.2
0
2
b
θ
M
2
1 m2
2
i(
M m
) , d1
1 m2
Gr
, e1
Pr 1
b
,
Pr 1
Gc
b
, e2
,
z/2 t
Sc 1
Sc 1
In order to get a clear understanding of the flow
field, we have separated q into real and imaginary parts to
obtain axial and transverse components u and v.
d2
0
0.5
1
1.5
z
2
2.5
3
“Fig.2.”Temperature profiles for different values of Pr.
on the free convective flow.
IV Results and Discussion
146 | Page
September 2014, Volume-1, Special Issue-1
1.5
Figures 6 and 7 show the effects of thermal Grashof
number Gr and mass Grashof number Gc. It has been
noticed that the axial velocity increases with increasing
values of both Gr and Gc,
M
1.0
1
3.0
u
1.8
5.0
0.5
1.6
Gr
1.4
10.
1.2
5.0
1
0
0
0.2
0.4 0.6
0.8
1
z
1.2 1.4 1.6 1.8
u 0.8
2
2.0
0.6
“Fig.3.”Axial velocity profiles for different values of M.
0.4
0.2
The effect of Rotation parameter on axial velocity is
shown in Figure.4. It is observed that the velocity
increases with decreasing values of
(3.0, 5.0).
1.5
0
-0.2
0
0.2
0.4
0.6
0.8
1
z
1.2
1.4
1.6
1.8
2
“Fig.6.”Axial velocity profiles for different values of Gr.
Ω
1.6
3.0
1
1.4
Gc
5.0
u
0.5
1.2
10.0
1
5.0
0.8
1.0
u
0.6
0
0
0.2
0.4
0.6
0.8
1
z
1.2
1.4
1.6
1.8
2
0.4
“Fig.4.”Axial velocity profiles for different values of Ω.
0.2
0
Fig.5 demonstrates the effect of Hall parameter m on
axial velocity. It has been noticed that the velocity
decreases with increasing values of Hall parameter.
-0.2
0
0.
1
z
1.5
2
“Fig.7.”Axial velocity profiles for different
values of Gc.
1.6
1.4
m
1.2
Figure.8 illustrates the effects of Magnetic field
parameter M on transverse velocity. It is observed that the
transverse velocity increases with increasing values of M
and it is also observed that the transverse velocity peaks
closer to the wall as M is increased.
0.5
1
u
3.0
0.8
5.0
0.6
0.4
0.2
0
-0.2
0
0.2
0.4
0.6
0.8
1
z
1.2
1.4
1.6
1.8
2
“Fig.5.”Axial velocity profiles for different values of m
147 | Page
September 2014, Volume-1, Special Issue-1
0.1
4
0.1
2
0.
1
0.0
8
v 0.0
6
0.0
4
0.0
2
0
0.
1
0
M
1.
0
3.
0
5.
0
0
0.02
0.
2
0.
4
0.
6
0.
8
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0
0.8
v
1
1.
2
1.
4
1.
6
1.
8
2
z
“Fig.8.” Transverse velocity profiles for
values of M.
m
5.
03.
0
1.
0
0.
5
1
z
1.
5
2
different
The Transverse velocity profiles for different values
are shown in Figure.9. It is
of Rotaion parameter
observed that the velocity increases with decreasing
values of .
0
“Fig.10.”Transverse velocity profiles for
different values of m.
Figure.11. demonstrates the effect of Thermal
Grashof number Gr on transverse velocity. It is observed
that there is an increase in Transverse velocity as there is
an increase in Gr
-0.05
0.1
-0.1
0
-0.15
Ω
-0.2
-0.1
Gr
-0.2
8.0
-0.3
5.0
-0.4
2.0
3.0
v
-0.25
-0.3
5.0
v
-0.35
-0.4
-0.5
-0.45
0
0.5
1
1.5
2
-0.6
z
-0.7
“Fig.9.”Transverse velocity profiles for different values
of Ω.
Figure.10 shows the effect of Hall parameter m on
transverse velocity. It is found that the velocity increases
with increasing values of m.
-0.8
0
0.5
1
z
1.5
2
“Fig.11.” Transverse velocity profiles for different values
of Gr.
The effect of Mass Grashof number on
transverse velocity is shown in Figure.12. Numerical
calculations were carried out for different values of Gc
namely 3.0, 5.0, 10.0. From the Figure it has been
noticed that with decreasing values of Gc the tansverse
velocity is increased.
148 | Page
September 2014, Volume-1, Special Issue-1
[4] Watanabe T.,and Pop I (1995): Hall effects on
magnetohydrodynamic boundary layer flow over a
continuous moving flat plate. : - Acta Mechanica vol.,
108, pp.35-47.
0.2
0
Gc
3.0
-0.2
-0.4
[5] Takhar H.S., Chamkha A.J., and Nath G.(2002) :
MHD flow over a moving plate in a rotating fluid with
a magnetic field, Hall currents and free stream
velocity.:- Intl.J.Engng.Sci. vol,40(13).pp.1511-1527.
5.0
-0.6
v
10.0
-0.8
[6] Hayat T., and Abbas Z. (2007): Effects of hall Current
and Heat Transfer on the flow in a porous Medium
with slip conditions:- Journal of Porous Media,
vol.,10(1), pp.35-50.
-1
-1.2
-1.4
-1.6
0
0.5
1
1.5
2
z
“Fig.12.”Transverse velocity profiles for different values
of Gc.
V Conclusion
In this paper we have studied the effects of Hall
current, Rotation effect on MHD flow through an
exponentially accelerated vertical plate with uniform
temperature and variable mass diffusion. In the analysis
of the flow the following conclusions are made.
[7] Muthucumaraswamy R., Sathappan KE and Natarajan
R., (2008): Heat transfer effects on flow past an
exponentially accelerated vertical plate with variable
temperature,:- Theoretical Applied Mechanics,
Vol.35, pp.323-331.
[8]
Muthucumaraswamy R., Sundar R.M., and
Subramanian V.S.A.,(2011): Magneto hydro dynamic
convective flow past an accelerated isothermal vertical
plate with variable mass diffusion :-International
Journal
of
Applied
Mechanics
and
Engineering,Vol.16,pp.885-891.
(1) The concentration near the plate increases with
decreasing values of the Schmidt number.
(2) The temperature of the plate decreases with increasing
values of prandtl number
(3) Both Axial velocity and Transverse velocity increase
with decreasing values of Magnetic field parameter or
Rotation parameter. Also the Axial velocity increase
with decreasing values of Hall parameter but the trend
is reversed in Transverse velocity.
(4) Both Axial velocity and Transverse velocity increase
with increasing values of thermal Grashof number.
(5) Axial velocity increases with increasing values of
mass Grashof number whereas the transverse velocity
increase with decreasing values of mass Grashof
number.
References
[1] Pop I(1971): The effect of Hall Currents on
hydromagnetic flow near an accelerated plate. J.Math.Phys.Sci. vol.,5, pp.375-379.
[2] Raptis A., and Singh A.K., (1981): MHD free
convection flow past an accelerated vertical plate Letters in Heat and Mass Transfer Vol.8, pp.137-143.
[3] H.S.Takhar., P.C.Ram., S.S.Singh (1992): Hall
effects on heat and mass transfer flow with variable
suction and heat generation.- Astrophysics and space
science Volume.191,issue1.pp.101-106.
149 | Page
September 2014, Volume-1, Special Issue-1
DETECTION OF CAR-LICENSE PLATE USING MODIFIED
VERTICAL EDGE DETECTION ALGORITHM
S.Meha Soman1, Dr.N.Jaisankar2
1
PG Student, Applied Electronics
Professor & Head, Dept of ECE
1,2
Misrimal Navajee Munoth Jain Engineering, Chennai, India
1
[email protected]
2
[email protected]
2
Abstract - The Car License Plate detection and
recognition system became an important area of
research due to its various applications, such as the
payment of parking fees, highway toll fees, traffic data
collection, and crime prevention. There are many issues
that to be resolved to create a successful and fast Car
license plate detection system e.g., poor image quality,
plate sizes and designs, processing time, and
background details and complexity. To overcome these
issues we proposed an algorithm called modified VEDA,
for detecting vertical edge, which enhances the
performance of the car license plate detection in terms
of computation time and to increase the detection rate.
Keywords— Edge Detection,License plate,Vertical edge
detection algorithm, Modified VEDA.
I .INTRODUCTION
Localization of potential license plate regions(s)
from vehicle images serves as a challenging task on
account of huge variations of size, shape, colour, texture
and spatial orientations of license plate regions in such
images.Normally, objective of any Automatic License
Plate Recognition (ALPR) system is to localize potential
license plate region(s) from the vehicle images captured
with a road-side camera and interpret them using an
Optical Character Recognition (OCR) system to have the
license number of your vehicle.Various techniques has
already been developed recently for the exact purpose for
efficient detection of license plate regions from offline
vehicular images. Normally, most of the functions on
ALPR systems apply edge based features for localizing
standardized license plate regions.
Many of these works captures the image
associated with a vehicle carefully placed when in front
of a camera occupying the complete view of it and having
a clear picture of the license plate. However in an
unconstrained outdoor environment there may be huge
variations in lighting conditions/ wind speed/ pollution
levels/ motion etc. that produces localization of true
license plateregions difficult.
M.Fukumi et al., 2000 have introduced a method
to recognize characters of vehicle license plate in
Malaysia by using a neural network based threshold
method. For separation of characters and background, a
threshold of digitalization is important and is determined
using a three-layered neural network. Furthermore, in the
150 | Page
extracted character portions, we segment characters and
recognize them by obtaining their features[5].
H.Zhang et al., 2008 have proposed a fast
algorithm detecting license plates in various conditions.
It defines a new vertical edge map, with which the license
plate detection algorithm is extremely fast. Then
construct a cascade classifier which is composed of two
kinds of classifiers[9].
F.Shafait et al.,2011 have introduced Adaptive
Binarization an important step in many document
analysis and OCR processes. It describes a fast adaptive
Binarization algorithm.That yields the same quality of
Binarization as the Sauvola method, 1 but runs in time
close to that of global Thresholding methods (like Otsu’s
method2), independent of the window size. The algorithm
combines the statistical constraints of Sauvola’s method
with integral images.3Testing on the UW-1 dataset
demonstrates a 20-fold speedup compared to the original
Sauvola algorithm[7].
F.L Cheng et al., 2012 have discussed a novel
method to recognize license plates robustly are presented.
First, a segmentation phase locates the license plate
within the image using its salient features. Then, a
procedure based upon feature projection estimate needs to
separate the license plate into seven characters. Finally,
the character recognizer extracts some salient features of
characters and uses template-matching operators to get a
robust solution[3].
This proposed method uses modified vertical
edge detection algorithm to distinguish the plate detail
region, particularly the beginning and the end of each
character. Therefore, the plate details will be easily
detected, and the character recognition process will be
done faster. The VEDA is concentrates intersection of
black-white and white-black regions.
II. SYSTEM DESIGN
I. A.Block Diagram
II. The flow diagram for detection of car-license late
is shown in figure.
September 2014, Volume-1, Special Issue-1
beginning and the end of each character. Therefore, the
plate details will be easily detected, and the character
recognition process will be done faster. The VEDA is
concentrates intersection of black-white and white-black
regions. The center pixel of the mask is located at points
(0, 1) and (1, 1). By moving the mask from left to right,
the black–white regions will be found. Therefore, the last
two black pixels will only be kept. Similarly, the first
black pixel in the case of white–black regions will be
kept.
Fig 2. a) Black –White b) White-Black Region
A 2 × 4 mask is proposed for this process.
Fig 1. Detection of Car-License Plate
B. Pre-Processing
In image preprocessing, initially the RGB image is
converted into gray scale image. And then Adaptive
Thresholding Technique is applied on the image to
constitute the binarized image. After that the Unwanted
Lines Elimination Algorithm (ULEA) is applied to the
image. This algorithm is considered morphological
operations and enhancement process. It performs removal
of noise and enhances the binarized image.
C. Segmentation
Edge detection is probably the most widely used
operations in image analysis, and there are probably more
algorithms within the literature for enhancing and
detecting edges than any other detection approaches. This
is due to the fact that edges establish the outline relevant
to an object. An edge will be the boundary between a
desire and to discover the background, and indicates the
boundary between overlapping objects. Recently, license
plate edge detection technique is a vital part of vision
navigation, which is the key method of intelligent vehicle
assistance. The detection outcome is seriously affected
through quality of noise and image. This means that in
the event the edges in an image can be identified
accurately, all of the objects can easily be located
andbasic properties such as area, perimeter, and shape can
easily be measured.
Fig 3. Design of Proposed Mask
a) Moving Mask
b) left mask (0, 0), (1, 0)
c) Centre Mask (0, 1), (0, 2), (1, 1) (1, 2)
d) Right Mask (0,3),(1,3)
In this type of a mask, it is divided into three
sub masks: The first sub mask is the left mask “2 × 2,”
the second sub mask is the center “2 × 1,” and the third
sub mask is the right mask “2 × 1”. Simply, after each
two pixels are checked at once, the first sub mask is
applied so that a 2 pixel width “because two column are
processed” can be considered for detecting. This process
is specified to detect the vertical edges at the intersections
of black–white regions.
And unwanted vertical edges are removed from
the image by using binary to open area operator. For
Sobel, both vertical edges of a detected object have the
same thickness. As VEDA’s output form, the searching
process for the LP details could be faster and easier
because this process searches for the availability of a 2
pixel width followed by a 1 pixel width to stand for a
vertical edge.In addition,there is no need to search again
once the 1 pixel width is faced. These two features could
make the searching process faster.
i) Modified VEDA
VEDA (Vertical Edge Detection Algorithm) is
applied to the ULEA output of the image. It performs to
distinguish the plate detail region, particularly the
151 | Page
September 2014, Volume-1, Special Issue-1
ii) Algorithm For Edge Detection
Input : Test number Plate Image.
Output : Segmented Image.
Step 1 : From the given image we are proposing
2 x 4 mask.
Step 2 : And this mask is divided into three sub
mask, the first mask is the left sub mask
of 2 x 1,the second sub mask is the
center 2 x 2, and third sub mask is the
right mask 2 x 1.
l=x(i,j-1)&&x(i+1,j-1)
m=x(I,j)&&x(i,j+1)&&x(i+1,j)&&x(i+1,y+1)
n=x(i,j+2)&&x(i+1,j+2)
Step 3 : Where x can be seen as rows or the
height of a given image and y can be
seen as Columns as well as width of a
given image.
Step 4 : If four pixels of centre mask are black,
then the other mask values are test to
check whether they are black or not.
Step 5 : If the whole values are black, then the
First column of the centre mask
converted to white.
Step 6 : If the first column of the centre mask
and any other column has different
values, the pixel value of column one is
taken.
Step 7 : This process is repeated with the whole
pixels in the image.
Step 8 : And after we are checking for white
pixels. If the mask contains less than
five white pixels then we are using
binary area open operator to remove
unwanted Edges from the image.
BW2 = Bwareaopen (BW, P)
D. Extraction
i) Highly Desired Details (HDD)
Highlight the desired details such as plate details
and vertical edges in the image. The HDD performs
NAND–AND operation for each two Corresponding pixel
values taken from both ULEA and VEDA output images.
The HDD performs NAND–AND operation for each two
corresponding pixel values taken from both ULEA and
VEDA.
This process depends on the VEDA output in
highlighting the plate region. All the pixels in the vertical
edge image will be scanned.
Fig 4. NAND- AND Procedure
ii) Candidate Region Extraction:
Candidates are extracted from the image using four
steps. These are
1) Count the Drawn Lines per Each Row
2) Divide the image into Multi-groups
3) Count and store satisfied group indexes
and boundaries
4) Select boundaries of candidate regions
Count the Drawn Lines per Each Row:
The number of lines that have been drawn per
each row will be counted and stored in matrix variable
HwMnyLines[a], where a = 0, 1. . . Height-1.
Divide the image into Multi-groups:
The huge number of rows will delay the
processing time in the next steps. Thus, to reduce the
consumed time, gathering many rows as a group is used
here.Dividing the image into multi-groups by using
How many groups=Height/C
C- Number of lines in each group.
where how_mny_groups represents the total number of
groups, height represents the total number of image rows,
and C represents the candidate region extraction (CRE)
constant. In this paper, C is chosen to represent one group
(set of rows). For our methodology, C = 10 because each
ten rows could save the computation time. In addition, it
could avoid either loosing much desired details or
consuming more computation time to process the image.
Therefore, each group consists of ten rows.
Count and store satisfied group indexes and boundaries:
When there are two neighbor black pixels and
followed by one black pixel, as in VEDA output form, the
two edges will be checked to highlight the desired details
by drawing black horizontal lines connecting each two
vertical edges.
152 | Page
It is useful to use a threshold to eliminate those
unsatisfied groups and to keep the satisfied groups in
which the license plate details exist.
September 2014, Volume-1, Special Issue-1
Most of the group lines are not parts of the plate
details. Therefore, it is useful to use a threshold to
eliminate those unsatisfied groups and to keep the
satisfied groups in which the LP details exist in. Each
group will be checked; if it has at least 15 lines, then it is
considered as a part of the LP region.
Thus, the total number of groups including the
parts of LP regions will be counted and stored. This
threshold (i.e., 15 lines) value is determined to make sure
that the small-sized LP is included for a plate searching
process. If the predefined threshold is selected with a less
value than that given, wrong result can be yielded
because noise and/or nonplate regions will be considered
as parts of the true LP, even if they are not.
4) Select boundaries of candidate regions:
Draws the horizontal boundaries above and
below of each candidate region. As shown, there are two
candidate regions interpreted from horizontal-line
plotting, and these conditions require an additional step
before the LP region can be correctly extracted.
E) Localization
This process aims to select and extract one correct
license plate. Some of the processed images are blurry, or
the plate region might be defected. The plate region can
be checked pixel by pixel, whether it belongs to the LP
region or not. A mathematical formulation is proposed for
this purpose, and once this formulation is applied on each
pixel, the probability of the pixel being an element of the
LP can be decided.
Plate region selection
Plate detection
For the candidate regions, each column will be
checked one by one. If the column blackness ratio
exceeds 50%, then the current column belongs to the LP
region; thus, this column will be replaced by a vertical
black line in the result image, Each column is checked by
the condition that, if blckPix ≥ 0.5 × colmnHght, then the
current column is an element of theLP region.
Here, the blckPix represents the total number of
black pixels per each column in the current candidate
region.
Fig 5. Flowchart of PRS and PD
The first part, the selection process of the LP region
from the mathematical perspective only. The second part
applies the proposed equation on the image. The third
part gives the proof of the proposed equation using
statistical calculations and graphs.
The fourth part explains the voting step. The final part
introduces the procedure of detecting the LP using the
proposed equation. The flowchart of plate region
selection (PRS) and plate detection
(PD).
1) Selection Process of the LP Region.
2) Applying the Mathematical Formulation.
3) Making a Vote.
I. 1) Selection Process of the LP Region
The condition will be modified as
followsblckPix ≥ PRS× colmnHght, where
PRSrepresents the PRS factor.
2) Applying the Mathematical Formulation.
153 | Page
September 2014, Volume-1, Special Issue-1
After applying on the image that contains the
candidate candidate regions , and output is obtained.
Making a Vote
The columns whose top and bottom neighbors
have high ratios of blackness details are given one vote.
This process is done for all candidate regions. Hence, the
candidate region that has the highest vote values will be
the selected region as the true LP.
III. RESULTS AND DISCUSSION
Detection of car-license plate using vertical edge
detection algorithm and extraction was performed and the
results are shown below.
Fig 9. Candidate region extraction
Fig 10. Plate region selection
Fig 6. Car Image
Fig 11. Plate Detection
Fig 12. Localization
Fig 7.Application of vertical edge detection
algorithm(VEDA)
IV. CONCLUSION
A robust technique for license plate detection is
presented here. It exploits the fact that the license plate
area contains rich edge and texture information.First, the
vertical edges are extracted and the edge map is
adaptively binarized. Then, the license plate candidates
are detected. The proposed way is tested on various
images. It produced fairly and stable results. Consistent
acceptable outputs over the various kinds of real life
images have proved robustness of th proposed scheme.
Thus, the proposed method could be handy for any
computer vision task where extraction of edge maps is
necessary to produce a large set of images for feature
extraction .
Fig 8. Modified VEDA
154 | Page
September 2014, Volume-1, Special Issue-1
V. REFERENCES
1. Bai.H. and Liu.C. “A hybrid License Plate Extraction
method based on Edge Statisticsand Morphology”in
proc. IEEE International Conference Pattern
Recognition, March 1999, pp.831-834.
2. Bradley.D and Roth.G. “Adaptive Thresholding
using the integral image, Graph Tools” Vol. 12, No.
2, April 2010 pp. 13–21.
3. Cheng F.-L and Wang G.-Y. “Automatic licenseplate location and recognition based on Feature
salience” in proc IEEE Transactions Vehicular
Technology, Vol. 58, No.7, June 2012
4. pp. 3781–3785.
5. Debi.K, Char H.-U. and Jo.C.K. “Parallelogram and
histogram based vehicle license plate Detection” in
proc IEEE International Conference Smart
Manufacturing Application ,April 2010,
6. pp. 349–353.
7. Fukumi.M,Takeuchi.Y.and Fukumato.H. “Neural
Network based threshold determination for Malaysia
License plate character Recognition” in proc
International Conference Mechatron. Technology,
April 2000, pp. 1–5.
8. Guo J.M. and Liu .Y.-F. “License plate localization
and character segmentation with feedback selflearning and hybrid binarization techniques” in proc
IEEE Transactions Vehicular Technology, Vol. 57,
No. 3, May 2008,
9. pp. 1417–1424.
10. Shafait.F, Keysers.D. and Breuel.T.M. “Efficient
implementation of local adaptive thresholding
Techniques using integral images of car”
,Document Recognition Retrieval. April 2011,
11. pp. 681510-1–681510-6.
12. Guo J.M. and Liu .Y.-F. ”License plate localization
and character segmentation with feedback selflearning and hybrid binarization techniques” in proc
IEEE TransactionsVehicularTechnology,Vol.5 No. 3,
March 2008, pp. 1417–1424.
13. Zhang.H, Jia.W. and Wu.Q. “A fast algorithm for
license plate detection in various conditions” IEEE
International Conference System, March 2008,
14. pp. 2420–2425
15. Naito.T, Tsukada.T. and Yamada.K.
“Robust
license- plate recognition method for passing
vehicles under outside environment”in proc IEEE
Transaction Vehicular Technology, Vol. 49, No. 6,
June 2004,pp. 2309–2319.
16. Parisi.R, Diclaudio.E.D. and Lucarelli.G. “Car plate
Recognition By neural networks and image
Processing” IEEE International Symposium, March
2002, pp. 195–198.
155 | Page
September 2014, Volume-1, Special Issue-1
MODIFIED CONTEXT DEPENDENT SIMILARITY ALGORITHM
FOR LOGO MATCHING AND RECOGNITION
S.Shamini1, Dr.N.Jaisankar2
PG student, Applied Electronics
2
Professor & Head, Dept of ECE
1
,2Misrimal Navajee Munoth Jain Engineering, Chennai, India
1
[email protected]
2
dr.jai235@gmail
1
Abstract - The wide range application of visual data
from Companies, Institution, Individuals and Social
system like Flickr, YouTube is for diffusion and sharing
of images and Video. There are several issues in
processing visual data from an image which was
corrupted by noise or subjected to any transformation
and also its accuracy in matching Logos are some of
the emerging research issues currently. To overcome
these issues we have proposed a new class of similarities
based on Modified Context Dependent algorithm which
enhances the performance in terms of accuracy in logo
matching and computation time.
Keywords - Visual data, Matching logos, context,
computation time, accuracy.
I. INTRODUCTION
Social media include all media formats by which
groups of users interact to produce, share, and augment
information in a distributed, networked, and parallel
process. The most popular examples include Twitter
(short text messages), blogs and discussion forums
(commentary and discourse), Flickr (photos), YouTube
(videos), and Open Street Map (geospatial data).
Social media produces tremendous amounts of
data that can contain valuable information in many
contexts. Moreover, anyone can access this data either
freely or by means of subscriptions or provided service
interfaces, enabling completely new applications.
Graphic logos are a special class of visual
objects extremely important to assess the identity of
something or someone. In industry and commerce, they
have the essential role to recall in the customer the
expectations associated with a particular product or
service. This economical relevance has motivated the
active involvement of companies in soliciting smart
image analysis solutions to scan logo archives to find
evidence of similar already existing logos, discover either
improper or non-authorized use of their logo
A.Smeulders et al (1998), Proposed Content
based-retrieval system which depends upon Pattern, types
of picture, role of semantics and sensory gap. Features for
retrieval are sorted by accumulative and global features,
salient points, object and shape features, signs, and
structural combinations. The CBIR implementation
improves image retrieval based on features[14].
156 | Page
J.Matas et al (2004), have introduces a Novel
rotation invariant detector. It was coined as SURF. A new
robust similarity measure for establishing tentative
correspondences is proposed. The robustness ensures that
invariants from multiple measurement regions (regions
obtained by invariant constructions from external
regions), some that are significantly larger (and hence
discriminative) than the MSERs[10].
Y.Jinget et al (2008), have discussed Image
ranking, it is done through an iterative procedure based
on the page rank computation. Numerical weight is
assigned to each image. An algorithm is provided to
analyze the visual link and solely rely on the text
clues[5].
J.Rodriguex et al (2009), Proposed 2D shape
representation of shape described by a set of 2D points.
Invariant relevant transformation technique is used, Such
as translation, rotation and scaling is done.2D shapes in a
way that is invariant to the permutation of the landmarks.
Within the framework, a shape is mapped to an analytic
function on the complex plane, leading to analytic
signature (ANSIG)[12].
D.Lowe et al (2010), Proposed Distinctive
invariant method which is used for feature extraction.
Object recognition is done from nearest neighbor
algorithm it also describes an approach to using these
features for object recognition. The recognition proceeds
by matching individual features to a database of features
from known objects using a fast nearest-neighbor
algorithm followed by a Hough transform[8].
The proposed method uses modified context
dependent similarity algorithm which involves
preprocessing the test image followed by interest point
extraction, context computation and similarity design.
This overcome the limitation of processing an unclear or
corrupted image which contain logo and check its
genunity.
II .SYSTEM DESIGN
A. Block Diagram
The flow diagram for Modified context
dependent similarity algorithm is as shown in figure.
September 2014, Volume-1, Special Issue-1
reducing the noise and degrading function.The Fourier
domain of the Wiener filter is
Where,
H*(u,v)
Complex
conjugate
of
degradation function,
Pn (u, v)-Power Spectral Density of Noise,
Ps (u, v)- Power Spectral Density of nondegraded
image.
iv) Gaussian filter
Gaussian filters are used in image processing
because they have a property that their support in the time
domain is equal to their support in the frequency
domain.The Gaussian filters have the 'minimum timebandwidth product'.
The Gaussian Smoothing Operator performs a
weighted average of surrounding pixels based on the
Gaussian distribution. It is used to remove Gaussian noise
and is a realistic model of defocused lens. Sigma defines
the amount of blurring.
Fig 1. Modified CDS algorithm
The probability distribution function of the
normalised random variable is given by
B . Pre-Processing
Pre-processing is an important technique which
is usually carried out to filter the noise and to enhance the
image before any processing. Four filters namely Mean,
Median, Gaussian and Weiner filters are used to remove
noise here. And their Peak signal to noise ratio is
calculated. The image with high PSNR value is used for
further processing.
Where μ is the expected value and σ2 is the
variance.
i) Mean filter
The mean filter is a simple spatial filter. It is a
sliding-window filter that replaces the center value in the
window. It replaces with the average mean of all the pixel
values in the kernel or window.
ii) Median filter
Median Filter is a simple and powerful nonlinear filter. Median filter is used for reducing the amount
of intensity variation between one pixel and the other
pixel.
v) Peak signal to Noise ratio (PSNR)
The Peak signal to noise ratio calculation is
performed in order to enhance the image quality and to
remove noise present in an image. This simplifies the
further processing steps. PSNR is the ratio between
maximum possible power of a signal and the power of
distorting noise which affects the quality of its
representation. It is defined by
Where MAXfis the maximum signal value that exists in
original ―known to be good‖ image.
The median is calculated by first sorting all the
pixel values into ascending order and then replace the
pixel being calculated with the middle pixel value. If the
neighboring pixel of image which is to be considered
contains an even numbers of pixels, than the average of
the two middle pixel values is used to replace.
Each of the above mentioned filter produces a
separate filtering output and the maximum signal value of
the best of these filter is calculated and proceeded further
for interest points extraction.
iii) Weiner filter
The goal of wiener filter is to reduced the mean
square error as much as possible. This filter is capable of
C) Interest Points Extraction
Interest point extraction is a recent terminology
in computer vision that refers to the detection of interest
points for subsequent processing. An interest point is a
157 | Page
September 2014, Volume-1, Special Issue-1
point in the image it has a well-defined position in image
space. The interest points are extracted using key points
extracted from Scale Invariant Feature Transform.
For any object in an image, interesting points on
the object can be extracted to provide a "feature
description" of the object. This description, extracted
from a training image, can then be used to identify the
object when attempting to locate the object in a test image
Such points usually lie on high-contrast regions of the
image, such as object edges.Another important
characteristic of these features is that the relative
positions between them in the original scene shouldn't
change from one image to another.
i) Scale Invariant Feature Transform
Scale-invariant feature transform (or SIFT) is an
algorithm in computer vision to detect and describe local
features in images.The algorithm make feature detection
in the scale space and determine the location of the
feature points and the scale.
Algorithm
transform follows:
of
the
scale
invariant
feature
Finding key points:
With the super-fast approximation, key points
can be find. These are maxima and minima in the
Difference of Gaussian image.
Assigning an orientation to the key points:
An orientation is calculated for each key point.
Any further calculations are done relative to this
orientation. This effectively cancels out the effect of
orientation, making it rotation invariant.
Generate SIFT Features:
Finally, with scale and rotation invariance in
place, one more representation is generated. This helps
uniquely identify features.
D) Context
The context is defined by the local spatial
configuration of interest points in both SX and SY.
Formally, in order to take into account spatial
information, an interest point xi ∈ SX is defined as xi =
(ψg(xi ),ψf (xi ),ψo(xi ),ψs (xi ),ω(xi )) where the symbol
ψg(xi ) ∈ R2 stands for the 2D co-ordinates of xi while ψf
(xi ) ∈Rc corresponds to the feature of xi .
Constructing a scale space.
Scale-space extreme value detection (Uses
difference-of-Gaussian function)
Key point localization (Sub-pixel location and
scale fit to a model)
Orientation assignment (1 or more for each key
point)
Key point descriptor (Created from local image
gradients).
where (ψg(xj)−ψg(xi)) is the vector between the
two point coordinates ψg(xj) and ψg(xi). The radius of a
neighborhooddisk surrounding xi is denoted as € p and
obtained by multiplying a constant value € to the scale ψs
(xi) of the interest point xi. In the above definition, θ = 1.
. . Na,ρ = 1. . . Nr corresponds to indices of different parts
of the disk.
Na and Nr correspond to 8 sectors and 8 bands.
In figure 3. Definition and partitioning of the context of
an interest point xi into different sectors (for orientations)
and bands (for locations).
Fig 2. SIFT keypoints mapped
Constructing a scale space:
This is the initial preparation. Internal
representations of the original image to ensure scale
invariance. This is done by generating a "scale space".
LoG Approximation:
The Laplacian of Gaussian is great for finding
interesting points (or key points) in an image.
Fig 3.. Partitioning of the context of an interest point
into different sectors and bands
158 | Page
September 2014, Volume-1, Special Issue-1
Step 3:
E) Similarity design
We define k as a function which, given two
interest points (x, y) ∈SX × SY, provides a similarity
measure between them. For a finite collection of interest
points, the sets SX, SYare finite. Provided that we put
some (arbitrary) order on SX, SY, we can view function k
as a matrix K,
Let Dx,y= d(x, y) = ||ψf (x) − ψf (y)||2. Using this
notation, the similarity K between the two objects
SX,SYis obtained by solving the following minimization
problem
Here α, β ≥ 0 and the operations log (natural), ≥ are
applied individually to every entry of the matrix.
Extract Scale Invariant Feature
ransform [SIFT] from Ix, Iy.
Step 4: Compute the first octave of SIFT
Assume K2=0, k1=0:3, k= sqrt (2)
sigma = [(k^(k1+(2*k2)))*1.6]
Step 5: Compute the CDS matrix by
[1/ ((2*pi)*((k*sigma)*(k*sigma))))*exp(((x*x)+(y*y))/(2*(k*k)*(sigma*sigma))]
Step 6: Store matrix result and resize the image.
Step 7: Compute second octave and third octave
by repeating the above steps only by
Changing the values of k2 as 1&2
Respectively
Step 8: Obtain and Plot the key points on the
image using kpl
Step 9: Calculate magnitude and Calculate
orientation of the key points
p1=mag (k1-2:k1+2, j1-2:j1+2);
q1=oric(k1-2:k1+2,j1-2:j1+2);
Step 10: Plot the keypoint of the test and
reference logo image.
step11: determine the CDS matrix k
step12: Compare keypoint and find the value of
tow (tow=(count/length(kp))
IV .RESULTS AND DISCUSSION
The logo matching and recognition using
Modified context dependent similarity algorithm is
performed and the results are shown below
Fig 4. Collection of SIFT points with their locations,
orientations and scales
Solution:
Let‘s consider the adjacency matrices {Pθ,ρ}θ,ρ,
{Qθ,ρ}θ,ρ related to a reference logo SX and a test image
SY
respectively.
a) filtered image
Where
is the ―entry wise‖
norm (i.e. The sum of
the square values of vector coefficients).
III. SYSTEM IMPLEMENTATION
The proposed system has been implemented
using the following
modified context dependent
similarity algorithm
Input : Test Logo image.
Output : Detected Logo image.
Step 1: Two Input images are taken namely
Reference logo image Ix.
Test logo image Iy.
Step 2: Convert the color image to gray scale.
159 | Page
b) Test image with keypoint mapped into it
September 2014, Volume-1, Special Issue-1
VI. REFERENCES
1. Ballan L. and Jain A. (2008) ‗A system for automatic
detection and recognition of advertising trademarks
in sports videos‘, ACM Multimedia, pp. 991–992.
2.
Bay H. and Tuytelaars T. (2008) ‗Speeded-up robust
features(SURF)‘, Computation. Visual Image
Understanding, Vol.110 No. 3, pp.346–359. Carneiro
G. and Jepson A. (2004) ‗Flexible spatial models for
grouping local
image
Features‘,
in
Proc.Conf.vol.2.pp.747–754.
3.
Eakins J.P. and Boardman J.M. (1998) ‗Similarity
retrieval of the trademark images‘,IEEE Multimedia,
vol. 5 No. 2, pp. 53–63.
4.
Jing Y. and Baluja S. (2008) ‗PageRank for product
image search ‘,in Proc.Beijing, China, pp. 307–316.
5.
Kalantidis
Y.
and
Trevisiol
M.
(2011)
‗Scalabletriangulation-based on logo recognition‘, in
Proc. ACM Int. Conference. Multimedia Retr.,Italy,
pp. 1–7.
6.
Kim Y.S. and Kim W.Y. (1997) ‗Content-based
trademark retrieval system using visually Salient
feature‘, in Proc. IEEE Conference. Comput, Vis.
PatternRecognit., San Juan, Puerto Rico, pp. 307–
312.
7.
Lowe D. (2004) ‗Distinctive image features from
scale-invariant key points‘
Int.J. Comput. Vis.,
Vol. 60 No. 2, pp. 91–110.
8.
Luo J. and Crandall D. (2006) ‗Color object
detection using spatial-color joint probability
functions‘, IEEE Transaction. Vol. 15 No. 6,
pp.1443–1453.
9.
Matas J. and Chum O. (2004) ‗Robust wide-baseline
stereo from maximally
stable
extremal
regions‘, Image Vis. Comput., Vol. 22 No. 10, pp.
761–767.
c ) Reference image with keypoint
d) Genuine logo detected
10. Mortensen E. and Shapiro L. (2005) ‗A SIFT
descriptor with global context‘ ,in Proc.Conference
Comput. Vis. Pattern Recognt., pp.184–190.
e) Fake logo detected
V. CONCLUSION
We have proposed a new class of similarities
based on Modified Context Dependent algorithm. We
implemented Scale invariant feature transform algorithm
for key point extraction and enhanced context
computation technique which enhances the performance
in terms of accuracy in logo matching and computation
time.
160 | Page
11. Rodriguez J. and Aguiar P. (2009) ‗ANSIG—An
analytic signature for permutation-invariant twodimensional shape representation‘, in Proc. IEEE
Conference Comput. Vis. Pattern Recognt., pp. 1–8.
12. Sahbi H. and Ballan L. (2013) ‗Context-Dependent
Logo Matching and Recognition‘, Member, IEEE,
Giuseppe Serra, and Alberto Del Bimbo, Member,
IEEE.
13. Smeulders A. and Worring S. (2010) ‗Content based
image retrieval at the end of the early years‘ ,IEEE
Transaction Pattern Analysis. Vol. 22 No. 12, pp.
1349–1380.
September 2014, Volume-1, Special Issue-1
A JOURNEY TOWARDS: TO BECOME THE BEST VARSITY
Mrs. B. Mohana Priya
Assistant Professor in English,
AVIT, Paiyanoor, Chennai.
[email protected]
Abstract : Now days, the students and the parents
usually decide to join a university / college based on its
reputation, rather than its location. It is evident in the
case of BITS, a foremost engineering college in India
which is being set up in a very small village named
Pilani. B est industries usually recruit candidates from
the best universities. This creates more competition
within universities, to become the best in the region.
Desiring fruitful results, more & more Indian
universities are going ahead with various Memorandum
Of Understandings (MOU) with various best-in class
industries & universities. This creates a gradual
improvement in the overall academics & working of
those universities, thereby improving their ranking year
on year. This article tells an overview on how to improve
the reputation of the universities in terms of "soft
power" perspective with respect to other universities in
the region. This study mainly focuses with respect to
Indian context and how the universities benefit out of it.
Keywords: Best Indian University,
Requirements for betterment
Soft
Power,
INTRODUCTION: IDEAS IN CONFLICT
"Indian higher education is completely
regulated. It's very difficult to start a private university.
It's very difficult for a foreign university to come to India.
As a result of that our higher education is simply not
keeping pace with India's demands. And that is leading to
a lot of problems which we need to address" Nandan
Nilekani (2009).
As the University Grants Commission (UGC),
the apex body regulating higher education in India, marks
its 60th anniversary — it was inaugurated on December
28, 1953 — some introspection is in order. The
democratisation of the higher education system and
improved and expanded access and opportunities are
some of the milestones of the last half-a-century.
However, there are concerns expressed by all
stakeholders that the current models of governance of
universities do not inspire confidence about an
appropriate framework to regulate them. Several issues
need to be examined in the context of the existing
framework for regulating universities.
The existing model is based on deep and
pervasive distrust among regulators over the possibility of
universities doing things on their own, and doing it well.
The current framework that require universities to be
constantly regulated by laws, rules, regulations,
161 | Page
guidelines and policies set by the government and the
regulatory bodies have not produced the best results.
The universities need to see the students as their
customer and their aim should be to satisfy the customer
needs. This study mainly identifies the needs of the
students which makes universities to do more and more to
their students. There is both demand and supply involved
in the case of universities. For that universities need to
look for the best among the students, to have a better
cycle between two. In connection to this, The term “ Soft
power” is now widely used in international affairs by
analysts and statesmen. The phrase "Soft Power" was
coined by Joseph Nye of Harvard University in a 1991
book, „Bound to Lead‟: The Changing Nature of
American Power. He further developed the concept in his
2004 book, Soft Power: The Means to Success in World
Politics. For example, in 2007, Chinese President Hu
Jintao told the 17th Communist Party Congress that
China needed to increase its soft power, and the
American Secretary of Defense Robert Gates spoke of the
need to enhance American soft power by "a dramatic
increase in spending on the civilian instruments of
national security--diplomacy, strategic communications,
foreign assistance, civic action and economic
reconstruction and development.
"Soft power is the ability to obtain what you
want through co-option and attraction. It is in
contradistinction to 'hard power', which is the use of
coercion and payment. It is similar in substance but not
identical to a combination of t he second dimension
(agenda setting) and the
third dimensions (or the
radical dimension) of power as expounded by Steven
Lukes in Power a Radical View. Soft power can be
wielded not just by states, but by all actors in
international politics, such as NGO's or international
institutions. The idea of attraction as a form of power did
not originate with Nye or Lukes, and can be dated back to
such ancient Chinese philosophers as Lao Tsu in the 7th
century BC, but the modern development dates back only
to the late 20th century. Soft power is also widely talked
in India. "It's not the side of the bigger army that wins, it's
the country that tells a better story that prevails". This
was being told by our present Minister of State for
External Affairs, Shashi Tharoor (2009) in Mysore.
LEVELS IN UNIVERSITIES:
While considering level / ranking in universities,
two varied factors needs to be taken care. Various survey
groups have been analyzed including Data quest-IDCNasscom survey, Outlook survey, India today survey.
September 2014, Volume-1, Special Issue-1
From that five basic factors been identified and in this
article, it is being mentioned in the "Basic Requirement
Level". For the public, parents and for the students, the
first and the foremost one being the "Basic Requirement
Level". In that the universities have to comply / fulfill the
rules and regulations laid down by UGC and the
Government of India. Also, it needs to look into the
future needs of the students who are the customers for the
universities. While looking at basic requirements, this
study mainly focus on following five factors.
From the above table, it somewhat seems easy
for the universities to reach the second / medium level in
"Soft Power" with some moderate measures. But it is not
that easy for universities to reach the top most level in
"Soft Power". For that they need to have very structured
plans for achieving best-in-class facility for all the five
basic requirements. In this study, it is mainly
concentrated on how to become the best universities in
terms of "Soft Power".
ACTIVITIES
FOR
BECOMING
BEST
UNIVERSITY-SOFT POWER CONTEXT
As mentioned initially, the best universities will
have a high Soft Power compared with other universities.
For that, the university should have a best in class facility
in all the five factors in the "Basic Requirement Level".
1. Infrastructure
2. Faculty Eminence
3. Accessibility to Industries
4. Quality of Students
5. Placement Offered
The second one is "Soft Power Level" of the
university. The University with better Soft Power Level
will have a better response from its customers including
the students and the parents. An University will be having
the highest level of Soft Power, if it is having a 'best in
class' features for all the above five factors in "Basic
Requirement Level". Since it is not possible to have best
features in all the five factors in short period, some
universities are coming up with a variety of measures to
improve there "Soft Power Level" Considering
educational institutions, we can place universities in three
levels with the context of "Soft Power". It includes
Activities for improving each of those factors in
Basic Requirement Level" is as mentioned below.
1. Universities with High Soft Power
2. Universities with Medium Soft Power
3. Universities with Low Soft Power
Industrial Participation: Industrial Participation can be
enhanced by participating eminent industrialist in
university board and they need to be involved in the
critical decisions, like for the syllabus of industry specific
courses. A good alumni base also can improve the
industrial participation, where the universities old
students will happily improve the status of the same.
Seminars / Conferences also bridge the gap between
universities and the industries.
Universities with Low Soft Power have just
fulfilled "Basic Requirement Level" which comply the
criteria set by the government /UGC, but it lacks attention
from the public or media. It is very easy for the
universities to rank up the ladder to second level
(Universities with Medium Soft Power) through various
public attention activities like convocation involving
famous person, by organizing renowned seminars /
conferences etc. Usually, these universities take two to
five years in achieving the second / medium level of "Soft
Power". The time varies mainly depends on the
university's capability and also the measures taken by it,
to attract the public / media towards the university.
Universities
with Low
Soft Power
Five Basic
Requirement
Activities
For Soft
Power
162 | Page
Universities
with High
Soft Power
Available
Universities
with
Medium
Soft Power
Available
No
Yes
Yes
Best-inclass
Benchmarking: Benchmarking with the best universities
needs to be done in regular intervals. The team
monitoring the same should be capable of fixing the
criteria. Also, the benchmarked universities need to be
carefully fixed after a thorough evaluation on the same.
This process will help in improving all the five factors in
the "Basic Requirement Level" which comprises of
Infrastructure availability, faculty eminence, accessibility
to industries, quality of students and placement offered.
Quality Students ↔ Placement link: There is always a
strong link between quality of students and the
placement. If there are more quality students in a
university, there is high chance of better placement there.
Also, this is a cycle too. Quality students will join a
university only if it has better placement assistance. At
the same time, placement will be high if the student
quality is high. While assessing student quality level both
their initial knowledge level and also the universities
contribution for enhancing the same place a vital role.
Good Alumni Association: Strong alumni base is a real
long term asset for the university. An alumni helps to get
the university updated for all the real issues happening in
industry. This helps the university to update their
curriculum accordingly. Also, there will be continuous
support from the alumni group for the improvement of
the university.
September 2014, Volume-1, Special Issue-1
ANALYSIS & VIEWS:
This article is an initial study concentrated
mainly on improving soft power among the universities,
which in turn creates a best university. Further, more
detailed study required for implementing it in a real case
study (to implement in universities). This study
conducted mainly on the Indian context. For future
articles, more study comparing with foreign university
will give positive results and new ideas to the
universities. More study to be conducted for the impact of
University soft power on the students needs to be
analyzed. Mathematical approach involving various
samples can give more relevant data and this can be
studied further in the future articles.
REFERENCES:
1.
Joseph S. Nye Jr., 'Bound to Lead: The Changing
Nature of American Power,' Basic Books, New York,
1991.
2.
JosephS. Nye Jr., 'Soft Power: The Means to Success
in World Politics,' Public Affairs, 2004.
3.
Nandan Nilekani (February 2009) Talk by Nandan
Nilekani in Long Beach, California on his ideas for
India's future
4.
http://www.ted.com/talks/nandan_nilekani_s_ideas_f
or_india_s_future.html
5.
Shashi Tharoor (November 2009) Talk by Shashi
Tharoor in Mysore, Indian on why nations should
pursue soft nature
6.
http://www.ted.com/talks/lang/eng/shashi_tharoor.ht
ml
7.
Steven Lukes, 'Power and the battle for hearts and
minds: on the bluntness of soft power, ' Felix
Berenskoetter and M.J. Williams, eds. Power in
World Politics, Routledge, 2007
163 | Page
September 2014, Volume-1, Special Issue-1
EXTRACTION OF 3D OBJECT FROM 2D OBJECT
Diya Sharon Christy1 M. Ramasubramanian2
1
PG Scholar, Lecturer
Associate Professor
Vinayaka Missions University, Aarupadai Veedu Institute of Technology,
Rajiv Gandhi Salai, (OMR), Department of Computer Science and Engineering
Paiyanoor-603104, Kancheepuram District, Tamil Nadu, India
2
Abstract - The gesture recognition is to be studied
vigorously, for the communication between human and
computer in an interactive computing environment. Lots
of studies have proposed the efficient methods about the
recognition algorithm using 2D camera captured
images. There is a limitation to these methods are exist,
such as the extracted features cannot represent fully the
object in the real world.
Keywords:3D object
Recognition, Computer Vision.
Extraction,
Gesture
Though, many studies used 3D features instead
of 2D features for more accurate gesture recognition, the
problem, such as the processing time, to generate 3D
objects, is still unsolved in the related researches.
Therefore we are proposing a new method to extract the
3D features combined with the 3D object reconstruction.
Our propose method uses the enhanced GPU-based visual
hull generation algorithm.
This algorithm disables unnecessary processes,
like the texture calculation, a nearest boundary, a farthest
boundary, and a thickness of the object projected on the
base-plane. In the experimental section results, we are
presenting the results of proposed method on ten human
postures. T shape, both hands up, right hand up, left hand
up, hands front, bend with hands up, bend with hands
front, stand, sit and bend, and compare the computational
time of the proposed method with that of the previous
methods.
I. INTRODUCTION
The recognition algorithm is much significant to
the interactive computing environment. In addition to
that, the processing time and recognition accuracy are the
main factors of the recognition algorithm. Hence, various
researches related to these concerns have been studied in
the last decade. Generally, computer vision-based
recognition algorithms will use 2D images for extracting
features. The 2D images can be used efficiently, when the
position of the camera and viewing direction are fixed.
The extracted features from 2D input images, are
invariant to the scale, the translation, and the rotation in
2D planes. However, in spite that the targets, that are
captured and recognized, are 3D objects, the features,
which are extracted in 2D images, that can have 2D
information or limited 3D information. In order to solve
this problem, a lot of studies proposed using multi-view
images. These methods will recognize the objects or
164 | Page
postures using comparison results between camera input
images and multi-view camera captured images that are
captured by real or virtual cameras around the objects of
recognition.
However, these methods spends very long time
to generate features and to compare the features with the
input data, since the accuracy of recognition is
proportional to the number of camera view images, and
the major problem of these methods is that the features
extracted from multi-view images cannot fully represent
the 3D information. This is due to the images still
containing only the 2D information. So, recently a lot of
studies are proposing many kinds of methods using
reconstructed 3D objects.
The reconstructed 3D objects can represent
positions of components which include 3D objects and
can provide 3D information to extracted features.
Therefore, these ways can recognize more accurate than
the methods which use the 2D images. Table I shows the
kinds of computer vision-based feature extraction
methods using the reconstructed 3D objects.
TABLE I: The vision based 3D Feature Extraction
methods using Reconstructed 3D objects
Type of
Extracte
d
Features
Algorithms
for Feature
Extraction
Reeb graph
3D thinning
Graph
Histogra
m
Curveskeleton
Authors
[paper no.]
M. Hilaga et al. [8]
H. Sundar et al. [9]
N. D. Cornea et al.
[10,11]
A. Brennecke, T.
Isenberg [12]
3D
Bindistribution
C. Chu, I. Cohen [5]
Spherical
harmonic
T. Funkhouser et al.
[7]
Feature
Extractio
n
Time in
seconds
1
10
103
Less than
0.1
D. Kyoung et al. [6]
Less than
1
The methods using the structural featuring of 3D
objects are more accurate for recognition, because these
extract the features using 3D data of each component that
constructs the subjects of recognition (Table I). However,
the methods producing the skeletons from 3D objects [812] requires the long time, since these are divided into
two processes: the 3D object reconstruction and feature
extraction.
September 2014, Volume-1, Special Issue-1
To solve this problem, in the feature extraction
part, the methods, which use a spherical harmonic [7] or a
3D bin-distribution algorithm [5,6] for fast feature
extraction and represent the distances between the center
point and the boundary point by a histogram, is proposed.
However, even though these methods can represent the
global shape, they cannot represent the local characters.
Due to the 3D object reconstruction part still
exists, it is difficult to apply these methods using 3D
objects to the real-time recognition environment. In this
paper, we propose the method of a real-time 3D feature
extraction without the explicit 3D object reconstruction.
Fig. 1 shows the difference between the previous feature
extraction methods and the proposed one in their
processes.
II. VISUAL HULL
For extracting the features of dynamic 3D
objects, we can use the video streams or images as input
from multiple cameras, and reconstruct an approximate
shape of the target object from multiple images.
By rendering the reconstructed object, we are
able to obtain projection maps, which can be used as
important features of object. For the purpose of
reconstructing and visualizing the dynamic object, the
visual hull can be used. It has been widely used as 3D
geometry proxy, which represents a conservative
approximation of true geometry [12].
We can reconstruct a visual hull, of an object
with the calibrated cameras and the object's silhouette, in
multiple images. The shadow or the silhouette of the
object in an input image refers to the contour separating
the target object from the background. Using this
information, combined with camera calibration data, the
silhouette is projected back into the 3D scene space from
the cameras' center of the projection. This generates a
cone-like volume (silhouette cone) containing the actual
3D object. With multiple views, these cones can be
intersected. This produces the visual hull of the object
Fig. 2
Fig. 1 The 3D feature extraction processes:
(a) process of existing studies and (b) proposed method
This method can generate three kinds of features
which contain different types of 3D information. Nearest
boundary, Farthest boundary, and thickness of the object
projected on a base plane. The projection map can be
obtained by rendering the target object. For this purpose,
the visual hulls can be used as a 3D geometry proxy. It is
an approximate geometry representation resulting from
the shape-from-silhouette 3D reconstruction method[13].
The visual hull reconstruction algorithm and
rendering can be accelerated by the modern graphics
hardware. Li et al.[14] presented a hardware-accelerated
visual hull (HAVH) rendering technique. As we are
extracting features from the results of the visual hull
rendering, our proposed method does not need explicit
geometric representation. Therefore we use the enhanced
HAVH algorithm which disables unnecessary processes,
such as the calculation of texture, in the general HAVH
algorithm. Moreover, we can save the drawing time by
disabling all lighting and texture calculations for this
rendering. These processes are not necessary for feature
extraction (Fig. 1(b)). The structure of the paper is as
follows. We describe the visual hull in Section II. Next,
we describe the details of our methods in Section III: the
silhouette extraction (Section III.A), the visual hull
rendering (Section III.B) and the projection map
generation (Section III.C). The Experimental results are
provided in Section IV. And, we conclude in Section V.
165 | Page
Visual hull reconstruction: Reconstructed 3D surface
Different implementations of visual hull
reconstruction are described in the literature [14-17].
Some of them computes an explicit geometric
representation of the visual hull, either as voxel volume
[15] or polygonal mesh [16]. However, if the goal is
rendering visual hulls from novel viewpoints, the
reconstruction does not need to be explicit. Li et al. [14]
present a hardware-accelerated visual hull (HAVH)
rendering technique. It is a method for rendering of visual
hull without reconstruction of the actual object. The
implicit 3D reconstruction is done in rendering process by
exploiting projective texture mapping and alpha map
trimming. It runs on the modern graphics hardware and
achieves high frame rates.
We can obtain projection maps for feature
extraction by rendering the visual hull. The explicit
geometry representation is not needed for this process.
Moreover, explicit geometry reconstruction is very timeconsuming process. Instead of reconstructing 3D visual
hull geometry, we render the visual hull directly from
September 2014, Volume-1, Special Issue-1
silhouettes of input images by using HAVH method and
obtain the projection maps from the rendering results.
III. FAST FEATURE EXTRACTION
In order to extract the features of a dynamic 3D
object, we render a visual hull of the target object from
multiple input images. By using HAVH rendering
method, we can render the visual hull without
reconstructing the actual object in the real time. From the
rendering results of the visual hull, we can obtain the
projection maps which contain 3D information of the
target object, such as nearest boundary, farthest boundary,
and thickness of the object (Fig. 3). They can be used as
important features of the target object.
The projection maps are obtained by rendering
the target object. When an object is rendered by the 3D
graphics card, depth of a generated pixel is stored in a
depth buffer. The depth buffer can be extracted and will
be saved as a texture [18], called a depth map. By
rendering the front-most surfaces of the visual hull, we
will get the depth map which stores the distance from a
projection plane to the nearest boundary. It is called a
nearest boundary projection map (Fig. 3(a)). Likewise,
we will get a farthest boundary projection map by
rendering the rear-most surfaces of the visual hull (Fig.
3(b)). By subtracting the values from the two maps, we
can get a thickness map which stores the distance
between the front-most surfaces and rear-most surfaces
Fig. 3 Projection maps: (a) nearest boundary projection
map, (b) farthest boundary projection map, and (c)
thickness map
A. Silhouette Extraction: When the images are
captured from multiple cameras, an object's silhouette can
be computed in multiple images. The target object, in
each captured image(Ic) is segmented from the
background (Ib). We can store the information into
silhouette images(S). The alpha values of a silhouette
image are set to 1 for the foreground object and to 0 for
the background as in (1).
(1)
Silhouettes or shadows are then generated from
each silhouette image. The silhouette of the object in a
silhouette image refers to the collection of all edges
separating the foreground object from the background.
Using this information, combined with calibrated
cameras, we are able to generate silhouette cones by
projecting back each silhouette into 3D scene space.
B. Visual Hull Rendering: The visual hull
surfaces can be determined on the graphics hardware by
exploiting projective texturing in conjunction with alpha
blending while rendering silhouette cones. As shown by
Fig. 5(a), for rendering a silhouette cone of the nth
camera, all
The silhouette images of S1,S2,S3, …,Sn-1 are
used as the mask, eliminating the portions of each cone,
that do not lie on the surface of the visual hull. In the
texture unit, the alpha values projected from multiple
textures are modulated. As a result, only those pixels,
projected with the alpha value 1 from all the other
silhouette images will produce the output alpha value
1(Fig. 5(b)). Thus, the visual hull faces are drawn. All the
polygons of silhouette cones are still rendered entirely,
but using the alpha testing, only the correct parts of them
are actually generating the pixels in image.
Our method consists of two major parts as
shown in Fig. 4. When images are captured from
cameras, an object's silhouette can be extracted in the
multiple images. Using this information, combined with
calibration data, we can render the visual hull of the
target object. We are able to obtain projection maps of the
object while rendering the visual hull.
Fig. 5 Silhouette cone rendering:(a) while rendering each
silhouette cone, it is protectively textured by the
silhouette images from all other views. (b) alpha map
trimming, alpha values from multiple textures are
modulated. Hence, visual hull faces are drawn
Fig. 4 Work flow of our method.
166 | Page
September 2014, Volume-1, Special Issue-1
C. Generation of Projection Map: We can compute the
distance from the projection plane to front surfaces of a
target object as well as the distance to rear-most surfaces.
Now we are able to compute the thickness of the object,
which is the distance between front-most surfaces and
rear-most surfaces. Consider the example in Fig. 6. Given
a vector perpendicular to the projection plane, we can
find hit points: P1 on the front-most surface and P2 on the
rear-most surface. The distance between P1 and P2 can be
computed. It equals to ||P2-P1||.
Thus we can generate the projection map by
rendering the target object. First, we set a virtual camera
to be able to view the 3D object. The object from a
viewpoint is then projected onto the camera's projection
plane. An orthographic projection, can be used in order to
avoid the perspective projection distortion. Rasterization,
is nothing but the process of converting geometric
primitives into pixels, determines the viewing direction
and its hit point. In rendering the object’s front-most
surface, the hitpoint P1 on the front-most surface along
the viewing direction is easily extracted for each of the
pixel and will be saved in a buffer. Likewise, the hitpoint
P2 on the rear-most surface can be obtained by rerendering the object from the same viewpoint and saved
in another buffer. With the information from the two
buffers, now we can compute the distance.
Fig. 6 Distance from a projective plane to front-most of
an object, distance to rearmost surfaces, and its thickness
To implement, we can generate projection maps
using depth information from the viewpoint by rendering
the target object. When an object is rendered by 3D
graphics card, the depth of a generated pixel is stored in a
depth buffer. It is done in hardware. The depth buffer can
be extracted and saved as a texture, called a depth map. It
is usual to avoid updating the color buffers and disable all
lighting and texture calculations for this rendering in
order to save drawing time. We render the target object
from a viewpoint with the depth test reversed in order to
draw the rear-most faces of the object. From this
rendering, the depth buffer is extracted and stored in a
texture, which is a farthest boundary map (Fig. 7(a)). To
obtain a nearest boundary map, we have to render the
object, again from the same viewpoint with the normal
depth test by passing the fragments closer to the
viewpoint (Fig. 7(b)). We can compute the distance by
subtracting the values from the two depth buffers in order
to generate a thickness map. It can be done by multiple
textures blending function Fig. 7(c)).
167 | Page
Fig. 7 Projection map generation using depth map. (a),(b),
and (c) are 1D version of projection maps of an object
shown in left: (a) farthest boundary projection map stores
the depth from view plane to rear-most surface, (b)
nearest boundary projection map stores the depth values
of front-most surface, (c) thickness map is generated by
subtracting (b) from (a)
IV. EXPERIMENTAL RESULTS
This section demonstrates our results of the fast
feature extraction. All images have been generated on a
2.13 GHz CPU with 2Gbyte memory and an nVidia
GeForce 8800GTX graphic card, using Direct3D. We
used ten cameras to acquire input images. The cameras
were positioned in and around the object accurately
calibrated system. The resolution of both acquired images
and rendered result images was set to 640X480. Under
this setting, we have to measure the speed of our method.
We obtained the ten silhouette cones from silhouette
images. It took around 8ms per image on the CPU.
However, we did not check the calculation time of this
process, due to this is a common factor for all algorithms.
Generating a single projection map by rendering frontmost or rear-most surfaces of the visual hulls, which is
the process of a nearest or farthest boundary projection
map, took around 1.50 ms. The generation times for a
thickness map including the generation of two projection
maps and distance computation by rendering the visual
hull twice were about 3.0 ms (Table II).
TABLE II : The Comparison Of The Proposed
Method With The Present 3d Feature Extraction
Methods Which Use Explicit 3d Models
Using
Methods
Thinning based
Skeletonization
3D
bin
distribution
Proposed
method
Visual
Hull
Generation
Feature
Extraction
370ms
107 ms
370 ms
10 ms
3 ms
3 ms
Experimental results show that the proposed
method provides high accuracy of recognition and fast
feature extraction. Table II shows the comparison of the
proposed method with the 3D feature extraction methods
which use explicit 3D models. For this experiment, we
use the 3D models that are reconstructed in the voxel
September 2014, Volume-1, Special Issue-1
space of 300X300X300 size. Because we generate the
projection map using GPU programming without explicit
3D object reconstruction, the proposed method is faster
than other methods and can manage 14 or 15 image sets
per second. Therefore this method is matched with realtime recognition system. Fig. 10 shows the silhouette
images which are extracted only foreground objects in a
camera captured image (Fig. 8(a)) and projection maps
which are generated using the reconstructed 3D objects.
In this paper, we use ten kinds of human posture images.
And, the projection maps are generated using a top-view
camera with an orthographic projection. Because the
human postures are limited to the z=0 plane and the topview image is invariant to the translation, scaling and
rotation, we can use the top-view image. As shown by
Fig. 8(b), there are many similar silhouette images in
different posture and different camera views. However,
the projection maps can represent the difference of each
posture, since they have the 3D information of each
posture (Fig. 8(c-e))
Fig. 10 Extracted features from the captured images of 10
human postures: (a) camera captured images, (b)
silhouette images, (c) nearest boundary projection map,
(d) farthest boundary projection map and (e) thickness
map
V. CONCLUSION
In this paper, we proposed a 3D feature
extraction method. The proposed method generates 3
kinds of projection maps, which project all data on the
z=0 plane using the input images of the multi-view
camera system, instead of 3D object. This method is fast
for presenting the 3D information of the object in input
images, due to we use the enhanced HAVH algorithm
that the unnecessary processes are disabled such as the
light and texture calculation. Therefore the proposed
method can be applied to real-time recognition system.
However, some problems remain in this method: error in
visual hull rendering, limitation of the number of
cameras, data transferring time in memories and distance
calculation between overlapping of components. In our
method, we use the silhouette-based visual hull rendering
algorithm. But this algorithm cannot generate the accurate
168 | Page
3D object, because the silhouette images are binary
images and does not have the input object's texture
information, and our method cannot use more than 16
camera images. However, this is a hardware limitation
and we can solve this problem using parallel visual hull
rendering method. Finally, the proposed method cannot
detect the z-position of arms or legs, because we calculate
only the distance between the nearest and the farthest
parts from a camera. Now we are studying about reducing
transfer of time and more accuracy to provide good
performance.
REFERENCES
[1] Kwangjin Hong, Chulhan Lee, Keechul Jung, and
Kyoungsu Oh, “Real-time 3D Feature Extraction
without Explicit 3D Object Reconstruction”2008,
pp. 283-288.
[2] J. Loffler, “Content-based retrieval of 3d models in
distributed web databases by visual shape
information,” in Proc. 4th International Conf.
Information Visualization, 2000, pp. 82.
[3] C.M. Cyr, B.B. Kimia, “A similarity-based aspectgraph approach to 3d object recognition,”
International J. Computer Vision, vol. 57, 2004, pp.
5-22
[4] P. Min, J. Chen, T. Funkhouser, “A 2d sketch
interface for a 3d model search engine,” in Proc. the
International Conf. Computer Graphics and
Interactive Techniques, 2002, pp. 138.
[5] C. Chu, I. Cohen, “Posture and gesture recognition
using 3d body shapes decomposition,” in Proc. the
IEEE Computer Society Conf. CVPR, vol. 3, 2005,
pp. 69.
[6] D. Kyoung, Y. Lee, W. Baek, E. Han, J. Yang, K.
Jung, “Efficient 3d voxel reconstruction using precomputing method for gesture recognition,” in Proc.
Korea-Japan Joint Workshop, 2006, pp. 67-73.
[7] T. Funkhouser, P. Min, M. Kazhdan, J. Chen, A.
Halderman, D. Dobkin, D. Jacobs, “A search engine
for 3d models,” ACM Trans. Graphics, vol. 22,
2003, pp. 83-105.
[8] M. Hilaga, Y. Shinagawa, T. Kohmura, T. Kunii,
“Topology matching for fully automatic similarity
estimation of 3d shapes.” in Proc. the 28th annual
Conf. Computer graphics and interactive techniques,
2001, pp.203-212.
[9] H. Sundar, D. Silver, N. Gagvani, S. Dickinson,
“Skeleton based shape matching and retrieval,” in
Proc. International Conf. Shape Modeling
International, 2003, pp. 130-139.
[10] N.D. Cornea, D. Silver, P. Min, “Curve-skeleton
properties, applications and algorithms,” IEEE
Trans. Visualization and Computer Graphics, vol.
13, 2007, pp. 530-548.
[11] N.D. Cornea, D. Silver, X. Yuan, R.
Balasubramanian, “Computing hierarchical curveskeletons of 3d objects,” in Proc. the Visual
Computer, vol. 21, 2005, pp. 945-955
September 2014, Volume-1, Special Issue-1
[12] A. Brennecke, T. Isenberg, “3d shape matching
using skeleton graphs,” in Proc. Simulation and
Visualization, vol. 13, 2004, pp. 299-31
[13] A. Laurentini, “The visual hull concept for
silhouette-based image understanding,” IEEE Trans.
Pattern Analysis and Machine Intelligence, vol. 16,
1994, pp. 150-162.
[14] M. Li, M. Magnor, H. Seidel, “Hardwareaccelerated visual hull reconstruction and
rendering,” in Proc. Graphics Interface, 2003, pp.
65-71.
[15] R. Szeliski, “Rapid octree construction from image
sequences,” in Proc. CVGIP: Image Underst., vol.
58, 1993, pp. 23-32.
[16] W. Matusik, C. Buehler, L. McMillan, “Polyhedral
visual hulls for real-time rendering,” in Proc. the
12th
Eurographics
Workshop.
Rendering
Technique, 2001, pp. 115-126.
[17] C. Lee, J. Cho, K. Oh, “Hardware-accelerated
jaggy-free visual hulls with silhouette maps,” in
Proc. the ACM Sym. Virtual Reality Software and
Technology, 2006, pp. 87-90
[18] C. Everitt, A. Rege, C. Cebenoyan, “Hardware
shadow mapping,” Technical report, NVIDIA.
AUTHOR
Mr. M. Ramasubramanian
received his B.Sc., and M.C.A
Degrees
in
Bharathidasan
University in the year 1997 and
2000 respectively, and received
M.E. degree in the field of
Computer
Science
and
Engineering
from
Vinayaka
Missions University in 2009. He is presently working as
a Senior Assistant Professor, in Aarupadai Veedu
Institute of Technology, Vinayaka Missions University,
India. He is doing his Research in the area of Image
Processing in the same University, under the guidance of
Dr. M.A. Dorai Rangaswamy.
169 | Page
September 2014, Volume-1, Special Issue-1
CLOUD BASED MOBILE SOCIAL TV
Chandan Kumar Srivastawa1,
Mr.P.T.Sivashankar2
1
Student – M.E (C.S.E)
2
Associate professor
1,2
Department of Computer Science & Engineering,
1,2
Aarupadai Veedu Institute of Technology, Vinayaka Nagar, Paiyanoor
1
[email protected]
Abstract
The rapidly increasing power of personal
mobile devices is providing much richer contents and
social interactions to users on the move. This trend
however is throttled by the limited battery lifetime of
mobile devices and unstable wireless connectivity,
making the highest possible quality of service
experienced by mobile users not feasible. The recent
cloud computing technology, with its rich resources to
compensate for the limitations of mobile devices and
connections, can potentially provide an ideal platform to
support the desired mobile services. Tough challenges
arise on how to effectively exploit cloud resources to
facilitate mobile services, especially those with stringent
interaction delay requirements. In this paper, we
propose the design of a Cloud-based, novel Mobile
social TV system.
Literature survey
Literature survey is the most important step in
software development process.
Before developing the
tool it is necessary to determine the time factor, economy
and company strength. Once these things are satisfied,
then next steps are to determine which operating system
and language can be used for developing the tool. Once
the programmers start building the tool the programmers
need lot of external support. This support can be obtained
from senior programmers, from book or from websites.
Before building the system the above consideration are
taken into account for developing the proposed system.
INTRODUCTION
Thanks to the revolutionary “reinventing the
phone” campaigns initiated by Apple Inc. in 2007, smart
phones nowadays are shipped with multiple
microprocessor cores and gigabyte RAMs; they possess
more computation power than personal computers of a
few years ago. On the other hand, the wide deployment of
3G broadband cellular infrastructures further fuels the
trend. Apart from common productivity tasks like emails
and web surfing, smart phones are flexing their strengths
in more challenging scenarios such as real time video
streaming and online gaming, as well as serving as a main
tool for social exchanges.
Although many mobile social or media
applications have emerged, truely killer ones gaining
mass acceptance are still impeded by the limitations of
the current mobile and wireless technologies, among
which battery lifetime and unstable connection bandwidth
170 | Page
are the most difficult ones. It is natural to resort to cloud
computing, the newly-emerged computing paradigm for
low-cost, agile, scalable resource supply, to support
power-efficient mobile data communication. With
virtually infinite hardware and software resources, the
cloud can offload the computation and other tasks
involved in a mobile application and may significantly
reduce battery consumption at the mobile devices, if a
proper design is in place. The big challenge in front of us
is how to effectively exploit cloud services to facilitate
mobile applications. There have been a few studies on
designing mobile cloud computing systems, but none of
them deal in particular with stringent delay requirements
for spontaneous social interactivity among mobile users.
In this paper, we describe the design of a novel mobile
social TV system, CloudMoV, which can effectively
utilize the cloud computing paradigm to offer a livingroom experience of video watching to disparate mobile
users with spontaneous social interactions. In CloudMoV,
mobile users can import a live or on-demand video to
watch from any video streaming site, invite their friends
to watch the video concurrently, and chat with their
friends while enjoying the video. It therefore blends
viewing experience and social awareness among friends
on the go. As opposed to traditional TV watching, mobile
social TV is well suited to today’s life style, where family
and friends may be separated geographically but hope to
share a co-viewing experience. While social TV enabled
by set-top boxes over the traditional TV systems is
already available, it remains a challenge to achieve
mobile social TV, where the concurrently viewing
experience with friends is enabled on mobile devices.
We design CloudMoV to seamlessly utilize agile resource
support and rich functionalities offered by both an IaaS
(Infrastructure-as-a-Service) cloud and a PaaS (Platformasa- Service) cloud. Our design achieves the following
goals.
Encoding flexibility: Different mobile devices
have differently sized displays, customized
playback hardware’s, and various codes.
Traditional solutions would adopt a few encoding
formats ahead of the release of a video program.
But even the most generous content providers
would not be able to attend to all possible mobile
platforms, if not only to the current hottest models.
CloudMoV customizes the streams for different
devices at real time, by offloading the transcoding
tasks to an IaaS cloud. In particular, we novelly
employ a surrogate for each user, which is a virtual
September 2014, Volume-1, Special Issue-1
machine (VM) in the IaaS cloud. The surrogate
downloads the video on behalf of the user and
transcodes it into the desired formats, while
catering to the specific configurations of the
mobile device as well as the current connectivity
quality.
Battery efficiency: A breakdown analysis
conducted by Carroll et al. indicates that the
network modules (both Wi- Fi and 3G) and the
display contribute to a significant portion of the
overall power consumption in a mobile device,
dwarfing usages from other hardware modules
including CPU, memory, etc. We target at energy
saving coming from the network module of
smartphones through an efficient data transmission
mechanism design. We focus on 3G wireless
networking as it is getting more widely used and
challenging in our design than Wi-Fi based
transmissions. Based on cellular network traces
from real-world 3G carriers, we investigate the key
3G configuration parameters such as the power
states and the inactivity timers, and design a novel
burst transmission mechanism for streaming from
the surrogates to the mobile devices. The burst
transmission mechanism makes careful decisions
on burst sizes and opportunistic transitions among
high/low power consumption modes at the devices,
in order to effectively increase the battery lifetime.
hardware resources provided by an IaaS cloud),
with transparent, automatic scaling of users’
applications onto the cloud.
Portability. A prototype CloudMov system is
implemented following the philosophy of “Write
Once, Run Anywhere” (WORA): both the frontend mobile modules and the backend server
modules are implemented in “100% Pure
Java”,with well-designed generic data models
suitable for any BigTable-like data store; the only
exception is the transcoding module, which is
implemented using ANSI C for performance
reasons and uses no platform-dependent or
proprietary APIs.
•
The client module can run on any mobile devices
supporting HTML5, including Android phones,
iOS systems, etc. To showcase its performance, we
deploy the system on Amazon EC2 and Google
App Engine, and conduct thorough tests on iOS
platforms. Our prototype can be readily migrated
to various cloud and mobile platforms with little
effort. The remainder of this paper is organized as
follows. In Sec. II, we compare our work with the
existing literature and highlight our novelties. In
Sec. III, we present the architecture of CloudMoV
and the design of individual modules
SYSTEM ARCHITECTURE
Spontaneous
social
interactivity:Multiple
mechanisms are included in the design of
CloudMoV to enable spontaneous social, coviewing
experience.
First,
efficient
synchronization mechanisms are proposed to
guarantee that friends joining in a video program
may watch the same portion (if they choose to),
and share immediate reactions and comments.
Although synchronized playback is inherently a
feature of traditional TV, the current Internet video
services (e.g., Web 2.0 TV) rarely offer such a
service. Second, efficient message communication
mechanisms are designed for social interactions
among friends, and different types of messages are
prioritized in their retrieval frequencies to avoid
unnecessary interruptions of the viewing progress.
For example, online friend lists can be retrieved at
longer intervals at each user, while invitation and
chat messages should be delivered more timely.
We adopt textual chat messages rather than voice
in our current design, believing that text chats are
less distractive to viewers and easier to read/write
and manage by any user.
These mechanisms are seamlessly integrated with
functionalities provided by a typical PaaS cloud,
via an efficient design of data storage with
BigTable and dynamic handling of large volumes
of concurrent messages. We exploit a PaaS cloud
for social interaction support due to its provision of
robust underlying platforms (other than simply
171 | Page
IMPLEMENTATION
• Implementation is the stage of the project when the
theoretical design is turned out into a working
system. Thus it can be considered to be the most
critical stage in achieving a successful new system
and in giving the user, confidence that the new
system will work and be effective.
• The implementation stage involves careful planning,
investigation of the existing system and it’s
constraints on implementation, designing of methods
to achieve changeover and evaluation of changeover
methods.
September 2014, Volume-1, Special Issue-1
MODULE DESCRIPTION:
1. Transcoder
2. Social Cloud
3. Messenger
4. Gateway
5. Subscribe
Transcoder
• It resides in each surrogate, and is responsible for
dynamically deciding how to encode the video
stream from the video source in the appropriate
format, dimension, and bit rate. Before delivery to
the user, the video stream is further encapsulated into
a proper transport stream. Each video is exported as
MPEG-2 transport streams, which is the de facto
standard nowadays to deliver digital video and audio
streams over lossy medium
Social Cloud
• Social network is a dynamic virtual organization with
inherent trust relationships between friends. This
dynamic virtual organization can be created since
these social networks reflect real world relationships.
It allows users to interact, form connections and
share information with one another. This trust can be
used as a foundation for information, hardware and
services sharing in a Social Cloud.
Messenger
• It is the client side of the social cloud, residing in
each surrogate in the IaaS cloud. The Messenger
periodically queries the social cloud for the social
data on behalf of the mobile user and pre-processes
the data into a light-weighted format (plain text
files), at a much lower frequency. The plain text files
are asynchronously delivered from the surrogate to
the user in a traffic-friendly manner, i.e., little traffic
is incurred. In the reverse direction, the messenger
disseminates this user’s messages (invitations and
chat messages) to other users via the data store of the
social cloud.
Gateway
• The gateway provides authentication services for
users to log in to the CloudMoV system, and stores
users’ credentials in a permanent table of a MySQL
database it has installed. It also stores information of
the pool of currently available VMs in the IaaS cloud
in another in-memory table. After a user successfully
logs in to the system, a VM surrogate will be
assigned from the pool to the user. The in-memory
table is used to guarantee small query latencies, since
the VM pool is updated frequently as the gateway
reserves and destroys VM instances according to the
current workload. In addition, the gateway also stores
each user’s friend list in a plain text file (in XML
formats), which is immediately uploaded to the
surrogate after it is assigned to the user.
172 | Page
Subscribe
• In this module user can download the video.
Subscribe module download video in high speed and
clear video streaming. Authorized user every one
download and watch the videos.
Transcoding mechanism
• It resides in each surrogate, and is responsible for
dynamically deciding how to encode the video
stream from the video source in the appropriate
format, dimension, and bit rate. Before delivery to
the user, the video stream is further encapsulated into
a proper transport stream. Each video is exported as
MPEG-2 transport streams, which is the de facto
standard nowadays to deliver digital video and audio
streams over lossy medium.



Only one high quality compressed video is
stored
No/Much less computations on motion
estimation
Can produce comparable video quality with
direct encoding
CONCLUSION
• We conclude results prove the superior
performance of CloudMoV, in terms of
transcoding efficiency, timely social interaction,
and scalability. In CloudMoV, mobile users can
import a live or on-demand video to watch from
any video streaming site, invite their friends to
watch the video concurrently, and chat with their
friends while enjoying the video.
REFERENCES
[1] M. Satyanarayanan, P. Bahl, R. Caceres, and N.
Davies, “The case for vm-based cloudlets in mobile
computing,” IEEE Pervasive Computing, vol. 8, pp.
14–23, 2009.
[2] S. Kosta, A. Aucinas, P. Hui, R. Mortier, and X.
Zhang, “Thinkair: Dynamic resource allocation and
parallel execution in the cloud for mobile code
offloading,” in Proc. of IEEE INFOCOM, 2012.
[3] Z. Huang, C. Mei, L. E. Li, and T. Woo,
“Cloudstream: Delivering high-quality streaming
videos through a cloud-based svc proxy,” in
INFOCOM’11, 2011, pp. 201 205.
[4] T. Coppens, L. Trappeniners, and M. Godon,
“AmigoTV: towards a social TV experience,” in
Proc. of EuroITV, 2004.
[5] N. Ducheneaut, R. J. Moore, L. Oehlberg, J. D.
Thornton, and E. Nickell, “Social TV: Designing for
Distributed,
Sociable
Television
Viewing,”
International
Journal
of
Human-Computer
Interaction, vol. 24, no. 2, pp. 136–154, 2008.
September 2014, Volume-1, Special Issue-1
[6] A. Carroll and G. Heiser, “An analysis of power
consumption in as smartphone,” in Proc. of
USENIXATC, 2010.
[7] J. Santos, D. Gomes, S. Sargento, R. L. Aguiar, N.
Baker, M. Zafar, and A. Ikram, “Multicast/broadcast
network convergence in next generation mobile
networks,” Comput. Netw., vol. 52, pp. 228–247,
January 2008.
[8] DVB-H, http://www.dvb-h.org/.
[9] K. Chorianopoulos and G. Lekakos, “Introduction to
social tv: Enhancing the shared experience with
interactive tv,” International Journal of HumanComputer Interaction, vol. 24, no. 2, pp. 113–120,
2008.
[10] M. Chuah, “Reality instant messaging: injecting a
dose of reality into online chat,” in CHI ’03 extended
abstracts on Human factors in computing systems,
ser. CHI EA ’03, 2003, pp. 926–927.
[11] R. Schatz, S. Wagner, S. Egger, and N. Jordan,
“Mobile TV becomes Social - Integrating Content
with Communications,” in Proc. of ITI, 2007.
[12] R. Schatz and S. Egger, “Social Interaction Features
for Mobile TV Services,” in Proc. of 2008 IEEE
International Symposium on Broadband Multimedia
Systems and Broadcasting, 2008.
[13] J. Flinn and M. Satyanarayanan, “Energy-aware
adaptation for mobile applications,” in Proceedings
of the seventeenth ACM symposium on Operating
systems principles, ser. SOSP ’99, 1999, pp. 48–63.
[14] W. Yuan and K. Nahrstedt, “Energy-efficient soft
real-time cpu scheduling for mobile multimedia
systems,” in Proceedings of the nineteenth ACM
symposium on Operating systems principles, ser.
SOSP ’03, 2003, pp. 149–16
[15] website :http://java.sun.com
[16] website : http://www.sourcefordgde.com
[17] website : http://www.networkcomputing.com/
[18] website : http://www.roseindia.com/
[19] website : http://www.java2s.com/
173 | Page
September 2014, Volume-1, Special Issue-1
BLACKBOX TESTING OF ORANGEHRMORGANIZATION CONFIGURATION
Subburaj.V
Dept. of CSE
Aarupadai Veedu Institute of Technology
[email protected]
ABSTRACT- This project deals with the
blockbox
testing
of
OrangeHRM-Organization
ConfigurationApplication .The prime focus is to
uncover the critical errors present in the system by
testing it repeatedly.This project is basically done by
implementing the concepts of
iterative testing,
exploratory testing and customer feedback. The various
modules of the project are classified into several
iterations. These iterations not only help in delivering
high quality software but also it ensures that the
customer is getting a software that works. The various
increments of the software are tested and the proceeding
goes further upon the customer’s feedback. The test
plans and test cases for unit testing, integration testing
and system testing are prepared and executed for each
iteration. Hence the effective communication between
the testing team and the customer representative helps
in ensuring the reliability of the delivered product.
I. INTRODUCTION
The purpose of this document is to describe all
the requirements for the OrangeHRM. This forms the
basis for acceptance between the OrangeHRM (customer)
and the software vendor. OrangeHRM offers a complete
suite of human capital management / human resource
management tools. The intended audiences include
Customer Representative, Development team and Testing
Team.
The proposed software product is the
OrangeHRM. The system will be used to add the
employee information, employee leave processing,
recruitment management, performance evaluation.
OrangeHRM aims to be the world’s leading
open source HRM (HRIS) solution for small and
medium sized enterprises (SMEs) by providing a flexible
and easy to use HRM system affordable for any company
worldwide. HR personnel will transform their work from
more paper-based work to less paper-based work due to
their awareness of the convenience provided by the
system, saving time and cost.
Fig 1: Test Plan
The proposed software product is the
OrangeHRM. The system will be used to add the
employee information, employee leave processing,
recruitment management, performance evaluation
II. SCREEN LEVEL REQUIREMENTS
1.Login and Registration Module
This module enables employee to log in and
access the details. It also enables admin to register any
employee and update employee’s general information
along with contact, qualification and other details. The
employee registration can be done only buy the admin
type of user having this privilege.
Login:
In this module the Employee or Administrator
enters the system by using different user names.
New Employee Registration:
If the Employee is new to the organization then
he/she has to register in the new Employee registration
form.
Update Employee:
If the Employee want to update his profile then
he has to update in the update employee form.
174 | Page
September 2014, Volume-1, Special Issue-1
Forget Password:
By using this Employee can retrieve the
password.
2. Delete View and Update Employee Information
Module:
This module has control over the system and
able to manage the human resource by adding, viewing
and updating employee information. This module is based
on hierarchy and employees can see their profile and
profiles of other employee who are in lower hierarchy.
Delete Employee:
In this administrator can delete the employee from
the organization by using Employee id.
Throughout the training, we were able to put in
our efforts to make the project a success. The
environment provided by the company enabled us to
work in a positive manner.
IV. REFERENCES
1. http://en.wikipedia.org/wiki/Software_testing
2. www.guru99.com/software-testing.html
3. http://SoftwareTesting_Help.com
4. Roger Pressman, Software Engineering,
A
Practitioner’s Approach
5. B.Beizer. Software Testing Techniques. Van
NostrandReinhold, New York,NY,1990
Time Sheet
In this administrator generate a time sheet for
employee.
Salary Report:
In this administrator generate a salary report for the
employee.
Leave Report:
In this administrator see the leaves applied by
employees and he manages the leaves.
Search work Sheet:
By using this administrator can see the employee
work sheets.
Fig1.1 Test case
Employee Salary and Payroll Module:
This module deals with employee salary. Any
employee can see his salary details. The employee having
admin type of privilege can see his own salary as well as
the payroll of the other employees.
III. CONCLUSION
The project of Human Resource Management
System is the requirement of almost all organization to
manage the men power in proper and efficient manner.
175 | Page
September 2014, Volume-1, Special Issue-1