Study Group Part 5 - ARMA Liberty Bell Chapter of Philadelphia

Transcription

Study Group Part 5 - ARMA Liberty Bell Chapter of Philadelphia
ARMA Liberty Bell Chapter
CRM Study Group
CRM Examination – Part 5
April 12, 2007
TABLE OF CONTENTS
TAB 1 – Part 5 Sample Examination
TAB 2 – Part 5 Exam Outline
TAB 3 – Notes taken from “Robek Brown” textbook (compiled
by Pete Casey)
TAB 4 –Technical definitions “section F”- Data Management
(compiled by Ellie Kidd, Jon Cohen, Robert Tocher, Mary
Nyce and Kathleen Roth)
PART 5
TECHNOLOGY, EQUIPMENT AND SUPPLIES
A. Micrographics
1. Standards
2. Equipment and supplies
a. Film
b. Camera
c. Processors and duplicators
d. Readers and printers
e. Hybrid systems
3. Methods and systems
4. Quality controls
5. Storage
B. Reprographics
1. Standards
2. Selection criteria
3. Printers and duplicators
4. Copiers
5. Computerized reprographics
C. Imaging Systems
1. Standards
2. Selection criteria
3. Media
4. Recording and processing
5. Drives and peripherals
6. Storage devices
D. Systems
1. Basic Concepts
a. Planning
b. Standards
c. Business rules and workflow
d. Evaluation of vendors
e. Implementation
f. Administration
2. Organizing
a. Collaboration with IT
b. Addressing customer needs
3. Directing
a. Training users
b. Documentation
4. Controlling
a. Compliance
b. Quality control and audits
E. Records Creation
1. System Architecture
2. Devices
3. Applications
a. Sources of input
b. Choosing formats
c. Implementation
4. Capture
a. Legacy systems
b. Current
c. Migration
5. Organizing data
a. Classification and auto-classification
b. Record status
c. Validation
d. Selection
F. Data Management
1. System architecture
a. Telecommunications
b. Networks
c. Shared servers
d. Internet and intranets
e. Websites
f. Portals
2. Devices
a. Personal
b. Enterprise
3. Programs, Software, and Applications
a. Database
b. Data mining
c. Data warehousing
d. Enterprise Content Management (ECM)
e. Website management
f. Electronic Document Management
g. Operating systems, utilities and diagnostics
h. RIM software
i. Email
j. Instant Messaging
k. Artificial Intelligence
l. Other
4. Security/Accessibility
a. Access rights
b. Customer service
c. Confidentiality/privacy
d. Methods of implementation
5. Data use
a. Serving multiple users
b. Shared drives
c. Electronic document rooms
d. Manipulation and processing
e. Search and retrieval
f. Output
6. Data storage
a. Methods
b. Backup
c. Hot sites
G. Data/system Disposition
1. Applying the retention schedule
a. Manual
b. Electronic
2. Preservation issues
a. Data migration
b. Long term retention
c. Software upgrades and updates
d. Data destruction
3. Data repositories
H. Preservation, Recovery and Destruction Techniques
1. Preservation
a. Paper
b. Film
c. Electronic or magnetic
2. Recovery
a. Paper
b. Film
c. Electronic or magnetic
3. Destruction
a. Paper
b. Film
c. Electronic or magnetic
d. Data destruction
PART 5
EQUIPMENT, SUPPLIES, AND TECHNOLOGY
A. MICROGRAPHICS
1. Standards. Review the standards established and recognized by the American
National Standards Institute (ANSI) and the Association of Image and Information
Management (AIIM) for producing, processing and storing microfilm.
2. Equipment and supplies. Know the various types of microfilm cameras and film
technology. Be able to identify the features of the various microfilm formats and to cite
the common usage of each format. Be familiar wit the types of processors available and
the major factors that affect film processing. Review the various types of readers and
reader-printers that are used. Know how to select a reader for specific microfilm formats.
Understand the importance of matching the reader’s magnification powers with the film’s
reduction ratio. Review the current technology used for indexing, searching and
retrieving images. Review COM equipment configurations.
3. Methods and systems. Critical to the use of microfilm is the ability to locate the
images on film. Know that the most important facet of developing a microfilm system is
determining what type of indexing system is required. Be familiar with indexing
systems, such as blips, sequential numbering, and microfiche indexes.
4. Quality controls. Know and understand the importance of image quality. Know
what tests are performed to measure density (D-Min and D-Max) and resolution. Know
why these tests are important to image quality. Know the film developing processes and
what steps are necessary to ensure film quality. Be familiar with the Methylene Blue
Test.
5. Storage. Know how to properly store and maintain microfilm. Review the published
standards that specify proper conditions for preservation and maintenance of master and
duplicate microfilm. Know how light, humidity, temperature, and chemicals can
adversely affect the long-term storage of microfilm.
B. REPROGRAPHICS.
1. Standards. Review industry standards pertaining to copying and reprographics
equipment. Review how industry guidelines and market leaders may influence
equipment design.
2. Selection criteria. Know how to evaluate copying requirements and profile copying
activities. Know how to complete a cost justification analysis that may include cost per
copy, total copy, project savings, maintenance, power consumption, change-back options,
etc. Know how to properly match equipment to copying needs. Review vendor selection
criteria, such as reliability, service, quality, and price.
3. Printers and duplicators. Review and compare the various types of printers and
their use. Review desktop publishing and its effect on office technology and printers.
Know the types of non-electronic duplicators and their requirements for intermediary
masters. Be able to recommend the different duplicators based upon quality and quantity
requirements.
4. Copiers. Know the various types of copiers, such as personal, convenience, copy
center, color, etc. Be able to describe each type’s distinct use. Review the variety of
special features and attachments for copiers that are designed to meet users’ special
requirements. Compare the applications of analog versus digital networked copiers.
Review specialty copiers, such as blueline, diazo, and oversize copiers.
5. Computerized reprographics. Be able to identify functions of intelligent copiers.
Be aware of hybrid technologies, such as phototypesetting, and systems that digitize
input, print output, and scan microfilm. Review the use of multifunction systems that
combine duplicate office functions such as printing, faxing, copying, and scanning, into
one device.
C. IMAGING SYSTEMS
1. Standards. Standards on image formatting and recording are in effect, however,
some equipment may be proprietary and some equipment components may be
incompatible, which can present significant problems. Know and understand the
importance of standards, the hazards of obsolescence, and the methods for migrating
information.
2. Selection criteria. Know how to determine user requirements - workflow
(transaction processing and image enabling) or storage (retrieval and reference only).
Review both PC-networked and standalone systems. Be able to assess selection criteria
such as data transfer rate, disk access time, seek time, media tolerance, error corrections,
estimated drive life, resolution , compression ratios, system costs, etc. Be familiar with
the technological problems, such as records retention; records destruction; document
preparation and indexing; and system expense issues.
3. Media. Know the primary types of optical disks. Know which can be updated and/or
erased. Be able to determine the optimum media for any record type. Know the most
common size platters for each type of optical disk and have an understanding of the
capacity in terms of both mega or gigabytes of information and corresponding pages of
recorded information per disk.
4. Recording and processing. Know the types of scanners available. Know the
meaning and importance of drop-out color and throughput. Define single session,
incremental, or multi-session recording. Explain the processes for capturing the image
and appropriate metadata; choosing software (including OCR and ICR); using templates;
storing or converting files to different formats; and burning or transmitting to storage
media. Review hybrid systems, such as scan on demand micrographics, simultaneous
scanning to microfilm and optical disks, aperture card scanning, and COLD.
5. Drives and peripherals. Know why electronic imaging systems require both
magnetic and optical drives. Assess why high-resolution monitors are preferred and
identify the acceptable monitor resolutions. Explain display dpi and its relationship to the
scanned image. Explain refresh rate. Identify image output options, such as output to
laser printers and fax machines.
6. Storage devices. Know how to select storage devices that meet user needs. Examine
on-line, near-line, and off-line storage and access. Describe remote libraries, jukeboxes,
and expansion units.
D. SYSTEMS
1. Basic Concepts. A RIM Manager knows that business problems can be examined
and solved through systems analysis and systems development. Some of the decisions
they must make include identifying who has responsibility for planning the system;
determining where technology can enhance productivity; assessing the needs of the users;
determining the scope of the project: establishing resource allocation, budget and
funding; weighing the benefits of outsourcing vs. in-house development; testing and
evaluating the system; forming strategic partnerships; and planning for the conversion.
Be able to describe the purpose of and components of a requirements analysis.
There are numerous international, national and joint voluntary standards for information
technology. The primary standards bodies are: ISO, ANSI, IEEE and AIIM. Be able to
describe the purpose for their existence and the benefits of compliance with standards.
Understand the effect of globalization of business with the need for standards. Within
computer technology, recognize some of the more common acronyms such as SGML,
SCSI, SQL, and HTML. In addition, many organizations maintain internal standards. Be
able to discuss the definition and purpose of benchmarking, and how project management
can be used for information systems development.
A RIM manager knows that how an organization operates is important to ensure business
needs are satisfied. This can be accomplished through writing effective policies,
procedures and implementing best practices. Describe the use of workflow analysis to be
able to streamline processes, introduce technology where appropriate, and increase
organizational efficiency. Understand the concept of information flow through the use of
data dictionaries, data elements, and data structure.
A RIM Manager may be involved in the evaluation and selection of vendors. Be able to
distinguish between RFIs, RFQs, RFPs, and project proposals, and how they interface
with the requirements analysis. Be familiar with evaluation methods, such as scoring and
weighting.
RIM managers are often charged to manage the transition from an old system to the new,
including completing final operating documentation, developing procedures, designing a
conversion plan, and initial training. Be able to discuss the advantages and disadvantages
of prototypes, pilot projects, phased implementation, and parallel implementation.
Be able to discuss the role and responsibilities of the system administrator regarding
ongoing operation, maintenance managing transition, troubleshooting, controlling user
account information and monitoring system security.
2. Organizing. Understand the requirements of the various groups who work with the
records system and RIM staff, including establishing a partnership with IT, defining
duties, obtaining funding, and management support. Also, be able to identify the needs
of the different customer groups.
3. Directing. Understand the process, attributes, and value of training computer users
using computer-based methods such as tutorials, distance learning and web-based online
courses, in addition to traditional classroom instruction and books. Explain the
importance of maintaining system and user documentation updates.
4. Controlling. Know how an evaluation of the system performance and human input
into the system can affect data quality and system performance regarding compliance
with standards, regulatory requirements, laws, and internal policies and procedures. Be
able to describe the role of feedback for evaluation purposes and methods for collecting
it.
E. RECORDS CREATION
1. System Architecture. A RIM manager should be able to assist in the design and
planning, and be able evaluate and assess the choices for and the performance of
components of the system. Data should be able to flow seamlessly across space and time.
Identify issues related to hardware and software integration; connectivity; and
interoperability. Be able to identify basic components of a computer.
2. Devices. Data is captured into an information system through an originating device.
Be able to discuss different pieces of hardware, the possible configurations and
combinations for data input. Discuss and compare the features of keyboards, voice
recognition, handwriting recognition, scanners, barcode readers, video recorders, wireless
devices, and scientific and medical instruments. Discuss the types of metadata that need
to be captured at records creation to ensure authentic and reliable records.
3. Applications of data.
Sources of input. Recognize that data can be obtained from varied sources, such as
scanned documents, keyed data entry, electronic recording devices, in addition to other
computers (timeshare, mainframe, minicomputers, and PCs).
Know the importance of how files are created, structured, and stored. Be able to
distinguish between the different possible formats for text, image, data, or sound files.
Was analog or digital encoding used to create the electronic records? Are the files
compressed and can they be decompressed in the future? What implications do these
characteristics have on how data will be stored and used? What is the role of metadata in
capturing and managing this information over time?
During the implementation phase of applying data to a system many factors need to be
addressed. What considerations need to be made in the care and processing of data?
Will there need to be any conversion? What is the quality of the data? Is there any
documentation on how the data was collected? Is it readable? What is the role of
metadata in capturing and managing this information over time?
4. Capture. A RIM manager knows that data can come from many sources - data that
has already been captured in legacy systems and migrated; and data captured through
current input. Be able to describe the features and requirements for each type.
5. Organizing data. A RIM manager recognizes that completeness and accuracy of the
content and classification of data is important to the overall success of a system and
business process. Data can be classified either manually, or through software, to
determine if data is a record and into which record series it belongs. Can the files be
organized in a under a file plan or structure, whether electronic or paper? Understand the
process by which data can be indexed and classified using keywords, taxonomies, or
metadata. Understand the value of validation to check data for correctness and adherence
to standards and conventions. A RIM Manager should be aware of technology and
techniques to select and collect relevant data for specific purposes and to make data
available to users for multiple uses in order to turn data into meaningful information for
the organization.
F. DATA MANAGEMENT
1. System architecture. A RIM manager should understand how information can be
distributed throughout a computer system and how the system can vary in scale, number
of users, access, and geographic coverage. Be aware of how systems are designed,
organized, and optimized. Look at factors that determine whether the system could
benefit from mirroring or clustering. Compare the advantages of turnkey systems,
commercial off the shelf systems (COTS), and custom systems.
A RIM manager should have a basic understanding of telecommunications in order to
work with IT staff, including the technologies that allow transmission of analog and
digital data. Explain the characteristics and uses for voice mail, fax and teleconferencing.
Understand the purpose for, and types of, networks and be able to describe the
components and structure to a network that enable information to be shared, through such
processes as Electronic Data Interchange (EDI). Differentiate between the features and
purposes LANs, MANs, and WANs. Explain the differences between computer and
network operating systems. Know the difference between the Internet and an intranet.
Describe the capabilities and limitations of each. Web pages are electronic documents explain how they are accessed, structured, maintained, updated, secured, and linked
within and between other websites. Be able to distinguish when a web page is being used
for information or as a portal. Define homepages, URLs, and webmasters.
2. Devices. Be able to distinguish between the purposes and the functions of devices
used in a personal or single user system and those in an enterprise system.
Describe the features and uses for personal devices such as cell phones, PDAs,
blackberries, laptop, desktop PCs, and peripherals. Be able to expand on how personal
devices can communicate with, and become components of, enterprise or shared systems.
Discuss how wireless devices; servers or mainframe computers; printers; networks; and
applications can be shared on an enterprise-wide basis.
3. Programs, Software, and Applications. A RIM manager is often called upon to
evaluate and choose software and manage the information within. Understand the basic
process of creating computer programs and the different levels of programming, such as
machine language, high-level languages, and interactive programs. Describe the
selection criteria, features, and purpose of the following types of software: database; data
mining; data warehousing; Enterprise Content Management (ECM); website
management; Electronic Document Management; operating systems, utilities and
diagnostics; RIM; email; instant messaging; artificial intelligence; and other types.
Explain how one would verify that the software once installed, was capable of creating
reliable and authentic records.
4. Security/Accessibility. Be able to define access rights. Describe the purposes and
types of different levels of accessibility to a computer system that address balancing the
needs to perform customer service, yet protect data confidentiality and privacy.
Understand different threats to a computer system and the methods that can be
implemented to providing security. Explain how methods such as RFID and barcoding
can assist in tracking and monitoring the location of physical manifestations of
information. Explain how audit and history files can be used to verify the integrity of
data and records.
5. Data Use. A RIM manager understands that the same data can serve different
purposes to different groups and may need to be shared among those groups.
Understand the role of a database administrator in the balancing of data quality and the
needs of users. Be able to explain the different methods that data can be shared,
including shared drives and electronic document rooms. The value of information
contained in data can be maintained and distributed as output in hardcopy and electronic
forms, including COLD, COM, digital photographs, videos, x-rays, sound recordings.
Understand how data can be processed in batch or online modes and how it can be
manipulated through sorting, filtering, calculating, and generating reports. Identify the
purpose for tagging metadata (SGML, HTML, XML). Be able to describe how retrieval
tools, such as indexes and search engines work and are used, including the role of
indexing, structured searches, text retrieval, natural language processing, Boolean
searches, and data tagging. Know the different metrics to measure the success of a
search, including recall and precision. Explain the methods and forms of data output.
6. Data storage. A RIM manager has a good perspective on how data has been stored
over time and may encounter older, historic storage methods, formats, and media of
storage and be responsible for the safe transfer of that data to new media, using new
methods. Understand the effects of storage methods, media, and recording format on the
potential for short-term, intermediate term, and long- term archival storage. Media
stability may also be a factor in reliable recording and playback of data and can influence
data retrieval due to the effects of wear, corrosion, handling, environmental conditions,
and tape tension, in the case of magnetic tapes.
Be able to list the characteristics and uses for floppy and hard magnetic disks, magnetic
tapes, optical disks, CDs, DVDs, USB flash drive, storage area networks (SANs) and
RAID storage devices. Recognize the environmental conditions that are optimal for
storage of data using the different methods. Distinguish between primary and secondary
storage. Be able to list the advantages and disadvantages of preserving data in its native
format, proprietary formats, or de facto standard formats. Recognize the differences
between storage copies and working copies of files. Be able to describe the different
types and purposes of backups for data. Describe different malfunctions of storage
devices and media and how data can be restored. Define and distinguish between hotsites
and coldsites. Understand the use of reciprocal agreements in data storage.
G. DATA/SYSTEM DISPOSITION
1. Applying the retention schedule. Be able to explain how a RIM manager can
influence the appropriate retention and destruction of data, including vital records,
through the application of records retention schedules to all record formats and how
software can be used to manage the process for both electronic and physical records.
2. Preservation issues. A RIM manager knows that records need to be available for
their entire life cycle. Strategic plans should be made and implemented for migrating
data to meet long term retention requirements for the entire life cycle of the records.
Recognize the need to not only patch and upgrade software as new releases become
available but to consider updating software too in order to ensure backward compatibility
with data, optimize use of data, and to incorporate enhanced software features to manage
the data. A RIM manager must also recognize that technical support for software may
not always be available in the future and plans for the possibility of converting the data to
be managed by new software to remain readable and usable.
Be able to list possible requirements for destruction of data based on data properties,
storage media type, and security concerns. Which methods ensure complete destruction
of the information and which methods leave open the possibility for data restoration?
Why is this important?
3. Data repositories. Understand the purpose for, and advantages of, storing data in
repositories. Be able to discuss the pros and cons of storing documents in their native
formats vs. open standard formats and the effects on searchability and usability.
H. PRESERVATION, RECOVERY AND DESTRUCTION TECHNIQUES
1. Preservation. Preservation is necessary for those records determined to have longterm
value and/or historical significance. Preservation methods are media dependent and
in the case of electronic or digital records, require a movement to another form for long
term preservation and availability of the information. Be familiar with the conditions that
can damage records and the processes used to reverse or halt the further deterioration of
records in any media. Identify the major problems that affect the preservation of
electronic, magnetic, digital, and optical media. Discuss the role of metadata in ensuring
a complete and accurate history of the preservation of records over time.
2. Recovery. Identify the immediate, short term, and long term recovery procedures
following a disaster in which records of any media type were damaged or destroyed.
Understand the importance of a disaster recovery plan and know the procedures
associated with recovery of records damaged by water, fire, smoke, or chemicals.
3. Destruction. Know the various records destruction methods available, such as
shredding, recycling, maceration, pulverization, pulping, erasing, and writing over.
Know what methods are suitable for which medium. Be able to address cost factors as
well as environmental, security, and confidentiality concerns. For electronic
recordkeeping systems, be able to discuss what metadata about the destroyed records
should be maintained as evidence of their destruction.
PART 5
TECHNOLOGY, EQUIPMENT AND SUPPLIES
Basic Outline for Part 5
A. Micrographics
B. Reprographics
C. Imaging Systems
D. Systems
E. Records Creation
F. Data Management
G. Data/System Disposition
H. Preservation, Recovery and Destruction Techniques
The following notes are taken from the “Robek Brown” textbook – Records and
Information Management
Some of the concepts from the Outline are elaborated in the following notes. Please
note that information for Part 5 of the examination may be found using additional
sources. Please note the ICRM Bibliography for this study material.
Robek Brown - CHAPTER 4- VITAL RECORDS PROTECTION AND DISASTER
RECOVERY TRAINING
Business importance: Ensure the very survival of an organization.
Businesses can be faced with lawsuits for not providing disaster recovery plans.
Disaster recovery produce no return on investment unless a disaster actually occurs.
Vital Records: Their essential Characteristics:
Vital: irreplaceable and required to operate the business. Business records are vital
mainly because of their intrinsic uniqueness.
Info is vital asset because:
1. Establishes the legal status of the org as a business entity.
2. Documents the assets and liabilities of the org from a financial perspective.
3. Documents the operations of the organization, which enable production processes
or other work to be accomplished.
Documents not defined as vital may be classified as important or simply useful.
Basic Elements of Vital Records Protection:
The RM must compare the cost of protecting records to the cost of reconstructing them
and to other direct monetary losses, if the records are destroyed prematurely. Calculated
risk.
Initiating the Program: Senior management approval.
Vital Records Protection: A form of business interruption insurance: The premiums paid
are the costs of the protective measures taken. As part of the Risk Analysis the RM must
consider the nature of potential disasters that an organization might be subject to, the
likelihood of these disasters, an the consequences should they occur.
Class1. Nuclear attack
Class 2. Earthquake
Class 3. Fire in a manufacturing plant
Class 4. Fire in a federal record Center, St. Louis
Class 5. Bomb thrown in to tape library
Class 6 Research notes of chemist destroyed
Class 7. Letter of Commendation lost
Class 1 forgetaboutit.
Class 2. local govt must assist services. Roster of all city employees, a list of temp places
of housing and blueprints of all major buildings in thecity.
Class 3. Backup computers. One victim of a disaster in a large metropolis.
Class 4 is identical to class three except during non working.\
Class 5 only one or two functions in an org. May not put you out of business, but could
cost millions. The component although valueable, it is not vital.
Class 6. Dispersal would be recommended.
Class 7. Lost document. Micrographic copies or maintaining good records management
practices.
How do you determine what is vital.
Identify the FUNCTIONS that are essential to the primary mission of org. The RM must
identify the records whose info value to the org is so great, and the consequences of loss
so severe, that special protection is justified to reduce these risks.
Very selective.
Only about 2 to 7 percent of the total records of an org are vital.
Vital Records: Loss of info would subject org to unacceptable level of risk.
IMPORTANT: SECONDARY
Useful: Lowest protection priority
Interviews for vital records:
1. Never ask, What records are vital?
2. Ask Managers to envision a scenario in which all the records in their dept were
destroyed. How would that affect business.
3. More specifically ask:
Whether the absence of would prevent org from doing business.
What legal consequences.
Which records series could not be replaced.
Which records series have been dispersed.
How much would it cost to reassemble.
How would duplication be performed.
Wheter any departmental resources are available for expenditures of information
protection.
Interview senior executives.
The cost of maintaining the vital record. This may include the cost of an additional copy
or the cost of reducing the size of the copy.
Defining an Acceptable level of risk: If org does nothing it will face maximum exposure
to the risks of a disaster. If too elaborate the consequences are nil. Careful cost risk
benefit analysis.
Making reasonable judgments concerning the business consequences resulting from loss
of information constitutes a qualitative approach to risk analysis. If the organization
wishes to adopta more quantitative approach, the following formula may be used.
R= P x C
Where
R= risk sometimes referred to as the ALE or annualized loss expectancy
P= Probability that such a loss will be sustained in any given year: the likelihood of such
an occurrence based on available historical
C= cost of the loss, usually the cost to replace the info so that it can be used to resume
normal business operations.
Vital records may not apply to this formula since losing these records is unacceptable to
begin with.
Vital records building should be detached and inconspicuous.
NFPA
Vaults for vital records should be capable of maintaining constant temps of 60 + or – 3
and a relative humidity of 50 = or – 5. For vault microfilm even dryer humidity is
recommended 20 to 30 percent is the optimum range by ansi. NFPA vault standards say
that sprinklers are optional. Halon is now an ozone depleter and is banned by EPA.
File rooms for important: Important or useful.
Fire resistant file cabinet and safe: better than leaving unprotected. Level of risk depends
on environment. Reliance as sole mean of protection is not reasonable. Useful for
SMALL quantities of records.
Extra copies when record is created AND Scheduled reproduction of existing records by
any process such as microfilming.
Dispersal consists of creating duplicate hard or microfilm copies of records. This can be
built in or improvised.
Improvised or built in dispersal may take the form of creating an extra copy specifically
for dispersal to a remote creation.
Microfilm will be damaged at temps exceeding 200 F. Microfilm packed in airtight
storage containers before placed in vault retards the passage of steam to the microfilm.
1. FULL SYSTEM BACKUP:
2. Differential backup: making backup copies of all files that have been changed
since last backup. You would have two sets of backup media: the full system
restoration plus the incremental.
3. Incremental backup – copies only the files that have changed since the last
incremental backup.
Backups of pc data on LAN
Software backup options for vital pc records:
1. Use of operating software backup options.- save to disk
2. Use of special backup software- buy from vendors.
3. Use of automatic, on line backup software. Vital data is sent via telephone lines to
remote protection site.
Diskettes are most vulnerable to loss of data if they are exposed to strong magnetic fields.
Malicious tampering.
PC Security: Password protection. Encryption. No printed list of passwords.
Virus- a copy of a program that can insert a copy of itself into another program.
Traditional Magnetic Tape backup with off site rotation of updated files. Optical disk as
backup.
Transaction processing intensive: are at risk if computer processes are down.
On line electronic vaulting: been available for protection of online data. Now more
available because:
1. High speed digital’s price has decreased.
2. The development of electronic vaulting software.
Recovery of Damaged Records: Must have section on the recovery of records.
Prepare a master list of vital records with precise floor room and cabinet locations of
these records. The vital records master list should also show which vital records have
been backed up, the off site location of these records so that they can be used to conduct
business as soon as the situation demands.
Recovery plan should also include recovery measures for wet record medial. Water
damage to records is a factor in at least 90 percent of all records related disasters. The
plan must indicate which vital records have top priority for recovery purposes.
Records Recovery Checklist (page 86) includes:
Names of recovery team and backups.
Names and phone numbers of document restoration experts and companies with vacuum
chambers, freezing facilities, or fumigation capabilities.
Names and phone numbers of companies having large portable fans and dehumidifiers
for rent.
A designation of locations to which damaged records can be relocated for air drying and
other restoration measures.
First Priority: Prevent further damage to records:
First 48 to 72 records is most critical.
Ideal location is a cool and dry one with temperatures less than 60f and relative humidity
of 45. Should be relocated to a site close by.
Records Removal from the Disaster Site: Wet records boxes could weigh 60 pounds (
usually 30 pounds).
Remove irreplaceable vital records first.
Remove all active records.
Next move all other vital records appearing on master list.
Lastly remove non vital on a priority basis.
Do not open drawers hot to the touch.
If possible transport entire file cabinets intact, or remove file drawers without disturbing
their contents.
Initial Restoration Steps at the Recovery Site:
It may be necessary to freeze the records.
Fungicides such as thymol, formalin, or ophenyl can be used to retard their growth.
Air drying: damp but not completely soaked. Small quantity and can be dryed in less than
72 hours.
Vacuum Drying Water soaked records:
1. vacuum freeze dying
2. thermal freeze drying
3. vacuum drying
Recovery of water damaged microfilms:
Film can be damaged at temps in excess of 150f.
Duplicate copies of microfilms of vital should always be stored in off site location. Clean
cold water. They can be reprocessed. If masters exist there may be no need to try to
recover these records.
Magnetic: Can be salvaged if they are water soaked but not exposed to high temperatures.
Imperiled at temperatures greater than 150F. diskettes (the most vulnerable of all reocord
media are threatened with loss of data at 125F or a relative humidity exceeding 80
percent. Fixed magnetic are more problematic. Such media are generally unrestorable.
Restoration of water damaged magnetic tapes is accomplished by hand drying the tapes
with lint free cloths; then running them through a tape cleaner or winder ( not tape drive)
then running them over cleaning tissues. When tapes are reasonably dry they should be
run over the tissues and the blades, and then read and copied onto new media. Water
soaked diskettes should be kept in cool distilled water, dried with lint free towels, and
then copied onto new diskettes, if they have not been warped or magnetically damaged.
Vital Records Program Implementation: Simulate disaster with people sent to dispersal
site for test runs.
Vital Records Master List: Provides dept with a complete list of all vital records for
which it is responsible. Records to maintain essential operations, recreate the company’s
legal and financial positions and to meet obligations to stockholders.
Transfer of Vital Records: Records should be destroyed at regular intervals. Originating
department should be notified. Records status may change while they are in storage.
Audit: Audit of vital records should be made periodically. The audit consists of reviewing
dates on the records maintenance program cards or the vital records control cards and
comparing the date of receipt of documents with the frequency noted on the vital records
master list.
Vital Records Manual: Establishing procedures. Maintain communications about the
program. Seminar, memorandums, new developments. A vital records manual, published
sep or as part of RM manual is the best tool of communication.
Manual may be divided into three parts:
1. Procedures of vital records protection and list of objectives.
2. Second part, master list should be explained.
3. Third part, Insturctions for reconstructing vital records in the event of a disaster
and the use of equipment that would be available.
ROBEK BROWN -CHAPTER 8- ELECTRONIC RECORDS
Electronic Records- records containing machine readable, as opposed to human readable,
information.
Text Files- are usually produced by word processing programs or by software.
ASCII or American Standard Code for Information Interchange. Is used in virtually all
minicomputers and microcomputers, and in many non - IBM mainframe computers
systems. ASCII is the broadest compatibility for file interchange. It is the the best option
for insuring that text files can be read and processed over a period of years on a variety of
computer devices. Thus, text files having long retention requirements should generally be
preserved in ASCII format whenever possible.
EBCDIC – Extended Binary Coded Decimal Interchange Code- is used for text files
created on IBM mainframes and other IBM compatible mainframe computers.
Data Files are computer- processable files that store numeric data as quantitative valuses,
so that the numbers can be manipulated utilizing arithmetic computations.
Text files also store numeric digits, but the digits are stored simply as characters, without
regard to their quantitive significance.
Image files contain computer – processable images of documents that generally existed in
hard copy format prior to having been converted to image files.
Magnetic disk drives are the preferred storage medium in high – performance computing
environments requiring very rapid, on-line access to electronic records. These disks may
be fixed or removable. Fixed disk drives with high capacity platters are generally used in
larger mainfame and minicomputer installations. Storage capacity between 500 mb to 5
gb. Microcomputers fixed disks contain 40 to 200 mb. High capacity disk is 9 to 12
inches in diameter. Lower capacity disk range from 2 to 5.2 inches in diameter.
Magnetic Tape – is the medium used for storage for electronic records. Most widely used
tapes are inch in widt, 10.5 inches in diameter, and 2400 feet in length and are mounted
on open reels. They are “nine tracks,” Gamma ferric oxide is the recording material. This
medium is unsuitable for rapid on lin access thus magnetic tapes are used mainly for
batch processing applications and backup. IBM 3480 cartridge measure 4 inches by 5
inches by 1 inch. The stand contains 550 feet of magnetic tape and stroes 200 mb of data.
The recording material is chromium dioxide. They are used primarily for archiving.
Diskettes- sizes 5.25 and 3.5 inches. 3.5 is now the standard. Provides 720 kb or 1.44 mb
storage.
Optical Media- are non magnetic storage devices. Have storage capacities up to 10 gb.
WORM- Write Once Read Many- used mainly to provide on line access to documents
and data. As well as for off line data archiving and disaster backup purposes.
ERM- Problems and issues
1. Inadequate attention is devoted to creation of a record. RM should provide
guidelines to dept personnel encouraging them to create and maintain only those
computer - generated records that have business purpose.
2. Inadequate attention is often devoted to the organization and identification of
electronic records on-line on non removable media. Machine readable records are
3.
4.
5.
6.
7.
difficult to retrieve because of naming conventions and the tools to find them
)data dictionaries, thesauri and other index devices.
Inadequate attention is often devoted to the organization and id of er off-line on
removable media. Labels on diskettes are brief and cryptic. No inventory is taken
of the growing collection.
The protection and security of er is often overlooked. Vital records unprotected.
Security is not properly maintained.
Retention and disposition of er is often overlooked. Space concerns do not exist
but litigation is still a concern with er. (eg. Records are producable)
Short usable life or er insufficient to meet orgs retention and archival needs.
Useful life is shorter than paper or micofilm.
Ownership status of er is unclear. People regard as private files.
Inventorying ER
Similar to paper records inventory. Done by a department by department basis. 1.
Records series concept applied to er. Record series = data set. 2. All er records series
should be inventoried. Systems operated by a centralized data processing unit or by an
outside service bureau, it will be necessary to collect invetnroy data from operators of the
system and end users.
ER Inventory data
1. Name of Electronic Series. Not “floppy disk.” Give description of contents
2. Series description. Include any details relevant to series status as an er. Name of
information system (text, data or image) Include copy type of removable media (
working copy, storage copy or backup)
3. Medium description. (fixed, hard, disk, magnetic tape etc…) Physical size (5.25
inch)
4. Date of records and medium – should include date of records as well as
manufacturing and recording dates. This may be useful in projecting the useful
life of the records.
5. Hardware and software environments. (Standalone word processor IBM
compatible microcomputer with 4 megabytes…)
6. Network environment (LAN OR WAN)
7. Volume of Records. Expressed in bytes. Helps project future growth.
8. Reference activity
9. Retention Status
10. Protection Status- eg. Creation of security copies of magnetic records to be sent to
a vault
11. Relationship to other records. Pertains to indexes and other related records in
separate mediums. Is it duplicated or microfilmed?
Managing Active Electronic Records
IBM PC AND MS DOS are in 75 percent of the world’s pc’s.
Fixed disks, directories and subdirectories.
MSDOS
Directory trees – The root directory is the starting point for all subordinate directories.
Further broken down into filename and filename extension. Disks may also be identified
by separate volume names or partitions. File allocation table is used to keep trck of
clusters of occupied disk sectors that contain the data in various files.
Organizing directories and subdirectories.
Naming files and documents. DOS filename no more than 8 characters.
Establishing Filename conventions. Two approaches
1. Develop data dictionary or a thesaurus.
2. Relate the org’s file indexes for paper records to its electronic records.
Labeling and identifying removable media
Master Inventory should be made which shows:
The name of the dept or org which created the disk, tape, or diskette, or medium
The names of the files by records series, including a listing or description of the files
which the medium contains.
The name and version number of the software used to process the data.
The type and capacity or recording density of the medium
a serial number or other unique identifier of the medium (eg tape or disk number)
The date or range of dates on which the data were recorded and/or the inclusive dates of
the records.
Any special security requirements or access restrictions applicable to the medium.
Whether the medium is a working copy, a backup copy, or a storage copy
Software for managing ER
1. Search capabilities in previously installed software like document summary in
wp. 2
2. File management software for pc’s
3. Document management for lans and wans.
Storage Conditions for Magnetic media- airconditioned and constant temp of 63 to 68
and a relative humidity of 35 to 45 percent.
Rewind every 3 ½ years. Use vault approved by NFPA.
Optical Storage conditions- Not certain. Use Magnetic standards.
Applying Retention to PC based ER.
Managers should be aware of undelete function. This could be a legal risk. If records
exist they must be produced, regardless of the difficulties that may be associated with
retrieving them.
Archival Status
Media hardware and software dependence. The rapid nature of product obsolescence
would suggest that lengthy or permanent retention requirements for electronic media
would be difficult to meet.
3 ways to deal with Archiving ER.
1. Preserve the er in ASCII format. Most flexible option for future processing.
2. Rely on micrographic media for long-term and archival retention purposes. COM
microfiche or COLD.
3. Retain the paper records.
CHAPTER 9 – AUTOMATION OF RECORDS MGMT SYSTEMS AND
FUNCTIONS
Benefits of Automation
1. Life cycle control of records.
2. improves performance through timely retrieval. Recordkeeping systems should be
capable of performing at peak level 95 percent of the time.
3. Provides multiple pathways to information. This provides more value to the user
than ever before.
4. Benefits in work measurement, cost reduction, productivity improvement and
better services to clients.
5. Raises status of RM and those who work in it. Responsible for maintaining an
instantly accessible corporate information database
Requirements Analysis
Identify poorly performing recordkeeping systems.
Identify functions requiring Automation. Automation must begin by defining the business
objectives of the potential application.
Conduct a physical survey of the records system. Ask who, how for what purpose etc…
Identify Internal resources to support automation. Computers and systems. Consultants
and tech support. Analysis of staff and budgeting support to maintain system.
Preparing the requirements analysis aka needs analysis. The requirements analysis can
include “functional systems specificacations”. These are detailed specs for how the
proposed system is to perform in delivering info to the users. The req analysis may also
include a cost benefit analysis, proposed hardware and software, or other details of
system acquisition and development. It should conclude with a statement of priorities fro
rm automation. Should also have request for authorization.
Selecting appropriate hardware
PC is least expensive platform and most control to the rm. LAN’s and WAN’s can be
used for larger number of records or for remote users. The logical choice for which
system to use is usually dictated by the predominant system already in place. For
example, offices with IBM PC’s usually choose a compatible system.
Software Directory for Automated RM systems.
Two broad areas:
Technical evaluation of product quality and Business evaluation of product and supplier.
Technical Evaluation of Features and Functions:
1. Ease of use.
2. Data retrieval and search capabilities – does the package have keyword searchingsingle word search. Keyword search? Phrase searching- search two or more
consecutive words in field. Boolean logic- use of and or not to combine search
statement to expand or limit the scope of the search. Wild card searching- use
wild card symbols to represent multiple prefixes or suffixes of words (eg.
Automate, automation) or alternate word spellings (labor, labour)
3. Data field characteristics- length of characters, range searches- using symbols like
gt or lt
4. Data entry features- full screen editing, code tables- use of tables to store
authorized field values which can be accessed during data entry.
5. Security features- user password access. Multilevel security at the menu, record,
or field levels to restrict access to specified menus, record types and fields.
6. Reporting features- user defined reports- customize the search and sort parameters
Business evaluation
1. Product price evaluation- Be sure to look at base prices compared with multiple
user price. Compare “like” products to determine value.
2. After sale support from vendors include customizing the software, providing on
site installation services, providing ongoing maintenance support and training.
These are usually priced as extras. Vendor supplied maintenance is most often
priced at 12 to 15 percent of the software license fee.
3. Vendor quality and stability- Attrition rate of software companies is high. Up to
20 percent each year. Check references, ask for financial condition and visit
installations for live product demos if possible. Use quantitive methodology.
(Give numeric ratings on a chart)
Justifying the cost of the new system- Compare the cost of the present system compared
with the new. Factor in hard dollars and soft dollars. Intangibles such as improved staff
morale can be noted. A good automated RM application should be able to generate a
return on investment within two years.
System Conversion and Implementation
Conversion refers to tasks associated with making the transition from the manual
procedures supporting the recordkeeping system to the computerized procedures for the
new system. System conversion includes planning installing testing and debugging.
Converting any existing computerized data that needs to be imported into the new
database from manual indexes; performing data validation and error corrections; and
training all personnel in the use of the new systems.
Bar code technology for RM
Bar codes were first used in the railroad industry during the 60’s as a means of tracking
location of rail cars. Each line in a barcode represents a numeric. There are several types
or symbologies of bar codes each of which is based on ASCII character data sets. Code
39 is the most common. It incorporates the full 128 character ASCII data set.
Benefits of bar cod technology
1. Keyless data entry.
2. Greater accuracy in location
3. Misfiles are reduced. Files location can be verified by scanning the folder and the
location on the shelf.
4. Records dispostion process is facilitated. Lists can be generated to speed the
pulling and or destruction of records.
5. Records “lost in transit” can be tracked. A report can be run of records withdrawn
from storage but not yet received at their destination. Permits an immediate search
rather than days or weeks later when someone discovers problems.
Specific applications for bar code technology
1. Single forms or Loose paper documents. Bar codes can be affixed to single papers
and can be scanned upon creation or receipt (postcards). Loose documents can be
found at earlier creation or receipt stage.
2. Documents housed in file folders- portable scanners may be placed at remote user
points increasing the folder tracking capability
3. Records stored in cartons in record centers. With bc’s it is easy to move cartons
from one shelf to another and still accurately track them.
4. CAR microfilm systems- Computer assisted retrieval. As paper is filmed the roll
number is linked with the paper’s bc number. Greatly increases index production
and reduces errors that would be made during manual keyboard indexing.
Bar code equipment:
Consists of printers, recognition hardware and software to manage and index the
database.
1. Printers- You can use separate labels that can be affixed to paper/folders or paper
can be printed out with bar code. Laser printers are recommended.
2. Recognition hardware- Readers, scanners and wands. Platform scannersare
stationary and are connected to the hardware at the central processing station.
Scanners may be contact units (close range) or noncontact laser gun units. The
data recorded in them are periodically uploaded into the software’s database.
Chapter 10- Microfilm Imaging Systems and Technology
Microfilm can reduce storage space requirement by 98 percent. It is very similar to
photography technology. It requires special cameras to pick up the small subject matterthe printed page.
Camera Film
The most common used is silver halide film or more simply black and white silver
microfilm. The film consists of a polyester base with a silver halide emulsion is coated.
The emulsion bears the photographic image. The emulsion is a mirror image. The portion
of the document that is white will show on the film as black and vice versa. This is called
negative image.
The most common film widths used today are 16mm, 35mm and 105mm. Thickness
ranges from 2.5 mils to 7 with 5 being the most common. 16mm is used for small
documents. Engineering documents large maps newspapers is 35mm and 105mm is used
form extremely large engineering drawings. 105mm is used for microfiche and for COM.
Another camera film is called dry silver film because no wet chemistry is used. The
development is aided by a heat process. The advantage of this system is convenience and
the elimination of a separate development machine and instant development speed.
Copy Film
Also known as print film. Copy film can be sign maintaining (background the same as
the original) or sign reversing. These are also known as nonreversal and reversal film.
The process is done by contact printing. The most common types of copy film are diazo,
vesicular and silver.
Diazo Film- non silver film used for contact printing. Blue, blue black or black. The
image is made by exposing the film to ammonia vapors. Some duplicators use anhydrous
ammonia. Proper ventilation is important. 10ppm ammonia are able to be smelled. 50
ppm is allowed by OSHA. Diazo can be developed without ammonia. This technique
takes longer to develop.
Vesicular film- Known as thermal film. Developed by a heat process. Along with DIAZO
it is available in the same widths as camera film. Green or beige. It is a reversal film.
Used extensively in COM because of reversal capabilities and its simplicity. Not
considered as effective as diazo or silver print as a source film. For two reasons:
1. Does not have the range needed to duplicate the wide variety of contrast found in
source document film.
2. It reverses the normal negative image of source document film.
Vesicular can be exposed to normal light during development.
Silver Print Film- Available in the same widths as original camera film. Two types of
film:
1. Reversal
2. Non-reversal
Developed under darkroom conditions. “Step Test” is done before using the machine.
Required only once and is valid for all film having the same background density. Costly.
Not widely used. Advantage: improved quality in duplicating low contrast images.
Image Quality- Two major qc’s must be done as part of filming process.
1. resolution test
2. lighting and density test
Resolution- is the measure of the sharpness of the characters of the image on the film.
Subjective in nature. Patterns are view with a 50x microscope. The numeric value
assigned to that pattern is said to be the resolution of the film. Resolution are stated in
lines per millimeter. Lines per millimeter read p. 260
Lighting and density- Meausred by a densitometer. Measurement stated in numbers is
called background density. Standard is 1.0 to 1.2. Overexposed will measure higher.
Density of the clear portion is referred to as D MIN. Black portion is D MAX. DMIN
density is measured most often in checking the contrast of positive appearing COM film.
Reduction and Magnification Ratios
Reduction ratio- size of the film image compared to the original document. 24 to 1
reduction means 1/24 the size of the original. More documents can be captured per reel
when a high reduction ratio is used than when a low ratio is used. Small document
cameras have fixed ratios. Maybe two settings. Large document cameras (flatbed)
(planetary) have variables from 16x to 36x. Catalogs can be put on ultrafiche. Are filmed
under laboratory conditions. 4 x 6 ultrafiche can have 400 images.
Document image orientation- manner in which the image is position on the film. Modes
are described as being either cine or comic and either duplex or duo.
Cine- (Cinema) horizontal lines run across the width of the film. AKA portrait mode.
Comic mode- (comic strips) horizontal lines run across the length of the film. AKA
landscape mode.
Duplex mode- (2 sides) achieved by rotary type camera.
Duo mode- places front side images of cine document side by side on the film. This
method masks one side of the film while imaging the other.
Microforms- generic for any thing that contains microimages. Two types:
1. roll film
2. unitized microforms
Roll film- Most economical and most frequently used microform. Low reference, long
retention, seq ordered docs as well as high ref randomly retrieved by automated systems.
3 types- open reel, cassette and cartridge or magazine
Unitized Microforms- aperture cards, jackets, and card jackets
Aperture cards- Started out as “tab card” in data processing. Rectangular opening to slide
in microfilm. Most frequently used in engineering drawings. Available in 35mm for
engineering drawings. For office documents it is available in 16mm. 35mm is
standardized as mil d aperture card.
MIL D – 4 types
1. cold seal card- pressure sensitive transparent tape overlaps the edge of aperture.
2. suspension type card- thin polyester material on three sides. Slipped into unsealed
side.
3. heat sensitive adhesive holds film in place
4. dizao and vesicular copy cards like 1 or 3. duplicates microfilm mounted in
another card or imaged on a roll.
Identification on the aperture is made first on a master data card rather than directly on
aperture. Aperture is more expensive.
Card Jackets- 16mm Aperture is referred to as a card jacket. It can hold combo of 35mm
and 16mm. Film mounting is done by the suspension method. The purpose of jacket is to
group related microfilm frames (student record) into one microform so it can be used like
a file folder.
Microfilm jackets- two thin clear plastic sealed on two sides to form a jacket or envelope.
16 or 35 mm. Title put on a tape strip. Most common is 4 x 6. Viewed in similar viewers
as 16mm aperture cards. Most frequently used for case files and project type.
Microfiche- Most common 4x 6 and tab card size. Most common with catalogs and large
computer reports with wide distribution. Made several ways.
1. contact print of a microfilm jacket
2. step and repeat camera. Exposes each image on a sheet of silver film. Can be
updatable.
3. COM recorder directly from tape or disk.
Microfiche has uniform rows and columns of images. EG. F-12 could be a location.
Rotary Cameras- films ordinary variety of office documents – checks and sales slips.
Sequential numbering device- accessory used in rotary cameras. A number is given to a
batch and this can be indexed on the computer.
Image Count device- “blip system” – aids in finding documents.
Engineering planetary camera- rectangular table with an overhead cameral and lights.
The document to be filmed is stationary. Usually uses 35mm, some for 70 or 105. A
“processor planetary camera” produces an imaged frame of 35mm microfilm mounted in
an aperture card completely processed and ready to use.
Backlighting is used in combination with toplighting, when data are printed on the
reverse side of a drawing. Backlighting also improves line image quality.
Reduction ratios are achieved by raising or lowering the film unit. 12x to 36x is common.
Engineer work is 16x 24x and 30x.
Sectionalizing is used when a document overlaps the copy board. The document must be
filmed twice with a 2 inch overlap on each side. Done with engineering drawings. Each
frame is identified with a code such as f1 or f2.
Processor Engineering Camera- Suited when you need instant turnaround time. The
advanced indexing system allows processor cameras to be titling cameras.
Small document planetary camera- smaller version. Suitable for those that cannot be
filmed on rotary. Bound books, booklets. Also used for jacket work and microfiche work
because the comic or portrait mode cannot be obtained on the rotary if the document is
over 11 inches long.
Step and repeat microfiche camera- an overhead camera like a planetary camera is used,
but the camera is designed so that the images are exposed in uniform rows and columns
on a sheet of 105mm film. Some develop internally and cannot be changed.
Updatable Microfiche camera- Step and repeat camera- allows an image to be added or
deleted to microfiche at any time. Manufacturers call it a record processor. It uses a
process called TEP (transparent electrophotography) The master fiche is filed near the
camera in a protective envelope. When change is made it is inserted into camera.
Archival and legal acceptance should be investigated.
Film processing- deep tank, roller transport, and straight film.
Certifications and Targets.
Certifications state that film was done in the normal course of business.
Targets- visual guides to roll film. Usually letter size sheets of paper with large words
printed like start or end. Sometimes barcoded.
Computer Assisted Retrieval (CAR) – uses the blip (1/8 inch in size). One blip per image
Multilevel blips- Sizes are maxi, midi, and mini aka primary, secondary, and tertiary. For
example a series could be arranged as follows: by date; within the date grouping by
invoice number and third by an indefinite number of supporting documents for the
invoice. The primary break is assigned the larges blip and so on.
Data entry must be done in CAR to achieve indexing. Page 284.
3 Types of CAR data entry styles
1. Standalone approach- advantage- ease of installation. Disadvantage- duplicate
data capture and requires dedicated computer.
2. In-stream approach- utilizes mainframe hardware. Advantage- utilize mainframe
info- bad- requires a lot of programming
3. distributed approach- combination of in-stream and standalone. Utilizes PC but
can communicate with the mainframe.
Data entry and the Barcode- Can be printed automatically onto the paper or affixed with a
label.
COM- The unit that records the data is called the COM recorder or unit. COM captures
the output on film.
Cathode Ray Tube recorder- Initially used.
Laser TechnologyOnline or Offline systems.
Online-
Offline- Not used often. Magnetic tapes are used to film the COM.
Dry and wet Silver film- Dry silver film is processed by a heat drum. Dry and wet
chemicals can be used in the same system. Wet silver film is more stable than dry, but
dry is far more convenient.
Equipment configurations
Train system includes four “boxes”
1.
2.
3.
4.
first box is the tape drive
second box is the com recorder
third box is the inter link to the duplicator
Fourth box where the duplicates are made from either vesicular or diazo film.
Normal polarity for COM is positive with clear background.
On line form overlay- program a form like a W-2
Feasibility of COM- Wide distribution of computer printouts. Reduction in paper cost.
75000 pages of printout can be sent first class for $3.00. This would be 15 boxes of
material.
ROBEK BROWN CHAPTER 11 - Electronic Imaging Systems
Ampex- magnetic videotape for images, but technically unfeasible.
EIS- Electronic Imaging System
EIM- Electronic Imaging Mangement
Microfilm is an analog image that can be seen with the naked eye.
Electronic image is created by a binary digital representative, which is called a bit map.
Advantages of Electronic Imaging- Instant access to image. Workflow. Can be part of a
trnsaction process. Can be justified with more than cost. If the function is vital like
customer service, introduction can be easily justified.
System has following components:
1.
2.
3.
4.
5.
scanner
disk drive and disks
monitor
printer
control computers and software
Jukebox- automatically load disks. Standalones systems work like CAR for microfilm.
They use simple file retrieve refile system.
Scanners- AKA digitizer. The document is indexed so that it can be displayed on an
electronic screen at some later time.
Overview of bit mapping- a document is page is electronically divided into an imaginary
grid of millions of bit locations on which black and white dots, or bits, called pixels
(picture elements) that make up that character and records the grid box where a portion of
the image is located.
Black bit- is recorded as 1. and a white bit (the clear background) is recorded as o. These
bits are recorded in digital raster format from left to right top to bottom as a series of 1’s
and o’s on optical disk mounted on a disk drive.
The disk drive uses a laser to make extremely small pits or bubbles on the surface of the
disk. Each pit represents a 1. The absence of a pit represents a 0.
Bit mapping is first recorded on a magnetic disk in the processor for reviewing and qc
and compression. It is then transferred to the high density optical disk in the disk drive
for permanent storage or automated random retrieval.
Compression ratio: Controlled by software. Computer scientists, found they did not need
to record all the white spaces or 0’s. This reduced the coding from a ratio of 7:1 to a ratio
of 30:1. EG: an avg 400 word page bit mapped at 200 dpi requires 467 kilobytes.
Compressed at 28:1, the document image requires 17 kilobytes. The significance of
compression is that it increases the disk capacity. Many factors involved in compression.
A scan sample must be taken to estimates of job volume and disk capacities.
The same software must also decompress.
Resolution: In micrographics it relates to a quality standard. In imaging it relates to a
planned readability. Resolution is expressed in dots per inch. A dot is a pixel or bit, the
smallest element of a bit map, it may be a 1 or a 0. The dots in effect square. More dots
sharper image.
200 dpi is entirely satisfactory for all office documents. 200 dpi means that one square
inch would have 200 pixels horizontally and 200 pixels vertically, or 40,000 dots.
Increased resolution results in increased scan time and added storage space.
Proportionately an increase from 200 dpi to 300 dpi is an increase of 50 percent. That
translates to 50 percent increase in scanning time as well as disk storage space.
Two basic types of scanners.
1. flatbed type- similar to may office copiers. Very slow. Made for low volume.
2. Sheetfed digitizer- Stack tray etc.. 1000 to 7000 pages per hour.
Micrographic scanner- films microforms.
Large size document scanners are used to digitize engineering drawings varying from C
to E size.
Facsimile transceiver: part of a work flow system.
Gray scale or color scanning: Gray scale scanning may require more than eight times as
much disk space as simple black and white.
SCSI or skuzzy: Small Computer System Interface: interface system.
Paper size limitations of scanners.
Magnetic disks: an optical disk has 15 to 50 times the arreal density of magnetic. Used
for temporary qc.
Optical disk: Some disks are proprietary.
Disk space: some is used for formatting data. Before images are scanned the disk is
evenly divided into pie shaped segments called sectors.
Disk life is estimated at 100 years, but only five years before it is recorded. Partially
recorded disk may not be suitable for additional recording after it is five years old.
Types of optical disks: 3 types.
1. WORM- write once read many. Most often used in imaging because cannot be
changed. Evidentiary purposes.
2. rewritable- aka erasable or MO disk. Magneto Optics technology. A laser light
and magnetic field change the polarity of the bits o’s to 1’s and vice versa. It can
alter or deletes the previously recorded image.
3. CD-ROM- Read Only Memory. Was only in lab until recently. Not usually part of
document imaging system, rather it is used for databases. When the ROM disk is
created in house using a mastering recorder, a disk referred to as cd recordable
CDR is used. Feasible when distribution is less than 40. 4.75 inches
WORM disk with 200 dpi and 10:1 compression on a 12 inch double sided disk can hold
200,000 pages.
Disk Drives: Drives are interfaced with a SCSI to the control processor.
Specification sheets: include such items as data transfer rate, disk access, time, and seek
time. Media tolerance: which affects data reliability with older previously recorded disks.
Error corrections: affecting data loss due to dust or static electricity. Estimated drive life.
Monitors: VGA monitor is used in typical PC environment. Five year cost of knowledge
worker is $150,000. To increase productivity by 1 percent we have gained more than is
need to pay for quality monitor.
Size: Minimum size is 19 inches. Should be able to see full document on screen without
scrolling.
Resolution: Resolution of the monitor screen should be compatible with the scanned
image. Why bother scanning at 200 dpi if monitor only reads at 150 dpi? Because printers
have capability of printing at the 200 dpi resolution.
Refresh rate: how often the monitor redraws the screen. Wavering causes eye fatigue.
Need at least 75 HZ.
Image display Controller: internal electronic board that controls all of the display
functions, but primarily screen image display, clarity, and speed.
Printers: Laser printer and inkjet.
Jukeboxes: AKA autochangers, automated disk libraries or optical disk changers.
Two types of Electronic Document Imaging Systems:
1. Electronic File System: is a counterpart to the microfilm based computer assisted
retrieval. Papers are scanned in random order, indexed and recorded on an optical
disk in binary format, and retrieved and displayed on a computer monitor.
2. Singe Station System: Most often used when operator is primary user of the
system rather than a facilitator or information provider to another user.
Networks: LAN MAN WAN- MAN stand for Metropolitan Area Network.
Networks are either central or distributed.
The central computer is known as the server. Workstation are known as clients.
Distributed networks are controlled equally by each workstation. The failure of one
workstation will not disable the entire system as is the case with a central system.
COLD: Most use a service bureau.
Work flow: high speed networking technology.
Accounts payable scenario: things can be routed throughout organization.
Hybrid Systems:
Scan on demand conversions: microfilm to imaging.
Aperture card applications: easier and faster to scan an image of a large drawing on an
aperture, than it is to scan the original document with a large document scanner. CAD is
Computer Aided Design.
COM image Output- COM recorders usually record only in digital ASCII. Records from
ST. LOUIS military records.
Microfilm Image transmission: micfilm is scanned and then sent in less than 10 seconds
to requestor.
Other Hybrid Uses: Scan and film at the same time. For backup. Using disk for high
activity and film for the long retention.
Microfilm is still the best media for long term storage. Not everyone needs two second
retrieval.
ROBEK CHAPTER 12 – IMAGING SYSTEMS EVALUATION AND SYSTEMS
DESIGN
Value of a Systems Design Study – One must understand the advantages, capabilities,
limitations, costs and savings potential of imaging. Systems design study imporves the
chances that imaging will be used judiciously. Study of existing system – all aspects of
filing, interfiling, refilling, sorting, as well as quantities and volumes of documents and
related clerical activity must be examined.
Reasons for the Study
Systems Design Study – can be used to evaluate an existing document imaging system to
determine if it is performing as planned. The Record Manager will consider all
alternatives, including microfilm, optical disk, and other media as well as hard copy.
Imaging System Study has four phases
1.
2.
3.
4.
Data Collection and analysis
Problem definition and system design
Cost Justification and Approval
Implementation
1. Data Collection and Analysis – Fact finding Report.. Almost identical to
information found in the records inventory form. The report contains many points
including the following.
Section II - General Description, Purpose and Use of Records Series
Know why a record came into existence. Understand the function of a document.
Is the document a vital record? If so, imaging should be considered as a method of
vital records protection.
Point of Stability – The point at which no more annotations or additions are made to
the document or file. If document or file is subject to annotations throughout its
normal life, such as is the case with some engineering drawings, the analyst must
consider alternate methods of annotation.
User Consideration – Ergonomics – physical welfare and employee comfort. Know
the number of people who use the records.
Section III – Retention and Legality
Very short retention times seldom justify imaging. Long retention periods often
justify micrographics.
Section IV – Volume
Understanding the volume of records can point to what type of medium should be
used to maintain records.
Backfile – Sometime backfile records include records that that are retained beyond
their peak access period. This means that records that are due for destruction in the
near future are not good candidates for imaging. If the backfile had a long retention
rate and high reference rate it may be worth imaging those records. Sometimes
backfile records are not mimaged but stored in records centers and retrieved as
needed. Sometimes “scan on demand” conversion is performed. This means that the
records are not imaged unless the reords is retrieved. Permanent records are usually
imaged because the image is archival.
Section V – References
The purpose of knowing current retrieval time requirements is to be able to show, in a
cost analysis, potential saving or increased costs comparing a new system to an
existing one. The information will also indicate if the present system meets the time
norms of the unit it serves.
Section VI – Sorting and filing arrangements
File arrangement can be helpful in designing parameters for indexing in the
automated system. The current filing system can be evaluated for weaknesses.
Section VII – Equipment in Office
Value of the Equipment is evaluated. The value of the real estate currently used. The
cost of labor to run the manual system is also evaluated.
Section VIII – file Supplies Utilized
Evaluate the cost of file folders etc.
Section IX: Input/Output Records
Evaluate records that are related to the main records studied. These records may
become a part of the proposed system and may require a detailed study on each
record involved.
Section X: Duplicate and Alternative Sources
Evaluate Duplication…
Section XI : file content and distribution
Describes the file, but also describes the documents that make up the file.
Section XII: Origination and Purpose of Component Documents
This section allows us to describe in a narrative fashion who originates the
documents, why they were created, and how they are used now.
Section XIII: Processing and Transmitting of Component Documents
Flowchart describing the life cycle of each document that makes up the file.
Problem Definition and Systems Design
Problems•
•
•
•
•
•
•
•
Loss of records
Slow retrieval
Excessive volume of records
Lack of space
Excessive waiting for access to records
Excessive person hours spent in filing and retrieval
Delays in processing
Loss of records in process
Flowchart for the proposed system should be developed to aid understanding.
Testing and Evaluation
A sample of the records should be imaged to test costs and work out procedures in detail.
Cost Justification and Approval
All systems have cost associated with them, but justifying these costs is another matter.
Labor, Material and Equipment.
Cost Justification – comparing the total costs of an existing system to the total costs of a
proposed system.
Feasibility Report aka Needs Assessment report – A report submitted in order to obtain
approval from upper management to install a costly new system. It should show the cost
of the old system in terms of labor, space, materials and equipment, and the cost of the
proposed imaging system in the same terms.
RFP – Request for Proposal – the vendor is requested to respond by stating how it will be
done, what is required to do it, and the cost.
After the system has been approved, planning for implementation is finalized. Training
and facility modification will be covered.
COM – Computer Output Microfilm
COLD – Computer Output to Laser Disk
ROBEK CHAPTER 15 - REPROGRAPHICS AND COPY MANAGEMENT AND
CONTROL
A new copier results in an increase in volume of copying as much as 20 to 30 percent.
Avg business document is copied 19 times. 35 percent of copies are unnecessary.
Copiers and duplicators- Copier make copies directly from an original. Duplicators
require an intermediate master.
Copier Justification
Cost per copy- Only one part of the cost. When copying volume is rising at a greater rate
than costs is an example where cost per copy can be droppings as total cost is rising.
Total Cost Factors – contract- you could buy paper from the vendor company and build
in a discount.
CBSC-
Copies between service calls
SCF- Service Call Frequency
Copy machine requires more service than any other piece of office equipment.
Original Copy- Stamp original copy to reduce filing of duplicate material.
Jan. 1, 1978- copyright law.
Unneccesary Copies- 2 major abuses1. Extra copy- just in case.
2. information copy- magazine articles
Reducing distribution and controlling the routing.
Program Evaluation:
Detailed records should be kept on all copying and duplicating equipment and their use.
A folder for each copier should contain all contracts update sheets, written
correspondence, maintenance reports service calls and reports of oral communication
concerning equipment.
A billing report may be calculated each month for all machines.
Locator report- volume by user.
Budget expenditure report- year to date computations on all expenses.
Service report- provides a service history of all equipment. Includes date of request, type
of problem, date of repair and nature of repair.
Productivity analysis report- monthly volume of all machines. Excessively high volumes
might be an indication that more copying equipment is needed or that inappropriate use is
being made of existing equipment.
These reports are useful in selecting equipment, in allocating resources and determining
charge back costs.
Remote Diagnostics- First is meter reading capability. Second is designed to automate
service and customer support. 3. On drawing board is machine that will interface with the
copier.
Copy Control- to slow down copy use and for charge back to clients.
Feeders1. Single feeders
2. Stack feeders
3. Recirculating stack feeders
ADF- Automatic Document FeederComputer forms feeder
Duplex- double sided.
Original size sensing- change size according to the paper size.
Finishing- Hole punching and stapling.
Clamshell access- small copiers that open up.
Convenience copiers- 10 to 40 cpm.
Copy center- In buildings with more than 50 people. 50 to 90 cpm. Monthly output of
100000 pages.
CRD- Central Reproduction Departments- exceeds 90 cpm. 100000 to 500,000 copies per
month.
DTP- desk top publishing.
MFD- Multifunction devices- laser printing, faxing and digital scanning.
IPU- intelligent processing unit- computer interface for image editing printing and
scanning.
POD- Printing on Demand.
Pen Plotters- fiber tip felt tip or ballpoint.
Non electronic duplicators1. offset
2. stencil
3. spirit
Offset- print shops. Cheaper than electrostatic
Stencil- wax coated sheet.
Spirit duplicator- solvent for the image transfer. AKA ditto machine.
Part 5 – Section F. - Data Management Definitions
Section F. contains many technical terms. I had help compiling these definitions.
Thanks to Jon Cohen, Ellie Kidd, Robert Tocher, Mary Nyce and Kathleen Roth.
F. Data Management
1. System architecture
2. Devices
3. Programs, Software, and Applications
4. Security/Accessibility
5. Data use
6. Data storage
F. DATA MANAGEMENT
1. System architecture.
2. Devices.
*terms compiled by Ellie Kidd
analog data
spelled analogue, describes a device or system that represents changing values as continuously
variable physical quantities. A typical analog device is a clock in which the hands move
continuously around the face. Such a clock is capable of indicating every possible time of day. In
contrast, a digital clock is capable of representing only a finite number of times (every tenth of a
second, for example). In general, humans experience the world analogically. Vision, for example,
is an analog experience because we perceive infinitely smooth gradations of shapes and colors.
When used in reference to data storage and transmission, analog format is that in which
information is transmitted by modulating a continuous transmission signal, such as amplifying a
signal's strength or varying its frequency to add or take away data. For example, telephones take
sound vibrations and turn them into electrical vibrations of the same shape before they are
transmitted over traditional telephone lines. Radio wave transmissions work in the same way.
Computers, which handle data in digital form, require modems to turn signals from digital to
analog before transmitting those signals over communication lines such as telephone lines that
carry only analog signals. The signals are turned back into digital form (demodulated) at the
receiving end so that the computer can process the data in its digital format.
digital data
(adj.) Describes any system based on discontinuous data or events. Computers are digital
machines because at their most basic level they can distinguish between just two values,
0 and 1, or off and on. There is no simple way to represent all the values in between, such
as 0.25. All data that a computer processes must be encoded digitally, as a series of
zeroes and ones.
The opposite of digital is analog. A typical analog device is a clock in which the hands
move continuously around the face. Such a clock is capable of indicating every possible
time of day. In contrast, a digital clock is capable of representing only a finite number of
times (every tenth of a second, for example).
In general, humans experience the world analogically. Vision, for example, is an analog
experience because we perceive infinitely smooth gradations of shapes and colors. Most
analog events, however, can be simulated digitally. Photographs in newspapers, for
instance, consist of an array of dots that are either black or white. From afar, the viewer
does not see the dots (the digital form), but only lines and shading, which appear to be
continuous. Although digital representations are approximations of analog events, they
are useful because they are relatively easy to store and manipulate electronically. The
trick is in converting from analog to digital, and back again.
This is the principle behind compact discs (CDs). The music itself exists in an analog
form, as waves in the air, but these sounds are then translated into a digital form that is
encoded onto the disk. When you play a compact disc, the CD player reads the digital
data, translates it back into its original analog form, and sends it to the amplifier and
eventually the speakers.
Internally, computers are digital because they consist of discrete units called bits that are
either on or off. But by combining many bits in complex ways, computers simulate
analog events. In one sense, this is what computer science is all about.
voice mail
Refers to e-mail systems that support audio. Users can leave spoken messages for one another
and listen to the messages by executing the appropriate command in the e-mail system.
fax
(v) To send a document via a fax machine. Short for facsimile machine, a device that can send or
receive pictures and text over a telephone line.
Fax machines work by digitizing an image -- dividing it into a grid of dots. Each dot is either on
or off, depending on whether it is black or white. Electronically, each dot is represented by a bit
that has a value of either 0 (off) or 1 (on). In this way, the fax machine translates a picture into a
series of zeros and ones (called a bit map) that can be transmitted like normal computer data. On
the receiving side, a fax machine reads the incoming data, translates the zeros and ones back into
dots, and reprints the picture.
The idea of fax machines has been around since 1842, when Alexander Bain invented a machine
capable of receiving signals from a telegraph wire and translating them into images on paper. In
1850, a London inventor named F. C. Blakewell received a patent for a similar machine, which he
called a copying telegraph.
But while the idea of fax machines has existed since the 1800s, fax machines did not become
popular until the mid 1980s. The spark igniting the fax revolution was the adoption in 1983 of a
standard protocol for sending faxes at rates of 9,600 bps. The standard was created by the CCITT
standards organization and is known as the Group 3 standard. Now, faxes are commonplace in
offices of all sizes. They provide an inexpensive, fast, and reliable method for transmitting
correspondence, contracts, résumés, handwritten notes, and illustrations.
A fax machine consists of an optical scanner for digitizing images on paper, a printer for printing
incoming fax messages, and a telephone for making the connection. The optical scanner generally
does not offer the same quality of resolution as stand-alone scanners. Some printers on fax
machines are thermal, which means they require a special kind of paper.
All fax machines conform to the CCITT Group 3 protocol. (There is a new protocol called Group
4, but it requires ISDN lines.) The Group 3 protocol supports two classes of resolution: 203 by 98
dpi and 203 by 196 dpi. The protocol also specifies a data-compression technique and a
maximum transmission speed of 9,600 bps.
Some of the features that differentiate one fax machine from another include the following:
• speed: fax machines transmit data at different rates, from 4,800 bps to 28,800 bps. A
9,600-bps fax machine typically requires 10 to 20 seconds to transmit one page.
• printer type: Most fax machines use a thermal printer that requires special paper that
tends to turn yellow or brown after a period. More expensive fax machines have printers
that can print on regular bond paper.
• paper size: The thermal paper used in most fax machines comes in two basic sizes:
8.5-inches wide and 10.1-inches wide. Some machines accept only the narrow-sized
paper.
• paper cutter: Most fax machines include a paper cutter because the thermal paper that
most fax machines use comes in rolls. The least expensive models and portable faxes,
however, may not include a paper cutter.
• paper feed : Most fax machines have paper feeds so that you can send multiple-page
documents without manually feeding each page into the machine.
• autodialing: fax machines come with a variety of dialing features. Some enable you to
program the fax to send a document at a future time so that you can take advantage of the
lowest telephone rates.
As an alternative to stand-alone fax machines, you can also put together a fax system by
purchasing separately a fax modem and an optical scanner. You may not even need the optical
scanner if the documents you want to send are already in electronic form.
teleconferencing
(1) To hold a conference via a telephone or network connection. Computers have given new
meaning to the term because they allow groups to do much more than just talk. Once a
teleconference is established, the group can share applications and mark up a common
whiteboard. There are many teleconferencing applications that work over private networks. One
of the first to operate over the Internet is Microsoft's NetMeeting.
(2) To deliver live events via satellite to geographically dispersed downlink sites.
Video conference Conducting a conference between two or more participants at different sites by
using computer networks to transmit audio and video data. For example, a point-to-point (twoperson) video conferencing system works much like a video telephone. Each participant has a
video camera, microphone, and speakers mounted on his or her computer. As the two participants
speak to one another, their voices are carried over the network and delivered to the other's
speakers, and whatever images appear in front of the video camera appear in a window on the
other participant's monitor.
Multipoint videoconferencing allows three or more participants to sit in a virtual conference room
and communicate as if they were sitting right next to each other. Until the mid 90s, the hardware
costs made videoconferencing prohibitively expensive for most organizations, but that situation is
changing rapidly. Many analysts believe that videoconferencing will be one of the fastestgrowing segments of the computer industry in the latter half of the decade.
network
(n.) A group of two or more computer systems linked together. There are many types of
computer networks, including:
• local-area networks (LANs) : The computers are geographically close together (that
is, in the same building).
• wide-area networks (WANs) : The computers are farther apart and are connected by
telephone lines or radio waves.
• campus-area networks (CANs): The computers are within a limited geographic area,
such as a campus or military base.
• metropolitan-area networks MANs): A data network designed for a town or city.
• home-area networks (HANs): A network contained within a user's home that
connects a person's digital devices.
In addition to these types, the following characteristics are also used to categorize different types
of networks:
• topology : The geometric arrangement of a computer system. Common topologies
include a bus, star, and ring. See the Network topology diagrams in the Quick Reference
section of Webopedia.
• protocol : The protocol defines a common set of rules and signals that computers on
the network use to communicate. One of the most popular protocols for LANs is called
Ethernet. Another popular LAN protocol for PCs is the IBM token-ring network .
• architecture : Networks can be broadly classified as using either a peer-to-peer or
client/server architecture.
Computers on a network are sometimes called nodes. Computers and devices that allocate
resources for a network are called servers.
(v.) To connect two or more computers together with the ability to communicate with each other.
VAN
- Acronym for Value Added Network: refers to a private network provider that leases
communication lines to its subscribers. VANs provides specialized services such as
assisting with EDI (electronic data interchange), extra security, message delivery, or access
to a particular database.
EDI
Short for Electronic Data Interchange, the transfer of data between different companies using
networks, such as VANs or the Internet. As more and more companies get connected to the
Internet, EDI is becoming increasingly important as an easy mechanism for companies to buy,
sell, and trade information. ANSI has approved a set of EDI standards known as the X12
standards.
LAN
A computer network that spans a relatively small area. Most LANs are confined to a single
building or group of buildings. However, one LAN can be connected to other LANs over any
distance via telephone lines and radio waves. A system of LANs connected in this way is called a
wide-area network (WAN).
Most LANs connect workstations and personal computers. Each node (individual computer ) in a
LAN has its own CPU with which it executes programs, but it also is able to access data and
devices anywhere on the LAN. This means that many users can share expensive devices, such as
laser printers, as well as data. Users can also use the LAN to communicate with each other, by
sending e-mail or engaging in chat sessions.
There are many different types of LANs Ethernets being the most common for PCs. Most Apple
Macintosh networks are based on Apple's AppleTalk network system, which is built into
Macintosh computers.
The following characteristics differentiate one LAN from another:
• topology : The geometric arrangement of devices on the network. For example,
devices can be arranged in a ring or in a straight line.
• protocols : The rules and encoding specifications for sending data. The protocols also
determine whether the network uses a peer-to-peer or client/server architecture.
• media : Devices can be connected by twisted-pair wire, coaxial cables, or fiber optic
cables. Some networks do without connecting media altogether, communicating instead
via radio waves.
LANs are capable of transmitting data at very fast rates, much faster than data can be transmitted
over a telephone line; but the distances are limited, and there is also a limit on the number of
computers that can be attached to a single LAN.
Twisted Pair Wire - A type of cable that consists of two independently insulated wires
twisted around one another. The use of two wires twisted together helps to reduce
crosstalk and electromagnetic induction. While twisted-pair cable is used by older
telephone networks and is the least expensive type of local-area network (LAN) cable,
most networks contain some twisted-pair cabling at some point along the network. Other
types of cables used for LANs include coaxial cables and fiber optic cables.
Coaxial Cable A type of wire that consists of a center wire surrounded by
insulation and then a grounded shield of braided wire. The shield minimizes
electrical and radio frequency interference. Coaxial cabling is the primary type of
cabling used by the cable television industry and is also widely used for computer
networks, such as Ethernet. Although more expensive than standard telephone wire,
it is much less susceptible to interference and can carry much more data.
Fiber Optic - A technology that uses glass (or plastic) threads (fibers) to transmit data. A
fiber optic cable consists of a bundle of glass threads, each of which is capable of
transmitting messages modulated onto light waves.
Fiber optics has several advantages over traditional metal communications lines:
•
•
•
•
Fiber optic cables have a much greater bandwidth than metal cables. This means that
they can carry more data.
Fiber optic cables are less susceptible than metal cables to interference.
Fiber optic cables are much thinner and lighter than metal wires.
Data can be transmitted digitally (the natural form for computer data) rather than
analogically.
The main disadvantage of fiber optics is that the cables are expensive to install. In
addition, they are more fragile than wire and are difficult to splice.
Fiber optics is a particularly popular technology for local-area networks. In addition,
telephone companies are steadily replacing traditional telephone lines with fiber optic
cables. In the future, almost all communications will employ fiber optics.
MAN
Short for Metropolitan Area Network, a data network designed for a town or city. In terms of
geographic breadth, MANs are larger than local-area networks (LANs), but smaller than widearea networks (WANs). MANs are usually characterized by very high-speed connections using
fiber optical cable or other digital media.
WAN
A computer network that spans a relatively large geographical area. Typically, a WAN consists of
two or more local-area networks (LANs).
Computers connected to a wide-area network are often connected through public networks, such
as the telephone system. They can also be connected through leased lines or satellites. The largest
WAN in existence is the Internet.
Computer
A programmable machine. The two principal characteristics of a computer are:
• It responds to a specific set of instructions in a well-defined manner.
•
It can execute a prerecorded list of instructions (a program).
Modern computers are electronic and digital. The actual machinery -- wires, transistors, and
circuits -- is called hardware; the instructions and data are called software.
All general-purpose computers require the following hardware components:
• memory : Enables a computer to store, at least temporarily, data and programs.
• mass storage device : Allows a computer to permanently retain large amounts of data.
Common mass storage devices include disk drives and tape drives.
• input device : Usually a keyboard and mouse, the input device is the conduit through
which data and instructions enter a computer.
• output device : A display screen, printer, or other device that lets you see what the
computer has accomplished.
• central processing unit (CPU): The heart of the computer, this is the component that
actually executes instructions.
In addition to these components, many others make it possible for the basic components to work
together efficiently. For example, every computer requires a bus that transmits data from one part
of the computer to another.
Computers can be generally classified by size and power as follows, though there is considerable
overlap:
• personal computer : A small, single-user computer based on a microprocessor. In
addition to the microprocessor, a personal computer has a keyboard for entering data, a
monitor for displaying information, and a storage device for saving data.
• workstation : A powerful, single-user computer. A workstation is like a personal
computer, but it has a more powerful microprocessor and a higher-quality monitor.
• minicomputer : A multi-user computer capable of supporting from 10 to hundreds of
users simultaneously.
• mainframe : A powerful multi-user computer capable of supporting many hundreds or
thousands of users simultaneously.
• supercomputer : An extremely fast computer that can perform hundreds of millions of
instructions per second.
Computer Operating System
The most important program that runs on a computer. Every general-purpose computer must have
an operating system to run other programs. Operating systems perform basic tasks, such as
recognizing input from the keyboard, sending output to the display screen, keeping track of files
and directories on the disk, and controlling peripheral devices such as disk drives and printers.
For large systems, the operating system has even greater responsibilities and powers. It is like a
traffic cop -- it makes sure that different programs and users running at the same time do not
interfere with each other. The operating system is also responsible for security, ensuring that
unauthorized users do not access the system.
Operating systems can be classified as follows:
• multi-user : Allows two or more users to run programs at the same time. Some
operating systems permit hundreds or even thousands of concurrent users. Refers to
computer systems that support two or more simultaneous users. All mainframes and
minicomputers are multi-user systems, but most personal computers and workstations are
not. Another term for multi-user is time sharing.
• multiprocessing : Supports running a program on more than one CPU. (1) Refers
to a computer system's ability to support more than one process (program) at the
same time. Multiprocessing operating systems enable several programs to run
concurrently. UNIX is one of the most widely used multiprocessing systems, but
there are many others, including OS/2 for high-end PCs. Multiprocessing systems
are much more complicated than single-process systems because the operating
system must allocate resources to competing processes in a reasonable manner.
(2) Refers to the utilization of multiple CPUs in a single computer system. This is
also called parallel processing.
• multitasking : Allows more than one program to run concurrently. The ability
to execute more than one task at the same time, a task being a program. The terms
multitasking and multiprocessing are often used interchangeably, although
multiprocessing implies that more than one CPU is involved.
In multitasking, only one CPU is involved, but it switches from one program to
another so quickly that it gives the appearance of executing all of the programs at
the same time.
There are two basic types of multitasking: preemptive and cooperative. In preemptive
multitasking, the operating system parcels out CPU time slices to each program. In
cooperative multitasking, each program can control the CPU for as long as it needs it. If a
program is not using the CPU, however, it can allow another program to use it
temporarily. OS/2, Windows 95, Windows NT, the Amiga operating system and UNIX
use preemptive multitasking, whereas Microsoft Windows 3.x and the MultiFinder (for
Macintosh computers) use cooperative multitasking.
• multithreading : Allows different parts of a single program to run concurrently. The
ability of an operating system to execute different parts of a program, called threads,
simultaneously. The programmer must carefully design the program in such a way that all
the threads can run at the same time without interfering with each other.
• real time: Responds to input instantly. General-purpose operating systems, such as
DOS and UNIX, are not real-time. Occurring immediately. The term is used to describe a
number of different computer features. For example, real-time operating systems are
systems that respond to input immediately. They are used for such tasks as navigation, in
which the computer must react to a steady flow of new information without interruption.
Most general-purpose operating systems are not real-time because they can take a few
seconds, or even minutes, to react.
Real time can also refer to events simulated by a computer at the same speed that they
would occur in real life. In graphics animation, for example, a real-time program would
display objects moving across the screen at the same speed that they would actually
move.
Operating systems provide a software platform on top of which other programs, called
application programs, can run. The application programs must be written to run on top of a
particular operating system. Your choice of operating system, therefore, determines to a great
extent the applications you can run. For PCs, the most popular operating systems are DOS, OS/2,
and Windows, but others are available, such as Linux.
As a user, you normally interact with the operating system through a set of commands. For
example, the DOS operating system contains commands such as COPY and RENAME for
copying files and changing the names of files, respectively. The commands are accepted and
executed by a part of the operating system called the command processor or command line
interpreter. Graphical user interfaces allow you to enter commands by pointing and clicking at
objects that appear on the screen.
Network Operating System
Abbreviated as NOS, an operating system that includes special functions for connecting
computers and devices into a local-area network (LAN). Some operating systems, such as UNIX
and the Mac OS, have networking functions built in. The term network operating system,
however, is generally reserved for software that enhances a basic operating system by adding
networking features. Novell Netware, Artisoft's LANtastic, Microsoft Windows Server, and
Windows NT are examples of an NOS.
3. Programs, Software, and Applications.
4. Security/Accessibility.
*terms compiled by Jon Cohen
3. Programs, Software, and Applications
a. Database
In computing, a database can be defined as a structured collection of records or data that is stored in a
computer so that a program can consult it to answer queries. The records retrieved in answer to queries
become information that can be used to make decisions.
b. Data mining
also called Knowledge-Discovery in Databases (KDD) or Knowledge-Discovery and Data Mining, is
the process of automatically searching large volumes of data for patterns using tools such as classification,
association rule mining, clustering, etc.
c. Data warehousing
A data warehouse is the main repository of the organization's historical data, its corporate memory. For
example, an organization would use the information that's stored in its data warehouse to find out what day
of the week they sold the most widgets in May 1992, or how employee sick leave the week before
Christmas differed between California and Quebec from 2001-2005. In other words, the data warehouse
contains the raw material for management's decision support system. The critical factor leading to the use
of a data warehouse is that a data analyst can perform complex queries and analysis (such as data mining)
on the information without slowing down the operational systems.
d. Enterprise Content Management (ECM)
Enterprise Content Management (ECM) is any of the strategies and technologies employed in the
information technology industry for managing the capture, storage, security, revision control, retrieval,
distribution, preservation and destruction of documents and content. ECM especially concerns content
imported into or generated from within an organization in the course of its operation, and includes the
control of access to this content from outside of the organization's processes.
ECM systems are designed to manage both structured and unstructured content, so that an organization,
such as a business or governmental agency, can more effectively meet business goals (increase profit or
improve the efficient use of budgets), serve its customers (as a competitive advantage, or to improve
responsiveness), and protect itself (against non-compliance, law-suits, uncoordinated departments or
turnover within the organization). In a large enterprise, ECM is not regarded as an optional expense, where
it is essential to content preservation and re-usability, and to the control of access to content - whereas, very
small organizations may find their needs temporarily met by carefully managed shared folders and a wiki,
for example. Recent trends in business and government indicate that ECM is becoming a core investment
for organizations of all sizes, more immediately tied to organizational goals than in the past: increasingly
more central to what an enterprise does, and how it accomplishes its mission
e. Website management
Managing the content and processes of a website
f. Electronic Document Management
a computer system (or set of computer programs) used to track and store electronic documents and/or
images of paper documents. The term has some overlap with the concepts of Content Management Systems
and is often viewed as a component of Enterprise Content Management Systems and related to Digital
Asset Management.
g. Operating systems, utilities and diagnostics
An operating system (OS) is a set of computer programs that manage the hardware and software resources
of a computer. An operating system rationally processes electronic devices in response to approved
commands. At the foundation of all system software, an operating system performs basic tasks such as
controlling and allocating memory, prioritizing system requests, controlling input and output devices,
facilitating networking, and managing file systems. Most operating systems have a command line
interpreter as a basic user interface, but they may also provide a graphical user interface (GUI) for ease of
operation. The operating system forms a platform for other system software and for application software.
Utilities and diagnostics – tools that your computer uses to find and fix a problem
h. RIM software
i. Email
Electronic mail (abbreviated "e-mail" or, often, "email") is a store and forward method of composing,
sending, storing, and receiving messages over electronic communication systems. The term "e-mail" (as a
noun or verb) applies both to the Internet e-mail system based on the Simple Mail Transfer Protocol
(SMTP) and to intranet systems allowing users within one organization to e-mail each other. Often these
workgroup collaboration organizations may use the Internet protocols for internal e-mail service.
j. Instant Messaging
Instant messaging or IM is a form of real-time communication between two or more people based on
typed text. The text is conveyed via computers connected over a network such as the Internet.
k. Artificial Intelligence
The term Artificial Intelligence (AI) was first used by John McCarthy who considers it to mean "the
science and engineering of making intelligent machines".[1] It can also refer to intelligence as exhibited by
an artificial (man-made, non-natural, manufactured) entity. The terms strong and weak AI can be used to
narrow the definition for classifying such systems. AI is studied in overlapping fields of computer science,
psychology, neuroscience and engineering, dealing with intelligent behavior, learning and adaptation and
usually developed using customized machines or computers.
Research in AI is concerned with producing machines to automate tasks requiring intelligent behavior.
Examples include control, planning and scheduling, the ability to answer diagnostic and consumer
questions, handwriting, natural language, speech and facial recognition. As such, the study of AI has also
become an engineering discipline, focused on providing solutions to real life problems, knowledge mining,
software applications, strategy games like computer chess and other video games. One of the biggest
difficulties with AI is that of comprehension. Many devices have been created that can do amazing things,
but critics of AI claim that no actual comprehension by the AI machine has taken place.
l. Other
4. Security/Accessibility
Data security is the means of ensuring that data is kept safe from corruption and that access to it is suitably
controlled. Thus data security helps to ensure privacy. It also helps in protecting personal data.
Information security is the process of protecting data from unauthorized access, use, disclosure,
destruction, modification, or disruption.[1] The terms information security, computer security and
information assurance are frequently used interchangeably. These fields are interrelated and share the
common goals of protecting the confidentiality, integrity and availability of information; however, there are
some subtle differences between them. These differences lie primarily in the approach to the subject, the
methodologies used, and the areas of concentration. Information security is concerned with the
confidentiality, integrity and availability of data regardless of the form the data may take: electronic, print,
or other forms.
The International Standard ISO/IEC 17799 covers data security under the topic of information security, and
one of its cardinal principle is that all stored information, i.e. data, should be owned so that it is clear whose
responsibility it is to protect and control access to that data.
a. Access rights
Most modern file systems have methods of administering permissions or access rights to specific users
and groups of users. These systems control the ability of the users affected to view or make changes to the
contents of the file system.
b. Customer service
c. Confidentiality/privacy
Data privacy refers to the evolving relationship between technology and the legal right to, or public
expectation of privacy in the collection and sharing of data.
Privacy problems exist wherever uniquely identifiable data relating to a person or persons are collected and
stored, in digital form or otherwise. Improper or non-existent disclosure control can be the root cause for
privacy issues. The most common sources of data that are affected by data privacy issues are:
•
•
•
•
•
Health information.
Criminal justice.
Financial information.
Genetic information.
Location information.
The challenge in data privacy is to share data while protecting the personally identifiable information.
Consider the example of health data which are collected from hospitals in a district; it is standard practice
to share this only in the aggregate. The idea of sharing the data in the aggregate is to ensure that only nonidentifiable data are shared.
The legal protection of the right to privacy in general and of data privacy in particular varies greatly
around the world.
d. Methods of implementation
Administrative controls are comprised of approved written policies, procedures, standards and guidelines.
Administrative controls form the framework for running the business and managing people. They inform
people on how the business is to be run and how day to day operations are to be conducted. Laws and
regulations created by government bodies are also a type of administrative control because they inform the
business. Some industry sectors have policies, procedures, standards and guidelines that must be followed the Payment Card Industry (PCI) Data Security Standard required by Visa and Master Card is such an
example. Other examples of administrative controls include the corporate security policy, password policy,
hiring policies, and disciplinary policies.
Administrative controls form the basis for the selection and implementation of logical and physical
controls. Logical and physical controls are manifestations of administrative controls. Administrative
controls are of paramount importance.
Logical controls (also called technical controls) use software and data to monitor and control access to
information and computing systems. For example: passwords, network and host based firewalls, network
intrusion detection systems, access control lists, and data encryption are logical controls.
An important logical control that is frequently overlooked is the principle of least privilege. The principle
of least privilege requires that an individual, program or system process is not granted any more access
privileges than are necessary to perform the task. A blatant example of the failure to adhere to the principle
of least privilege is logging into Windows as user Administrator to read Email and surf the Web. Violations
of this principle can also occur when an individual collects additional access privileges over time. This
happens when employees' job duties change, or they are promoted to a new position, or they transfer to
another department. The access privileges required by their new duties are frequently added onto their
already existing access privileges which may no longer be necessary or appropriate.
Physical controls monitor and control the environment of the work place and computing facilities. They
also monitor and control access to and from such facilities. For example: doors, locks, heating and air
conditioning, smoke and fire alarms, fire suppression systems, cameras, barricades, fencing, security
guards, cable locks, etc. Separating the network and work place into functional areas are also physical
controls.
5. Data use
* Compiled by Robert Tocher
•
Database Administrator
(WWW.BLS.GOV)
With the Internet and electronic business generating large volumes of
data, there is a growing need to be able to store, manage, and extract data effectively. Database
administrators work with database management systems software and determine ways to
organize and store data. They identify user requirements, set up computer databases, and test and
coordinate modifications to the computer database systems. An organization’s database
administrator ensures the performance of the system, understands the platform on which the
database runs, and adds new users to the system. Because they also may design and implement
system security, database administrators often plan and coordinate security measures. With the
volume of sensitive data generated every second growing rapidly, data integrity, backup systems,
and database security have become increasingly important aspects of the job of database
administrators.
•
Data Sharing Methods
o Shared Drive
(www.computerhope.com)
Also known as a share, a shared directory is a directory or
folder that is made accessible to multiple users or groups on a network. This is the most common
method of accessing and sharing information on a local area network.
o
Electronic Document Rooms
(www.ornl.gov) The Electronic File Room (EFR) shall provide a secure, searchable, and userfriendly electronic repository for the storage and retrieval of documents at many different
classification levels.
•
Output to hardcopy and electronic forms
o COLD – Computer Output Laser Disc
o
o
o
o
o
COM – Computer Output Microfilm
Digital Photographs – jpeg, tif
Video- DVD, DVD-R, HDVD, Blu-Ray
X-ray
(www.wikipedia.org) X-rays are a type of electromagnetic radiation
with wavelengths of around 10-10 meters. When medical X-rays are being
produced, a thin metallic sheet is placed between the emitter and the
target, effectively filtering out the lower energy (soft) X-rays.
Sound recordings – MP3
Data Processing
o Batch
(www.computing-dictionary.thefreedictionary.com)
Performing a particular operation automatically on a group of files all at once rather than
manually opening, editing and saving one file at a time. For example, graphics software that
converts a selection of images from one format to another would be a batch processing utility.
See DeBabelizer.
•
Processing a group of transactions at one time. Transactions are collected and processed against
the master files (master files updated) at the end of the day or some other time period. Contrast
with transaction processing.
• Tagging metadata (SGML, HTML, XML).
(www.encyclopedia.thefreedictionary.com)
In computing, an HTML element indicates
structure in an HTML document and a way of hierarchically arranging content. More specifically,
an HTML element is an SGML element that meets the requirements of one or more of the HTML
Document Type Definitions (DTDs). These elements have properties: both attributes and content,
as specified (both allowable and required) according to the appropriate HTML DTD (for
example, the HTML 4.01 strict DTD). Elements may represent headings, paragraphs, hypertext
links, lists, embedded media, and a variety of other structures.
o The purpose of tagging metadata is to index and sort, allowing searching and
reporting capabilities.
• XML
(www.orafaq.com/glossary/faqglosx.htm) XML (Extensible Markup Language) is a W3C
initiative that allows information and services to be encoded with meaningful structure and
semantics that computers and humans can understand. XML is great for information exchange,
and can easily be extended to include user-specified and industry-specified tags.
•
Index
(www.wikipedia.org) A database index is a data structure that improves the speed of operations
in a table. Indexes can be created using one or more columns. The disk space required to store the
index is typically less than the storage of the table (since indexes usually contains only the keyfields according to which the table is to be arranged, and excludes all the other details in the
table). In a relational database an index is a copy of part of a table.
•
Search Engine
(www.webopedia.com) A program that searches documents for specified keywords and returns a
list of the documents where the keywords were found. Although search engine is really a general
class of programs, the term is often used to specifically describe systems like Alta Vista and
Excite that enable users to search for documents on the World Wide Web and USENET
newsgroups.
Typically, a search engine works by sending out a spider to fetch as many documents as possible.
Another program, called an indexer, then reads these documents and creates an index based on
the words contained in each document. Each search engine uses a proprietary algorithm to create
its indices such that, ideally, only meaningful results are returned for each query.
•
Indexing
(www.webopedia.com) n.) In database design, a list of keys (or keywords), each of which
identifies a unique record. Indices make it faster to find specific records and to sort records by the
index field -- that is, the field used to identify each record.
(v.) To create an index for a database, or to find records using an index.
•
Structured Searches?
•
Text Retrieval
(www.nao.org.uk/intosai/edp/directory/misc/glossary.html) system by which important
documents (or portions thereof) can be retrieved by searching for occurrences of key words,
phrases or sentences.
•
Natural Language Processing
(www.wikipedia.org) Natural language processing (NLP) is a subfield of artificial intelligence
and linguistics. It studies the problems of automated generation and understanding of natural
human languages. Natural language generation systems convert information from computer
databases into normal-sounding human language, and natural language understanding systems
convert samples of human language into more formal representations that are easier for computer
programs to manipulate.
•
Boolean Searches
(www.webopedia.com) (bool´ē-&n loj´ik) (n.) Named after the nineteenth-century
mathematician George Boole, Boolean logic is a form of algebra in which all values are reduced
to either TRUE or FALSE. Boolean logic is especially important for computer science because it
fits nicely with the binary numbering system, in which each bit has a value of either 1 or 0.
Another way of looking at it is that each bit has a value of either TRUE or FALSE.
•
Data Tagging
A tag is a (relevant) keyword or term associated with or assigned to a piece of information (like
picture, article, or video clip), thus describing the item and enabling keyword-based classification
of information it is applied to.
Tags are usually chosen informally and personally by the author/creator or the consumer of the
item — i.e. not usually as part of some formally defined classification scheme. Tags are typically
used in dynamic, flexible, automatically generated internet taxonomies for online resources such
as computer files, web pages, digital images, and internet bookmarks (both in social bookmarking
services, and in the current generation of web browsers - see Flock). For this reason, "tagging"
has become associated with the Web 2.0 buzz. Many people associate "tagging" with the idea of
the semantic web, however some believe that tagging may not be having a positive effect on the
overall drive towards the semantic web[[1]].
Typically, an item will have one or more tags associated with it.
•
Metrics of search – recall and precision
(http://www.dba-oracle.com/t_search_engine_precison_recall.htm) Precision is defined as a
metrics to ensure that the query returns ALL matching pages (i.e. no lost results). In other words,
precision is percentage of the statistical universe of matching results. An example of precision of
search engines is this scholarly study titled “Precision and Recall of Five Search Engines for
Retrieval of Scholarly Information in the Field of Biotechnology “, shows interesting academic
research on the relative precision and recall of several internet search engines.
•
Recall is defined fare more loosely and it as it uses the highly-variable word “relevant”, a
loosely-defined term, and at the heart of the success of any search engine. Notes from
Precision and Recall of Five Search Engines for Retrieval of Scholarly Information in the
Field of Biotechnology, define precision as a metric that is impossible to accurately
measure:
“Thus it [Precision] requires knowledge not just of the relevant and retrieved but also those not
retrieved (Clarke & Willet, 1997). There is no proper method of calculating absolute recall of
search engines as it is impossible to know the total number of relevant in huge databases.”
•
Methods and forms of data output?
6. Data storage.
*Compiled by Kathleen Roth and Mary Nyce
Media Stability - (Saffady - Records and Information Management) - Stability denotes the extent to which a
given storage medium retains physical characteristics and chemical properties appropriate to its intended
purpose. Stability is the period of time that a given medium will remain useful for its intended purpose.
Storage Copies and Working Copies - (Saffady - Records and Information Management) Where
microforms are used for long term retention or permanent preservation of recorded information, a distinction
must be made between storage copies, which are used to produce one or more working copies and
seldom handled thereafter, and working copies, which are intended for display, printing, distribution, or
other purposes.
Reciprocal Agreements in Data Storage - (http://www.itl.nist.gov/lab/bulletns/archives/b995.txt)
Reciprocal agreement - An agreement that allows two organizations to back each other up. (While this
approach often sounds desirable, contingency planning experts note that this alternative has the greatest
chance of failure due to problems keeping agreements and plans up-to-date as systems and personnel
change.)
Native Format - The file format that an application normally reads and writes. The native format of a
Microsoft word document is different than a WordPerfect document, etc. The problem is that there are tons
of et ceteras. Even image editing programs, which are designed to read and convert a raft of different
graphics file types, have their own built-in native format. For example, in order to build an image in layers,
Photoshop converts foreign images into its native, layered file format (.PSD extension). Contrast with
foreign format and file format.
Hard Magnetic Disks - The primary computer storage device. Like tape, it is magnetically recorded and
can be re-recorded over and over. Disks are rotating platters with a mechanical arm that moves a
read/write head between the outer and inner edges of the platter's surface. It can take as long as one
second to find a location on a floppy disk to as little as a couple of milliseconds on a fast hard disk. See
hard disk for more details.
Tracks and Spots
The disk surface is divided into concentric tracks (circles within circles). The thinner the tracks, the more
storage. The data bits are recorded as tiny magnetic spots on the tracks. The smaller the spot, the more
bits per inch and the greater the storage.
Sectors
Tracks are further divided into sectors, which hold a block of data that is read or written at one time; for
example, READ SECTOR 782, WRITE SECTOR 5448. In order to update the disk, one or more sectors
are read into the computer, changed and written back to disk. The operating system figures out how to fit
data into these fixed spaces.
Modern disks have more sectors in the outer tracks than the inner ones because the outer radius of the
platter is greater than the inner radius (see CAV). See magnetic tape and optical disk.
Tracks and Sectors
Tracks are concentric circles on the disk, broken up
into storage units called "sectors." The sector, which
is typically 512 bytes, is the smallest unit that can be
read or written.
Magnetic Disk Summary
The following magnetic disk technologies are summarized below. Several have been discontinued,
but drives and media continue to be used long after they have been officially discontinued.
DISCONTINUED TECHNOLOGIES
The Early 1990s
This RAID II prototype in 1992, which embodies principles of high
performance and fault tolerance, was designed and built by University of
Berkeley graduate students. Housing 36 320MB disk drives, its total storage
was less than the disk drive in the cheapest PC only six years later. (Image
courtesy of The Computer Museum History Center,
www.computerhistory.org) See RAID.
Magnetic Tapes - A sequential storage medium used for data collection, backup and archiving. Like
videotape, computer tape is made of flexible plastic with one side coated with a ferromagnetic material.
Tapes were originally open reels, but were superseded by cartridges and cassettes of many sizes and
shapes.
Tape has been more economical than disks for archival data, but that is changing as disk capacities have
increased enormously. If tapes are stored for the duration, they must be periodically recopied or the
tightly coiled magnetic surfaces may contaminate each other.
Sequential Medium
The major drawback of tape is its sequential format. Locating a specific record requires reading every
record in front of it or searching for markers that identify predefined partitions. Although most tapes are
used for archiving rather than routine updating, some drives allow rewriting in place if the byte count does
not change. Otherwise, updating requires copying files from the original tape to a blank tape (scratch
tape) and adding the new data in between.
Track Formats
Tracks run parallel to the edge of the tape (linear recording) or diagonally (helical scan). A linear variation
is serpentine recording, in which the tracks "snake" back and forth from the end of the tape to the
beginning.
Legacy open reel tapes used nine linear tracks (8 bits plus parity), while modern cartridges use 128 or
more tracks. Data are recorded in blocks of contiguous bytes, separated by a space called an "interrecord
gap" or "interblock gap." Tape drive speed is measured in inches per second (ips). Over the years,
storage density has increased from 200 to 38,000 bpi. See helical scan and compact tape.
Tracks on Magnetic Tape
Except for helical scan recording, most tracks on magnetic tape run
parallel to the length of the tape.
Magnetic Tape Summary
The following magnetic tape technologies are summarized below. See also magnetic disk and optical
disk.
Optical Disks - Direct access disk written and read by light. CD, CD-ROM, DVD-ROM and DVDVideo are read-only optical disks that are recorded at the time of manufacture and cannot be
erased. CD-R, DVD-R, WORM and magneto-optic (in WORM mode) disks are write-once. They are
recorded in the user's environment, but cannot be erased. CR-RW, DVD-RAM, DVD-RW and MO
disks are rewritable.
Rewritable disks use either magneto-optic (MO) or phase change technology. Used in libraries that
hold multiple cartridges, magneto-optic (MO) disks are extremely robust. Phase change disks (CDRW, DVD-RAM, etc.) are lower cost consumer-oriented products, and DVD-RAM is expected to
become very popular.
Optical disks have some advantages over magnetic disks. They have higher capacities as
removable modules, and they are not subject to head crashes or corruption from stray magnetic
fields. They also have a 30-year life and are less vulnerable to extremes of hot and cold. See DVD,
phase change, holographic storage, ISO 13346, multilevel optical disk and legality of optical
storage. See also magnetic disk and magnetic tape.
Writability
Read only
Write once
Rewrite
Optical Disk Types
CD, CD-ROM, DVD-ROM, DVD-Video
CR-R, DVD-R, WORM
CD-RW, DVD-RAM, DVD-RW, MO, DataPlay
DVD - Digital VideoDisc or Digital Versatile Disc) An optical digital disc for storing movies and data.
Introduced in the U.S. in 1997, and developed by both the computer and movie industries, the disc
uses the same diameter platter as a CD (120mm/4.75" diameter), but holds 4.7GB rather than
700MB. Whereas CDs use only one side, DVDs can be recorded on both sides as well as in dual
layers. DVD drives/players read most CD media as well. For the specifications of 2x, 4x, 8x, etc.
DVD drives, see DVD drives. The various flavors of DVDs are summarized below:
Standard Definition Movie DVDs
DVD-Video is the movie format, which uses MPEG-2 compression to provide approximately two
hours of video per side at standard definition TV resolution (480i resolution). When most people
mention the word "DVD," they are referring to a DVD-Video disc. See DVD-Video and DTV.
High Definition Movie DVDs
Blu-ray and HD DVD are two competing formats that have enough storage for two hour highdefinition movies (1080i resolution). See Blu-ray, HD DVD and capacity comparisons below.
Read Only DVDs
A DVD-ROM is like a larger CD-ROM that holds data and interactive audio and video material. Like
CD-ROMs, DVD-ROMs are manufactured. See DVD-ROM.
Writable/Recordable DVDs
A DVD-RAM is a rewritable DVD that functions like a removable hard disk. DVD-RAM media can be
rewritten 100,000 times before it is no longer usable. See DVD-RAM.
DVD-R and DVD+R are competing write-once formats for movies or data. DVD-RW and DVD+RW
are competing, rewritable (re-recordable) formats that unlike DVD-RAM's 100,000 cycles, can only
be rewritten 1,000 times. Aimed at the consumer, 1,000 rewrites is considered more than sufficient.
See DVD-R, DVD+R, DVD-RW and DVD+RW.
Music DVDs
DVD-Audio is a second-generation digital music format that provides higher sampling rates than
audio CDs. Many have welcomed the new format, believing that the original audio CD was unable to
capture the total sound spectrum. See DVD-Audio.
DVD Stands For?
Originally, "Digital VideoDisc." Since the technology became important to the computer world, the
"video" was dropped, and it was just D-V-D. Later, it was dubbed "Digital Versatile Disc" by the DVD
Forum. Take your pick.
Minus (-R/-RW) and Plus (+R/+RW)
The formats endorsed by the DVD Forum (www.dvdforum.org) have a hyphen in their names and are
verbalized as "DVD Minus R" or "DVD Dash R" (DVD-R) and "DVD Minus RW" or "DVD Dash RW"
(DVD-RW). The competing formats from the DVD+RW Alliance (www.dvdrw.com) use a plus sign:
"DVD Plus R" (DVD+R) and "DVD Plus RW" (DVD+RW). Starting in 2002, drives that supported both
Minus and Plus formats were introduced. See DVD Forum and DVD+RW Alliance.
CD - Commercial Data Servers, Inc., Sunnyvale, CA) A former manufacturer of entry-level, IBMcompatible mainframes that were designed to replace surviving 43xx and 9370 machines. It was founded
in 1994 by Gene Amdahl, Bill O'Connell and Ray Williams. CDS was expected to produce a high-end
mainframe using cryogenic techniques, but cancelled the project. It also stopped production of its
mainframes in late 1999 and turned the company into Xbridge Systems to focus on mainframe
connectivity software from desktop PCs. See Xbridge, Trilogy and Amdahl.
Storage Area Network (SANs) - A network of storage disks. In large enterprises, a SAN connects
multiple servers to a centralized pool of disk storage. Compared to managing hundreds of servers, each
with their own disks, SANs improve system administration. By treating all the company's storage as a
single resource, disk maintenance and routine backups are easier to schedule and control. In some
SANs, the disks themselves can copy data to other disks for backup without any processing overhead at
the host computers.
High Speed
The SAN network allows data transfers between computers and disks at the same high peripheral
channel speeds as when they are directly attached. Fibre Channel is a driving force with SANs and is
typically used to encapsulate SCSI commands. SSA and ESCON channels are also supported.
Centralized or Distributed
A centralized SAN connects multiple servers to a collection of disks, whereas a distributed SAN typically
uses one or more Fibre Channel or SCSI switches to connect nodes within buildings or campuses. For
long distances, SAN traffic is transferred over ATM, SONET or dark fiber. To guarantee complete
recovery in a disaster, dual, redundant SANs are deployed, one a mirror of the other and each in
separate locations.
Over IP
Another SAN option is IP storage, which enables data transfer via IP over fast Gigabit Ethernet locally or
via the Internet to anywhere in the world (see IP storage). See LAN free backup.
Channel Attached Vs. Network Attached
A related storage device is the network attached storage (NAS) system, which is a file server that
attaches to the LAN like any other client or server in the network. Rather than containing a full-blown
operating system, the NAS uses a slim microkernel specialized for handling only file reads and writes
(CIFS/SMB, NFS, NCP). However, the NAS is subject to the variable behavior and overhead of a network
that may contain thousands of users.
The Terminology
SAN-NAS terminology is confusing (storage area network vs. network attached storage). They both fall
under the "storage network" umbrella, but operate differently: the channel-attached SAN extends the disk
channel, whereas the NAS is another node on the network. See NAS, Fibre Channel, SCSI switch, iSCSI,
IP storage, SCSI and SNIA.
SANs and NASs
SANs are channel attached, and NASs are network attached.
They all fall under the "storage network" umbrella.
Channel Attached
EMC has been a pioneer in channel-attached storage networks,
especially in the mainframe arena. Its Symmetrix storage
systems support up to 32 ports (channels) and hold up to 6TB.
Network-attached options are also available. (Image courtesy of
EMC Corporation.)
Network Attached
It does not get much simpler than Snap Appliance's Snap!
Server. Containing only an on/off switch and Ethernet port, it
provides an instant storage boost by simply plugging it into the
network hub. (Image courtesy of Snap Appliance, Inc.)
Redundant Array of Independent Disks (RAID) - A disk subsystem that is used to increase
performance or provide fault tolerance or both. RAID uses two or more ordinary hard disks and a RAID
disk controller. In the past, RAID has also been implemented via software only.
In the late 1980s, the term stood for "redundant array of inexpensive disks," being compared to large,
expensive disks at the time. As hard disks became cheaper, the RAID Advisory Board changed
"inexpensive" to "independent."
Small and Large
RAID subsystems come in all sizes from desktop units to floor-standing models (see NAS and SAN).
Stand-alone units may include large amounts of cache as well as redundant power supplies. Initially used
with servers, desktop PCs are increasingly being retrofitted by adding a RAID controller and extra IDE or
SCSI disks. Newer motherboards often have RAID controllers.
Disk Striping
RAID improves performance by disk striping, which interleaves bytes or groups of bytes across multiple
drives, so more than one disk is reading and writing simultaneously.
Mirroring and Parity
Fault tolerance is achieved by mirroring or parity. Mirroring is 100% duplication of the data on two drives
(RAID 1). Parity is used to calculate the data in two drives and store the results on a third (RAID 3 or 5).
After a failed drive is replaced, the RAID controller automatically rebuilds the lost data from the other two.
RAID systems may have a spare drive (hot spare) ready and waiting to be the replacement for a drive
that fails.
The parity calculation is performed in the following manner: a bit from drive 1 is XOR'd with a bit from
drive 2, and the result bit is stored on drive 3 (see OR for an explanation of XOR).
RAID Levels
RAID 0 - Speed
Level 0 is disk striping only, which interleaves data across multiple disks for better performance. It does
not provide safeguards against failure. RAID 0 is widely used in gaming machines for higher speed.
RAID 1 - Fault Tolerance
Uses disk mirroring, which provides 100% duplication of data. Offers highest reliability, but doubles
storage cost. RAID 1 is widely used in business applications.
RAID 2 - Speed
Bits (rather than bytes or groups of bytes) are interleaved across multiple disks. The Connection Machine
used this technique, but this is a rare method.
RAID 3 - Speed and Fault Tolerance
Data are striped across three or more drives. Used to achieve the highest data transfer, because all
drives operate in parallel. Parity bits are stored on separate, dedicated drives.
RAID 4 - Speed and Fault Tolerance
Similar to Level 3, but manages disks independently rather than in unison. Not often used.
RAID 5 - Speed and Fault Tolerance
Data are striped across three or more drives for performance, and parity bits are used for fault tolerance.
The parity bits from two drives are stored on a third drive and are interspersed with user data. RAID 5 is
widely used on servers to provide speed and fault tolerance.
RAID 6 - Speed and Fault Tolerance
Highest reliability, but not widely used. Similar to RAID 5, but performs two different parity computations
or the same computation on overlapping subsets of the data.
RAID 10 - Speed and Fault Tolerance
A combination of RAID 1 and RAID 0 combined. Raid 0 is used for performance, and RAID 1 is used for
fault tolerance.
Big RAID
EMC has been a leader in high-end RAID systems for years. Its
line of Symmetrix systems can store multiple terabytes of data.
(Image courtesy of EMC Corporation.)
Little RAID
Arco was the first to provide mirroring (RAID 1) on less
expensive IDE drives rather than their SCSI counterparts. This
self-contained RAIDcase takes up two drive bays and connects
to the IDE cable just like a single drive. (Image courtesy of Arco
Computer Products, Inc., www.arcoide.com)
Early RAID
This RAID II prototype was designed by Randy Katz and
William Patterson and built by University of Berkeley graduate
students in 1992. Housing 36 320MB disk drives, its total
storage was much less than a single drive today. (Image
courtesy of The Computer Museum History Center,
www.computerhistory.org)
Primary Storage - The computer's internal memory, which is typically made up of dynamic RAM
chips. Until non-volatile RAM, such as magnetic RAM (MRAM), becomes commonplace, the
computer's primary storage is temporary. When the power is turned off, the data in primary storage
are lost. Contrast with secondary storage. See dynamic RAM and static RAM.
Secondary Storage - External storage, such as disk and tape.
Proprietary Formats - formats are file formats which are covered by a patent or copyright. Typically
such restrictions attempt to prevent reverse engineer though reverse engineering of file formats for
the purposes of interoperability is generally believed to be legal by those who practise it. Legal
positions differ according to each countries view on, among other thing, software patents.
The opposite of a proprietary format is an open format which does not place restrictions on end
users and are often also human readable.
Privacy, Ownership, Risk and Freedom
One of the contentious issues surrounding the use of proprietary formats is that of ownership. If the
information is stored in a way which your software provider tries to keep secret, you may own the
information, but have no way to retrieve it except by using their software. If you can't retrieve it but
the software manufacturer can - they have practical control of your information. If you think of aspect
in terms of giving almost guaranteed sales for future releases of the software, you can understand
why this is called vendor lock-in.
The issue of risk comes about because exactly how a proprietary format works is not publicly
recorded. If the software firm owning right to that format stops making software which can read it
then those who had used the format in the past may lose all information in those files. Such
situations are quite common, especially for outdated versions of software.
Prominent Proprietary Formats
• DOC - Microsoft Word Document
• DWG - AutoCad Drawing
• MP3 - MPEG Audio Layer 3
Prominent Open Formats
•
•
•
•
•
•
•
txt ASCII Text - Plain text
HTML - standard web format
PNG - image format common on web
odt - OASIS XML text document
ods - OASIS XML spreadsheet document
odg - OASIS XML drawing document
odp - OASIS XML presentation document
De Facto Standard Formats - A de facto standard is a technical or other standard that is so
dominant that everybody seems to follow it like an authorized standard. The de jure standard may
be different: one example is the act of speeding found on highways. Although the de jure standard is
to drive at the speed limit or slower, in many places the de facto standard is to drive at the speed
limit or slightly faster.
A de facto standard is sometimes not formalized and may simply rely on the fact that someone has
come up with a good idea that is liked so much that it is copied. Typical creators of de facto
standards are individual companies, corporations, and consortia. In computing, de facto standards
can sometimes become de jure standards due to their share of the relevant market. For example,
JavaScript by Netscape was standardized as ECMAScript and parts of DOM Level 0 became
standardized in DOM Level 1/2 HTML Specification.
Hot Sites, Cold Sites & Warm Sites - A backup site is a location where a business can easily
relocate following a disaster, such as fire, flood, or terrorist threat. This is an integral part of the
disaster recovery plan of a business.
A backup site can be another location operated by the business, or contracted via a company that
specializes in disaster recovery services. In some cases, a business will have an agreement with a
second business to operate a joint disaster recovery facility.
There are three types of backup sites, including cold sites, warm sites, and hot sites. The
differences between the types are determined by the costs and effort required to implement each.
Hot Sites
A hot site is a duplicate of the original site of the business, with full computer systems as well as
near-complete backups of user data. Following a disaster, the hot site exists so that the business
can relocate with minimal losses to normal operations. Ideally, a hot site will be up and running
within a matter of hours. This type of backup site is the most expensive to operate. Hot sites are
popular with stock exchanges and other financial institutions who may need to evacuate due to
potential bomb threats and must resume normal operations as soon as possible.
Cold Sites
A cold site is the most inexpensive type of backup site for a business to operate. It does not include
backed up copies of data and information from the original location of the business, nor does it
include hardware already set up. The lack of hardware contributes to the minimal startup costs of
the cold site, but requires additional time following the disaster to have the operation running at a
capacity close to that prior to the disaster.
Warm Sites
A warm site is a location where the business can relocate to after the disaster that is already
stocked with computer hardware similar to that of the original site, but does not contain backed up
copies of data and information.
USB Flash Drive - USB flash drives are NAND-type flash memory data storage devices integrated
with a USB interface. They are typically small, lightweight, removable and rewritable. As of
November 2006, memory capacities for USB Flash Drives commonly are found from 128 megabytes
up to 64 gigabytes [1]. Capacity is limited only by current flash memory densities, although cost per
megabyte increases rapidly at higher capacities due to the expensive components.
USB flash drives have several advantages over other portable storage devices, particularly the
floppy disk. They are more compact, generally faster, hold more data, and are considered more
reliable (due to their lack of moving parts) than floppy disks. These types of drives use the USB
mass storage standard, supported natively by modern operating systems such as Linux, Mac OS X,
and Windows.
A flash drive consists of a small printed circuit board encased in a robust plastic or metal casing,
making the drive sturdy enough to be carried about in a pocket, as a key fob, or on a lanyard. Only
the USB connector protrudes from this protection, and is usually covered by a removable cap. Most
flash drives use a standard type-A USB connection allowing them to be connected directly to a port
on a personal computer.
Most flash drives are active only when powered by a USB computer connection, and require no
other external power source or battery power source; they are powered using the limited supply
afforded by the USB connection. To access the data stored in a flash drive, the flash drive must be
connected to a computer, either by direct connection to the computer's USB port or via a USB hub.
History
An original 16 megabyte "disgo"; The 8 MB version is considered to be the first USB flash drive
The flash drive was first invented in 1998 by Dov Moran, President and CEO of M-Systems Flash
Pioneers (Israel). Dan Harkabi, who is now a Vice President at SanDisk, led the development and
marketing team at M-Systems. His most significant contribution was that the product be self-reliant
and free of the need to install drivers. Nearly simultaneous development of similar products was
undertaken at Netac and at Trek 2000, Ltd. All three companies have similar and disputed patents.
IBM was the first North American seller of a USB flash drive, and marketed an 8 MB version of the
product in 2001 under the "Memory Key" moniker. IBM later introduced a 16 MB version
manufactured by Trek 2000, and returned to M-Systems for the 64 MB version in 2003. Lexar can
also lay claim to a USB flash drive product. In 2000 they introduced a Compact Flash (CF) card
having an internal USB function. Lexar offered a companion card reader and USB cable that
eliminated the need for a USB hub.
The first flash drives were made by M-Systems and distributed in Europe under the "disgo" [1] brand
in sizes of 8 MB, 16 MB, 32 MB, and 64 MB. These were marketed as "a true floppy-killer", and this
design was continued up to 256 MB. Asian manufacturers soon started making their own flash
drives that were cheaper than the disgo series.
Modern flash drives have USB 2.0 connectivity. However, they do not currently use the full
480 Mbit/s the specification supports due to technical limitations inherent in NAND flash. The fastest
drives available now use a dual channel controller, though still fall considerably short of the transfer
rate possible from a current generation hard disk, or the maximum high speed USB 2.0 throughput.
Flash drives have become iconic as a sort of "fashion statement" [2], much like the iPod's white ear
bud headphones.
Components
One end of the device is fitted with a single male
type-A USB connector. Inside the plastic casing is a
small printed circuit board. Mounted on this board is
some simple power circuitry and a small number of
surface-mounted integrated circuits (ICs). Typically,
one of these ICs provides an interface to the USB
port, another drives the onboard memory, and the
other is the flash memory.
Essential components
There are typically three parts to a flash drive:
• Male type-A USB connector - provides an
Internals of a typical flash drive
interface to the host computer.
(Seitec brand USB1.1 pictured)
• USB mass storage controller - implements
1 USB connector
the USB host controller and provides a
linear interface to block-oriented serial
2 USB mass storage controller device
flash devices while hiding the complexities
of block-orientation, block erasure, and
3 Test points
wear balancing, or wear levelling, although
4 Flash memory chip
drives that actually perform this in
hardware are rare. The controller contains
5 Crystal oscillator
a small RISC microprocessor and a small
6 LED
amount of on-chip ROM and RAM.
• NAND flash memory chip - stores data.
7 Write-protect switch
NAND flash is typically also used in digital
cameras.
8 Space for second flash memory chip
• Crystal oscillator - produces the device's
main 12 MHz clock signal and controls the device's data output through a phase-locked
loop.
Additional components
The typical device may also include:
• Jumpers and test pins - for testing during the flash drive's manufacturing or loading code into
the microprocessor.
• LEDs - indicate data transfers or data reads and writes.
• Write-protect switches - indicate whether the device should be in "write-protection" mode.
• Unpopulated space - provides space to include a second memory chip. Having this second
space allows the manufacturer to develop only one printed circuit board that can be used
for more than one storage size device, to meet the needs of the market.
• USB connector cover or cap - reduces the risk of damage due to static electricity, and
improves overall device appearance. Some flash drives do not feature a cap, but instead
have retractable USB connectors. Other flash drives have a "swivel" cap that is
permanently connected to the drive itself and eliminates the chance of losing the cap.
• Transport aid - In some cases, the cap contains the hole suitable for connection to a key
chain or lanyard or to otherwise aid transport and storage of the USB flash device.
Size and style of packaging
Flash drives come in various, sometimes bulky or novelty, shapes and sizes
Some manufacturers differentiate their products by using unnecessarily elaborate housings. An
example is some of Lexar's Jump Drives which are often bulky and difficult to connect to the USB
port.
Recently, USB flash drives have been integrated into other things such as a watch or a pen.
Overweight or ill fitting flash drive packaging can cause disconnection from the host computer. This
can be overcome by using a short USB to USB (male to female) extension cable to relieve tension
on the port. Such cables are USB-compatible, but do not conform to the USB standard. [2] [3]
Common uses
Personal data transport
The most common use of flash drives is by individuals to transport and store personal
files such as documents, pictures and video.
Computer repair
Flash drives enjoy notable success in the PC repair field as a means to transfer
recovery and antivirus software to infected PCs, while allowing a portion of the host
machine's data to be archived in case of emergency.
System administration
Flash drives are particularly popular among system and network administrators, who
load them with configuration information and software used for system maintenance,
troubleshooting, and recovery.
Application carriers
Flash drives are used to carry applications that run on the host computer without requiring
installation. U3, backed by flash drive vendors, offers an API to flash drive-specific functions.
airWRX is an application framework that runs from a flash drive and turns its PC host and
other nearby PCs into a multi-screen, web-like work environment. The Mozilla Firefox
browser has a configuration for flash drives, as does Opera.[3]
A Creative MuVo, a small solid-state digital audio player in a flash drive form
Audio players
Many companies make solid state digital audio players in a small form factor, essentially
producing flash drives with sound output and a simple user interface. Probably the bestknown of these has been Apple Computer's iPod shuffle, and the Creative Labs MuVo.
To boot operating systems
In a way similar to that used in LiveCD, one can launch any operating system from a
bootable flash drive, known as a LiveUSB.
In arcades
In the arcade game In the Groove and more commonly In The Groove 2, flash drives are
used to transfer high scores, screenshots, dance edits, and combos throughout sessions.
While use of flash drives is common, the drive must be Linux compatible, causing problems
for some players. Data used can then be uploaded to Groovestats.
Strengths and weaknesses
Flash drives are nearly impervious to the scratches and dust that were problematic for previous
forms of portable storage, such as compact discs and floppy disks, and their durable solid-state
design means they often survive casual abuse. This makes them ideal for transporting personal
data or work files from one location to another, such as from home to school or office or for carrying
around personal data that the user typically wants to access in a variety of places. The near-ubiquity
of USB support on modern computers means that such a drive will work in most places.
Flash drives are also a relatively dense form of storage, where even the cheapest will store dozens
of floppy disks worth of data. Some can hold more data than a CD (700 MB). Top of the line flash
drives can store more data than a DVD (4.7 GB) .
Flash drives implement the USB mass storage device class, meaning that most modern operating
systems can read and write to flash drives without any additional device drivers. Instead of exposing
the complex technical detail of the underlying flash memory devices, the flash drives export a simple
block-structured logical unit to the host operating system. The operating system can use whatever
type of filesystem or block addressing scheme it wants. Some computers have the ability to boot up
from flash drives.
Like all flash memory devices, flash drives can sustain only a limited number of write and erase
cycles before failure. Mid-range flash drives under normal conditions will support several hundred
thousand cycles, although write operations will gradually slow as the device ages. This should be a
consideration when using a flash drive to run application software or an operating system. To
address this, as well as space limitations, some developers have produced special versions of
operating systems (such as Linux) or commonplace applications (such as Mozilla Firefox) designed
to run from flash drives. These are typically optimized for size and configured to place temporary or
intermediate files in memory rather than store them temporarily on the flash drive.
Most USB flash drives do not employ a write-protect mechanism. Such a switch on the housing of
the drive itself would keep the host computer from writing or modifying data on the drive. Writeprotection would make a device suitable for repairing virus-contaminated host computers without
infecting the USB flash drive itself.
Flash drives are much more tolerant of abuse than mechanical drives, but can still be damaged or
have data corrupted if an impact such as a drop from a moving car or being hit with a blunt object
loosens a circuit connection. Improperly wired USB ports can also destroy the circuitry of a flash
drive, a danger in home-built desktop PCs.
Comparison to other portable memory forms
Flash storage devices are best compared to other common, portable, swappable data storage
devices: floppy disks, Zip disks, miniCD / miniDVD and CD-R/CD-RW discs. 3.5 inch floppy disks
and Iomega Zip disks are still available as of mid-2006, despite their declining popularity.
Floppy disks were the first publicly-popular method of file transport, but have essentially become
obsolete due to their low capacity, low speed, and low durability. Virtually all new computers include
USB ports, and many of them are now sold without a floppy drive, the Apple iMac being the first to
ship this way. Floppy disks are still in use because of their low cost and ease of use with older
systems. Attempts to extend the floppy standard (such as the Imation SuperDisk) were not
successful because of a reputation for unreliability and the lack of a single standard for PC vendors
to adopt.
The Iomega Zip drive enjoyed some popularity, but never reached the point of ubiquity in
computers. Also, the larger sizes of Zip-now up to 750 MB-cannot be read on older drives. Unless
one were to carry around an external drive, their usefulness as a means of moving data was rather
limited. The cost per megabyte was fairly high, with individual disks often priced at US$10 or higher.
Because the material used for creating the storage medium in Zip disks is similar to that used in
floppy disks, Zip disks have a higher risk of failure and data loss. Larger removable storage media,
like Iomega's Jaz drive, had even higher costs, both in drives and in media, and as such were never
feasible as a floppy alternative.
CD-R and CD-RW are swappable storage media alternatives. Unlike Zip and floppy drives, DVD
and CD recorders are increasingly common in personal computer systems. CD-Rs can only be
written to once, and the more expensive CD-RWs are only rated up to 1,000 erase/write cycles,
whereas modern NAND-based flash drives often last for 500,000 or more erase/write cycles. Optical
storage devices are also usually slower than their flash-based counterparts. Compact discs with an
11.5 cm diameter can also be inconveniently large and, unlike flash drives, cannot fit into a pocket
or hang from a keychain. Smaller CDs are available, and these are an exception. There is also no
standard file system for rewriteable optical media; packet-writing utilities like DirectCD and InCD
exist, but produce discs that are not universally readable, despite claiming to be based on the UDF
standard. The upcoming Mount Rainier standard addresses this shortcoming in CD-RW media, but
is still not supported by most DVD and CD recorders or major operating systems.
Security
Some flash drives feature encryption of the data stored on them, generally using full disk encryption
below the filesystem. This prevents an unauthorized person from accessing the data stored on it.
The disadvantage is that the drive is accessible only in the minority of computers which have
compatible encryption software, for which no portable standard is widely deployed.
Some encryption applications allow running without installation. The executable files can be stored
on the USB drive, together with the encrypted file image. The encrypted partition can be accessed
on any computer running Microsoft Windows. Other flash drives allow the user to configure secure
and public partitions of different sizes. Executable files for Windows, Macintosh, and Linux are
usually included on the drive.
Newer flash drives support biometric fingerprinting to confirm the user's identity. As of mid-2005, this
was a relatively costly alternative to standard password protection offered on many new USB flash
storage devices.
Some manufacturers deploy physical authentication tokens in the form of a flash drive. These are
used to control access to a sensitive system by containing encryption keys or, more commonly,
communicating with security software on the target machine. The system is designed so the target
machine will not operate except when the flash drive device is plugged into it. Some of these "PC
lock" devices also function as normal flash drives when plugged into other machines.
Flash drives present a significant security challenge for large organizations. Their small size and
ease of use allows unsupervised visitors or unscrupulous employees to smuggle confidential data
out with little chance of detection. Equally, corporate and public computers alike are vulnerable to
attackers connecting a flash drive to a free USB port and using malicious software such as rootkits
or packet sniffers. To prevent this, some organizations forbid the use of flash drives, and some
computers are configured to disable the mounting of USB mass storage devices by ordinary users,
a feature introduced in Windows XP Service Pack 2; others use third-party software to control USB
usage. In a lower-tech security solution, some organizations disconnect USB ports inside the
computer or fill the USB sockets with epoxy.
Naming
Recently, "USB flash drive" or simply "UFD" has emerged as the de facto standard term for these
devices. Many major manufacturers (SanDisk, Lexar, Kingston) and resellers use the term UFD to
describe them. However, the myriad of different brand names and terminology used, in the past and
currently, makes UFDs more difficult for manufacturers to market and for consumers to research.
Some commonly used names are actually trademarks of particular companies e.g. 'disgo'.
Future developments
Semiconductor corporations have striven to radically reduce the cost of the components in a flash
drive by integrating various flash drive functions in a single chip, thereby reducing the part-count and
overall package cost. As of 2004, some manufacturers plan to include more ICs so that the storage
and logic/communications functions are packaged in a single ultra-low-cost device.
In efforts to focus on increasing capacities, 64 MB and smaller capacity flash memory has been
largely discontinued, and 128 MB capacity flash memory is being phased out. Kanguru has recently
released a 64 GB flash memory drive that uses USB 2.0 and claims 10 years worth of information
preservation. [4]
Lexar is attempting to introduce a USB flash card [5] [6], which would be a compact USB flash drive
intended to replace various kinds of flash memory cards.
SanDisk has introduced a new technology to allow controlled storage and usage of copyrighted
materials on flash drives, primarily for use by students. This technology is termed FlashCP.
Source: Computer Desktop Encyclopedia copyright ©1981-2007 by The Computer Language
Company Inc. All Right reserved. THIS DEFINITION IS FOR PERSONAL USE ONLY. All other
reproduction is strictly prohibited without permission from the publisher.