I had not
Transcription
I had not
THE LEADING EDGE FORUM PRESENTS: EXTREME DATA: Rethinking the “I” in IT ABOUT THE LEADING EDGE FORUM As part of CSC’s Office of Innovation, the Leading Edge access to a client-driven programme of research, confer- Forum (LEF) provides clients with access to a powerful ences, information exchanges and advisory services. knowledge base and a global network of innovative thought leaders who engage technology and business executives The LEF Technology Programs offer CTOs and on the current and future role of information technology. other senior technologists opportunities to examine The LEF stimulates innovation and thought leadership timely technology topics and explore innovative through two core offerings: initiatives by leveraging CSC’s technology experts, alliance partners, research centres and events. The LEF Executive Programme helps companies leverage IT for business benefit through an annual retainer-based For more information about the Leading Edge Forum, service that provides CIOs and other senior executives with please visit http://lef.csc.com. LEF TECHNOLOGY PROGRAMS LEADERSHIP In this ongoing series of reports about technology directions, the LEF looks at the role of innovation in the marketplace both now and in the years to come. By studying technology’s current realities and anticipating its future shape, these reports provide organizations with the necessary balance between tactical decision making and strategic planning. WILLIAM KOFF PAUL GUSTAFSON Vice President and Chief Technology Officer, Director, LEF Technology Programs Office of Innovation Paul Gustafson is an accomplished technologist and proven leader in emerging technologies, applied research and strategy. As director of the LEF Technology Programs, Paul brings vision and leadership to a portfolio of programs and directs the technology research agenda. Astute at recognizing technology trends, how they interrelate, and their implications for business, Paul brings his insights to bear on client strategy, CSC research, leadership development and innovation strategy. He has published numerous papers and articles on strategic technology issues and speaks to executive audiences frequently on these topics. Bill Koff is a leader in CSC’s technology community, providing vision and direction as vice president and chief technology officer for the Office of Innovation. Bill plays a key role in guiding CSC research, innovation, technology thought leadership and alliance partner activities, and in certifying CSC’s Centers of Excellence. He advises CSC and its clients on critical information technology trends, technology innovation and strategic investments in leading edge technology. A frequent speaker on technology, architecture and management issues, Bill’s areas of interest include system architecture, digital disruptions, innovative uses of data, and the open source movement. [email protected] [email protected] Return to Table of Contents EXTREME DATA: Rethinking the “I” in IT CONTENTS 2 How the “I” in IT Is Changing 6 DATA EVERYWHERE Data in many places, changing the rules 18 TIME AND PLACE Data about when and where people and things are, and what’s happening now 31 SOCIAL CONNECTIONS Data that strengthens connections between people 41 MEANING Data that helps make sense of it all 56 APPENDIX: HANDY WEB SITES 60 ACKNOWLEDGMENTS How the “I” in IT Is Changing It is remarkable how far we have come with used in new ways – enabling new business digital information. processes, interpersonal connections and knowledge for business, government, commu- Sensors can report the real-time status of an nities and individuals. In this world, organiza- engine in the bowels of a ship so that costly tions need to understand and leverage their emergency shut-downs can be avoided. Data data opportunities, putting information to from your car can report your actual driving work for them like never before. behavior and lower your insurance rates. From your home PC you can access air quality and It wasn’t always like this. The organization’s other EPA ratings for your neighborhood, or data used to be centralized, sanitized and take an aerial “fly over” of your town or just authorized. It resided inside a walled fortress, about anywhere in the world. Cell phone the data center, where it was guarded and blogging based on your location is taking managed. It was structured, well-defined data hold. Camera phones are becoming common- from databases and corporate systems. And place. You can even use your multifaceted it was official – data that was generated by cell phone to check in for an airplane flight the corporation and tracked. or, in the future, to take the place of a train ticket. Today, data has broken free and is growing at a breathtaking rate. Data is mobile, operating 2 An information explosion is underway, giving freely outside corporate boundaries. It is messy rise to an era of extreme data and dramatic and unstructured, coming in many shapes and new applications. Extreme data is new types sizes: short-hand text messages, pictures, voice of data, generated by new devices, and being snippets, video clips. It is informal – generated Return to Table of Contents on the run and much of it “in passing,” enough information to fill 500,000 libraries reflecting Internet time and the 24/7 pace of the size of the U.S. Library of Congress print global business. This data has energy, as the collections.1 cover image of this report reflects. Meanwhile, U.S. federal regulations such as There is a strong consumer component. Data Sarbanes-Oxley (accounting and auditing that we’ve known about for years in consumer standards) and HIPAA (privacy standards applications but has taken a back seat in the for personal health information) have made corporate world is now front and center, prop- organizations responsible for their volumi- agating into the business world. Businesses are nous, widely-scattered piles of information. adopting technologies like instant messaging, The ability to exploit both structured and un- voice over IP and MP3 data as part of their structured data has never been more critical. information technology infrastructure. The “I” in IT is changing. Some say all this data is not necessarily a good thing – that it is more than we can digest and IT organizations are recognizing and leverag- put to meaningful use, and it poses serious ing new forms of data in real-world situations privacy and security issues. For example, today. Although much of this activity has computer storage technology is getting cheap not reached critical mass, innovative organi- enough that one day you will be able to record zations are showing the way, and more will every conversation of your life and decades follow. of photographs, but it is not clear that search technology will keep pace so you can make Part of the extreme data story is the sheer sense of all this data. volume of data being generated. Nearly all organizations are coping with an explosion of The vast amounts of digital data all around documents, presentation slides, spreadsheets, us make organizations and individuals e-mail messages and instant messages. New increasingly vulnerable to information theft. data is being measured in exobytes (1018). What happens when U.S. passports get RFID According to a study by the University of tags and a person unknowingly passes by a California at Berkeley, some five exobytes of surreptitious wireless reader that steals his new data was created and stored in 2002 – passport data? The same problem exists for Return to Table of Contents 3 those smart cards in your wallet. This report does not focus on privacy and security issues E X T R E M E D ATA but recognizes they are real and need to be addressed. D ATA E V E RY W H E R E The world of extreme data is one of produc- Data in many places, changing the rules tivity, innovation, convenience and communication. The “I” in IT has been redefined as a broad swath of data that runs wide and deep, including mobile, personal and corporate data; new types of data that go well beyond text; and enormous volumes of data. In short, this is not your father’s data. T I M E A N D P L AC E Data about when and where people and things are, and what’s happening now This report examines four dimensions of extreme data: data everywhere, time and place, social connections and meaning. These dimensions make today’s data different and put it on new footing, challenging organiza- SOCIAL CONNECTIONS Data that strengthens connections between people tions to explore the extreme side of what is possible with data today. MEANING Data that helps make sense of it all Source: CSC 4 Return to Table of Contents TECHNOLOGIES A P P L I C AT I O N S Portable devices, PDAs, smart phones, USB Identification, information, entertain- drives, cameras, smart cards, MP3 players, ment, training, diagnostics, transactions, implants, wearables, embedded processors, navigation, manuals biometrics Location technologies, GPS, RFID, GIS, Tracking, positioning, monitoring, wireless, sensors, cameras, smart dust (motes) navigation, identification, real-time updates, alarms, warnings, emergencies Messaging, conferencing, text-voice-video Finding people, developing personal IM, wikis, blogs, RSS, podcasts, VoIP, networks, collaborating, information Bluetooth, shared workspaces, peer-to-peer, sharing, broadcasting, narrowcasting, virtual communities publishing, linking, filtering, trusting, co-creating Image-audio-video-text search, XML, RDF, Multimedia search; integrated desktop, metadata, Semantic Web, taxonomies, company and Web search; pattern ontologies, artificial intelligence, mapping, identification; industry standards; visualization machine-to-machine communications; trend analysis; data mining; expert systems Return to Table of Contents 5 D ATA E V E RY W H E R E DATA IN MANY PLACES, CHANGING THE RULES The evolution of computing from centralized and monolithic to highly distributed and small is legendary, and with that has come the dispersion of data all around us, like confetti. In the past decade, data has spread from its familiar homes – the data center, the server room, the PC – to the very edge of the network where we work and play. Now data can be found all around us: in your wallet, in your pocket, in your briefcase, in your car, and even in your own body. With data everywhere, the foundation is laid for a world of extreme data and applications that are changing the rules of business and our personal lives. As data has gotten closer to people and further away from corporate walls, it has become mobile. Data can be easily transported, in large quantities, to where it is needed. The simplicity and immediacy of having data where you need it is opening up new opportunities for business and consumers. 6 Consumer devices crammed with data and functionality are pressuring the enterprise to put more of its data everywhere. Many consumer devices are already being deployed in the enterprise and are driving the rethinking of business processes. With that comes a shift in IT power from corporate to consumer, as employees’ use of consumer devices at home influences IT decisions in the enterprise.2 “Everyone should remember the PC, which started out as a toy for hobbyists and was shunned by the enterprise. Consumers led the way, and a company called Microsoft became the number one software maker,” recounts Paul Gustafson, director of the LEF Technology Programs. “Today’s consumer devices that put data everywhere are challenging the enterprise to figure out how to put its data at the edge of the network to benefit customers, partners and employees.” With data everywhere, the foundation is laid for a world of extreme data and applications that are changing the rules of business and our personal lives. There are a slew of consumer devices out there: mobile phones, camera phones, sophisticated PDAs, digital cameras, digital video recorders, digital audio recorders, digital music players. Enterprises must recognize that these “edge devices” are first-class citizens that need to be incorporated into business processes and supported by IT. And as with all matters IT, there are challenges to face, notably with privacy issues and data management. We are seeing enhanced field service because technicians can tote a massive field service database in their pocket; training delivered on personal digital assistants (PDAs) for road warriors; an online music industry that is taking off; car insurance rates that are based on data reported by the car – not to mention new media like podcasting and personalized TV that could one day unseat traditional radio and TV. That said, the data train has left the station. We already live in a world of data everywhere, and it will only become more so. The benefits of having data everywhere are manifold, with efficiency, convenience and flexibility topping the list. Let’s look at some of the most extreme examples of data in its many places and the transformations taking place as a result. (Note: We will see examples of data everywhere Return to Table of Contents throughout the report, particularly in the chapter on Time and Place.) DATA IN THE FIELD: MOBILE BUSINESS PLATFORMS New applications are taking enterprise data out into the field where it’s needed – at the point of contact with clients and business processes. The IT business platform has gone mobile, providing data and applications so that workers can function more effectively wherever they are. Workers are being armed with souped-up PDAs that no longer just manage personal information but can now contain complete field service databases and comprehensive training documents. Enterprises can put databases everywhere to get the job done. Field Service. Mobile workers like field service technicians need the ability to work offline. To this end, Pocket PCs can now carry full-function databases for field service, enabling technicians to do their jobs more effectively with the information they need literally in their hands. Such a disconnected wireless database enables the worker to gather and analyze data in the field, and to consult and update large corporate databases while mobile and disconnected from the network. This is a major step forward for mobile functionality, boosting handheld capabilities to the level of desktop capabilities. Training. Periodic training in new technologies, tools and processes is essential to keeping workers up to date, but the mobile work style makes traditional classroom training difficult to schedule. At its Training Enterprises must recognize that these “edge devices” are first-class citizens that need to be incorporated into business processes and supported by IT. Center of Excellence, CSC has studied the use of PDAs as a platform for mobile training. Interactive training materials and applications are downloaded to a PDA, and can be accessed anywhere and anytime. With the advent of smaller, more powerful computing devices with larger built-in storage, the time is ripe for a scaled-down version of the enterprise database coupled with replication directly to the master database on the server. In the past, handheld databases weren’t considered enterprise-ready – powerful enough or secure enough – and they replicated with a local copy of the database on your PC. But now the handheld database and replication are moving to the enterprise level. Using the Pocket PC, wireless (WiFi) Internet, SQL Server and IIS, CSC created a pilot question and observation database for field personnel at Canadian-based Bombardier, a world-leading manufacturer of innovative transportation solutions, from regional aircraft and business jets to rail transportation equipment. Field personnel can query and record inspection information or ask questions in the system while disconnected or out of range from the local network. When the technician returns to the office, where there is a wireless base station, he or she is able to sync the data with the master database, which resides on an enterprise server.3 Bombardier is experimenting with Pocket PCs for field personnel who service the Las Vegas monorail, made by Bombardier. Field personnel can query and record inspection information in a local database while disconnected from the network, and sync the data directly with the master database later on. Such a disconnected wireless database enables field personnel to gather and analyze data in the field, and consult and update large corporate databases while mobile and disconnected from the network. Source: Bombardier Return to Table of Contents 7 The Defense Acquisition University and other CSC clients are keenly interested in how they can use PDAs to deliver effective mobile training to a highly mobile workforce. it continues to evolve, the multifaceted mobile phone is taking on even more roles, being used for airport check-in, watching news and other TV clips, and even scanning documents. At the U.S. Department of Agriculture, veterinarians are now able to download training documents about the import and export of animal products onto tablet computers in the field via a wireless connection. In addition, the Training Center of Excellence’s Collaborative Writing Environment (CWE) is being used to generate documentation for USDA emergency programs pertaining to bio-security and other emergency incident response topics. The CWE exports content via a variety of open and closed source formats for use by multiple handheld and other devices. The system leverages wireless, XML and HTML technologies. What’s next? How about your mobile phone as a train ticket. Use your phone instead of a smart card to pay train fares at the gate by tapping the phone on a reader; this is being tested in Japan during 2005 and planned to be operational in 2006. The phone will contain an RFID (Radio Frequency Identification) chip that is compatible with the current smart card payment system used by 10 million Japanese commuters. MOBILE PHONES: GOING BEYOND THE CALL Mobile phones are already a lot more than just phones, used for e-mailing, text messaging, taking photos, playing music, accessing the Web and, yes, talking. As With this, though, comes a note of caution. “The phone should be the reader,” asserts Doug Neal, LEF Research Fellow. “People should carry readers, and objects and fixed locations should be outfitted with RFID tags. This way the individual has control over the transaction and privacy can be maintained. The tag can’t be inadvertently read as the person passes near an RFID reader. Eventually this will happen as RFID technology matures, but the trend for the immediate future is more RFID tags in things that people carry, including cell phones.” In Japan, your cell phone can be used as a train ticket starting in 2006. 8 Return to Table of Contents Source: NTT DoCoMo, Inc. Airport Check-In. Travelers on Scandinavian Airlines, the fourth largest airline in Europe, are able to check in for their flights via mobile phone. Working with CSC, the airline has created a system that uses text messages to the mobile phone that alert the traveler to call a voice system for check-in. The process is convenient and cost effective, and decreases check-in lines at the airport. Since the service was launched in January 2004, the number of subscribing passengers has increased by approximately 10 percent per month to over 22,000 today, resulting in over 23,500 text messages issued per month. Due to its success and innovation, the project received the coveted CSC Award for Technical Excellence. The text message system and voice system recognize the traveler by his mobile phone number (the systems support four languages: Swedish, Danish, Norwegian and English). Travelers receive a Short Message Service (SMS) text message on their mobile phone approximately 22 hours prior to departure. The message identifies the flight number and the telephone number of the voice system, which the traveler calls to check in and obtain a seat assignment. At the end of the call the traveler receives a text message with seat information as a confirmation of his check-in. The traveler can save the message and refer to the seat assignment when boarding the plane. The same approach can be used in other areas of selfservice, such as scheduling a doctor appointment. The customer receives a text message with suggested appointment times and calls a service to accept or decline the appointment. For airlines, the service can be expanded to include booking and rebooking flights, waitlist confirmation, lost luggage handling, a hot line for bypassing certain queues, and other functions. It’s all about delivering customer service in a competitive marketplace. “We are going to continue developing products that make travel easier,” says Jörgen Lindegaard, president and CEO of The SAS Group. “We are also going to offer more mobile self-service solutions, including better SMS and voice recognition services.” Travelers on Scandinavian Airlines can check in for a flight via their cell phones. The traveler receives a text message roughly 22 hours before departure asking him to check in by calling the voice-activated check-in system. CSC designed the check-in system, which can also be used for other selfservice applications. Source: CSC CNN and More. Your mobile phone can be used for watching news and other TV clips. Using Verizon’s VCAST service on its new 3G network, users can download and view more than 300 daily video clips from channels like CNN, NBC and ESPN on EV-DO mobile phones. Content is custom-formatted for the small screen, and some new entertainment series are being created especially for this new medium. The notion of “mobisodes” (mobile episodes) is just one more foray into digital entertainment for the person on the go. Scanner. Mobile document imaging software by Xerox Research Center Europe in Grenoble, France turns a cell phone into a device that can photograph notes on a white board, contracts and other hand-written correspondence and then convert them to a format for processing, either in hard copy or on your computer. This software, which could be on the market by the end of 2005, is for anyone with a job that requires research in the field (architects, insurance claims adjusters, real estate professionals, etc.). Return to Table of Contents 9 Books. The idea that you can read a book on your mobile phone has emerged in Japan, where mobile phone users push the envelope on applications. Several Web sites offer hundreds of novels, some written especially for the new medium. In some cases users can download novels in short installments and run them as Java-based applications. Although the small screen may not be ideal, the idea of reading a book on a mobile device is appealing – you can read a book just about anywhere without having to carry around a bulky text. PDAs can also play the role of portable book, loaded with book reader software and sporting a larger screen. With books on mobile phones, PDAs and even iPods, commuters can review small books (maybe The One Minute Manager), training materials, conference sessions and legal briefs on their way to work. THE STORAGE THAT MAKES IT POSSIBLE Hitachi now delivers a 3.5-inch hard drive that holds a whopping 500 gigabytes of data. The Japanese electronics maker also offers a tiny one-inch microdrive This trend has been in the making for years, particularly for music. Now, with a big boost from Apple’s iPod/iTunes duet, we are witnessing a complete reorganization of the music industry and legitimization of the download business model, posing the biggest challenge to the music industry since Napster and its file-sharing cohorts started the digital threat. The centerpiece of the model is granting the user listening rights rather than selling the music per se. Although you purchase a song, you are actually buying a license to listen to that song on several devices and to make copies on CDs for personal use. (Even peer-to-peer file sharing services are becoming legal as innovative companies “stream” or temporarily broadcast the content rather than have it occupy space on the user’s hard drive.) Although Apple’s iTunes Music Store made downloading digital music a legitimate thing to do, and Microsoft’s MSN Music store is coming on strong, Apple and Microsoft are hardly the only players in town. At the end of 2004, some 230 legal download services were operating in 30 countries, compared to about 50 at the end of 2003. that packs 8-10 gigabytes. These drives help put data everywhere – specifically, video and music playback in mobile phones and other small devices. For instance, the Archos AV400 series pocket video recorder uses Hitachi 2.5-inch drives that range from 20-100 gigabytes and can record from 80-400 hours of TV programs and video content. MUSIC , RADIO, TV: ONE SIZE NO LONGER FITS ALL The rise of digital entertainment – songs, radio shows, TV programs – is challenging the “one size fits all” delivery model of traditional media industries. Programming once served on large platters is being sliced and diced into bite-size segments at the will of the consumer, who orders exactly what song or program he wants when he wants it. Falling by the wayside are fixed CDs – why buy the entire CD when you only want one song? – and, potentially, fixed radio and TV schedules. 10 Return to Table of Contents Recognizing an important new player, a study by market research firm Shelley Taylor & Associates rated France’s FnacMusic number one of the 15 download music stores it evaluated at the end of 2004. The store went live in September 2004 with a catalog of 300,000 tracks and immediately won recognition from the public and press for its customer-friendly features, including discounts for buying multiple tracks and the ability to download music videos and purchase concert tickets. FnacMusic was designed and implemented by CSC in nine months, in response to Fnac’s need to redefine its strategy for online music distribution. Fnac is the top distributor of media and entertainment products in France; the company had been selling physical music in its stores for 35 years and on the Internet for five years. But facing declining CD sales, weak presence in online music distribution, the explosion of peer-topeer network usage, and the emergence of players such as iTunes, Fnac called on CSC in 2003 to help the company plan how to re-enter the online music market, including identifying synergies with the company’s 60 stores in France and its existing Web site for books, CDs and technical products. Physical products, concert tickets Download video logos, ring tones My account New albums, promotion Access by music style, mood, top hits Album push Packaged offers New tracks of the week Top selling tracks Most popular playlists France’s FnacMusic online music store (www.fnacmusic.com), designed and implemented by CSC, gives consumers a broad range of functionality via a simple and friendly interface. Source: Fnac “Fnac wanted to sell digital music via the FnacMusic site, providing consumers with a similar experience to its stores. CSC helped to design a site with a broad range of functionality and yet a very simple and friendly interface,” notes François Momboisse, FnacMusic director. The FnacMusic project was a CSC Award for Technical Excellence winner. FnacMusic features tight connections to both fnac.com and the physical Fnac stores. For example, CSC and Fnac standardized the data that goes with a song and implemented cross references with the catalogs of the physical stores. This categorization was essential for Fnac to set up the marketing necessary for online music sales, which would be coordinated with the more traditional music promotions in Fnac stores. FnacMusic fundamentally changed how Fnac’s customers buy and listen to music, and how artists get paid. The story of legal downloadable music is a classic case of extreme data driving a new business model. Or as Ian Clarke, creator of the peer-to-peer network Freenet, once put it, “If your business model is selling water in the desert and it starts to rain, you’d better find a different business model.” As music has migrated from vinyl to CDs to the Internet, a physical product once limited in supply now flows as digital streams in seemingly limitless supply, prompting a new business model. If your business model is selling water in the desert and it starts to rain, you’d better find a different business model. Being able to hand-pick your digital entertainment is impacting not just the music industry but radio too. With the advent of podcasting, or time-shifted radio, Return to Table of Contents 11 listeners are not confined to their local radio station but can listen to programs from all over (as long as the program is available as an audio file on the Web). Listeners download audio files to their audio players and listen to what they want, when and where they want. Nearly 30 percent of the 22 million U.S. adults who own MP3 players have listened to podcasts, according to an April 2005 study by the Pew Internet & American Life Project.4 Podcasting, a term created from “iPod” and “broadcasting,” has been gaining momentum over the last year not just with listeners but with a new breed of do-it-yourself broadcasters. This combination of independent listening and independent broadcasting is what gives podcasting such potential. For the listener, podcasting has been likened to “TiVo for radio,” putting programming control in the consumer’s hands. Consumers can not only select what they listen to and when but also receive podcasts automatically (via RSS feeds, discussed in Meaning). For the broadcaster, just about anyone can set up shop and broadcast content without a broadcast tower or license. Thousands of podcasts exist, with the number growing daily. Traditional shows include WGBH’s “Morning Series” and most of Air America Radio’s shows, along with a host of home grown broadcasts covering everything from movies to politics to sports. While not an immediate threat to radio, podcasting presents a new channel that traditional broadcasters need to recognize. As digital audio players proliferate, podcasting will only become more popular, providing mass customization for a traditionally one-size-fits-all medium. (For more on podcasting, see Social Connections.) The same can be said of MythTV, akin to podcasting for TV. Similar to TiVo but with more functionality, MythTV lets you record TV or other video for later playback as well as turns your PC into a full-function cable box, eliminating the need to rent one from your cable provider. This do-it-yourself set top box lets you control what programs you watch and when, skip ads and surf the Web. In terms of data everywhere, this is TV data – typically the purview of industry executives – now at 12 Return to Table of Contents your disposal. As more and more TV shows can be downloaded and shared, this could threaten the broadcast and cable TV industries the way Napster threatened the music industry. MythTV is software (an open source program) and hardware that transform your computer into a digital video recorder like TiVo and much more: a Web browser, an e-mail client, a game player. The creator wanted “the mythical convergence box,” he states on the MythTV Web site – a convergence of computing and TV. Behind MythTV is peer-to-peer file-sharing software that delivers large TV (and other) files to your desktop in minutes rather than hours. The leading program, BitTorrent, now accounts for over half the peer-to-peer traffic on the Internet and is a major innovation in the distribution of new media. From online music stores to podcasting to MythTV, we are seeing a wake-up call to traditional media industries to be more flexible and look at ways to enable mass customization of their content – or be left behind while consumers do it. PORTABLE ID: STRONG, SECURE New forms of data, and lots of it, can be easily toted around in your wallet or your pocket. Data storage devices, once the size of a stack of dinner plates, have become breathtakingly small, enabling people to carry enormous quantities of data as easily as they carry a credit card or key chain. Smart cards and flash drives are being used to store biometric and other personal data, eliminating the need to carry numerous ID cards or paper documents and providing strong security and identification. Belgium’s new electronic identity (eID) card for its 10 million citizens is a smart card containing a digitized version of the individual’s complete identity document, as well as strong electronic authentication and digital signature capabilities for verifying one’s identity online and conducting online transactions. The card has reinforced security through biometrics (currently limited to a digitized facial picture in the first version, but ready for any biometrics technique). As such, the card serves as a single identity tool for all citizens. Issued to every citizen age 12 or older, the card replaces a traditional laminated paper document considered no longer secure enough. forms, an almost unlimited range of applications can exploit the eID card, including physical access control, Web site and portal access, single sign-on, e-procurement, e-invoicing, e-mail, and office documents, including PDF forms, that require digital signature. Belgium’s new electronic identity card for its citizens is the first national identity card of its kind – a smart card containing the individual’s full identity documentation, Additionally, as designed by the team, the same card can also be used as a student card, municipal card or health care card by complementing the standard eID card with data or functions specific to those uses, such as an electronic purse for parking or cantine payment, access rights, or rebates for specific services. In terms of online safety, the software can be used to verify that a remote chatter is not older than, say, 14 based on his identity data. (Microsoft is integrating the eID technology into its MSN Messenger instant messaging software.) electronic authentication and digital signature capabilities, and biometric data. Similar initiatives have begun elsewhere in Europe (Italy and Estonia), but Belgium is the first country in the world to introduce such an electronic identity card for every citizen. By mid-2004, 80,000 eID cards had been distributed as part of a pilot with 11 municipalities. Full rollout began in October 2004 and is expected to be completed within three years, with the first three million cards distributed by the end of 2005. Many have publicly praised the card, including Belgian government officials and Microsoft founder Bill Gates. The card supports the development of e-government applications, such as electronic tax returns, as well as online safety, such as for teen chatting, through the authentication and signature capabilities. CSC led the design and implementation of the eID card and related software, including the security functions, and coordinated the project. This work earned the CSC Award for Technical Excellence. The work was performed in line with European Commission institutions and technology groups and the key development groups (Microsoft and major open source groups) in order to leverage the technology for future eID projects in other countries. The United Kingdom, Spain and Australia have expressed interest. “Many people today already use the eID tools developed by CSC and are very pleased with the quality. What more can you want than happy users?” observes Bart Sijnave, eID project manager at the Belgian Ministry of Information and Communications Technology. CSC led the design and implementation of the card and its related software. Source: CSC Since the eID card was designed to be used in any circumstance where electronic interactions require strong authentication or signature of electronic documents and In the United States, the Transportation Security Administration is piloting a biometric identity card for its workers that carries fingerprint and other biometric Return to Table of Contents 13 data. The data links workers to their identities and other related information, and eliminates the need to carry different ID cards for different transportation centers. TSA plans to test the ID cards with as many as 200,000 workers at 34 transportation hubs in six states. The idea of being able to carry biometric data on an ID card was unheard of five years ago, but today it is a reality that is reshaping security policy. Another reality of having enormous amounts of portable data storage is that you can carry not just your ID but your medical information as well. The plug-and-play nature of flash drives make them wellsuited to store medical records, providing timely access to a person’s complete medical history. This can be crucial in an emergency when the individual is incapacitated. (In the United States, the Bush administration is pushing for electronic medical records for all Americans over the next decade.) DATA IN A FLASH Signaling a growing trend in secure portable data, flash drives are being used to store all sorts of data: operations manuals that can weigh over 10 pounds, playbooks for professional football players, student contact information for teachers at a day care center, sensitive client files for attorneys, and files for college The E-HealthKEY contains your complete medical history, prescription information and other personal health data on a flash drive. Small and portable, it can be carried on a key chain or on your person. Source: MedicAlert become the standard way to communicate special medical conditions. And, MedicAlert hopes the E-HealthKEY will be used by individuals not only for emergencies but also for ongoing management of their personal health records – a one-stop shop for prescription information, family immunizations, X-ray data and the like. students so they can access campus workstations without requiring a password or network storage. The flash drives put data everywhere in a lightweight and convenient, yet secure, form. MedicAlert’s new E-HealthKEY is a flash drive for storing personal medical information. You carry the E-HealthKEY in your pocket or on a key chain, and a medical professional or first responder simply plugs the device into a USB port on any computer to access your medical information. The E-HealthKEY can hold a person’s complete personal health record, including medical images; this data is also uploaded to the MedicAlert database as backup, with support from the company’s response center. MedicAlert hopes the E-HealthKEY will become the standard way to communicate full medical histories, much like the company’s trademark bracelets have 14 Return to Table of Contents Privacy issues aside, being able to put more data about you on your person is providing important improvements in health care and security. INFORMATION IMPLANTS: INVISIBLE, INDELIBLE Implantable chips take this a step further. Although medical implants have existed for years (e.g., rods, plates, pacemakers), you can now have an information implant that makes you and your personal information inseparable. Although implantable chips can conjure up unsettling images of “Manchurian Candidate” brainwashing, they are already being used by governments outside the U.S. for security applications and have been approved for use in the U.S. for health care applications. As implantable chips become more accepted, they may enable new identification and electronic transaction applications, like identity or credit cards that cannot be easily lost or stolen. Indeed, that’s the great thing about implanted information: it’s invisible and indelible. The stage was set for human-implantable chips with animal-implantable RFID tags in pets. For several years, veterinarians have been implanting passive RFID tags about the size of a grain of rice into dogs and cats, so a lost pet can be identified even if the traditional metal ID collar tag is missing. In humans, in the United States, implantable RFID tags made by VeriChip have been approved for health care applications. If a patient arrives at an emergency room either unconscious or unable to effectively communicate, doctors can quickly obtain information about the patient’s condition by scanning the RFID tag with a hand-held scanner and accessing his patient records, which are stored in a patient database. The tag contains a 16-bit identifier that is used to look up the person’s records in the database. In one experiment, the CIO of Harvard Medical School implanted a VeriChip tag near his elbow to test how it works. The tag, about the size of two grains of rice, was implanted with a needle in a five-minute procedure. When scanned, the tag’s ID number directs medical personnel to the CIO’s records at a hospital in Boston. Although such indelible ID is new for the United States, human-implantable RFID tags are already in use in Mexico, where the Attorney General and at least 160 staffers have received RFID implants that are used to verify who the person is in order to control access to high-security rooms used in Mexico’s battle against drug cartels. Less invasive than an implant, when temporary ID will do, is an RFID stick-on tag or wristband for patients. Again, the idea is to have critical personal information available on the spot, especially if the patient is non-responsive. The U.S Food and Drug Administration has approved the use of a stick-on RFID tag that will help prevent surgery mistakes in U.S. hospitals, such as operating on the wrong patient or the wrong organ. The tag contains the patient’s name, the type and date of scheduled surgery, and the name of the surgeon. The tag is attached to the patient near the site of the surgery before the patient is sedated in the operating room. Before the surgeon begins the procedure, the tag is scanned and verified with the patient’s chart to ensure that no mistakes will be made. For all patients, not just those undergoing surgery, there is the RFID wristband, which has been tested as a replacement for the standard-issue plastic wristband. The RFID wristband contains the patient’s name, gender, birth date and medical record number. Doctors and nurses use tablet PCs with an RFID reader that picks up the information off the band over a wireless (WiFi) connection. With proper authorization, this data links to a central database containing the patient’s medical records and information from labs, pharmacy and billing. The RFID wristband has been tested as a way to streamline administrative tasks and improve the accuracy of handling patients’ treatment and medical information. TELEMATICS: DIAGNOSTICS AND DISCOUNTS For years, machines ranging from spacecraft to elevators have been reporting diagnostic data about themselves to improve maintenance and productivity. This self-diagnosing gets extreme when it’s your car and it’s reporting about you, the driver. Your car has been reporting data about itself for some time, analyzing its own exhaust for inspection officials and providing location data for safety systems like OnStar. Now your car can report data about your driving habits in an effort to promote safe driving and grant insurance discounts. Self-diagnosing gets extreme when it’s your car and it’s reporting about you, the driver. Devices aimed at consumers – specifically, parents of teen drivers – are serving as event recorders in cars, similar to the “black box” event recorders in airplanes. Return to Table of Contents 15 Two of these devices are the Road Safety Teen Driver system and Davis Instruments’ CarChip. The devices, which plug into the car’s onboard diagnostics (OBDII) port, capture driving events such as speeding and aggressive driving (e.g., hard braking or aggressive starts), as well as time of the event, and enable data downloading to a PC for analysis. The devices, which sell for under $300, are like having a parent in the car at all times. “Road Safety and CarChip are the most proactive investments you can make for your teen drivers,” declares LEF Technology Programs director Paul Gustafson. The devices are also being used to parlay safety into lower insurance rates. The CarChip is the basis of a new insurance program in the United States by Progressive Casualty Insurance Company. Using technology from Davis Instruments similar to the CarChip, Progressive has developed a system that records the driver’s behavior while he or she is driving (e.g., time trip started, duration of trip, mileage, aggressive braking events, aggressive acceleration events, speed at 10-second intervals). The purpose of Progressive’s TripSense program is to give safe drivers, as demonstrated by their actual driving habits, discounts of up to 25 percent on their auto insurance rates. For now, Progressive is focusing on when and how the vehicle is driven, not where. The Road Safety device is the size of a small book and can sit under the driver’s seat; the CarChip is about the size of a domino. The Road Safety system beeps when the driver exceeds thresholds set by the parent, and includes GPS (Global Positioning System) location capabilities. The CarChip includes engine diagnostic codes so drivers can perform simple diagnostics, like determining what the “check engine” light means and resetting it. Both devices come in versions for professional fleets of vehicles too. Road Safety issues reports like these showing actual driver behavior. The report at left is a summary of driver activity from March 5 to May 11, 2005. It shows such things as highest speed during this period (82 m.p.h.) and highest gravity forces (.65 Gs) from excessive turning, braking or accelerating. Above, a second-by-second detail report shows the driver’s speed during 60 seconds on April 1. The car’s owner, who lives in Colorado, sets the thresholds for speed and Gs; if anyone driving the vehicle exceeds them, the Road Safety device sounds an alarm and reports it as an excessive event. 16 Return to Table of Contents The CarChip snaps into your car and reports data about your actual driving habits. It is used to encourage safe driving, especially among teens. Source: Davis Instruments Participation in the program, which launched a pilot in Minnesota in August 2004, is optional for Progressive customers. Participants plug a TripSensor recorder into the OBDII port of their cars. Periodically, participants remove the TripSensor to review their driving data on their home PC. To be eligible for the discount, participants must then upload their data to Progressive. Such driving behavior data is enabling a new way of doing business at Progressive: usage-based discounts. A usage-based discount program is one of many firsts at Progressive, which initially tested a usage-based insurance program using GPS and cellular technology in 1998 (which proved not cost effective at the time). In 2002, Progressive shared its knowledge of usage-based programs with Norwich Union, granting the U.K. insurer exclusive rights to Progressive’s patented method of determining usage-based auto insurance premiums. Norwich Union is now running a Pay As You Drive pilot in the U.K. and expects to make the program fully available to its customers in 2006. Unlike Progressive, Norwich Union is using GPS and cellular technologies in its black box, a bit smaller than a video cassette. The device is installed in the car’s trunk, and data from it will be used to adjust drivers’ premiums on a monthly basis based on how often, when and where the vehicle is driven. The approach, modeled after similar “pay as you go” pricing schemes for gas and electricity, is intended to be fairer and to give consumers more control over their insurance rates. The response to the pilot has been overwhelming, with the company turning down many volunteers once it had the 5,000 it needed. The 5,000 volunteers have the GPS device installed in their car’s trunk. The device calculates such factors as the time and place of a car trip. The data is reported directly to Norwich Union every 24 hours via the cellular technology. This allows Norwich Union to actively determine a premium for the insurance policy marketed to parents of young drivers. The fixed component of the premium covers the standard risks not associated with the highly risky time between late evening and early morning. When insured drivers operate their vehicle during these times – for instance, a teen is driving at midnight – they are charged an additional premium. Norwich Union hopes Pay As You Drive will change customer behavior behind the wheel, increasing safety and reducing premiums. The goal is fewer accidents, which translates to higher profits for the insurer. Additionally, the approach starts to change the insurance product from something that most people do not want to buy, think costs too much, and don’t plan to ever use (i.e., file a claim) to a desired product. Using GPS data, Norwich Union can begin to provide value-added services, such as paying tolls. We have come a long way from collecting data about the car’s diagnostics. Collecting data about driving behavior is relatively new, but it is a natural outgrowth of data collection by the car. In the U.S., 90 percent of all new cars sold come with a black box that collects crash data; some 30 million vehicles already on the road have this box, a traditional event data recorder. The data can be used for safety research, car design and accident investigations. The National Highway Traffic Safety Administration has proposed that the boxes be standardized by 2008 so that the same data is collected in the same format, leading to better analysis and, ultimately, safer car designs. The data holds obvious interest for crash investigators, insurers, consumers and lawyers. As part of the proposal, auto manufacturers would have to disclose the data to vehicle owners and make it easier for researchers and crash investigators to access the data. From cars to cards to cell phones, data everywhere is transforming the nature of business and consumer activities. Data is being unleashed from the enterprise to the edges of the network, putting it within our grasp and providing new levels of efficiency, convenience and flexibility. Return to Table of Contents 17 T I M E A N D P L AC E DATA ABOUT WHEN AND WHERE PEOPLE AND THINGS ARE, AND WHAT’S HAPPENING NOW You are stuck in traffic and need real-time traffic information to figure out an alternate route. Time and place data can help. The integration of location-detection technologies, digital cameras, real-time sensors, wireless and mobile devices, and geographic information systems has enabled new types of applications that focus on time and place – applications that use data about when and where people and things are, and what’s happening now. Extreme data applications dealing with time and place encompass location, mobility, real-time and presence. Typically these elements do not appear in isolation but rather interact to enable a wide variety of exciting new capabilities: Let us look at time and place data through the lens of where the data originates: location technologies, digital cameras and real-time sensors. Extreme data applications use data about the current location of people and mobile objects. Location detection technologies such as GPS and RFID, coupled with map data from GIS (Geographic Information Systems) technology, enable four important capabilities: location awareness, dynamic mapping, object tracking and rapid identification. Users are often mobile themselves and use cell phones and other wireless devices to retrieve information that is immediately useful at their current location. Many times their current location is key to the application. Extreme data describes what is happening now – real-time data that can drive immediate business or personal decisions. Extreme data provides insights into what is happening at a remote location. It provides a remote presence to users or computer applications by using data from cameras and real-time sensors. 18 Taken together, time and place data are about providing visibility – that is, giving a much more accurate picture of a business process (copper production) or situation (traffic), or simply where someone or something is. People make business and personal decisions all the time based on timing and location, so having the best data possible is critical. Time and place data move us closer to real-time scenarios, or at the very least minimize delay. This has bottom line impact for everything from production processes and supply chains to public safety and personal connections. Return to Table of Contents LOCATION, LOCATION, LOCATION Location Awareness. The location of a person or object can be automatically detected, and services can be provided that are tailored for that location. GPS, with its origins in the military, has become nearly commonplace today, found in everything from cell phones to car navigation systems. GPS data tells where a person or object is, providing location coordinates to within 10 meters. Several consumer transportation applications – stuck in traffic? speeding? waiting for a bus? – and two commercial applications that track vehicles and employees get our attention as extreme uses of GPS. Instant Notification Traffic is slow Road is partly blocked Location Knowledge Flexible Re-Routing Destination Addresses Interpretation 2097 N. Collins Blvd. Richardson, TX ... Location Information Traffic & Road Conditions Mapping / GeoCoding Translation Long: 96.793566 Lat: 33.032327 Location Data Basic Tracking Location data, leveraged by mobile communications, plays a critical role in supply chain management. Source: Hanns-Christian L. Hanebeck and Bryan Tracey, “The role of location in supply chain management: how mobile communication enables supply chain best practice and allows companies to move to the next level,” International Journal of Mobile Communications (IJMC), Vol. 1, No. 1/2, 2003, pp. 148-166. If you drive in San Francisco or certain other trafficclogged metro areas, a service by Zipdash lets you know where traffic jams are, alternate routes, and how long you are likely to be stuck in traffic. The service, accessed via your GPS-equipped Nextel mobile phone, displays real-time traffic speeds and congestion on a moving map, so you can pick the fastest way home or delay your trip until the traffic clears. The service transmits your car’s location and speed at regular intervals and aggregates this with similar data from other Zipdash users, as well as data from other sources, to determine traffic speed and flow. In the car itself, Pioneer Electronics has introduced a system that integrates real-time traffic data with the car’s navigation system. Traffic conditions are continually broadcast, and alternate routes plus millions of points of interest are available. The Pioneer AVIC-N2 receives traffic data from an XM Satellite Radio service and displays it, using icons, on a 6.5-inch color monitor in the car. When introduced in November 2004, this was the first aftermarket in-car navigation system to incorporate satellite-based traffic data. (Live video of traffic data is also becoming available to drivers; see Traffic and Safety later in this chapter.) Rand McNally, known for its road maps, recently announced a traffic service for roughly 90 U.S. metro areas, delivered to cell phones. Users enter zip codes for traffic areas they are interested in, and receive realtime traffic information and speed maps on their phone. Eventually, the service will be available for GPS-equipped phones, obviating the need to enter zip codes. But zip codes can be handy for the service’s Commute Wizard, which lets commuters enter the starting and ending zip codes of their routes and delivers the relevant traffic information. Now say the traffic has cleared and you are driving home fast to save time. A device from Origin blue i, a UK-based company, lets you avoid fixed speed traps by sounding an alert as you near a speed trap camera. The system uses GPS to compare your car’s location to a database containing the location of thousands of speed trap cameras. Using GPS is much more accurate than conventional radar detection because many cameras being installed these days do not emit radar waves. Return to Table of Contents 19 And if you are taking the bus instead of driving, NextBus Information Systems not only plots the location of a bus en route but predicts the arrival time of the next bus at your stop. The NextBus system outfits buses with GPS devices and feeds the location data into its proprietary modeling software, which factors in traffic and other stops to calculate when the bus will get to a specific stop. This information is updated constantly and can be accessed via Internet-enabled mobile phones, two-way pagers, PDAs and Web browsers. The result: real-time arrival information, not static bus schedules, that helps riders manage their time. On a much larger scale, GPS is being used to track vehicles that are hundreds or even thousands of miles away. Satellite Security Services, or S3, offers a GPSbased commercial satellite tracking service for organizations and individuals. From a command center in San Diego with banks of computer screens, staffers keep track of hundreds of vehicles in real time: school buses in Washington, D.C., milk trucks in Houston, oil tank trucks in the Midwest, teenage drivers. S3 is one of a growing number of private companies providing satellite tracking services, which the company reports have improved security and efficiency for its clients. Overall, people accept the use of GPS for tracking vehicles and, by extension, the people in them. But what about using GPS to unabashedly track mobile workers on the factory floor? At first glance this smacks of Big Brother, which is what makes it extreme. But Xora Inc. markets its GPS TimeTrack service as an efficiency booster, and companies in everything from construction to service industries are signing on. Employees don GPS-enabled mobile phones and go about their work day; the boss can know their whereabouts throughout the day as the phone tracks and records their locations. Employees also clock in and out of work, or when they start and finish a job, via the phone. Overall, the GPS system is being upgraded with the next set of modernized spacecraft, with launches beginning in late 2005. Enhanced capabilities will include higher power for existing signals and new civil and military signals. 20 Return to Table of Contents GPS has shown great commercial benefits, to be sure, but it has limitations that other technologies are addressing. GPS is for locating people and things outdoors or in line-of-sight to a GPS satellite; a complementary emerging technology, Ripple, is for locating people and things indoors. Think of GPS as global and Ripple as local. For example, Ripple technology has been piloted in an airport to locate passengers in real time so they don’t miss a flight. Dynamic Mapping. In addition to tracking and identifying, location data can be used for mapping. People and objects can be further tracked by superimposing their location and movements on a map. Or, data about the environment can be superimposed on a map, providing a more intuitive understanding. The U.S. Environmental Protection Agency is using geospatial data – geographically-referenced data about natural or man-made features on earth, such as rivers and roads – to generate maps for its environmental data. Through its EnviroFacts online database, the EPA provides public Web access to air quality, water quality and toxic waste information for a specific area, displaying results on dynamically created maps. Enter your zip code in the “Window to My Environment” portion of the site and see a local map; from here you can click on items such as hazardous waste and toxic releases and redraw the map to see if these things are in your area, and where. Being able to map the data to its location is crucial to understanding at a glance where environmental problems may be. Object Tracking and Rapid Identification. People and applications can track the movement of an object, vehicle, animal or person through a building or across the country. Moving objects, vehicles, animals or persons bearing RFID tags can be rapidly identified as they pass by a sensor; the tag’s identity data is used as a database key, and data about the object is then rapidly retrieved and updated. BHP Billiton, a diversified energy and natural resources company based in Melbourne, Australia, is using RFID tags to track large, costly stainless steel plates used in the production of copper. RFID is used to boost production efficiency and reduce maintenance costs of the plates, which wear down over time. At the RFID trial site in Chile, CSC estimated a return of $1 million over five years. If RFID is implemented at other BHP Billiton copper plants in South America, savings are estimated to be greater than ten million dollars over five years. 100 suppliers to deploy RFID tags as of January 1, 2005 was scaled back due to start-up difficulties. By midJanuary, 57 of the 100 suppliers were using RFID on cases and pallets. Wal-Mart’s read rates were less than 99 percent, and several other system problems were encountered. For optimum copper production, the performance of the plates should be periodically evaluated. Poorperforming plates result in lower grade copper and are refurbished or eventually retired from use. Previously, there was no good way to identify individual plates due to the harsh environment they operate in. The plates were visually inspected but the process was not efficient. As Wal-Mart works out the wrinkles, its move puts a stake in the ground with RFID, which has seen sluggish adoption due to competing standards and cost. But with a giant like Wal-Mart making a commitment, suppliers and retailers are following suit, realizing that RFID is here to stay. In January 2006 an additional 200 suppliers are scheduled to go live with RFID; some of those suppliers are already up and running. The plates, each about one meter square, are dipped in a chemical bath in a large tank, where they are subjected to hot, acidic conditions followed by repeated mechanical shock (flexing and hammering) to get the copper off the plates. These harsh conditions prevent the use of other identifying technologies such as bar codes, which cannot withstand the acid corrosion. About 30,000 plates are in the bath at any one time; a new copper plant BHP Billiton is planning at the Spence mine in Chile will house 40,000 plates in total. Working with CSC, BHP Billiton chose RFID technology embedded in a small, protective capsule about an inch long and affixed a capsule to each plate. The plates can be identified as they move through the tank house and their performance evaluated during each five-day production cycle, optimizing the amount of high-grade copper produced and making more efficient the identification of plates that actually need maintenance. “The RFID technology is an enabler that will allow us to significantly reduce the capital cost of the plant,” says Alan Pangbourne, BHP Billiton project manager for Spence plant development. In other work with BHP Billiton, CSC is designing RFID solutions to track vehicles in underground mines, to track special tools and test equipment, and to track inventory. Many retailers are exploring RFID tags to track inventory and cut out excesses – and thus costs – in the supply chain. Retail giant Wal-Mart is leading the way, though its highly publicized effort to require its top Wal-Mart, known for its IT prowess and relentless pursuit of operational excellence, expects its use of RFID to cut billions from its supply chain by reducing inventory, better matching current stock to current demand, and curtailing theft. Everyone knows that logistics, poorly executed, can suck the lifeblood out of a business. Having stage-by-stage information mitigates risk. Put another way, the number one job of RFID in the supply chain is to mitigate risk. “Everyone knows that logistics, poorly executed, can suck the lifeblood out of a business,” stresses Peter Cochrane, co-founder of ConceptLabs and former chief technologist and head of research for BT. “Having stage-by-stage information mitigates risk. RFID tags on every item and container, with corresponding scanning capability, is the best current route to end-to-end pervasive tracking.” The tradeoff, Cochrane cautions, is having exponentially more data to manage. However, because having an inefficient supply chain is far worse, Cochrane urges that RFID is worth tackling. A sophisticated location system by AeroScout uses RFID and other location technologies to provide several kinds of location information, including presence detection, Return to Table of Contents 21 real-time location and choke-point detection (i.e., that an asset has passed through a door or gate). The AeroScout Visibility System, based on a WiFi infrastructure, is being used by one transportation logistics provider to automate its trailer-tracking process once tractor trailers enter the company’s 60-acre cross-docking facility. The system is saving time and money as trailers are located nearly instantly rather than in 30 minutes by a human “spotter” and moved to their appropriate place in the facility for receiving or unloading cargo. UPS is conducting two RFID pilots, one with reusable containers and one with vehicles once they reach a UPS facility. The reusable containers are for holding irregularly shaped packages. Bar codes on these containers have not been easy to read and often deteriorate over time; the RFID tags improve the read rates of these containers. With the vehicles, UPS hopes that RFID will improve dispatch and security processes. Overall, the company, which delivers over 13 million packages per day worldwide, is focusing on real-time package flow information to improve efficiency and reduce delivery vehicle travel distances. Financially-strapped Delta Air Lines plans to track most baggage with RFID by 2007 to cut costs associated with misrouted baggage, which reportedly can amount to $100 million per year for the airline. RFID is more accurate than the current bar code method (99 percent of bags with RFID tags are read correctly, versus 85-89 percent with bar code tags ), and thus should reduce misrouting. RFID can provide not only basic identification and routing information but also link to information about the contents of the bag gleaned from screenings. This AeroScout device fits on a truck or other vehicle to provide location information, such as the precise current position of a truck as it moves through a cargo facility. The device uses RFID and other location technologies and WiFi wireless communications. Source: AeroScout RFID is being used in numerous other applications to track things as varied as packages, baggage, livestock and students. Efficiency and safety are key motivators as the cost of RFID tags has fallen to as low as 25 cents per tag, making them economically viable in many applications. (Some argue that prices must fall below 10 cents to bring RFID into mainstream applications, replacing bar codes. There are security issues with RFID as well.5) United Parcel Service, which has been using RFID for years starting with large trucks, is exploring newer RFID technologies and multi-protocol readers that would be more flexible than a single-protocol reader. 22 Return to Table of Contents Elsewhere, livestock is being tagged with RFID to prevent or limit the spread of disease, such as mad cow disease. Cattle producers in the United States, Canada and Australia are using RFID for tracking livestock. The U.S. National Cattleman’s Beef Association launched a nationwide livestock RFID program in January 2005 that involves ranches, retailers and restaurants. It is worth it to the industry to make the RFID investment rather than risk losing business due to infected meat or the perception of infected meat. An RFID tag is attached to the cattle’s ear; the objective is that cattle can be tracked through the supply chain in less than 48 hours versus several days of labor-intensive effort. CSC has conducted tests of RFID tagging for cattle in Switzerland; CSC has been leveraging its test and evaluation experience to propose approaches for a U.S. animal tracking system to be fielded by the U.S. Department of Agriculture. At school, some 28,000 students near Houston are piloting the use of RFID badges that they press against a reader when they get on and off of school buses, in an effort to prevent losing children accidentally or from kidnappings. SEEING IS BELIEVING The power of visual information is driven home by pictures from digital cameras and other sensors that let you see what is happening at a remote location or inside an otherwise inaccessible area or process. The Web cams of CAMNET monitor air pollution and visibility for numerous U.S. cities, updating their data every 15 minutes; the Mount St. Helens VolcanoCam provides near real-time imagery of the erupting volcano. We can “be there” thanks to live Web cams that bring us to the scene of a remote situation. Web cams are not new; they have monitored surf and ski conditions, among other things, for years. What is new is the surge of camera data being put to work in new ways, from traffic and safety to health care, mapping and information retrieval. Traffic and Safety. Cameras are key components in traffic management systems, which have become increasingly networked in an effort to manage traffic in real time for optimal flow. From Maryland to California, systems that deploy hundreds of cameras, sophisticated networks and powerful computers are seeing to it that traffic moves safely. One system aiming to integrate the use of video at new levels is the Colorado Transport Management System (CTMS). The state of Colorado is a leader in sharing video data, being one of the few states that shares video data state-wide today. The media, cities, state regions and traveling public have access to real-time video feeds via closed circuit TV and the Web (wired and wireless). CTMS, a project being done in partnership with CSC and Enroute Traffic Systems, seeks to integrate cameras with other devices for intelligent traffic management. For instance, just before a Denver Broncos football game starts, a scenario will execute across a number of devices including cameras. Dynamic message signs near the stadium will announce “Game Today – Alternative Route Advised,” while cameras at congested intersections will automatically turn to view traffic inbound to the stadium. Afterwards, a post-game scenario will execute automatically, turning cameras to view outbound traffic and later clearing the signs. “Intelligent traffic management is the vision of CTMS. Integrating camera data into an overall solution that allows the cameras to work with other devices in a complex scenario is a big part of this vision,” says Jason Westra, CSC’s system architect for CTMS. Other visions for CTMS include using cameras that can learn the norm and then report anomalies in traffic flow such as changes in speed. CTMS would automatically populate speed and incident maps for traveler awareness. Another use of cameras on the highways is the infamous red-light cam, which takes a picture of your car and license plate as you illegally pass through a red light and promptly issues you a ticket in the mail. The cameras spark fear in drivers, mostly for the good as they change their behavior and obey traffic signals. But sometimes the outcome can be bad if cam-aware drivers brake rather than proceed safely through a yellow light, and get rear-ended in the process. Cameras like these are used in over 100 American cities including Baltimore, San Diego and Charlotte, North Carolina. Camera data is also making its way into the driver’s hands, via the mobile phone, to help drivers see actual traffic conditions. TrafficLand’s AirVideo service enables mobile phone users to view live video of traffic conditions in the Washington, D.C. metropolitan area. Live traffic images from over 400 traffic cameras can be viewed on Web-enabled HTML-capable mobile phones. This is camera data getting to where it needs to be, informing drivers anywhere and anytime. Another innovative use of camera data is in the City of Westminster, in the heart of London, where some 40 cameras are being deployed in a wireless network to help fight crime. Because the cameras are mobile, they can be moved around, making it harder for criminals to elude them. In one case, suspected drug dealing was captured on camera and used as evidence in court. The dealers knew where the fixed surveillance cameras were and could avoid them by changing locations, hiding their heads or wearing masks. But when a wireless camera was affixed to a lamp post opposite a suspect corner in Soho, the dealers were caught on camera. Return to Table of Contents 23 With the wireless cameras, the City of Westminster has reported a significant reduction in crimes committed during the portion of the day when the majority of the city’s crimes occur. The wireless cameras are part of the city’s Wireless City initiative; Intel has worked closely with the Westminster City Council as a strategic advisor, and joins Cisco Systems, BT, Vertex, Cap Gemini and Telindus in supporting the effort. In addition to lowering crime, the wireless cameras are cutting costs. In the past the City of Westminster had successfully deployed closed circuit television cameras for security monitoring, but the cost of installation was high. Wireless cameras can be installed for a quarter of the cost of wired cameras. The city of Essen, Germany, has piloted similar wireless surveillance cameras at the Zollverein mine, an openair tourist attraction, and has explored transmitting the camera images in real time to PDAs carried by security guards. This way guards can make observations while on patrol and respond quickly as an event happens. The wireless security solution enables a few guards to monitor a large, open area that would otherwise be difficult to secure. The solution is being considered by the local police department. Face-To-Virtual-Face. Another way networked cameras are being used is to provide the presence of an expert. Telemedicine – the remote presence of a medical professional via a camera – is seen as a way to make the most of a scarce resource, doctors. At least 18 hospital systems in the United States have adopted new eICU (Enhanced Intensive Care Unit) technology to improve the care provided to patients in intensive care units. The eICU systems allow critical care doctors and nurses at a control station to monitor dozens of patients at different hospitals simultaneously, much as an air traffic controller keeps track of several planes. A camera provides the remote doctors a live view of the patient; heart rate, blood pressure and other patient information is also displayed at the remote control station. The hospitals emphasize that the technology is meant to enhance, not replace, in-person care by allowing doctors to quickly catch and respond to trouble more quickly. ICU patients report that they find the cameras to be reassuring rather than an invasion of privacy. 24 Return to Table of Contents Similarly, patients are reporting positive results with telerounding, whereby a camera mounted on a robot conducts the doctor’s rounds. A 5.5-foot tall, armless robot nicknamed Rudy has been making rounds at Sacramento’s UC Davis Medical Center, visiting patients after surgery. With Rudy at the patient’s side, the doctor can see, hear and navigate the area of interest – in this case, the patient. The doctor controls the robot via the Internet from a PC outfitted with a camera, microphone, joystick and software from the robot’s manufacturer, InTouch Health. The robot responds in kind, interacting with the patient and objects in its local environment via two-way audio and video. The doctor can get a close view of a surgical incision as well as hear a patient’s weak voice. Such RP-6 robots (RP stands for remote presence) from InTouch Health have been installed in over two dozen hospitals. You might ask, “Who wants to interact with a robot?” According to Dr. Yulun Wang, CEO of InTouch Health, many patients have reported high satisfaction with the robotic rounds. He cited a study by Johns Hopkins University that indicates patients would prefer to see their own doctor via the robot than to see another attending physician in person. But whether telerounding becomes widespread remains to be seen. Still, as the Baby Boomers age, if the number of doctors cannot keep pace, technologies such as Rudy may become commonplace. Just as facial cues are critical to the doctor-patient relationship, they can also enhance the companycustomer relationship, resulting in more customerfriendly products. Cameras are being used to record users’ facial expressions and comments as they test new software. In the past, only keystrokes and mouse movements were monitored, but a new application from TechSmith Corporation called Morae uses cameras to get a closer look at how users respond to software. The cameras provide emotional feedback, such as a scowl or smile, that is difficult to ascertain from traditional data. This feedback can give powerful clues about which parts of the software are easy to use and which are not, which in turn can inform software design decisions. The doctor is in, visiting his patient remotely via a robot from InTouch Health. The robot's camera lets the doctor see and be seen; patients report they would rather see their own doctor this way than see a different doctor in person. The remote presence provided by networked cameras has numerous potential uses, from conducting site visits to technology reviews to job interviews. Being able to deal with another person face-to-face without leaving the office is a powerful tool for organizations. Mapping the Planet and Your Neighborhood. Camera data is not confined to business applications but can be used broadly by everyone. Google and Microsoft provide a wealth of map and other GIS data right from their Web sites, effectively servicing the planet with satellite and aerial imagery and other geographic information. Google Maps and the more recent Google Earth are mapping services that incorporate satellite imagery. With Google Maps, you can type in a city name or street address and zoom in on it, or map a road trip. Google Earth offers a new level of functionality, enabling Source: InTouch Health people to search for a location, zoom in on aerial images, and add driving directions on top of the 3D map. (Google Earth supersedes Google’s Keyhole offering; Google acquired Keyhole in 2004.) Also, users of Google Maps will be able to view better satellite imagery. In response to Google’s services, Microsoft has announced Virtual Earth, which combines satellite imagery with local search (local search is discussed in Meaning). Users can overlay mapping data on satellite imagery to find local landmarks and buildings, for instance. Users can also view images such as buildings from a 45-degree angle, providing a unique 3D perspective. Google Earth uses satellite imagery from DigitalGlobe, a leading commercial supplier of GIS data whose roots are in the military. DigitalGlobe was the source Return to Table of Contents 25 BE F O RE DigitalGlobe was the source of many before and after images of the coastlines hit by the December 26, 2004 Indian Ocean tsunami, as reported in newspapers and on TV. The satellite images on these two pages show the hard-hit Banda Aceh northern shore on the northern tip of Sumatra. Such GIS data, once held primarily by the military, has become readily available for commercial and consumer use. Source: DigitalGlobe of many before and after images of the coastlines hit by the Indian Ocean tsunami, as reported in newspapers and on TV. Another commercial supplier of GIS data, ORBIMAGE, provides geospatial satellite imagery for applications such as pipeline routing, new construction planning, farming, forestry and travel planning. “Imagery adds an additional data layer for today’s GIS systems to help organizations manage facilities and resources, make better decisions and save money,” states Alex Fox, vice president and CIO of ORBIMAGE. Even NASA is supplying GIS data to consumers through its World Wind service, which can be downloaded free of charge from the Web. Released in January 2005, World Wind leverages satellite imagery and radar topography data from Shuttle missions to let people zoom in and visit the earth’s terrain in rich 3D. 26 Return to Table of Contents As many more commercial applications of GIS data are expected, the OpenGIS Specifications are being developed to support solutions that “geo-enable” the Web, wireless and location-based services, and mainstream IT. The specifications enable developers to incorporate complex spatial information into all kinds of applications. As well, the U.S. government is preparing for an impending surge in visual data. The National Geospatial-Intelligence Agency (NGA), formerly the Defense Mapping Agency, is exploring automated feature extraction and automated change detection software to exploit the wealth of visual data expected from NTM (national technical means), Unmanned Air Vehicles (UAV), commercial imagery, motion imagery and what the military is calling “persistent surveillance” (continuous video coverage of specific locations for extended periods of time). A F TER To date, a widespread application of automated feature extraction and change detection has not been extensively implemented in a production environment. One of the key challenges, in addition to processing so many pixels, is understanding the problem. A 100 percent solution may not necessarily be the goal. In other words, if searching for vertical obstructions, false positives may be acceptable (though not preferred), as no aircraft will crash into a false positive. If the software uncovers even one vertical obstruction, that is a better solution than putting huge amounts of imagery files in archives without being able to extract any information from them. CSC is exploring automated feature extraction and visualization techniques to support NGA’s needs. What Monument Is That? Cameras in mobile phones are being used to help people access information on the spot. Two services, ShotCodes and SnapToTell, provide information based on an image received from the mobile phone. These services show the promise of mobile information retrieval using images rather than cryptic, hard-to-type text. The camera phone becomes the mouse for the mobile user. With ShotCodes, by Swedish mobile commerce company OP3, the user scans a shotcode, a black and white symbol that is put on a sign or dynamic plasma panel, to retrieve information associated with that shotcode. Special software reads the camera phone’s picture of the shotcode and prompts the phone’s browser to go to a particular Web site, where the information resides. Attendees at a trade show could scan the shotcode at a vendor’s booth for product information. People walking by an advertisement could scan the shotcode on it and be linked to the corresponding site, where they could browse a catalog, register for a newsletter or even purchase a product. The camera phone becomes the mouse for the mobile user. Return to Table of Contents 27 The shotcode underlying technology, which was developed by Intel and Cambridge University as SpotCodes, has also been commercialized by Bango.net as Bango Spots, which return tailored text, images, audio and video to the user’s phone. SnapToTell uses GPS technology in conjunction with the camera phone to access information. The experimental system is designed for tourists, who take a picture of a monument or building and, based on the location of the phone and the content of the image, receive information about the image. Developed in Singapore, the system searches a specially-created scene database to match the user’s image to one it knows. Images in the database have associated text and audio descriptions, which are sent to the user. So the user can learn about the Sir Raffles Statue as he or she gazes up at it. Today the monuments, tomorrow the assembly line or battlefield. This shotcode points to the LEF Web site when you scan it with a compatible camera phone. The shotcode is like a bar code, and your camera phone becomes a mouse. Source: OP3 SIXTH SENSE Sensors and other technologies are providing real-time awareness that gives us a sense of what is happening now or alerts us to conditions of immediate interest or that may help predict the future. We can develop a keen sixth sense, or intuition, about the world around us. Extra Sensor Perception. Networks of super-small sensors, sometimes called “smart dust” or “motes,” are emerging. These sensors, each about the size of a grain of sand, can be deployed in vulnerable or hard-to-reach places. As they become cheaper, they can replace wired sensors or monitor new areas where sensors were previously cost-prohibitive. Energy company BP is testing 160 motes in an oil tanker in the North Atlantic to see if they can help predict equipment failures. The sensors measure such things as vibrations in the ship’s pumps, compressors and engines. The company is also testing motes in a refinery to monitor bearing condition on essential motors. If the motes can sense and report that a bearing is starting to overheat, the problem can be fixed during regular maintenance, thus avoiding an emergency shutdown. (Lost production is very costly, especially given the high price of gasoline these days.) Elsewhere, Oil ID Systems is experimenting with motes to track oil as it is stored, shipped and sold. The motes are mixed into the oil and function like floating RFID tags, essentially “branding” the oil. The company hopes to use this technology to prevent theft by being able to physically trace the oil. Motes are also being tested for use in remote areas like forests, to detect wild fires much sooner than if detection were left to a random hiker or distant satellite. Your camera phone becomes a mouse when you scan a shotcode on a billboard using a compatible camera phone. The shotcode takes the phone’s browser to a specific Web page, in this case one with additional information about the latest album from the Scandinavian pop group The Cardigans. You can even order tickets to a Cardigans concert on the spot. 28 Source: OP3 Return to Table of Contents In addition to motes, larger-scale sensor networks show the broad applicability of sensor networks. Out at sea, a network of six sensor buoys in the Pacific is used for early detection of tsunamis. Tragically, no such network exists for the Indian Ocean, where the devastating 2004 tsunami struck. The United States has called for a global ocean sensing system and is expanding its network of six buoys to 38 to protect its entire coast line, not just the Pacific. More broadly, an ocean observation system that spans the globe uses an array of over 1,500 underwater sensor instruments – expected to grow to 3,000 – to measure water pressure, temperature, salinity and other factors. The data, reported via satellite from the instruments when they surface, is used for science and weather research. Some of this data was not available even a few years ago and is enabling new insights into ocean processes. Closer to home, the WeatherBug backyard weather station monitors the weather at your house and reports it to you, turning homeowners into amateur meteorologists. The station’s sensor collects data about temperature, rainfall, wind speed and other elements and continually sends it to your PC, which runs WeatherBug’s software. This data can be shared with other users and aggregated into WeatherBug’s community weather channel (much the way Zipdash, discussed earlier, aggregates individual driving data into a composite traffic picture). WeatherBug also provides severe weather alerts and forecasts via text messages to cell phones. (See more on real-time alerts below.) Sensor networks still have a way to go before they are ubiquitous; researchers are working on how to lower the costs and power consumption of sensor nodes. Core technologies and standards are still developing. However, many believe the tide has turned – that there is more technology “pull” than “push” these days for sensor networks as organizations from the U.S. Department of Homeland Security to lighting companies explore how to leverage the technology. Real-time Alerts. Time is of the essence, and data is usually much more valuable the quicker it is received. We are moving towards a real-time or certainly an “Internet time” marketplace, in which we can be notified instantly of specific events or conditions of interest based on data received from sensors, data repositories or messages from people or applications. Alerts can be used to issue health and safety warnings, to fight crime, and to keep people up-to-date on breaking news and specific topics of interest. Say you live in Atlanta and have asthma. You need to know what the local air quality is all the time. Through the EPA’s EnviroFlash service, you can be notified every day via e-mail or pager about the current air quality in your area, giving you instant information that helps you decide whether or not to stay inside that day. A pilot of the service was launched in October 2004, and a national rollout began in May 2005. This was information you previously would have gotten via television or radio, but now you can get the information reliably without having to search for it, and you can customize it. The way EnviroFlash is designed, all subscribers receive Action Day alerts, which are issued when an Action Day is declared in your area.6 In addition, you are able to select whether you want to receive a daily air quality forecast when the forecast air quality index is at, or worse than, the level you specify from a pre-defined list (e.g., good, moderate, unhealthy, hazardous). You control the triggering of alerts, maximizing or minimizing them as desired. The EPA worked closely with CSC to design and implement EnviroFlash, a customizable public service alert system that translates into a healthier and safer population. One woman who used EnviroFlash wrote to the EPA that the service helped her not only manage her asthma medications but take better care of her husband: My husband has stage IV lung/brain cancer. It never occurred to me that the air quality could be so dangerous in the winter (I only knew about the cold and dry). Because of your e-mails we were able to plan his appointments and time outside around those bad days. That’s the goal of EnviroFlash: Reaching the public one person at a time with timely, relevant information. Others in the United States are looking at a broad public safety warning system that would enable states to warn people about everything from hazardous chemical spills to chemical attacks. The National Association of State CIOs (NASCIO) is planning a pilot system that would send alerts targeted to individual zip codes if necessary; these alerts could be picked up on mobile phones, PCs, PDAs, TV and radio. The alerts would be Return to Table of Contents 29 implemented and triggered on a state basis, with the ultimate goal being a standard nationwide public alert system called the All Alert System. The system is being modeled after Amber Alerts for abducted children, which are disseminated at the state level via TV, radio and the Internet. The All Alert System would be staterun and state-activated. Meanwhile, the Federal Emergency Management Agency is conducting tests for a similar Digital Emergency System, which uses the digital signals of public TV stations to transmit warnings to mobile devices, satellite and cable outlets, and others. FEMA and NASCIO are working to minimize any overlap. Then there is the type of alert that signals a probable activity based on a series of events or patterns. Research by CSC has identified smart alerts as the next step in business intelligence, for they help a person build and interpret an evolving situation.7 Smart alerts can be used to warn of terrorist or drug activity, identify suspect insurance claims, or track emerging market moves. Behind smart alerts are new technologies and techniques in Web mining for identifying and analyzing Web data, both structured and unstructured. (Web mining is discussed in Meaning.) Smart alert systems would continuously mine Web-based repositories and use visualization techniques to illustrate emerging trends or developments. The resulting intelligence would be not just a textual alert but a representation of trends and patterns mined from the Web over time. For example, a customs agency may be interested in drug trading patterns. Today, the intelligence staff analyzes, clips and distributes articles deemed to be of interest. Tomorrow, with the smart alert, the news articles would be mined by computers to extract identified key concepts like drug type, source countries and known traffickers. Alerts would be dispatched automatically along with updated trend patterns or emerging relationship developments, also gleaned from Web and other data mining, providing the intelligence consumer with a high-fidelity picture of the emerging situation. Many consumers are hungry to be “in the know” or the first to know, and there are plenty of Internet-based 30 Return to Table of Contents news alert services for breaking news, sports, market events, technology updates and topics of interest. Google, The New York Times and ZDNet are just the tip of the iceberg of those offering these alerts. Google Alerts send you e-mail updates of the latest relevant Google search results for your topic based on keywords you identify. You might set up an alert to monitor a competitor, an industry, a breaking story or your favorite sports team. The Times News Tracker directs articles to your inbox on up to 20 different topics you choose via keywords. When articles are published on nytimes.com that match the alert criteria, an e-mail alert is sent with a link to the article. Similarly, ZDNet’s email alert for technology news uses keywords and issues an e-mail with a link to the matching ZDNet article. News services are poised to deliver alerts, but what about something mechanical like a car? If your car crashes, wouldn’t you like it to notify an emergency center immediately? The ComCARE Alliance (Communications for Coordinated Assistance and Response to Emergencies) is exploring automatic crash notification as an important public safety measure. According to the Alliance, thousands of people in the United States die every year after being in a car crash and receiving insufficient or no help in the first “golden hour” after the crash. The Alliance reports that more than half of all crashes are on rural roads and over 40 percent of all fatal crashes occur at night; response time is longer in both cases.8 Automatic crash notification systems can change these numbers, providing real-time location data, crash data, personal medical data and voice contact to an emergency response unit. Some cars are already outfitted with crash notification systems today (OnStar is one such system). The Alliance wants to make such systems standard and nationwide; it is working on a recommended data set and interoperability among E911, emergency medical services (EMS), transportation systems and information systems. Time and place data let us know exactly where someone or something is, what is happening at a remote location, or if an event is about to occur. Time and place data give us powerful digital bearings that make us smarter, safer and more precise in our business and personal lives. SOCIAL CONNECTIONS DATA THAT STRENGTHENS CONNECTIONS BETWEEN PEOPLE The world of extreme data has a strong social side to it: people are interacting with each other at home, in the office and on the move in dramatically new ways. From online social networks to wikis to blogging to “MP3Jing,” technologies are in play that are redefining how we communicate and work with others. Many of these technologies are appearing at the edge of the network initially, in our personal lives, and then migrating to business. Finding friends online evolves to finding colleagues and professionals with similar interests; shared virtual spaces become collaborative work areas; personal publishing turns into a new way to reach customers; sharing digital music redefines the role of the DJ if not the club entertainment business. Several key technologies are making these social connections possible, including directory services, which help you work with people distributed on a network, and presence detection, which lets you know if a person is online and available. Today you can reach people in ways you never could before – people who share your personal or professional interests, live where you live, or are working on the same project – and get something done. Extreme social connections cluster around four areas: finding people, working together, extreme publishing and leisure. FINDING PEOPLE People are social beings – they like to stay in touch and they learn from keeping in touch with each other. New online services and technologies are helping people find not only long-lost friends and classmates but business colleagues. In the past, it was impractical if not impossible to find people with common business interests without “pressing the flesh,” but today’s Internet is making it possible to discover business contacts quickly, effectively and virtually. Finding Colleagues. Web-based social networks sprang onto the digital scene a year or more ago, as a neat way to find friends and classmates. Friendster, Ryze, Tribe.net and orkut are some of the leading sites. Not surprisingly, sites have emerged for finding business contacts, and in some cases the original social networks are also being used for business contacts. The market has found that the Internet is a powerful tool for finding and linking people with common ground – be it a profession, location, political affiliation, colleague or friend. The network gains strength as more members add contacts who are visible to the entire membership. In the past, the only way to meet people with similar interests was through a dating site. But today, people with shared business interests can discover one another through such business network sites as LinkedIn, COMMON.net and Jigsaw Data. The power of these networks comes from linking business people from different organizations and locations. You can find expertise outside your company and in another city. LinkedIn is the leading business network site, with over 2.4 million members worldwide. LinkedIn helps members find clients, employees, business partners, sales leads, industry experts, professional services and jobs. Members find what they need through trusted colleagues. You can search the entire LinkedIn member list to find people with the experience or qualifications you require, and then find a path of introductions to those people using your existing network of trusted colleagues. Return to Table of Contents 31 People register for free and provide their own contact information as well contact information about others that they are willing to share. The same is true of COMMON.net, a newer site that focuses on one-toone business networking by finding common ground between “seekers” of people and “advocates” who provide the link to others. A more focused service is Jigsaw Data, which targets sales contacts. Operating on the principle that one man’s trash is another man’s treasure, members of Jigsaw provide business contacts and can access each other’s contacts. Members pay $25 per month for access to 25 contacts, or can contribute 25 contacts in lieu of the $25 fee. Finding People at a Conference. Making meaningful connections at a business event such as a conference can be difficult if not awkward. How do you find someone with your interests among a sea of strangers? Help is on the way from your name tag. An electronic name tag, the nTAG, beams messages to fellow conventioneers like, “Hi Jane, I’m Bob. I am interested in open source software too.” The device uses infrared sensing and RFID to communicate with other tags and even lights up in the dark for those who do their networking at the bar. Jigsaw positions itself as a cost-effective alternative to gleaning contacts from large business databases such as Hoover’s and Dun and Bradstreet, whose fees can run to thousands of dollars a year. Jigsaw’s data is current and constantly updated. Further, the service is selfpoliced by members to help ensure that contact names are valid; any member can challenge a contact name he or she believes is invalid. “People want a reason to interact,” explains nTAG inventor Rick Borovoy. “They need help. This gives them a powerful nudge in that direction.” Meeting participants have given the nTAG high marks for being an icebreaker that helps them circulate beyond their usual pool of friends or colleagues. Companies can use the data collected to evaluate attendance, session popularity, how many clients interacted with the marketing staff, or connections made within or between departments. (See Tapping the Power of Social Networks.) As one member reports on the company Web site, “Jigsaw is like having access to 1,000 Rolodexes.” Jigsaw, Common.net and LinkedIn are the new way to make business connections and help take the “cold” out of cold calling. nTAGs communicate stored data when two wearers approach one another, and display messages that are customized for the two people. The badges can also be used by wearers to exchange electronic business cards and by companies to collect leads. People wear an nTAG at a conference to help them interact with others with similar interests. A social network diagram can be generated in real time that illustrates these interactions. Dots represent people, lines represent the interactions (connections), and colors represent attributes of the person, indicating the basis for the interaction. 32 Return to Table of Contents Source: nTAG Interactive Corporation TAPPING THE POWER OF SOCIAL NETWORKS How well connected are people in your As a result of the analysis, the organi- of the key connectors was a manager organization? Despite what the organi- zation would take steps to recognize lower on the organization chart who zation chart says, how are people and support key connectors, eliminate simply owned access to a key database, actually working together? How do we information bottlenecks and over- and one department was splintered tap these invisible social networks burdened employees, pull in peripheral from the others due to physical to improve corporate performance? people who represent untapped distance. With these inefficiencies resources, and bridge disconnects identified, that group found several between groups or departments. ways to revise information access Social network analysis is a burgeoning branch of management science that and utilize a more structured decision- examines how people share and “You have to look beneath the making process to reduce time lost. obtain information in large distributed organization chart to identify important These and other seemingly subtle groups. The focus is on informal gaps and determine ways to better changes ultimately shaved several days organizational networks, where some interconnect,” explains Rob Cross, off the drilling process, which resulted contend the real work gets done. assistant professor of management at in millions of dollars of cost reduction These networks are typically uncharted the University of Virginia and a leader simply by identifying inefficiencies in and unmonitored – literally off the in the field of social network analysis. the way work was getting done deep corporate radar screen. Social network In the case of a person becoming in the organization. analysis uses sophisticated information a bottleneck, for example, automate tools to analyze these networks and some of that person’s information In another case, social network analysis provide recommendations for or decision-making, or transfer it to was used to invigorate innovation at leveraging these networks to deliver other people. Make collaboration and Mars, the food giant whose products revenue growth and innovation. information sharing part of the annual include the flagship M&M’s candies as review and a criterion of new hires. well as Uncle Ben’s rice, Whiskas pet The outcome of social network food and beverage vending machines. analysis is often a change to corporate In one social network analysis Mars convened 300 of its top scientists strategy or policy, as well as change conducted by Cross, an oil drilling for a social network analysis workshop at the individual level. Social network company wanted to speed its led by Cross. The company felt its analysis may uncover that certain decision-making process to minimize expertise was not well integrated, people are “key connectors,” holding the number of days it spent drilling. thus stifling innovation. The network the network together but posing Each saved day would save the analysis did indeed show opportunity potential bottlenecks, while others are organization $250,000-$300,000. points for better integration across loosely connected and underutilized. functional lines and key technical Groups chartered to work together The social network analysis shown competencies, using network tech- may not be due to organizational in the diagrams on the next two pages niques to identify key areas to improve boundaries, leader behavior, proximity, found that key leaders were not as network connectivity (and not just cultural differences or other factors. connected as previously thought, one networking for networking’s sake). Return to Table of Contents 33 FORMAL STRUCTURE Exploration & Production TAPPING THE POWER OF SOCIAL NETWORKS (continued) Senior Vice President Jones Exploration Williams Drilling Taylor Production Stock G&G Cohen Petrophysical Cross Sen Production O'Brien Smith Andrews Moore Paine Hughes Reservoir Shapiro Miller Ramirez Part of the workshop included having people wear nTAGs, electronic name tags, at the reception to help people Bell Cole Hussain Kelly Getting things done often depends less on an organization’s connect on differences and find the formal structure than on an informal social network of expertise they needed. colleagues. Social network analysis teases out the informal networks and helps companies improve individual and “The nTAGs totally changed the group connectivity for better organizational performance. dynamics of the offsite,” recalls Cross. “This was now mingling with purpose.” The nTAGs got people talking and sitting with others. “Offsites often CSC is using social network analysis Preliminary findings from the analysis entrench existing ways – people hang to study the relationships of its top confirm the importance of key out with those they know – but the account executives. This work is being relationships for successful business nTAG helps get people out of their done as part of CSC’s involvement development and suggest areas for comfort zone,” Cross says. in the Network Roundtable at the enhancement. The 70 survey respon- University of Virginia. The Network dents in the analysis assigned a value Each nTAG was encoded with the Roundtable, led by Cross, trains to their relationships that totaled in the person’s existing social network and managers on various ways to promote billions of dollars of new business won. the areas of expertise the person performance through organizational needed to better integrate with for relationships and social network analysis. the company to be more successful in 34 However, the analysis identified several areas where CSC’s network of business certain innovation efforts. So when a “Industry studies highlighting the value developers was fragmented, providing person passed someone in the room of social networks pointed us in this some specific and targeted opportuni- whom he did not know (i.e., was direction,” explains Beverly Bacon, ties to enhance business development outside his social network) and who senior manager in CSC’s global learning performance. One of these is leveraging had expertise in areas he needed to and development management group. remotely located CSCers for developing integrate with (e.g., biochemistry), this “In our own survey of successful innovative thinking on key business commonality would flash on his badge business development executives at development initiatives. This is already and the other person’s badge and help CSC, the people surveyed consistently happening to some extent, but there them start a conversation. This gener- cited the importance of a good social is substantial potential for the practice ated a number of innovative ideas that network as key to their success. We to be broadened. otherwise would not have been devel- are now undertaking a formal analysis oped at an offsite like this, where partic- of the networks these executives To conduct a social network analysis, ularly technical people are prone to employ, in order to help develop researchers use a 30-minute Web- interacting only with those they already other business developers and leaders based survey to collect information know and are comfortable with. across the company.” from individuals about whom they Return to Table of Contents SOCIAL NETWORK O'Brien Stock Shapiro Cohen Paine Jones Cole Kelly Andrews Smith Miller Hughes Williams Cross Hussain Taylor Moore Ramirez Bell Sen Source: Rob Cross consult for information and advice, as well as who comes to them seeking Another approach for making contacts at a corporate event uses special diagrams or maps that identify people with common business interests. At its annual technology conference in June 2005, CSC created buddy maps in advance for each conference participant based on interest profile data gathered during the registration process. Using social network analysis tools, CSC identified potential “interest buddies” and gave each participant a map of whom they should try to seek out during the conference. The initiative clearly impacted the networking dynamics of the conference. For Della Brown, a manager in CSC’s Newark, Delaware data center, it meant 10 new buddies as she successfully tracked down the “top 10” on her map, none of whom she had met before. Brown took the prize for the conference’s best networker. Finding People on the Move. For those who are constantly traveling and on the move, online social networks are being adapted for mobile devices, enabling people to stay in touch with their most elusive colleagues and friends as well as send them location-sensitive information. the same. Based on this information, network graphs are generated that depict the connections between people in the organization. The graphs are analyzed to identify problem areas and improve organizational performance. Social network analysis is at the intersection of organizational behavior and information technology. In a world of extreme data, where organizational boundaries are routinely crossed via electronic networks, it is important These mobile online social networks (called MoSoSos, for mobile social software) are ideal for young mobile professionals who use their mobile phones or laptops everywhere. Services like Dodgeball, Playtxt, Jambo and Plazes tell users when their friends or members of an affinity group are nearby. Another service, Crunkie, shows where your friends are and enables location-based blogging. Using their mobile phones, users submit and view notes and pictures tagged for specific locations. You could post a restaurant review tagged to the restaurant’s location, and another Crunkie user looking for a restaurant in that area could access your review using Crunkie’s mapping features. to understand how people are, or need to be, working together. Behind every star performer is a well-honed social network; well-managed social networks offer a distinct competitive advantage for the organization as a whole too. Another type of mobile social network hails from Bluetooth wireless technology. “Bluejacking” is the act of sending unsolicited messages via Bluetooth-enabled cell phones to random people, even strangers, who also have Bluetooth-enabled phones. The technology, which scans for other Bluetooth-enabled devices within a 30-foot radius, lets you send a short message like “who r u?” to classmates during a lecture, passengers on the same bus or subway car, or co-workers attending a Return to Table of Contents 35 Jambo, a mobile online social network, helps people in close proximity find others with similar interests. Jambo uses personal area matching technology to help people with WiFi-enabled cell phones, PDAs and laptops meet each other. Whom should you be talking to at the next conference? Source: Jambo Networks meeting. It is an unusual (and perhaps unwanted) way to break the ice, though meeting strangers on the Internet was met with trepidation at first too. WORKING TOGETHER Several technologies are helping people collaborate and communicate in new ways and get the job done. Advances in instant messaging, collaboration tools, Internet phone service, joint Web surfing and bookmark sharing are leading the way. Integrating the social or “people” dimension directly into the application accelerates how people work together. Online Presence in the Enterprise. Instant messaging, a staple of teenagers for years, is now a first-class citizen in the enterprise. The latest generation of IM tools – including Microsoft’s MSN Messenger, Yahoo Messenger, America Online’s ICQ 5 and Paltalk – has extended IM beyond simple text messages to include audio and video chatting; these advanced features are 36 Return to Table of Contents as easy to use as traditional text IM. In the past, office workers often clandestinely installed IM tools on their computers to enhance communications with co-workers. Now many organizations, including CSC, are using secure IM tools such as Lotus Sametime as part of the supported toolset for office communications. These enterprise IM tools, including Microsoft’s new Office Communicator 2005, can integrate tightly with corporate directory services so users can easily find and exchange messages with anybody in their organization. CSC is implementing a nextgeneration portal that will leverage IM and directory services for its 79,000 employees. As IM and directory services move into the enterprise, they are migrating from stand-alone services into specific applications, such as call center software. Office Communicator 2005 integrates IM and other communication capabilities – presence awareness, notifications, voice, video and voice over IP – with Microsoft Office programs and other applications. Workers can immediately see from within the application who is online and available, call them into an online chat, and resolve problems more quickly than if they had to telephone or e-mail a colleague. “Integrating the social or ‘people’ dimension directly into the application accelerates how people work together,” observes LEF Technology Programs director Paul Gustafson. “In the past, the social dimension has been ignored in applications, served instead by freestanding IM or an employee address book. Now, by in effect integrating IM and the address book into the application, the employee is one step closer to the person he or she needs.” This raises an interesting issue around what we call “signaling” – letting people know how available you are and the best way to reach you. In the case of your phone, voice over IP, or IM session, you would want a signaling system to mediate between you and the outside world. Perhaps you would charge a fee before an outsider was put through. Trickier is how to handle calls within the firm; are you available only if the boss calls? Even trickier still is what the assumptions are about your availability if the company gives you a phone or PDA. Are you always on duty? Failure to make expectations explicit, and global, leads to frustration on both sides. Team Collaboration. Another type of socialization occurs when teams need to work together and share information. A wiki can be used as a powerful collaboration tool to unite team members and coordinate work, especially if team members are in different locations and a high degree of information sharing is required. A wiki is a shared space on the Web that is gaining traction in corporations. Anyone can add content to a wiki or edit content already there; users need not know HTML though they do need to learn a simple markup language. Users can enter or edit running text, submit attachments, provide links, make comments, and create hierarchies to organize topics. Security can be added as needed to a wiki, limiting access to team members only, for instance. In addition, wikis can be configured to track changes with revision control, keeping order and instilling accountability. As a writable (editable) space, wikis are an important step towards Web inventor Tim Berners-Lee’s vision of a writable Web. With today’s browser technology, the Web is primarily a read-only environment. Technologies like wikis begin to change that. Where appropriate, CSC uses wikis to help manage client projects. CSC’s Training Center of Excellence uses a wiki internally to manage best practices and lessons learned for general courseware development, as well as development issues and software documentation. “The wiki changes all the time, as do best practices,” notes David MacLuskie, center scientist at the Center. “Our developers are excited about being able to put performance information in the wiki and pull it out later on when writing proposals.” Despite its whimsical name, a wiki could become one of the more useful and easy-to-implement tools in the IT management arsenal too. The word “wiki wiki” is Hawaiian for “quick,” and the quickness with which wiki pages can be edited and searched means it’s easy for IT staff to record important information, such as configuration changes and maintenance activities, and look it up later. This is especially important when managing a large corporate network that spans many locations. Although wikis are great at organizing unstructured text, they quickly reach their limits when you try to add structure. For example, many people use wikis for a shared task list. But can you assign a due date for a particular item? Can you assign priority? Can you assign a task to a group, not just a single person? Can you flag items for discussion? JotSpot is a tool for adding such structure to wikis, making it easy to build simple applications using wikis. JotSpot lets you to go from a free-text wiki to a custom application with very little additional code. The company offers simple starter applications to help manage recruiting, help desks, tasks, a company directory and other collaborative activities. Collective IQ. Another ambitious effort at getting people to work together is the Open Directory Project, which brings together volunteers to catalog the Web. Styled after open source projects, Open Directory invites any Web user to apply to become an editor. Editors evaluate and categorize Web sites in a selected area of interest. The result is an ongoing organization and presentation of Web content by Web users. Any Web user can access Open Directory’s directory structure, which also powers the major Web search sites. Bringing together the best minds to solve a problem is a tenet of the open source movement and is manifest in the popular Web encyclopedia Wikipedia, itself a wiki. Any Web user can contribute content to Wikipedia. The notion of harnessing humanity’s “collective IQ” using computers dates to the 1950s, when computer pioneer Doug Engelbart envisioned tapping people’s collective intelligence to solve complex problems. Engelbart foresees a “dynamic knowledge repository” of pooled intelligence from people everywhere, with all the best thinking on an argument in one place.9 Return to Table of Contents 37 In addition to Wikipedia, other examples of harnessing collective IQ began emerging by 2000 as Webbased collaboration started taking hold. One is the Open Mind Initiative, an open source project that is compiling common sense knowledge from thousands of volunteers who provide raw input, in an effort to create intelligent software that can be used by a range of applications. Nonspecialists come together, joined by the Web for a common purpose, to pool their brain power. Advances in Communicating. Talking to people is a basic way to get help for accomplishing a task. The latest way to talk to others using the Internet – voice over IP – is creating some very interesting forms of interaction (not to mention challenging traditional phone service, both land-line and mobile). Skype, the free VoIP service, has created a community of over 35 million registered users who can connect to each other in special ways that set Skype apart from plain vanilla VoIP. Users can search the database of Skype users by such things as age, language and nationality. In addition, users can set their “Skype Me” flag, which invites strangers to call and chat. Many American Skype users enjoy receiving calls from Asia and other parts of the world from people who want to practice their English. The combination of anonymous random calling and the intimacy of voice creates a unique environment for communication and collaboration. People open up in ways they might not normally. Engelbart foresees a “dynamic knowledge repository” of pooled intelligence from people everywhere, with all the best thinking on an argument in one place. The next phase of random calling may be more formalized Skype-enabled social networks like Jyve, which connects Skype users with similar interests, and SomeoneNew, which connects Skype users for romantic purposes. Only a few English-language social networking sites currently use Skype, but such sites in Asia have been very successful. 38 Return to Table of Contents Beyond voice, Skype has positioned itself as a platform for other services; it includes IM, and there are plans to add videoconferencing to the Skype for Business offering. In the IM world the reverse has happened, with PC-to-PC calling added to the IM offerings of Microsoft, Yahoo and America Online. “As the lines between IM, voice and video blur, the emphasis is on ease of communication rather than discrete applications,” says Bill Koff, vice president and chief technology officer for CSC’s Office of Innovation. Other Collaborations. A different form of computerenabled collaboration is the ability to surf a Web site with someone else. A technology from Advanced Reality called JYBE (Join Your Browser with Everyone) allows two or more people in different locations to look at the same Web page together. JYBE is a free browser plug-in that works with Microsoft’s Internet Explorer and Mozilla Firefox, the open source browser. JYBE will probably not displace industrial-strength Web conferencing tools but offers a light-weight alternative for small businesses and consumers to show clients new products and experience the Web with family and friends. Another way to share your Web experience and collaborate with others is through bookmark sharing services. These services let you share your personal Web bookmarks (favorite sites) with others, see who has linked to one of your bookmarked sites, and what else they are linking to. Colleagues can build lists of bookmarks when working on joint projects. One bookmark sharing service, del.icio.us, lets users assign keywords to their bookmarked sites so bookmarks can be easily searched. Users can view others’ bookmarks and subscribe to the links of those whose lists are most interesting. Although favorite sites can always be shared by e-mail, services like del.icio.us broaden the sharing to a community of people with like interests. In short, there are numerous ways for people to work and communicate together today, aided by a shared Internet and mobile computing and communications devices. Cyberspace social scientist Howard Rheingold has recognized the enormous social power of such an information environment, which can unleash what he calls “smart mobs.” Smart mobs protested at the World Trade Organization meeting in Seattle in 1999 using dynamically updated Web sites, cell phones and “swarming” tactics. In the Philippines, a million people overthrew the president by organizing peaceful public demonstrations through swarms of text messages. Being pervasively connected in real time gives the body politic new potential for collective action. EXTREME PUBLISHING The Web has long been a publishing paradise, enabling individuals to publish like professionals, and technology advances are taking this to new extremes. Blogging hit its stride in 2004, giving voice to the individual-asjournalist outside the mainstream (“citizen journalists”). Blogging, which began as text, has evolved to podcasting and video blogging. And with this has come a whole new social milieu. The Blogosphere. Blogging has enabled people to voice their opinions on any subject, from politics to news to books to shopping. Now just about anyone, not merely the tech-savvy, can create a blog or Web log. Established blogging tools such as Google Blogger have made writing a blog easy – formatting and layout are done automatically. Thus anyone inclined to speak his mind and get feedback is heartily encouraged, because the tools of the trade make it easier than writing an article in HTML and designing the Web page it goes on. New services such as Yahoo 360 and Microsoft’s MSN Spaces integrate blogging into multi-purpose personal communication services, underscoring the importance of blogging as a way to connect with others. These services let users publish blogs, share content and post pictures for family and friends in a secure environment. Blogging is also being incorporated into instant messaging, so that you can be immediately notified when there is a new blog entry. The MSN Spaces blogging tool is integrated with MSN Messenger so that contacts in your Messenger window “gleam” when they have added an entry to their blog, signaling that there is a new blog article for you to read. This is an excellent prompt for people who need to know what’s new right away. New Avenues of Expression. Podcasting (discussed in Data Everywhere) has been likened to audio blogging for people who are mobile. People create audio broadcasts that can be downloaded to an MP3 player or computer and listened by anyone. The technology has given birth to a new breed of publisher, the podcaster, who sets up shop and creates running commentary on a variety of subjects. Today there are thousands of podcasters, from people recording in their basements to National Public Radio affiliates recording in professional studios. Podcasting puts those two classes of publishers, and everyone in between, on equal footing. Everyone can reach an audience on the Web. A newer form of publishing is the video blog or vlog, a blog published in video format. The vlog, while still up and coming, is being used for works that vary from citizen reporting to avant-garde film clips. Like blogging and podcasting, vlogging levels the playing field for publishers and provides a new avenue of expression. AT YOUR LEISURE New ways to share music, photos, games and even a hug are transforming how we socialize with family and friends. Consider how clubbing is changing with the advent of iPod nights, when the DJ’s role is taken over by amateurs boasting – that is to say, playing – the contents of their iPods over the club’s audio system. Variously called iPod DJ parties, “no wax” nights (for no vinyl) and MP3Jing, iPod nights have sprung up in New York City, Washington, Chicago, London, Tokyo and Hong Kong, promoting a culture of music diversity in the context of a highly personal shared experience. Think of it as karaoke for playlists – a popular new form of entertainment with a subculture of MP3 enthusiasts all its own. Sharing photos online has become a springboard to a much richer social experience. Services like Flickr (part of Yahoo) and HeyPix (part of CNET Networks’ Webshots) enable viewers to add comments and notes to your photos, so unknown people or locations can be identified, for instance. The services integrate blogging, enabling you to post photos to your blogs, and RSS Return to Table of Contents 39 feeds (discussed in Meaning), making it easy for friends using RSS aggregators to know when you have added new photos. Flickr also allows you to tag your photos with keywords so you can search your photos as well as others (that you are permitted to view) on the site. These services are attracting hundreds of thousands of users, who can elect to share their photos with the world, with friends and family, or with a single confidant. to develop a massively multiplayer training simulation called AWE (Asymmetric Warfare Environment) to help train soldiers for urban warfare. Forterra previously developed “There,” a highly successful MMO in which players socialize and express themselves with avatars that boast dozens of different emotional expressions and gestures. Forterra used “There” as the basis for the simulated world of AWE. Integrating multiple forms of interaction – photo sharing, blogs, RSS, tagging and others – is an approach that all successful social connection sites will likely adopt. Beyond the social whirl of online games, photos and music, there is a more intimate social connection: a hug. Robotics researchers at Carnegie Mellon University have designed a soft, huggable pillow called the “Hug” that uses sensors and wireless technology to enable family members to remotely communicate affection during phone calls. The device, which is used by the sender and the receiver, emits vibrations based on squeezes from the sender, along with heat. The Hug provides social and emotional support for distant grandparents and other family members, and could have potential for boosting the well-being of medical patients. If the technology takes off, being able to convey the intimate feelings and emotions of a hug remotely will be an extreme new way of connecting with others. On the gaming front, what was once a solitary experience – playing a computer game – has morphed into a highly connected experience. Massively multiplayer online games (MMOs) such as “Everquest” and “World of Warcraft” attract over 200,000 concurrent players, a mind-boggling number. There may be no other time in human history when so many people are actively engaged in playing a common game. “More people are connected than ever before,” observes Doug Neal, LEF Research Fellow, noting that some 5,000 people traveled to Dallas in 2004 for QuakeCon, the preeminent annual computer gaming convention where fans flock to play games for four days. With this has come a shift from top-down hierarchies to flatter social networks, where players lead by persuading others (e.g., come follow my character), not by force of authority. The leadership model of the social network can be seen in the business world as well. (See Tapping the Power of Social Networks.) With this has come a shift from top-down hierarchies to flatter social networks, where players lead by persuading others, not by force of authority. Online games have long had an impact on the military for simulated war game exercises. In 2004, the U.S. Army was sufficiently impressed with the potential of MMO technology that it contracted with Forterra Systems 40 Return to Table of Contents MEANING DATA THAT HELPS MAKE SENSE OF IT ALL A swell of digital data begs the question: How can I make use of all this data? What does it all mean? As data increases in sheer volume and new data types have entered into the mix, data is also advancing on another front: meaning. We are learning to better leverage the meaning behind the data to deliver value. Search is no longer just about text but includes image, video, audio, location-based information and even entire books. In addition, there are places to search beyond the Web: your hard drive and the enterprise’s hard drives. As data has moved to the edge, search has had to follow suit. There are ways to make the meaning of data – the semantics – explicit. This makes it easier for software to automatically manipulate the data for searching, integrating or interpreting across a variety of applications. Processes can be more efficient, and more can be done on behalf of the user. Having data with semantics – the Semantic Web and other metadata approaches for providing new levels of data definition – enables computers and people to work together better. Methods that help us detect patterns in data, including data mining techniques and new ways to aggregate and visualize data, improve our understanding of complex data and facilitate more informed decision-making. There may be data everywhere, but it is important to be able to use that data well, aided by advances in search, semantics and pattern detection. THE NEW SEARCH An organization’s digital assets are no longer confined to text but encompass images, video, audio and more. Organizations need to be able to make the most of all their digital assets through comprehensive search techniques that can comb through image libraries of products and parts, video and audio libraries of conference proceedings and courses, and other multimedia information. Much of this search activity is initially appearing on the Web, itself transforming into a vast multimedia repository of text, images, video, audio and full-text books. The Web searcher wants to find all these things, and search experts are rushing to add capabilities. If a picture is worth a thousand words, then the business value of image search will increase significantly for retailing, communications, real estate, marketing and other fields. Image Search. Google, Yahoo and other leading search companies have expanded to provide image search capabilities. In addition, specialized image search tools are appearing, such as Webmap PropertyView, which searches for images of real estate properties. If a picture is worth a thousand words, then the business value of image search will increase significantly for retailing, communications, real estate, marketing and other fields. Return to Table of Contents 41 New search engines can find not only snapshot photos but also map images taken from spacecraft (NASA World Wind, discussed in Time and Place), aerial photographs (TerraServer) and topographic maps (TopoZone). CSC is working with the National Oceanic and Atmospheric Administration on a novel system that searches environmental satellite images by two new dimensions: location and time. CLASS, the Comprehensive Large Array-data Stewardship System, is an enormous archive designed to handle a petabyte (1015) of environmental data that can be searched by both scientists and the public through a simple portal interface. People can search CLASS by type of data, location on the planet, and time. This is important for climate research and for analyzing cloud cover, sea surface temperatures, land usage and other aspects of the Earth’s surface and atmosphere. Users can select data by a predefined region, such as the Gulf of Mexico, or a region they define. CLASS brings together many elements for a comprehensive search: people can find data for a geographic area and time of interest, in a massive image archive that is updated daily. Advanced image search tools, such as those from piXlogic, find images by understanding the objects in the image – for instance, that it is a baby or a ball. IBM and European researches are developing similar tools that search for images based on the objects they contain and the meaning of the images. These approaches are intended to be more accurate than search methods that rely on text descriptions and keywords of the image. Research is underway at Purdue University to search for 3D shapes in computer aided design parts databases. The tools being developed are intended to save money by enabling designs for products with multiple complex parts, such as aircraft, to reuse existing parts rather than create costly new ones. Purdue’s shape-based search is being extended into the biological domain for comparing protein structures and the similarity and functioning of binding sites, for drug discovery. 42 Return to Table of Contents Video Search. Fast on the heels of image search is the rapidly evolving video search – being able to find a news cast, TV show, movie or other video clip. Google, Yahoo, blinkx TV and AOL Singingfish have all recently released new or enhanced video search capabilities for the Web. People can find video clips using the same familiar search interfaces they have been using to search for Web pages. Each of these video search services has different strengths. For example, Yahoo wants to promote the use of metadata in video content, to describe the content and thus make it easier to find and index by search engines. In addition to standard keyword and Boolean queries, blinkx provides a conceptual search, enabling users to enter regular text for which blinkx returns results whose content is conceptually similar to the search text. While blinkx searches TV programming (news, sports, entertainment), Singingfish searches TV, movies, radio and music. In contrast to the search sites, Videora is a search tool you download and use to find video files on the Web. Called the Napster of video, Videora is a peer-to-peer system that makes it easy for people to automatically locate video files and download them to their computers. The system, which combines BitTorrent and RSS technologies, is designed to handle large files with ease and is primarily used to find movies. Local Search. Another type of search is location-based search, which lets you search for items in a specific geographic area, like businesses and restaurants. Google, Yahoo, Microsoft and other search engines have added such local search capabilities. Google Local, in pilot, lets you search for business establishments by zip code and returns results plotted on a map, including driving directions. You can switch back and forth between maps and satellite data for the same location. Yahoo lets you transmit the results of a local search, such as the name, address and phone number of a business, directly to your mobile phones as a text message. In this way, search results are pushed to the edge of the network where they are needed most. Microsoft’s Virtual Earth (mentioned in Time and Place) couples local directory listings with satellite imagery so you can find places of interest. Another combination of local search plus images is offered by A9.com, a subsidiary of Amazon.com. A9 provides a yellow-pages listing augmented with street-level photos of the listing and shops on either side of it, letting users in effect walk up and down the street and base their decision to go to a store on the other businesses in the immediate neighborhood. “Local search is more than just tools,” comments Bill Koff, vice president and chief technology officer for CSC’s Office of Innovation. “Each company has a different strategy for its information and models. Google is compute-based, Yahoo uses people to define meta models and tagging, and Microsoft uses both, with an emphasis on manual tagging of local content.” In the future we can expect to see online services that combine local search, blogging, podcasting and vlogging. People will create blogs, podcasts and vlogs that are tagged with a location, such as the Museum of Modern Art in New York City. Users will be able to search for entries by location. So if you are planning a walk up Fifth Avenue near the MOMA, you can find text, podcasts and vlogs about sites along your walk. An early sign of this is podcasts done by a professor and some students for art on display at the MOMA; you can bring these podcasts to the museum and listen to them while looking at the art (instead of, or in addition to, listening to the museum’s audio guide). Though the podcasts are not tagged with location data per se, it is easy to see the technology evolving to that. Omni Search. Advances in search encompass not just tools but content. In its quest to expand the Web beyond its current body of material, Google is collaborating with several research institutions to digitize and make searchable books, scholarly papers and special collections that have not been previously available on the Web. The effort, involving Harvard, the University of Michigan, Stanford, the New York Public Library and others, is a step towards the vision of the Internet Archive, an initiative to provide universal digital access to the world’s books and other media. Desktop and Enterprise Search. And yet, it comes as no surprise that not all of the world’s information is intended to reside on the Web; a sizeable amount of data resides on PCs or in the enterprise, posing a different set of challenges than Web-based search. Recognizing this, the search companies have extended their technology to the desktop and the enterprise. Google, Yahoo, Microsoft, Ask Jeeves and AOL have all announced free desktop search tools, in a flurry of competition for this emerging search space. Google’s free downloadable beta tool searches e-mail, files, Web history, instant messages, and Word, Excel and PowerPoint documents on your hard drive. The tools from Yahoo, Microsoft, Ask Jeeves and AOL are similar. With these tools, people can search for the public and private data they need. One of the first comprehensive desktop search tools was X1 Desktop Search from X1 Technologies. The company has extended its core offering to include support for IBM Lotus Notes e-mail messages, attachments and local databases, and also offers an Enterprise Edition with additional capabilities for businesses and work groups. Other tools have also been refined to search for data in small and mid-size businesses. Google Mini can search up to 50,000 documents in an organization, applying the same technology as its flagship search engine. As well, enterprises are evaluating the desktop search tools to enable their employees to quickly mine their own collections of documents for needed information. In business, time is money, so a faster and more accurate search is like money in the bank. SMARTER THROUGH SEMANTICS With so much data in digital form, it makes sense to automate wherever possible to exploit that data. To this end, organizations are recognizing the value of metadata – data about data – for encoding meaning into data, making the data more machine-readable and thus rendering the software that manipulates the data more capable. Organizations are using metadata to describe documents, databases, files, Web pages and other information in order to help organize, browse and search for information. Return to Table of Contents 43 Mighty Metadata. Organizations are beginning to organize their metadata into structures, such as a thesaurus or taxonomy, that define categories and subcategories of topics. These structures help people understand how topics are organized and thus better select terms to describe or search for documents. ERIC, the Educational Resources Information Center, is a new library of education materials on the Web that is searchable using a sophisticated thesaurus. Sponsored by the U.S. Department of Education, ERIC provides public access to a premier bibliographic database of journal and non-journal education literature, containing over 1.1 million citations. CSC helped design and build ERIC, which is said to be the largest database of education materials in the world. ERIC’s thesaurus uses semi-automated indexing, which will become fully automated over time; then, human lexicographers will only have to check samples of results for quality – a vast improvement over manually indexing every document. The indexing uses linguistic technology to categorize ERIC documents; the indexer recognizes words or groups of words in a particular context, makes some inferences, and assigns a thesaurus term. For instance, upon encountering “Head Start,” it might assign the thesaurus term “early childhood education.” The Semantic Web is a global effort to explicitly encode meaning with Web data to enable software agents and applications to find, integrate and work with Web data in smart ways. The linguistic technology understands the thesaurus through training sets, whereby a set of documents associated with a thesaurus term is fed to ERIC’s software, to train it to understand that the contents of documents like these are associated with this particular thesaurus term. If the indexer encounters similar documents, they should be assigned the same thesaurus term. 44 Return to Table of Contents The linguistic technology can also extract concepts and entities (people, places, things) from documents and build an index of related documents based on these attributes. All this metadata helps make the ERIC search more accurate and easier for both novice and expert searchers. Another powerful use of metadata is for indexing video and audio material. Through its Digital Asset Management Systems architecture, CSC’s Multimedia Center of Excellence uses commercial tools such as Virage VideoLogger and ControlCenter, plus customdeveloped tools, to automatically create an index of data that describes video and audio content over time. The index can be used to search and retrieve desired segments of a video or audio stream, and correlate them with other content. The metadata describes scene changes, locations, who is speaking, what they are saying, where they are speaking, and when. The capability to search and retrieve video and audio is especially handy when reusing the segments – assuming the right segments can be found. Intelligence agencies and media companies can use the metadata to search for words of interest in thousands of television and radio broadcasts and phone conversations. The indexed video and audio can be packaged and forwarded to an analyst or editor for focused scrutiny. This new form of metadata opens up the “black hole” of video and audio files, which in the past have been accessible only by filename or keyword, not by actual content inside the file. Now workers can review surveillance tapes, news broadcasts, music videos, phone conversations and more, looking for particular words and phrases. Semantic Web. On a broader scale, movement is underway to endow the entire Web with meaning so people and computers can work together better to make the most of the Web’s information. That is the vision of the Semantic Web, a global effort to explicitly encode meaning with Web data to enable software agents and applications to find, integrate and work with Web data in smart ways. As envisioned, the Semantic Web will be much more powerful than today’s Web because its data will be read and consumed by computer programs, not just people. CSC has conducted a study of the Semantic Web, examining current tools, existing applications, key players in the field, and how the Semantic Web will likely be used.10 Software agents will be able to access the Semantic Web and perform some of the tasks people do manually today, such as searching, querying and integrating information. A person might use his personal software agent to search the Web to find and schedule physical therapy sessions. The agent finds a therapist who provides the required treatment, is located nearby, and is covered by the person’s insurance carrier by retrieving and integrating information from several Web sites. Photo based on concept by Miguel Salmeron and Scientific American The vision of the Semantic Web is that people and computers can work together better to make the most of the Web’s information. Semantic data is encoded with Web data, giving the Web data meaning that can be understood by software agents and computer applications, which can then automate more activities for the user. Business processes might also be automated using Web software agents. Messages delivered via the Web might trigger software agents that collect and integrate information from multiple Web locations, make decisions based on the integrated information, and then take appropriate actions. The actions might include notifying people, updating databases, or sending control signals to physical devices that are connected to the Web. Although robust commercial applications have yet to emerge because the Semantic Web is still in a formative stage, many believe that, as with the Web itself, the true value of the Semantic Web will materialize in ways that are difficult to foresee today. In the meantime, a group of universities and organizations in Europe has issued the Semantic Web Challenge, an annual contest designed to stimulate good examples of how the Semantic Web will be used. The group suggests natural disaster management and personal information management as fertile areas for Semantic Web applications. The vision of the Semantic Web is generating interest in related semantic technologies, including topic maps.11 Topic maps have been standardized in the information science community as a way to categorize and describe the content of documents. A topic map is like a smart index: it describes a set of subjects (or topics) and the relationships between them. Documents in a library can then be linked to one or more subjects in the topic map. Users can search a document library to find what they want by navigating through the topic map, selecting the subjects of interest, and then retrieving the documents linked to the subjects. Topic maps provide a context for search terms; the maps give more information than a taxonomy but are less formal than an ontology. The Semantic Web has been architected to describe resources on the Web, whereas topic maps have arisen to help search document libraries. Despite this difference in purpose, the two technologies have many similarities. Both are based on graph data structures and are represented in XML. Proponents of topic maps point to available tools and argue that document search can be strengthened by implementing topic map searches today. Some organizations, such as the U.S. Internal Revenue Service, are piloting the use of topic maps to define terms that are used in search. But proponents of the Semantic Web suggest that the powerful ontologies that can be specified with the W3C’s Web Ontology Language (OWL), a key standard for the Semantic Web, will be more useful in the long run. Semantic Information Integration. Many Semantic Web applications will fall into the category of semantic information integration. These applications will use software Return to Table of Contents 45 agents or server-based application programs to retrieve semantic markup from multiple sources on the Web, integrate and analyze the information, and take some actions. A simple example of semantic information integration is RSS, a headline syndication mechanism used primarily for news feeds and blogs – sites that update their entries frequently. RSS feeds are designed to deliver what’s new on a periodic basis. With RSS feeds you get a steady stream of updates without having to check Web sites or wait for e-mail updates. You can select the news you want using categories set up by the content provider, such as top stories, world news, technology and entertainment. RSS is a family of XML file formats that includes Really Simple Syndication, Rich Site Summary and RDF Site Summary. The formats hail from a key Semantic Web technology, the Resource Description Framework (RDF), which is a standard for representing the metadata about a piece of Web content. Because it is a common format, RDF written for one application, such as an RSS feed, can be easily used by other applications in the future. People access RSS feeds via an RSS aggregator (also called a reader), which can be a desktop application, a plug-in to a browser or an e-mail enhancement. The content provider updates its RSS feed (the XML file) on a regular basis, and the aggregator periodically retrieves what’s new in the file. The aggregator organizes this content, which typically includes the headline of the article, the source, a link to the full story, and other data items. Aggregators can handle multiple RSS feeds as specified by the user. RSS is like finding what you want from a river of information – and it’s all automatic. DETECTING PAT TERNS In the case of search and semantics, we know the meaning of the data and apply that meaning to make the data more useful. But when we don’t know the data’s meaning, we need techniques for detecting patterns and presenting the data in understandable ways. Some of the newer techniques for exploring 46 Return to Table of Contents and understanding data involve intuitive visualization, pattern analysis, Web mining and discovery, and other advanced techniques. Intuitive Visualization. People are visual creatures by nature, learning from what they see and watch, such as pictures, diagrams, videos and faces. New techniques for visualization present data in more intuitive, understandable ways, such as when selecting from alternatives, understanding events over time, and understanding complex data and statistics. Many business and personal decisions involve evaluating alternatives or items in a group. Amplifying this, the trend in the retail and services industries continues toward more alternatives and increased customization. Knowledge workers and people in general need help in weighing an ever-expanding array of choices. New tools are becoming available that provide intuitive visualization of sets of items, like a group of stocks or sales force performance data. One such tool is the Hive Group’s Honeycomb software, which creates colorful maps that let you understand complex data quickly. The software displays data as rectangles in a larger rectangle or map; the size, color and grouping of each individual rectangle provides information about the corresponding data item. The poster child for Honeycomb is SmartMoney.com’s Map of the Market, which portrays data from over 500 publicly traded companies grouped by sector or industry. Data is updated every 15 minutes. The size of each company (rectangle) in the map corresponds to the company’s market capitalization, and the color of the rectangle indicates whether the company’s stock has gone down (red) or up (green), with color intensity (dark to light) corresponding to the degree of change (small to large). Black indicates no change. A quick glance at the map tells you if the market is up or down, and where the extremes are. By holding the cursor over a rectangle, you can get the company name and percentage change in the stock price; with another click you get a drop-down menu of additional company information. SmartMoney.com’s Map of the Market (www.smartmoney.com/marketmap) portrays a wealth of stock and financial information from over 500 publicly traded companies, grouped by sector or industry. People know at a glance whether the market is up (green) or down (red) and can drill down for more specific information on individual companies. Maps can also be used as a product selection tool, as has been done by Peet’s Coffees and Teas online store. The map replaces long lists of products and prices, or having to go back and forth between multiple Web pages to compare products. Visualization techniques with camera data are being used to understand events and patterns over time. In the past, this was difficult to do because of the large amounts of camera data, both video and still pictures, that had to be examined. But today’s faster processors can get the job done. Researchers at the Georgia Institute of Technology are developing techniques to succinctly display key information that is embedded in large amounts of video. For example, by extracting single images from a traffic camera and displaying them periodically in a Source: SmartMoney.com matrix, highway traffic patterns can be easily detected and analyzed. Also presenting information in a matrix of images is 10x10, a Web site that uses images to summarize news. Each hour, 10x10 automatically determines the 100 most important words used by major news outlets like BBC World Edition and New York Times International News. It then selects photos from the news articles to illustrate the 100 words, and creates a 10x10 matrix of the photos. The matrix provides a unique visualization of the hour’s news, creating a visual record of a constantly evolving picture of the world. Often, the data and statistics needed for decisionmaking are available, but the data is so voluminous that people have a difficult time grasping its meaning. The effort invested in collecting the data is squandered Return to Table of Contents 47 A unique summary of world news, by the hour, is provided by 10x10 (www.tenbyten.org/now.html), which uses pictures to portray the most important 100 words in the news every hour. This snapshot of world events is completely generated by computer. Source: 10x10 if it does not enable enlightened analysis and decisionmaking. New software tools are being developed that illustrate complex data and statistics using innovative animated visualizations and 3D displays that help decision-makers more quickly grasp the meaning of the data. Gapminder is a non-profit initiative whose software visualizes trends in global human development, including health, income and education trends. World statistics are costly to buy, vast and difficult to understand. Gapminder’s mission is to make this data understandable, enjoyable and free (Gapminder’s reports and visualization tools are freely available from its Web site). Gapminder’s software, Trendanalyzer, turns dull time series data into attractive moving graphics on the screen. Trends are instantly graspable, and the interactive animations are fun to explore. 48 Return to Table of Contents Vizible Corporation’s software is a platform for creating intuitive and interactive views of voluminous and disparate data that is assembled in context for the person using it. Vizible sources information from any system and presents it in state-of-the-art interactive 2D or 3D views. The result is an easy to navigate, even artistic rendering of data that is instantly meaningful to the person using it. The view is also actionable, as the source of any information element is a click away. In one project, Vizible created a virtual operations center for a California municipality, integrating and displaying data from GIS systems; traffic videos; police, fire and emergency management systems; public utilities and the media. This visual system gave officials an integrated and contextual view of what was happening in the city, enabling people to work from a common platform when responding to an emergency situation. Vizible has done interfaces for managing other information resources, such as news sources and media files, and for organizing your day. Another powerful visualization of complex data is the GeoWall, a system for visualizing earth science data in 3D. The GeoWall, in use at over 400 schools, colleges and other institutions, is a relatively affordable system (under $10,000) and can be used to teach dozens or even hundreds of people simultaneously. Earlier 3D technology used an expensive “CAVE,” an entire room designed for immersive 3D visualization that was used by five to 10 people at a time and cost anywhere from $150,000 to over $1,000,000. The GeoWall uses stereo projection, a fast graphics card and an inexpensive computer to project images on a large screen that are viewed wearing glasses with polarizing filters, which create the stereoscopic effect. The technology, which can be loaded on a cart and moved from room to room, can be used to explore canyons, volcanoes, plate tectonics, surface geology and other spatial relationships of the Earth. The GeoWall is a cheaper, more flexible way to understand complex data about our planet. The next generation, GeoWall2, designed for researchers rather than students, uses an array of 15 flat panel liquid-crystal displays mounted on a wall, with an associated cluster of computers. GeoWall2 gives researchers higher resolution (about 30 million pixels, versus 780,000 to 1.3 million pixels for the GeoWall) to study images of rock cores and other geoscience data. This virtual operations center for a California municipality is a powerful visual system integrating data from GIS systems; traffic videos; police, fire and emergency management systems; public utilities and the media. The system enables people to quickly assess and respond to emergency situations. It can be viewed on a laptop or desktop or projected onto a screen. Source: Vizible Corporation Return to Table of Contents 49 Students examine a stereoscopic 3D model of earthquake hypocenters on a GeoWall at the Electronic Visualization Laboratory at the University of Illinois at Chicago. The stereoscopic capability of the GeoWall makes it easy to see the plate boundaries from the hypocenters (represented as points). The technology, once expensive and the size of a room, has become cheaper and portable, making it available to many more people than in the past. Source: Electronic Visualization Laboratory, University of Illinois at Chicago Pattern Analysis. New tools and techniques are enabling people to analyze large data sets and discover previously unrecognized patterns and relationships. Financial transactions are a treasure trove of information that can be mined for potential fraud. CSC’s FraudVision is an automated pattern recognition system that combats check fraud in high-volume, image-based payment processing operations. Every individual has a unique set of check writing traits that includes signatures, handwriting patterns, unique document types and other characteristics. FraudVision detects fraudulent checks by accurately detecting variations to these known characteristics. FraudVision integrates multiple pattern recognition capabilities with check imaging to detect forged, altered and counterfeit checks. 50 Return to Table of Contents Since image-based pattern recognition is compute intensive, it has traditionally been used in applications that return significant value relative to the amount of processing required, such as identifying a criminal from matching a set of fingerprints against a national crime database. In check processing, however, the return is relatively smaller – for example, identifying a fraudulent check for $500 – so FraudVision had to be very efficient in its design. “Roughly $675 million of check fraud goes undetected in the United States each year, against an annual volume of 30 to 40 billion check images,” points out Bill Cunningham, general manager of CSC’s Integrated Payments organization. “To make FraudVision viable, we had to not only create the technology but make it economical to use. You are looking through millions of good checks to find the few bad ones.” People’s personal information is another treasure trove of information, which can be mined for marketing and other business purposes – security issues notwithstanding. An entire industry of data aggregators has formed in the last 10 years as computing and network capabilities have improved, more data has become digitized, and companies have been able to gather and make sense of volumes of data concerning people’s public records, criminal histories and other electronic details. There has been a heightened sense of public awareness about data aggregators as identity theft has become more prominent and as data from several aggregators, including ChoicePoint, LexisNexis and Bank One, has fallen into the wrong hands. The traditional aggregators include ChoicePoint, LexisNexis, Acxiom and the three major U.S. credit bureaus. The aggregators assemble and mine data in surprising detail, and sell it to government and corporate clients for things like background checks for employee hiring and approving people for loans and insurance policies. Being able to tap and understand a wealth of data about individuals boosts security, speeds transactions and improves overall efficiency. Most of the data that is collected is publicly available but not under one roof. The aggregation and subsequent analysis of the data – finding the patterns and relationships – is extremely valuable. But there is a dark side too, as evidenced in 2005 when it was reported that ChoicePoint unknowingly sold information on 145,000 people to fraudulent business clients who were actually identity thieves. ChoicePoint, with over 19 billion data records, collects details about American’s homes, cars, relatives, criminal records and other aspects of their lives. Having the data in one place is a godsend for business but a target for criminals. As with all information, its power can be good or bad depending on who uses it. (See They’re Watching You on the next page.) Web Mining and Discovery. Another treasure trove of information, still relatively untapped, is the Web. CSC has conducted research into Web mining to explore how to leverage the world’s greatest information resource.12 The research found that fully 80-90 percent of the information people use in their daily work exists in unstructured, mainly textual repositories. Many of these repositories are now accessible through Web technologies, be they internal intranets or the Internet. Eventually, nearly all the information people work with will be Web accessible. Through Web mining technologies, people can discover valuable business insights locked inside the vast information repositories that are Web accessible. Applications have been demonstrated in marketing and business intelligence, customer relationship management, biotechnical design applications, product design and positioning, and knowledge management. CSC’s research identified smart alerts and market intelligence as two promising areas for Web mining. Smart alerts (discussed in Time and Place) use Web mining to monitor patterns of activity over time and help users interpret the evolving situation. Smart alerts provide a high-fidelity picture of emerging situations by visualizing concepts as they develop from textual repositories. Smart alerts could be created to monitor insurance claim activities, illegal trade and emerging market moves. Market intelligence can be enhanced through Web mining that assesses an industry sector by identifying networks of experts; assists with new product opportunities by analyzing product descriptions; and finds sales opportunities by identifying potential clients via behavior patterns. Text mining technology can be used to characterize a marketplace by discovering key concepts adopted by competing firms. Armed with this information, organizations can position their products for maximum advantage. Beyond the prospects of Web mining lies the notion of “reality mining.” In the future, what we browse on the Web will be a rich compilation of data streamed from sensors, including images, experiences and patterns, about such things as temperature and speed. If Web mining is about making sense of unstructured data, Return to Table of Contents 51 THEY’RE WATCHING YOU… Geoffrey R. Stone reviews the book details in No Place to Hide, it is worse How do they get this information? No Place to Hide: Behind the Scenes than we could ever have imagined. For the most part, we give it to of Our Emerging Surveillance Society, In this revealing book, O’Harrow them, though usually unwittingly, by Robert O’Harrow Jr. This article makes clear that Americans need to with almost every step we take. Over appeared in The Washington Post on think seriously about these issues the past several years, with the help February 20, 2005 and is reprinted now – before it is too late for us to of increasingly sophisticated computing with Mr. Stone’s permission. decide that we care. systems and advances in artificial intelligence, these institutions and O’Harrow unveils a modern world organizations have accumulated riddled with seemingly innocuous billions of data points about American We live in an ever more convenient private businesses, government citizens, which they then share with society. We use credit cards, buy agencies and software programs with or sell to one another and to the books on Amazon, reserve plane such obscure names as ChoicePoint, government. As O’Harrow notes, tickets on Expedia, bid for antiques Acxiom, Matrix, DARPA, Seisint, “personal data has become a on eBay, get cash at ATMs and find HOLe and NORA. Unbeknownst commodity that is bought and sold jobs on Monster. We use key cards to most of us, these institutions essentially like sow bellies.” to open hotel rooms, EZ-Pass to pay and technologies are relentlessly tolls and GPS to get directions. We compiling information about our Why do these companies and agencies send e-mail, fill prescriptions and names, addresses, license plates, do this? For you, of course. By gather- sexual needs on the Internet, and pay Social Security numbers, religions, ing and sharing such data, they protect bills electronically. incomes, family members, sexual you from identify theft and credit card orientations, friends, purchases, fraud, enable marketers to offer you These conveniences generate data. mortgages, bank accounts, credit card precisely the right products to satisfy In the “old” days, we did not leave transactions, credit standing, parking your tastes and needs, ensure that behind a readily accessible, electronic tickets, criminal arrests and convictions, your fellow passengers are not trail of our purchases, conversations, Web browsing, e-mail correspon- terrorists, locate missing children whereabouts and transactions. We dence, newspaper and magazine and deadbeat dads, help police catch took for granted the anonymity and preferences, cell phone activity, smugglers and murderers, and privacy of our ordinary, day-to-day vacations, fingerprints, insurance generally provide a safer society. And, lives. No more. Today, we are coverage, facial images, DNA, drug in fact, they really do these things. constantly tagged, monitored, studied, prescriptions and beer of choice. sorted and tracked by a vast array of Computers have made possible So what’s the problem? Should we institutions and organizations – private what was barely science fiction 20 care that there’s no place to hide? and public. As Robert O’Harrow Jr. years ago. What dangers are posed by this more convenient, more secure society? In this 52 Return to Table of Contents chilling narrative, O’Harrow identifies In the 1990s, this technology was Is this the long awaited coming of the risks and vividly illustrates them developed primarily by private 1984, the Brave New World of with powerful real-life stories. companies to enable marketers to the 21st century, or will we somehow target and profile consumers. After continue business, and life, as usual? First, there is the simple risk of mistake. Sept. 11, however, the FBI, CIA, NSA, The data in these systems, according to Justice Department and Department Ole Poulsen, one of HOLe’s creators, of Homeland Security aggressively are “full of errors and noise and wrong sought access to these business Geoffrey R. Stone is the Harry information.” As a result, individuals are databases, creating a vast private- Kalven Jr. Distinguished Service denied insurance, credit, employment, public partnership in the exchange Professor of Law at the University the right to board an airplane, and of such information. Moreover, the of Chicago and the author of even the right to vote when the system USA Patriot Act took full advantage Perilous Times: Free Speech in Wartime spins out inaccurate information. of the post-9/11 crisis mentality and from the Sedition Act of 1798 to the And, as O’Harrow persuasively authorized a wide range of previously War on Terrorism. demonstrates, correcting the record restricted government surveillance can be a nightmare. and data-gathering activities. Although © Copyright 2005 The Washington Post the stated goal of these activities is Second, there is the risk of public to ensure our security, history teaches disclosure. We regard much of this that once government has such information as private. But hackers can information, it will inevitably use it all too easily capture it and use it to to harass and silence those who humiliate, blackmail and impersonate question its policies. us. The Federal Trade Commission reports that in a typical year, 10 million Finally, O’Harrow warns that such Americans were the victims of identity massive invasion of privacy and theft, resulting in bounced checks, intrusion into our ordinary anonymity loan denials, harassment from debt may well alter the very fabric of our collectors, cancelled insurance and society. Once we understand that false accusations of criminal conduct. our every move is being tracked, monitored, recorded and collated, Third, there is the risk that government will we retain our essential sense will use this information not only to of individual autonomy and personal ferret out terrorists, but also to sup- dignity? Can freedom flourish in such press dissent and impose conformity. a society? Return to Table of Contents 53 reality mining is about making sense of sensor data – how should I change this process or allocate resources differently to respond to current conditions? The focus is on mining operations-relevant sensor data to make smart business decisions. Other Advanced Techniques. Other advanced techniques emerging from the lab in real-world settings are text parsing and applying scene analysis to image data. These techniques figure out what the data means, but at a much deeper level than traditional pattern analysis. In the parsing example, the computer acts like a person who is reading, breaking down sentences into subject, verb, object and other parts of speech in order to better understand the meaning of the sentence. This enables more accurate searching, by avoiding the ambiguity that arises from searching just on keywords, as well as automated detection of relationships between the persons, places and things mentioned in the text – something difficult to do in unstructured text. California-based Attensity, which has been partially funded by the U.S. Central Intelligence Agency, has developed such a parsing tool. It can help determine, for example, whether “bonds” refers to a financial instrument, baseball player or glue-like substance. Intelligence agencies are interested in automatically gleaning meaning from large collections of e-mail and instant messages, while companies like Whirlpool are automatically analyzing records of customer service calls to quickly detect trends. For example, knowing whether “smoke” refers to the caller’s Whirlpool microwave or the food in it makes a big difference. Other organizations such as John Deere, General Motors and U.S. intelligence agencies are analyzing unstructured text using Attensity’s technology. According to one account, if “purchase” is identified as a verb, the subject is identified as a possible customer. If “plastic explosive” is used as an object, the subject is tagged as a potential enemy. Although considerable progress has been made in attempting to understand the meaning behind text, the world of imagery is a tougher nut to crack. Computer 54 Return to Table of Contents understanding of still images is difficult, and little work has been done on analysis of video – until recently. Practical tools are emerging that enable computer systems to analyze digital video in real time. One such system is a computer-aided drowning detection system by Paris-based Vision IQ. The system, called Poseidon, acts like a lifeguard to spot troubled swimmers and sound an alert. In one case, Poseidon spotted an unconscious swimmer at the bottom of a Paris public pool and alerted lifeguards, who rescued the swimmer. He fully recovered. Poseidon, which the company reports has been installed in 120 pools in Europe and North America, uses underwater and overhead digital cameras and several patented technologies to effectively “see” what is happening underwater. The system combines stereo vision techniques (it receives images from at least two cameras simultaneously) with volume and texture analysis of the pool to distinguish between, say, a body and a shadow. The system is considered a breakthrough in computer vision technology, providing real-time scene analysis that was previously thought to be possible only by humans. All these technologies underscore the importance of addressing the age-old problem of how to get meaning from data. As data expands from text to multimedia and mobile formats, and the volume of data increases overall, the bar is raised to extract meaning, and companies are stepping up to the challenge. We live in a world of extreme data, marked demands fresh ways of thinking about the by innovation and opportunity – data going “I” in IT such that data at the edge of the where it has never gone before. This world network, along with consumer devices, are of extreme data is both exhilarating and scary. recognized as first-class citizens and sup- It provides new products, services and ways ported in the IT infrastructure, which in to communicate. It provides unimagined levels turn means they can be leveraged by the of precision, convenience and speed. organization for business results. If we reflect on the four dimensions of extreme data – data everywhere, time and place, social connections and meaning – we see that extreme data is mobile and pervasive; A world of extreme data demands extreme responsibility and extreme accountability to manage and protect that data. it is moving towards real-time scenarios; it is linking people in new ways, enhancing knowledge; and it is meaningful, showing Today’s organizations are challenged to put patterns and understanding despite appear- their data at the edge of the network, closer to ing chaotic and overwhelming at first. customers and where work gets done. They are challenged to take advantage of new forms However, all this data – new data types, aggre- of data and to leverage new technologies for gated in new ways, and in unprecedented gleaning meaning from data. Extreme data amounts – is vulnerable to unauthorized sur- raises questions for the organization and IT, veillance and misuse if it falls into the wrong which can be addressed by experimenting hands. A world of extreme data demands with extreme data technologies first-hand extreme responsibility and extreme accounta- in targeted operations. bility to manage and protect that data. It The time for getting started is now, because what is extreme today will be commonplace tomorrow. Return to Table of Contents 55 N OT E S APPENDIX: H A N DY W E B S I T E S 1 DATA EVERYWHERE This data includes Internet, TV, telephone and radio. For more information, see UC Berkeley’s School of Information Management and Systems report, “How Much Information? 2003,” Pocket PC Magazine at http://www.sims.berkeley.edu/research/projects/ www.pocketpcmag.com how-much-info-2003/execsum.htm#summary. VCAST Services 2 CSC’s Leading Edge Forum is studying this consumerization effect on the corporate IT agenda. 3 David Dossett, “Disconnected Wireless Database – Working Outside the Bubble,” CSC LEF Technology Grant Paper, 2004. 4 5 6 www.getvcast.com Archos www.archos.com Pew Internet & American Life Project, Podcasting, April 2005. Apple iPod and iTunes http://www.pewinternet.org/PPF/r/154/report_display.asp www.apple.com/itunes Peter Rehäußer, “RFID Security,” CSC LEF Technology Grant Paper, MSN Music March 7, 2005. www.music.msn.com An Action Day is when citizens are encouraged to take some action FnacMusic that day due to poor air quality, such as ride the bus instead of drive. www.fnacmusic.com Each locality defines an Action Day differently according to its needs. Podcasting News 7 Laurence Lock Lee, “Web Mining,” CSC LEF Technology Grant Paper, April 2004. http://www.csc.com/aboutus/lef/mds67_off/ index.shtml#grants 8 “Telematics and Automatic Crash Notification (ACN): Delivering emergency data to public safety,” presentation, August 2002. 9 www.podcastingnews.com MythTV www.mythtv.org www.comcare.org/take_action/presentations/ BitTorrent APCO%202002%20ACN%20Panel.ppt www.bittorrent.com K. Oanh Ha, “He Paved the Way for the PC Revolution,” San Jose Belgian eID Card Mercury News, February 21, 2005. www.eid.belgium.be 10 Ed Luczak, “Adding Meaning to the Web: A Guide to the Semantic Web,” CSC LEF Technology Grant paper, April 2004. MedicAlert www.medicalert.org http://www.csc.com/aboutus/lef/mds67_off/index.shtml#grants VeriChip 11 Paul Lerke, “Mapping the Information Landscape: Topic Map Technology,” CSC LEF Technology Grant paper, April 2005. 12 Lawrence Lock Lee, “Web Mining,” April 2004. www.4verichip.com Road Safety International www.roadsafety.com Davis Instruments www.davisnet.com 56 Return to Table of Contents Progressive Insurance AirVideo tripsense.progressive.com www.trafficland.com/tl-airvideo-signup.html Norwich Union InTouch Health www.norwichunion.com www.intouch-health.com Google Earth TIME AND PLACE earth.google.com DeLorme Portable GPS Devices DigitalGlobe www.delorme.com/bluelogger www.digitalglobe.com Zipdash ORBIMAGE www.zipdash.com www.orbimage.com Rand McNally Traffic NASA World Wind www.randmcnally.com/rmc/company/ worldwind.arc.nasa.gov cmpProducts.jsp?oid=-1073753515 Open Geospatial Consortium Pioneer Navigation Receiver www.opengeospatial.org www.pioneerelectronics.com/pna/product/detail/ 0,,2076_3151_192089333,00.html ShotCodes www.shotcode.com Origin blue i www.originbluei.com SnapToTell (white paper) www-mrim.imag.fr/publications/2005/CHE05/ NextBus chevallet05a_ECIR05_SnapToTell.pdf www.nextbus.com Argo Ocean Observation Floats Xora www.argo.ucsd.edu www.xora.com EnviroFlash EPA EnviroFacts www.epa.gov/airnow/enviroflash.html www.epa.gov/enviro ComCARE Alliance EPA Window to My Environment www.comcare.org www.epa.gov/enviro/wme AeroScout SOCIAL CONNECTIONS www.aeroscout.com Friendster CAMNET Realtime Air Pollution and Visibility Monitoring www.friendster.com www.hazecam.net Ryze www.ryze.com Return to Table of Contents 57 LinkedIn SomeoneNew www.linkedin.com www.someonenew.com COMMON.net JYBE www.common.net www.jybe.com Jigsaw Data del.icio.us www.jigsaw.com del.icio.us nTAG Interactive Google Blogger www.ntag.com www.blogger.com/start Dodgeball Yahoo 360 www.dodgeball.com 360.yahoo.com Playtxt MSN Spaces www.playtxt.net spaces.msn.com Jambo Networks Flickr www.jambonetworks.com www.flickr.com Plazes HeyPix www.plazes.com www.heypix.com Crunkie QuakeCon www.crunkie.com www.quakecon.org bluejack Q: Mobile Phone Bluejacking There www.bluejackq.com www.there.com Tikiwiki Community Portal 58 tikiwiki.org MEANING JotSpot Webmap PropertyView www.jotspot.com www.webmap.com.au Open Directory Project TerraServer.com dmoz.org www.terraserver.com Skype TopoZone www.skype.com www.topozone.com Jyve NOAA CLASS www.jyve.com www.class.noaa.gov Return to Table of Contents piXlogic Semantic Web Challenge www.pixlogic.com challenge.semanticweb.org Purdue University’s Shape Search Feedster (RSS Search Engine) engineering.purdue.edu/PRECISE/dess.html www.feedster.com Google Video Search Rojo (RSS Web service) video.google.com www.rojo.com Yahoo! Video Search Hive Group video.search.yahoo.com www.hivegroup.com blinkx TV SmartMoney.com’s Map of the Market www.blinkx.com www.smartmoney.com/marketmap Singingfish Peet’s Coffees and Teas search.singingfish.com www.peets.com/selector_coffee/coffee_selector.asp Videora 10x10 www.videora.com www.tenbyten.org/now.html A9.com Gapminder www.a9.com www.gapminder.org Internet Archive Vizible Corporation www.archive.org www.vizible.com Google Desktop GeoWall Consortium desktop.google.com geowall.org X1 Technologies Web Mining (CSC LEF Technology Grant paper) www.x1.com www.csc.com/aboutus/lef/mds67_off/index.shtml#grants ERIC – Education Resources Information Center www.eric.ed.gov Attensity www.attensity.com Virage www.virage.com Vision IQ www.vision-iq.com Adding Meaning to the Web (CSC LEF Technology Grant paper) www.csc.com/aboutus/lef/mds67_off/index.shtml#grants Return to Table of Contents 59 AC K N OW L E D G M E N T S LEF Associate Ed Luczak conducted the research for this report. A threetime recipient of the prestigious CSC Award for Technical Excellence, Ed is one of CSC’s premier technologists. He has served in many strategic roles in his 28-year career at CSC in the U.S. federal sector; among the most noteworthy are roles with the National Aeronautics and Space Administration and the Environmental Protection Agency. Ed’s specialties include expert systems, complex data sharing and data semantics. Working on Extreme Data has expanded how Ed thinks about information technology, giving him a broad view of IT across numerous industries. He now spends more time using Skype, a Pocket PC, GPS receivers, digital audio devices, a Web-enabled camera phone, blogs, wikis and social networking software – and exploring the role that these and other extreme data technologies will play in innovation for corporations, governments and consumers. Ed is based in New Carrollton, Maryland. [email protected] The LEF would like to thank the many others who contributed to this report: Beverly Bacon, CSC Dennis Franklin, CSC David Moschella, CSC Mike Benasutti, CSC Lawrence Henry, CSC Doug Neal, CSC Jerry Blodgett, CSC Dennis Hettema, OP3 Jeff Rushton, Vizible Corporation Lisa Braun, CSC Jim Kelly, Intel Don Smith, CSC Johan Bygden, CSC Cai Kjaer, CSC Carl Stålhandske, CSC Peter Cochrane, ConceptLabs Richard Kramer, CSC Marc Stern, CSC Tino Cremidis, CSC Gilles Le Caro, CSC Geoffrey Stone, University of Chicago Jerry Cronin, CSC Jason Leigh, Dennis Timmermans, OP3 Rob Cross, University of Virginia University of Illinois at Chicago Stéphanie Tostivint, CSC Bill Cunningham, CSC Vic Leonard, DigitalGlobe Yulun Wang, InTouch Health David Dossett, CSC Laurence Lock Lee, CSC Jason Westra, CSC Phillip Ehlen, CSC David MacLuskie, CSC Alex Fox, ORBIMAGE Christine Matthews, CSC Return to Table of Contents 60 Computer Sciences Corporation Worldwide CSC Headquarters The Americas 2100 East Grand Avenue El Segundo, California 90245 United States +1.310.615.0311 European Group The Royal Pavilion Wellesley Road Aldershot Hampshire GU11 1PZ United Kingdom +44(0)1252.534000 Australia/New Zealand 26 Talavera Road Macquarie Park NSW 2113 Australia +61(0)2.9034.3000 Asia 139 Cecil Street #08-00 Cecil House Singapore 069539 Republic of Singapore +65.6221.9095 About CSC Computer Sciences Corporation helps clients achieve strategic goals and profit from the use of information technology. With the broadest range of capabilities, CSC offers clients the solutions they need to manage complexity, focus on core businesses, collaborate with partners and clients, and improve operations. CSC makes a special point of understanding its clients and provides experts with real-world experience to with work them. CSC is vendor-independent, delivering solutions that best meet each client’s unique requirements. For more than 40 years, clients in industries and governments worldwide have trusted CSC with their business process and information systems outsourcing, systems integration and consulting needs. The company trades on the New York Stock Exchange under the symbol “CSC.” © 2005 Computer Sciences Corporation. All rights reserved. Printed in USA 4M 8/05 AP WH712